We are facing an issue with the datahub related to a Canonical item being duplicated, both on the database and on the target impex result.
We have a Canonical item with a very simple unique key. This item is composed of a Raw item from a SAP idoc that has many different segment definitions. On the database we see a Canonical item type with same integration key, but different canonicalItemId. One Canonical item row has incomplete, empty attribute values, while the other row has all attribute values correct. The problem is that there is a dependency between Canonical items. When using the 'resolve' expression, the Canonical item with the incomplete attribute values is resolved, resulting in a dependent Canonical item with empty values.
What we have discovered is when the number of Raw items on the database is greather than the datahub.composition.batch.size property value, the duplication of the Canonical item happens. It seems that on every iteration of a composition batch, the Canonical item cache get lost and an already existed Canonical item key is not found, so a new Canonical item is created with the same key.
We found when the datahub.composition.batch.size property value is reduced, we get more Canonical items duplicated, and when the value of the batch.size property is greather than the total raw items, the duplication does not happen, neither on the database nor on the impex file.
We are using Data Hub 22.214.171.124 with customized datahub extension, including the sapidocintegration and sapcoreconfiguration.