I've noticed strange behavior when using a table comparison element to sync tables with Data Services 4.2 SP8 P2. Here's a demo scenario to explain (Oracle):
First create the required source and target tables
CREATE TABLE TEST_SOURCE ( ID NUMBER, CODE_A VARCHAR2(50 BYTE), CODE_B VARCHAR2(50 BYTE) ); CREATE UNIQUE INDEX TEST_SOURCE_PK ON TEST_SOURCE(ID); ALTER TABLE TEST_SOURCE ADD ( CONSTRAINT TEST_SOURCE_PK PRIMARY KEY (ID) USING INDEX TEST_SOURCE_PK ENABLE VALIDATE); CREATE TABLE TEST_TARGET ( ID NUMBER, CODE_A VARCHAR2(50 BYTE), CODE_B VARCHAR2(50 BYTE) ); CREATE UNIQUE INDEX TEST_TARGET_PK ON TEST_TARGET(ID); ALTER TABLE TEST_TARGET ADD ( CONSTRAINT TEST_TARGET_PK PRIMARY KEY (ID) USING INDEX TEST_TARGET_PK ENABLE VALIDATE);
Next create a new data flow in DS "DF_TEST_SYNC" to perform the necessary inserts, updates and deletes to sync target with source:
Q_sort: order data by the "ID" column.
Table_Comparison: "ID" set as pk column for the comparison
Default target table options are kept.
Test no.1
Prepare the source table by inserting a million rows:
truncate table TEST_SOURCE; INSERT INTO TEST_SOURCE SELECT LEVEL, 'CODE_A_' || LEVEL, 'CODE_B_' || LEVEL FROM DUAL CONNECT BY LEVEL <= 1000000; COMMIT; -- make sure the target table is empty truncate table TEST_TARGET;
Sync both tables by running the data flow
Number of rows in target: 1.000.000 => OK
Delete the FIRST 50.000 ID's from the source table:
delete from TEST_SOURCE where ID <= 50000; commit;
Run the data flow again => target now has 950.000 rows => OK
Test no.2
Reset source and target table:
truncate table TEST_SOURCE; INSERT INTO TEST_SOURCE SELECT LEVEL, 'CODE_A_' || LEVEL, 'CODE_B_' || LEVEL FROM DUAL CONNECT BY LEVEL <= 1000000; COMMIT; -- make sure the target table is empty truncate table TEST_TARGET;
Sync both tables by running the data flow
Number of rows in target: 1.000.000 => OK
Delete the LAST 50.000 ID's from the source table:
delete from TEST_SOURCE where ID > 950000; commit;
Run the data flow again => target now has 999.473 rows => NOT OK, DS has detected rows to delete, but only a fraction of the 50.000 rows that need to be removed.
Run the data flow again => target now has 998.869 rows. This continues until eventually all 50.000 rows are deleted.
I've seen a similar issue with DS 4.1: https://archive.sap.com/discussions/thread/3948660
Has anyone else encountered this problem with Data Services?
Forgot to add some of the tests I've done to see if they had an impact on the results. They didn't.