cancel
Showing results for 
Search instead for 
Did you mean: 

Table comparison

Former Member
0 Kudos

Hi,

     I have a full refresh file with 1 million records. I should load it into a table initially and extract it in a file. From the second day I should get only the inserts and deletes. I used table comparison stage to achieve this. I tested my job with 3 lakh records and the table comparison takes approximately 10 minutes. Is there any way to improve the performance of this stage.

Note: I checked there is an option where we can set the comparison method as "sorted input" instead of row by row select. For this option, is it enough if my source file alone is sorted.

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

The sorted input option will always give you the best performance and should always be used with the table comparison transform, where possible.  The input query needs to be sorted on the primary key of the comparison table.

Former Member
0 Kudos

Thank you for your inputs. I just have one clarification for this stage. I have another case where  I need to do a table compare on a table that is already having records, and these records are not sorted. Would the sorted input option work in this case also. I will ensure that the records are sorted based on keys in my input file

Former Member
0 Kudos

Thank you for your inputs. I just have one clarification for this stage. I have another case where  I need to do a table compare on a table that is already having records, and these records are not sorted. Would the sorted input option work in this case also. I will ensure that the records are sorted based on keys in my input file

Former Member
0 Kudos

Hi Aditya,

No in case of sorted input both your input as well as output(Target) table data should has sorted data on primary key.

If it is not sorted then it may take more time then usual.

Thanks,
Swapnil

Former Member
0 Kudos

Can you please tell me how to sort the data in table within the table comparison stage. I do not have control over the table. can this be achieved within the etl job

Former Member
0 Kudos

The table does not need to be sorted - the input query needs to be sorted based on the primary key of the target table.

Former Member
0 Kudos

Thanks for your confirmation. That was really helpful

mageshwaran_subramanian
Active Contributor
0 Kudos

If only inserted/deleted needed , use joins rather than table comparison transform for better performance.