cancel
Showing results for 
Search instead for 
Did you mean: 

Transformation with Start and End Routine Performance Issues

Former Member
0 Kudos

Hi All,

We use start and end routines extensively in transformations. Typically, we will initialize various internal tables in a start routine, and then read these in an end routine.

In one such example, I load around 11 internal tables in a start routine, then use this data to populate additional fields in the end routine.

If I leave the packet size to 50,000 records, the performance of each packet nose dives. It can take 3 u2013 4 hours to process each packet. If I drop the packet size to 10,000 records, I see the start routine drop to around 4 mins and the end routine to 10 mins.

Obviously we are looking at the code that we have in the start routines to try to identify if any improvements can be made. However I saw the OSS note Note 1178077 - Effect of start routine on performance and extraction which recommends not to use start routines. Can anyone explain this note in more detail? Is this only applicable in very specific circumstances?

The source dso for the above load has some 250 fields, the target dso has around 200 fields. We have no infosource between the source and target u2013 just the transformation.

We understand that one possible explanation is that we are pulling too much data into internal tables and the data package and the system starts swapping giving the negative impact on performance. Reducing the packet size does seem to help. However, even with the small 10k data packet it takes a quite long time. It has also been recommended to use a single expert routine as this is the fastest way to process the data. Does anyone have experience of this technique? Does an expert routine support delta loads with no issues?

I look forward to your responses.

Regards,

Mark

Accepted Solutions (0)

Answers (3)

Answers (3)

Former Member
0 Kudos

Thanks for the replies.

We have spent a lot of time reviewing the transformations and have identified a far few performance improvement possibilities.

However, we seem to be limited to the amount of time it takes to physically build the internal tables.

We have noticed that the select .... for al entries in option generates many single selects on the database - which is very slow.

Does anyone have any tips on how to improve this?

Regards,

Mark

dennis_scoville4
Active Contributor
0 Kudos

A couple of things that you can look at in your Start Routine, would be to:

1. Create your internal tables as TYPE HASHED TABLE WITH UNIQUE KEY as much as possible. It requires unique records, otherwise it will throw a duplicate key error. If you cannot guarantee uniqueness, create the internal table as TYPE SORTED TABLE WITH NON-UNIQUE KEY and then DELETE...ADJACENT DUPLICATES to reduce the table size.

2. When performing the SELECT statements to build the internal tables, be sure to use ...FOR ALL ENTRIES IN SOURCE_PACKAGE WHERE... This will limit the internal table to only the needed records for reading in the End Routine and therefore reduce the amount of memory required to store the internal table and not require swap outs (those kill your performance).

3. Ensure that you have proper indexing on the objects of the tables being read to load the internal tables.

former_member184494
Active Contributor
0 Kudos

somethings which might help :

1. Have hashed internal tables if you are looping and selecting.

Keep the packet size around 25K ( just a thumb rule ) and increase the number of background processes in the DTP.