Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
pallab_haldar
Active Participant

Today I will discuss about performance tuning in SAP Data Service. 

Performance tuning in SAP Data Service can be divided into two parts :

Configurations/ Setting Changes :

  • 1. Use Degree of parallelism (DOP) option in the data flow to a value greater than one and 3-4. Use the data integrator features like Table partitioning, File multithreading , Degree of parallelism for data flows with DOP to get better performance.

2.  In Source database If we increase the size of database I/O, increase the size of the shared buffer to cache more data if will help to perform the Select statements quickly. 

3. In Target Database disable the Archive logging, disable the Redo logging for all tables. It will help to perform INSERT and UPDATE quickly.

4. We need to increase the monitor sample rate to 40K or more as per standard to get more performance.

5. To avoid performance degradation we need to exclude the Data Services logs from the virus scan if the virus scan is configured on the job server.

6.  For the first time execution Select the option COLLECT STATISTICS FOR OPTIMIZATION . For the second time onwards. Use collected stats which is selected by default.

7. Design a job such a way that runs one ‘Al engine’ process per CPU at a time. 

8. Based on the scenario ( for Large scale of data) increase the Array Fetch size value  to avoid repetitively go to the database every time to fetch the data. Set it more than 800.

pallab_haldar_3-1708369509382.png

 

Design/Development Changes :

1.  Push Down all the transformation logic implemented in SQL to Database layer to leverage the power of the database engine. To check the Optimized code pushed to the database engine please check the below -

pallab_haldar_0-1708367188988.png

 

But there are few restriction and clause for SQL push down operation. 

  • When pushdown is not possible in a DF then enable Bulk Loader on the target table. Bulk loader is much faster than using direct load.

pallab_haldar_1-1708367359361.png

  • In case there is a "select Distinct" then the code will not push down fully. If we can not avoid to use it please use select distinct then use it in the last query transform before the target table.
  •  Use Single Query Transform you want to use Group by and Order by.
  • Try to avoid data type conversions which prevents full push down.
  • Always try to avoid parallel execution of Query Transforms which prevents full push down.

2. Join Rank : Please define the rank properly to make execution speed faster. Open the Query Editor  assign the rank for larger tables. The highest ranked table will act as a driving table and execute . Monitor log file allows you to see the order in which the Data Services Optimizer performs the joins which will help you to identify the performance improvement. To add the trace go to Optimized Data Flow --> Trace tab -->  Execution Properties dialog.

3. Set the "Rows per Commit" value between 500 and 2000 where default is 1000.

pallab_haldar_2-1708368449686.png

 

4. Try to split complex logics of a single dataflow into multiple dataflows which is easy to maintenance and help to SQL pushdown. 

5.  Create index for columns which used in where clause as joining condition. This is a major point to improve performance drastically.

6. Always full pushdown is not possible. In that scenario if the dataset is larger enable the Bulk Loader on the target table. 

7. Use join instead of Lookup function if possible.

 

 

 

 

 

 

 

 

Labels in this area