Solved: Rows per commit & Network packet size

Former Member · ‎02-03-2015

Hi Experts,

How this rows per commit and network packet size goes hand in hand and is there any best way to find out the optimum numbers for these both ?

I have read the below one in one of the atricle and that had made me to raise this.

It is recommended that you set the Rows per commit and Network packet size parameters to avoid sending many partially filled packets over the network and ensure that the packet size contains all rows in the commit

Thanks,

AJ.

former_member213365 · ‎02-04-2015

I rarely change the network packet size, but only because I rarely work on SQL Server. Oracle doesn't have that "feature".

With regard to the rows per commit, you should also consider the array fetch size. Think of them as buckets that work together. If you change one then you may need to change the other. A big bucket for the source table and a small bucket for the target means that a lot of those source rows may be waiting - on the job server - for the target rows to get flushed to the database.

Both settings can have a dramatic impact not just on performance but also on memory.

My general rule of thumb is: wide row size = small array fetch size and small rows per commit. It's a "rule of thumb" so it varies by the situation. A target with three numeric columns? I might go with a large rows per commit. A target with 200 columns or perhaps with five VARCHAR(4000) columns? Small rows per commit.

You also should take into account what is actually happening at the target table. If your Dataflow updates or inserts only a couple of rows per day then why would you want a rows per commit of 5,000? Better to bump it down to perhaps 100 to reduce the amount of memory that the buffer grabs when the Dataflow is instantiated.

The rows per commit on the target table can be critical if you're using Number of Loaders > 1. The reason for this is that each loader is a buffer. When buffer #1 is full then it gets flushed to the database. The larger the buffer the longer you have to wait to send off those rows. Number of Loaders = 20 when you have a very large rows per commit value could mean that you have defeated the purpose of using 20 loader buffers. There may never be more than 2 in use. If you set the rows per commit to a lower value then there may be a much higher number of loaders/buffers that actually get used.

Caveat: Be careful with Number of Loaders > 1. You can create a huge headache for yourself if you don't use it right (i.e. database locks and deadlocks). I can kill your ETL by changing Number of Loaders to 2 when the target is an Oracle table with bitmap indexes.

As a Performance Tuning consultant I love going into sites where somebody read on a forum that a rows per commit value of 5,000 was optimal. Low hanging fruit like that is great! As Severin stated above, you need to test these settings. Use the defaults. I use the defaults on probably half the Dataflows I work on. Only after you determine that performance isn't optimal and after determining that the performance problem isn't somewhere else in the Dataflow, then start tweaking the array fetch size and rows per commit;

Former Member · ‎02-03-2015

Arun,

Check this link for how rows per commit impacts performance. This is independent of changing the default network packet size. Unless you are using the bulk loader option, the row commit size of 1000 seems to be the optimal value.

The impact of the CommitSize (SQL Server) - Enterprise Information Management - SCN Wiki

Cheers,

Chaitanya

Rows per commit & Network packet size

Accepted Solutions (1)

Accepted Solutions (1)

Answers (1)

Answers (1)

Retreiving AppGyver project

What is the difference between OMS and OMM, while ...

Re: Incorporate international clients into VAR Chi...

Incorporate international clients into VAR Chile

Re: reading external (OData) API from cap