Skip to Content
avatar image
Former Member

How to limit the usage of the transaction log

Good afternoon,

I'm looking for any improvement to reduce the usage of my transaction log during the execution of some batches inserting millions of rows in a table My batch is composed of independant child processes (6), each process work on a set of 2000 references knowing that each reference can insert 50 to more that 100 rows in a table.

Example of metrics

time ; # of chunk of 2000 refs ; number of refs ; number of rows inserted

15:24 ; 29 ; 58000 ; 3190000
15:25 ; 31 ; 62000 ; 3410000
15:26 ; 31 ; 62000 ; 3410000
15:27 ; 13 ; 26000 ; 1430000
15:28 ; 20 ; 40000 ; 2200000
15:29 ; 41 ; 82000 ; 4510000
15:30 ; 33 ; 66000 ; 3630000
15:31 ; 64 ; 128000 ; 7040000
15:32 ; 34 ; 68000 ; 3740000
15:33 ; 30 ; 60000 ; 3300000
15:34 ; 29 ; 58000 ; 3190000
15:35 ; 17 ; 34000 ; 1870000
15:36 ; 21 ; 42000 ; 2310000
15:37 ; 29 ; 58000 ; 3190000
15:38 ; 21 ; 42000 ; 2310000
15:39 ; 7 ; 14000 ; 770000
TOTAL = 49500000

Today the batch executes correctly but my syslogs get filled at 95%.

Knowing that my batch is tested on a subscope, the final scope might generate more data and fill my syslogs.

Moreover I'll probably have other sets of batches that will be launched in parallel targeting the same dataserver and database they can also fill my syslogs.

In order to prevent to have a syslogs full, my unique option might be to increase the log size.

On the other hand I'm running to unit test to see how to tune my dataserver to limit the fill of my syslogs.

From my understanding

  • spid is working some data stored in ULC.
  • When the modification are finished or when the ULC is full, data are flushed to cache but also in syslogs
  • because we're talking about 'log' information, the flush on the cache and disk is done in serial then every minute, checkpoint wakes up and check if the number of information modified are > than the 'recovery interval'. If this is the case then he'll try to write all dirty pages to disk.

Based of that I configured my 4k-page dataserver as follow

  • create a logonly cache with a 4K pool and 32K pool
  • bind the syslogs of my database to the new cache
  • configure 'ULC' to 32k
  • configure sp_logiosize for my db to 32k

By doing so, I was able to limit the time it takes for the data to go from ULC to syslogs (By increasing ULC I limited the number of ULC flush because of full ULC)

Then I increase the number of 'disk i/o structure' and 'i/o batch size' with the desire to submit as much IO as possible to the system in order to try to limit the checkpoint time.

I don't know if I succeeded but from what I saw with ASETUNE is:

  • syslogs is getting filled up to a limit of 95% before finishing
  • I can see a lot of 'physical_io' for the HOUSEKEEPER process
  • HK processes are doing IOs but not in the same scale than the housekeeper.

Having said that, ASETUNE showed me:

  • device attached to syslogs are stressed
  • my syslogs get filled, I can see it get filled then empty but at some point it continues getting fill before being emptied leading to the end of my batch
  • data devices are also stressed and I can see main object is my target table.

But now I don't know How I should proceed to see if there are anything else I could to to improve my batch execution without increasing the syslogs.

What would be your suggestion to continue working on this topic?

Thanks for your feedbacks

Simon

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

2 Answers

  • Best Answer
    Dec 04, 2017 at 04:11 PM

    Sounds like you have a couple different issues ...

    - amount of log activity; possible solutions ... increase log size or decrease volume of log activity

    - filling up the log 'too fast'; possible solutions ... increase log size or truncate log more frequently

    "Duh, Mark!" ?

    --------------

    No definitive 'solutions' at this point, just a few questions/ramblings ...

    - how big is the log? [if only a couple 100 MBs then by all means consider bumping up to 1-2GB]

    - are you replicating out of this database and if so, have you looked at SRS performance and tuning options?

    - are you explicitly truncating the log (eg, 'dump tran ... to file or truncate only') and if so, can you truncate more frequently?

    - if log truncations are taking too long, is there a performance issue with writing to disk (assuming 'dump tran' to disk)?

    - are these batches **only** doing inserts, or are they also performing other operations (eg, updates/deletes) on the data?

    - any chance these batches are rebuilding previous data sets (eg, EOD/EOW processing that rebuilds clients' holdings/positions/P&L)? [I've worked with some firms - usually banks - where they daily rebuild client data sets - to speed up intraday queries - where 95-99% of the data typically doesn't change from day to day; by replacing the rebuild-everything-from-scratch process with a replace-only-the-recent-changes process I've been able to reduce log activity from 3-5GB down to a few 100MBs; so, wondering if a rework/rethink of the current process could reduce the volume of log activity?]

    - does the entire batch process need to be recoverable, ie, do you need to write the log to disk? [wondering if some portion of the activity could be performed in a separate 'staging' database where logs could be discarded/truncated more efficiently ... to include disabling HK on an associated cache, enabling 'delayed commit', etc]

    - have you reviewed the query plans for one of the batches and if so, any sign of deferred operations? [deferred operations tend to generate an excessive volume of log records, so eliminating/reducing deferred operations can often times reduce the volume of log activity]

    - do you have triggers or FK constraints on the table(s) in question and if so, would it be feasible to disable these options (triggers, FK constraints) during the batch processing window? [again, looking at ways to reduce log activity, though eliminating these options could speed up batch processing which in turn could fill the log even faster]

    Add comment
    10|10000 characters needed characters exceeded

    • NP Simon.

      Looks like database recovery is not a show-stopper issue for your use-case.

      So your threshold procedure (if and when it gets triggered) should just truncate the log (a dump tran option) for this type of database where 'truncate log on checkpoint' is configured. If you are doing large scale inserts via bcp (or otherwise) just have multiple thresholds on logsegment to truncate log as often as possible.

      HTH

      Avinash

  • Dec 04, 2017 at 11:10 PM

    Transaction log is shared by all SPIDs running DML in the database. So 100% of it may not be available to your SPIDs.

    Normally "run" and "sleep" approach should work.

    Check the log space available "before" you start DML SQL.

    sp_spaceused(syslogs) being the simplest way to get the number of free pages.

    After running your "reasonable" number of DML operations and a commit, check the transaction log space again.

    You can build your own thresholds to "start" next set of DMLs when free space in syslogs is above certain percent e.g. 50% free ( or a hard limit of pages are free) and stop after your reasonable DML transactions are done. (You can start with 1K DMLs and raise the throttle in chunks of 1K)

    Then enter a "sleep" cycle for random number of time units. (e.g. 10 seconds to 100 seconds)

    During this the log will be truncated or tran dump may happen by way of schedule or threshold trigger.

    Then start the next "run" cycle.

    Following will be important factors in building your logic

    - are tran logs are automatically truncated

    - if not how frequently the tran dumps are done or are they fired by thresholds

    - how wide is the table you are inserting (fewer wide rows versus more short rows for the log space)

    - do the tables have DML triggers, RI and other constraints to slow down the DML

    You may have to test this in controlled way to get the right mix.

    HTH

    Avinash

    Add comment
    10|10000 characters needed characters exceeded