cancel
Showing results for 
Search instead for 
Did you mean: 

update Index Stats using Sampling

Former Member
0 Kudos

Hi Team Good afternoon We currently have a JOB that runs UIS using sampling = 30 and is failing on a table that has 293412179 rows and one CI Error - the tempdb is full for the DB maintenance  a.c which is mapped to the DBA tempdb Can't allocate space for object 'temp worktable' in database 'DBA_tempdb' because 'system' segment is full/has no free extents. If you ran out of space in syslogs, dump the transaction log. Otherwise, use ALTER DATABASE to increase the size of the segment. When we run the same job using sampling = 10 it works fine. Question What are the risk associated with changing the sampling ? Will that impact the query plans for the app using that table ? -Sid

Accepted Solutions (0)

Answers (3)

Answers (3)

former_member182259
Contributor
0 Kudos

If you are on 15.7sp130+, use update statistics with hashing instead of update statistics with sampling.   My opinion is that it is more accurate than sampling and definitely would eliminate the issue with tempdb as it uses next to nothing for tempdb space - result is it often is much faster was well due to not having to wait for IOs.

simon_ogden
Participant
0 Kudos

Definitely agree that you should move to hash-based stats but I would very strongly recommend to not do this in production without going through some significant testing in UAT. It is much more stable as a feature in later releases and should be safe to use but given its history of occasional bad histograms and stack traces it should not be undertaken lightly and only implemented after some significant testing.

Former Member
0 Kudos

Thank you Jeff, appreciated We are still on SP122 but we plan to go to SP135 in next 3 months What I wanted to understand is the query plan behavior with diff Sampling %. Adding Space is a break fix, but with same space with diff sampling it does work How would the query behave differently? The Concern from the developers is "Will that impact the query plans for the app using that table " Have requested them to see if the can provide any details on the Queries hitting that table, as it currently has only one view dependent on the table -Sid

former_member182259
Contributor
0 Kudos

You definitely should think of moving off sp122 quicker.....not a lot of changes in sp135 from where you are currently.   WRT sampling and stats think of it this way.   <start simplified explanation> Normally, without sampling, we scan the table (or index if updating stats only on a single index) and record the distinct values along with the frequency of occurrence.   When finished, we construct the (default) 20 step values and compute the cell histograms.  For example, let's say we have a table with 100 rows with an identity column with values 1-100.  When finished, the stats would have 20 histogram cells and we would have a weight of 5/100's for each step (0.05).  With stats sampling at 10%, we start scanning the table and we read every 10th row.  For that row, we record in a work table the values for the index keys...so we would read rows 10, 20, 30, 40, 50, 60, 70, 80, 90 & 100.   In our case, we have gotten lucky as the max value (100) was detected during the sampling....but we might not be able to create 20 steps as we only have 10 values, so we only create 10 histograms with 10/100 (0.10) for a weighting for each step.   Now, let's increase to 30% sampling (or let's say 33% for fun to make things easier).   Now the values we read are 3, 6, 9, 12, 15, 18, ....,90, 93, 96, 99.   We will have 30 distinct values from which we can construct 20 steps and the weight of each histogram would be ~7/100.  However, we do have more steps, consequently, queries with IN() or OR might do better as the finer granularity is less likely to aggregate to above 40% when a table scan is chosen - for example, where col in (15,25,35,45,55) might result in a tablescan with 10% sampling due to 5x costing each SARG (with aggregation due to OR logic) vs. with 30% sampling, it might not.   However a query with col=100 will hit the out of range histogram (if enabled) on 30% sampling, whereas it hits a cell on 10% sampling.  In that case we are lucky as we got the last row with 10% sampling vs. say if the values were more interdispersed.  So, I would say that generally 30% sampling gives *better* statistics but there may be edge cases where it is worse.  I think the lower the sampling percentage, the bigger the issue would be with low(er) cardinality columns - particularly where distinct(*) on column/sampling % is < histogram steps requested.    For really high cardinality columns such as names, it likely makes minimal difference.  In either case, if you have issues, it may be as much the number of histogram cells (update index stats with n values clause) vs. the sampling %......  One can't say what the exact impact would be without seeing the actual stats that result as well as query predicates.

One of the diffs is that in scanning the table, we end up creating a worktable that has to be sorted for each column and then frequency cells/histogram steps derived.   This is where tempdb (and proc cache) get hit the hardest - and a lot of the slowness comes from as worktables oft end up flushed to disk and PIO to re-read when sorting/aggregating stats.   With hash stats, we use an in-memory hash table for the values and hence no tempdb nor sorting hence a lot less PIO involved on the tempdb side - which is likely to give you better stats as it will read entire table, but I have found runs 5x+ faster, which will give you the target speed you are were after with sampling.

Former Member
0 Kudos

Hey Jeff, update (index) stats with hashing has been available since 2012 (v15.7 ESD#2); you stressed SP130+, is this due to the bugs or something else?

We did notice that high domain hashing might produce less accurate histograms, and hash-based statistics may use a significant amount of tempdb buffer cache.

We also saw high CPU/IO usages related to update stats, not sure if SAP has a fix for CR# 757246.

former_member182259
Contributor
0 Kudos

Update stats with hashing was added for LOW domain attributes only starting in ESD #2, but really wasn't very usable until about ESD #4 (we tried using them on a early project with mixed results until ESD #4).    Perhaps this is why you noted high domain hashing using a lot of tempdb buffer cache???   I started running a lot of tests in sp100+ & sp110 as HIGH domain hashing was finally added.  As far as why sp130+, I would avoid sp12x due to some known issues and sp1## is simply too old.   Some reported issues with hash stats in the sp110 & sp120 range - I was never sure of the exact scenarios as I never hit any, however, I do know that after sp130+, reports of issues are minimal.   One consideration that I did that may make a difference is that I always ran them without any attempts to use worker processes/consumers, ran in threaded kernel (vs. process).   I will also state that I ran the tests with set statistics resource on and a few other traces and compared to normal update stats without hashing - and update stats with hashing almost always used multiple orders of magnitude fewer resources in tempdb and proc cache.   Will update stats with hashing use buffers in tempdb cache - absolutely - but far far far far less than normal update stats.

As far as accuracy 'high domain hashing might produce less accurate histograms' was a caution in some of the early descriptions (similar to stats sampling) - but I have never seen anything horribly inaccurate.   In fact, I have found larger discrepancies and problems with folks running with the default histogram steps - and certainly sampling is much more prone to inaccurate stats.   One has to recognize, of course, that histogram stats are just that - it isn't an index with pointers to every value - but rather a weighting of distribution of values within a range.....that is quickly outdated with the first DML and therefore is only an estimate/approximation of values.

WRT CR 757246 - that is a problem with sp_sysmon and those that continue to rely on it for monitoring.   To *REALLY* monitor threaded kernel, you need to use monThread.   Values in monEngine as well as sp_sysmon are merely derived values that sometimes are a bit dubiously accurate due to methods used to approximate the real values from monThread.   The specific CR cited was noted that sp_sysmon reports *INACCURATE* IO busy values which is unrelated to update stats, but rather due to outstanding IOs - which of course update stats can be an IO stressor so is an easy repro.   It was likely fixed in some release - if you are still on ESD #2, you really really need to consider upgrading to sp130+.....it is ~4 years old at this point....and tons of issues identified/fixed.

Former Member
0 Kudos

Thanks Jeff for your explanation, we are running SP132 and SP135. That bug is related to @@io_busy which over weighted IO busy, this is annoying since many DBAs still use sp_monitor to get the CPU/IO usage.

We actually use ProactiveDBA and MDA tables for monitoring, I don't think anybody would rely on sp_sysmon  after ASEv12.5.

"update (index) stats with hashing" did consume much less tempdb space, however we don't see much improvement on time consumption against a 700M table.

Run Command

CPU Time (ms)

Elapse Time (ms)

Update statistics xxxx

1691914

3996041

Update index statistics xxxx

2086711

4433156

Update index statistics xxx with hashing

2018926

3872431

Update statistics xxx with hashing

1461211

3278796

former_member182259
Contributor
0 Kudos

Sorry - this wacky editor on SCN is giving me fits on cut & pasting....will try again...

What you might want to do is look at monProcessWaits before/after the update stats - if you see most of the time is on WaitEvent=29 or 144, then the time is all spent on IO and not a lot can be done about it other than speeding up the IO subsystem (compare WaitTime to Waits to get ms/IO)...

It is also possible that the particular table only has low domain values which doesn't require a lot of tempdb space necessarily.   The real gains I noted was when the tempdb space was reduced  - e.g. if you set statistics on, check out the drop in writes.....if not that much, then that is why no real gain - and likely low cardinality attributes....(or else someone did stats hashing by default - make sure you test with no_hashing and not just defaults as it could be inherited as well as set via sp_configure)....

former_member89972
Active Contributor
0 Kudos

Siddharth

May be you can take a different approach

Some of the places I worked, I have seen tempdb configured with log and data mixed just like master database.  So you have full tempdb available for growing data and log.  This is done because unlike normal user database we do not need to worry about tempdb recovery. 


Risk of runaway process exists in all set-ups.

This architecture has a danger of runaway transaction filling up whole tempdb.

But it also has the benefit of space for the DBA maintenance work where whole tempdb is available for you to use for worktables and/or transaction logs as needed.

Think about it, test it and if found suitable implement it

HTH

Avinash

simon_ogden
Participant
0 Kudos

For me I don't think it will make a huge amount of difference to plan choice unless you don't have in place non-sampled densities.

When you update statistics with sampling the pages and values it reads it considers as the 'full' table from the density point of view. If you have 100 unique values and you do sampling of 10% it'll calculate that you have 10 unique values and give you a total density of 0.1 instead of 0.01. This can make a big difference to costings in some areas (unknowns/subqueries).  You might also end up with slightly less granular histograms.

For this reason the any densities that have been calculated without sampling will not be replaced with sampled densities by default. If you've never gathered a non-sampled density then you may end up reducing your density further ( I wouldn't know without checking whether aa density calculated with 30% I'd replaced when sampled with 10%)

If you don't have a non-sampled density and sont have the resources to generate them you should consider adjusting the density by a factor equal to 1/sampling percent - see sp_modifystats.

Former Member
0 Kudos

Hi Simon We recently started using the UIS with sampling, before that we were simply doing US not even UIS on this DB Have requested the app team to see if the can provide any details on the Queries hitting that table, as it currently has only one view dependent on the table -Sid