cancel
Showing results for 
Search instead for 
Did you mean: 

How long does it take to compile a DB2 access plan?

0 Kudos

I have coded a function module to extract data from the totals table, faglflext. I expected it to be fairly slow to run, as I have a where condition based on rassc, which is not the first column in any index. However, I found an index scan of an index containing rassc (as the last column) performed adequately in a test environment which contained around 98% of our production data. The function module is used infrequently, and a response time of several minutes is acceptable. Unfortunately, the first run in production was unacceptably slow, at around 30 minutes; running again an hour or two later took only seconds. Could it possibly take that long to build an access plan?

I noticed in testing the first time I executed the query after changing it, or when running it after a gap of perhaps a day or more, it would take several minutes, but it could be run again soon after in seconds.

The table is not buffered, and in any case, if I understand correctly, buffering would not be used because the query joins several tables. For additional reassurance the difference was not down to buffering, I added the BYPASSING BUFFER option; run time was still much faster after the first time.

I understand the first run may need to build the access plan, the plan is stored in a cache, and may be lost if the query is not reused for a while. I assumed this was the reason for the longer first time run, and implemented the code in production.

In production, the first query, on a quiet system at the weekend, ran for around 30 minutes. When repeated an hour later, it returned the same data in seconds. The table contains over 80 million rows, but only 10 were selected. Running again to select around 700 rows for a different trading partner also ran in seconds. Looking at the access plan for the long run, the query was executed as expected.

Could it really take half an hour to build an access plan for a query joining four tables? I could rewrite it as separate queries, finishing with a FOR ALL ENTRIES instead of the joins, but I am not sure that will help, as in the test environment (with prod-like data) performance is slightly worse.

Are there any other reasons the first run would be slower?

Why might the first run take so much longer in production than in test, even with almost the same data, when subsequent runs are similar in the two environments?

Any suggestions appreciated.

Accepted Solutions (1)

Accepted Solutions (1)

Frank-Martin
Product and Topic Expert
Product and Topic Expert

Hi Antony,

a prepare of a very complex SQL statement may take some seconds but I do not expect it to prepare for minutes. You can check this if you find the SQL statement in "DBACOCKPIT -> performance -> SQL cache" . Change the default layout and include column "Preparation Time".

You described that the query can not use RASSC as the first column in an index. Does the optimizer use the predicate on this column only as a "sargable" predicate or does it use a JUMPSCAN ?. Only predicates that can be used as START/STOP predicates during an index scan limit the number of index pages that need to be read. If the predicate is only used as a "sargable" predicate the query probably needs to do a full index scan to apply the predicate. This may be very inefficient depending on the size of the index.

Based on the symptoms you describe I would conclude that the first query runs slow due to a buffer pool warm up effect. The next execution may benefit from data and index pages that are already in the bufferpool. You can also check this looking at columns "Data Phsical Reads" and "Index Physcal Reads" in "DBACOCKPIT -> performance -> SQL cache" .

Maybe you can provide some more details about the access plan of your query.

Regards

Frank

0 Kudos

Thanks, Frank.

Is the bufferpool you mention something within DB2, independent of ECC? FAGLFLEXT is marked "Buffering not allowed", I tried specifying BYPASSING BUFFER in the ABAP query, and I read that ECC ignores buffers for joins anyway.

I cannot access DBACOCKPIT in production, but looked in non-prod, and see buffer pool references, but no sign of "Reads".

  • Rows Read / Rows Processed 20,322
  • BP Gets/ Rows Processed 65,899
  • BP Gets / Executions 21,966.3

RASSC is used as a sargeable predicate. I wondered whether DB2 would use a jump scan, but there is nothing to indicate that. Would it show as a start/stop predicate if it did?

Our user index starts with client (RCLNT), company code (RBUKRS), fiscal year (RYEAR), and ledger (RLDNR). The reference data tables in the join provide a list of companies which use the ledger, and the current financial year for each. I designed it this way to avoid trawling through prior years. The function is part of an infrequently-used process, but is likely to be repeated several times over a few days each time the process is used.

Before I added RASSC to the index, the access plan showed all the other tables joined on the left, then an IXSCAN and FETCH for each client/company/ledger/year. Performance for the initial run in non-prod environments was just about adequate, and for repeated requests satisfactory. However, in production the repeated requests did not seem to run any faster than the first, which caused problems.

To improve performance without creating yet another index, I added RASSC to the existing index. Strangely, DB2 no longer uses RYEAR as a start/stop predicate, and uses RLDNR only to satisfy the RBUKRS criterion. An RIDSCN merges scans of FAGLFLEXT~1 (RLDNR/RBUKRS/RACCT; two key columns used) and our user index (RCLNT/RBUKRS/.../RASSC; two key columns).

With the index change in production, the query ran even longer the first time (around 30'), but then much more quickly on the next two occasions (a few seconds). This was on a Sunday, with very little activity on the system; I expected run times then to be similar to our user testing environment, where we have a recent copy of production data.

My main concern is not really the long run time (we may have to live with that, or make a case for a new index), but the inconsistency. I put a lot of effort into tuning my queries in development and testing, but performance in production, even during the maintenance window, was so much worse it seems I cannot rely on any test results.

Regards,

Anthony.

The access plan below was from a test run in a QA environment with far less data than production. Unfortunately, I do not currently have access to production or our prod-like UAT environments. I tried to attach it as an image, but the dialog box did not close when the upload completed, and there is no sign of an attachment.

Access Plan Opt Level = 5 ; Parallelism = None

0 SELECT STATEMENT ( Estimated Costs = 1.469E+03 [timerons] ) num_rows tot_cost i/o_cost

1 RETURN 1.4127E-01 1.4691E+03 2.1153E+02

2 NLJOIN 1.4127E-01 1.4691E+03 2.1153E+02

3 [O] TBSCAN GENROW 1.0000E+00 1.8014E-04 0.0000E+00
4 [I] NLJOIN 1.4127E-01 1.4691E+03 2.1153E+02

5 [O] NLJOIN 1.8548E+01 1.2115E+03 1.7427E+02

6 [O] NLJOIN 7.4984E+01 3.9774E+01 5.7543E+00

7 [O] FETCH YGLT_LEDGER_FYR 2.3666E+00 6.9086E+00 1.0000E+00

8 IXSCAN YGLT_LEDGER_FYR~0 #key columns: 2 2.3666E+00 1.8490E-02 0.0000E+00

9 [I] FETCH TVARVC 3.1683E+01 1.3895E+01 2.0088E+00

10 IXSCAN TVARVC~0 #key columns: 0 3.1683E+01 6.9324E+00 1.0000E+00

11 [I] FETCH FAGLFLEXT 2.4736E-01 1.5641E+01 2.2473E+00

12 IXSCAN FAGLFLEXT~Z02 #key columns: 2 2.4736E-01 1.3937E+01 2.0000E+00

13 [I] FETCH TVARVC 7.6162E-03 1.3899E+01 2.0088E+00

14 IXSCAN TVARVC~0 #key columns: 2 3.1683E+01 6.9324E+00 1.0000E+00

---------------------------------------------------------------------------------
Predicates: (Step 12 - IXSCAN FAGLFLEXT)
---------------------------------------------------------------------------------
7 ) Used as Start Condition
-----------------------------------------------------------------------------
relational operator: | equal (=)
subquery input required: | No
filter factor: | 5.0000E-01
-----------------------------------------------------------------------------
predicate text:
(Q6.RCLNT = ?)

9 ) Used as Start Condition
-----------------------------------------------------------------------------
relational operator: | equal (=)
subquery input required: | No
filter factor: | 2.4038E-04
-----------------------------------------------------------------------------
predicate text:
(Q6.RBUKRS = Q5.LOW)

7 ) Used as Stop Condition
-----------------------------------------------------------------------------
relational operator: | equal (=)
subquery input required: | No
filter factor: | 5.0000E-01
-----------------------------------------------------------------------------
predicate text:
(Q6.RCLNT = ?)

9 ) Used as Stop Condition
-----------------------------------------------------------------------------
relational operator: | equal (=)
subquery input required: | No
filter factor: | 2.4038E-04
-----------------------------------------------------------------------------
predicate text:
(Q6.RBUKRS = Q5.LOW)

2 ) Sargable Predicate
-----------------------------------------------------------------------------
relational operator: | equal (=)
subquery input required: | No
filter factor: | 1.0000E+00
-----------------------------------------------------------------------------
predicate text:
(Q6.RLDNR = Q3.RLDNR)

13 ) Sargable Predicate
-----------------------------------------------------------------------------
relational operator: | not applicable
subquery input required: | No
filter factor: | 2.5186E-03
-----------------------------------------------------------------------------
predicate text:
((Q6.RBUKRS = ?) OR (Q6.RASSC = ?))

Answers (2)

Answers (2)

Frank-Martin
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Antony,

I see. Without good index the query may not get much faster. If table sizes are larger on prod the buffer pool warm up effect may be just larger.

There is nothing obvious in the query that can be tuned. The duplicate redundant predicate on rldnr does not help but should not hurt much either. The strage clause "(SELECT Q1.$C0 FROM (VALUES 0) AS Q1 WHERE (? = ? SELECTIVITY 1.000000))" may be a result from this.

Regarding the OR part. Yes, Db2 may split this into seperate scans.

Regards

Frank

Frank-Martin
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Antony,

yes. No buffering is used on application server side. The data bufferpool is the place where DB2 keeps the data in memory to avoid reading them from disk over and over again. If a query that needs o lof a data or index pages for the first time if may be much slower since the pages need to be read into the bufferpool from disk. If it is executed again most of the data may still be in the bufferpool depending on the bufferpool size.

Your ratio "BP Gets/ Rows Processed = 65,899" and "Rows Read / Rows Processed = 20,322" does not look good. For each row that needs to be looked at Db2 needs to do 65,899 buffer pool look ups. And most of the rows that are processed turn out to be irrelevant sund must of them are not returned by the query. Those are indicators that no good index exist.

Looking ot your IXSCAN output currently only index columns RCLNT and RBUKRS can be used efficiently as START/STOP predicates. The predicate on RLDNR and RASSC can only be used as "sargable predicates". "sargable predicates" do not limit the number of index pages that need to be evaluated. Therefore they are less efficient .No JUMPSCAN is used. A JUMPSCAN may not be attractive if the index columns between RCLNT, RBUKRS and RASSC have a high cardinality.

I wonder why there is an additional "((Q6.RBUKRS = ?) OR (Q6.RASSC = ?))" predicate in the index scan. Can you past the ABAP statement and the SQL statement text?

Regards

Frank

0 Kudos

Hi Frank.

I did not realise "Buffering not allowed" referred only to the application server. Thank you for enlightening me.

The high ratio of records read to records processed is expected. I always knew there was no good index. I am sure a new index would solve all my problems with this query very easily, but would add to the index maintenance overhead for more time-critical applications. Adding RASSC to the existing index immediately after RCLNT, RBUKRS, RYEAR, and RLDNR should work for me, but could impact existing queries using that index to search also by RACCT (the last index column before I added RASSC).

Adding RASSC to the end of an existing index was a low-impact compromise which improved run times significantly in non-prod environments, even on the first run, before the buffer pool was loaded, but in production the first run took even longer. It is the significant inconsistency between non-prod and prod, even when the data is almost the same, which concerns me; I am sure I must be missing something. I will try to get a DBACOCKPIT report from production to compare with my non-prod results, and see if that gives me any clues.

The OR you mentioned is from the ABAP query (the emboldened condition below). I want rows where the company I am interested in is either the main company or the trading partner. In most test environments, DB2 scanned the index for each company on the ledger, applying the OR predicate as shown above. In production, it used separate scans merged in a RIDSCN. Most of the work is the massive scan for RASSC, which generally returns no more than a few hundred records.

ABAP:

SELECT faglflext~ryear etc.

FROM yglt_ledger_fyr " links TVARVC list of company codes with a TVARVC fiscal year
INNER JOIN tvarvc AS yearvar ON yearvar~name = yglt_ledger_fyr~yearvar " fiscal year for the ledger and variant
INNER JOIN tvarvc AS cocdvar ON cocdvar~name = yglt_ledger_fyr~cocdvar " companies for the ledger and variant
INNER JOIN faglflext
ON faglflext~rldnr = yglt_ledger_fyr~rldnr
AND faglflext~ryear = yearvar~low
AND faglflext~rbukrs = cocdvar~low

APPENDING TABLE lt_balances
WHERE yglt_ledger_fyr~rldnr = lv_rldnr
AND faglflext~rldnr = lv_rldnr " Redundant, because enforced by the join and prior WHERE condition, but may help the optimizer.
AND ( faglflext~rbukrs = p_company_code OR faglflext~rassc = lv_rassc ).

-------------------------------------------------------------------------------------
Optimized Statement:
-------------------------------------------------------------------------------------
SELECT
Q6.RYEAR AS "RYEAR", Q6.RMVCT AS "RMVCT", Q6.RTCUR AS "RTCUR", etc.
FROM
(SELECT Q1.$C0 FROM (VALUES 0) AS Q1 WHERE (? = ? SELECTIVITY 1.000000)) AS
Q2, SAPQE1.YGLT_LEDGER_FYR AS Q3, SAPQE1.TVARVC AS Q4,
SAPQE1.TVARVC AS Q5, SAPQE1.FAGLFLEXT AS Q6
WHERE
(Q4.MANDT = ?) AND (Q4.NAME = Q3.YEARVAR) AND (Q5.MANDT = ?) AND (Q5.NAME =
Q3.COCDVAR) AND (Q6.RCLNT = ?) AND (Q6.RYEAR = Q4.LOW) AND (Q6.RBUKRS =
Q5.LOW) AND (Q6.RLDNR = ?) AND (Q3.RLDNR = ?) AND (Q3.MANDT = ?) AND
((Q6.RBUKRS = ?) OR (Q6.RASSC = ?))