cancel
Showing results for 
Search instead for 
Did you mean: 

Access path calculation of CBO

Former Member
0 Kudos

Hello,

I have installed ABAP Trial 7.01 with MaxDB 7.7.06.07 to make some tests with the Cost Based Optimizer of MaxDB. SAP uses host variables in the database interface and the access path of the CBO is calculated without knowledge about the actual values. IN standard SAP the access path is the same, regardless if one row or 1 Mio rows are selected. This behavior I saw in DB2 and Oracle, but for MaxDB it looks like the costvalue at the end depends on the number of read pages/rows. The costvalue depends on the values of the host variables.

Am I wrong or is the calculation of the access path in MaxDB depending on the actual values?

Regards Bernd

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi,

you are right, MaxDB takes the real values into account.

Depending on them it is decided if the usage of an index would be better or a table scan.

And even for joins the calculation takes into account how many rows out of the table will fulfill the qualification on this table. This may cause different sequences of tables, depending on (for example) the check that 1% or 80% of a table will fulfill the qualification.

Elke

Answers (2)

Answers (2)

Former Member
0 Kudos

Thanks.

Yes, I understand, BI is different compared to OLTP systems. When a query runs for several seconds or minutes the additional effort for calculating alternative access path is not that important.

Lets focus on OLTP systems and in most cases bind variable peeking is switched off. The SQL statement may return 5 rows and has a runtime of 50ms. In DB2 and Oracle 10 I see the estimations of the CBO with the usage of host variables. When the estimation differs significantly from the actual read pages, because one row has a high or low selectivity I may change the SQL statement by hints. I can make that change, because I see significant differences what the CBO calculates in advance and what the actual numbers of read pages are.

What is MaxDB doing? Does MaxDB calculate every time the statement is executed the best access plan? When a statement has a short runtime of several milliseconds, then I guess this additional effort may significantly increase the response time.

Regards BErnd

lbreddemann
Active Contributor
0 Kudos

> What is MaxDB doing? Does MaxDB calculate every time the statement is executed the best access plan? When a statement has a short runtime of several milliseconds, then I guess this additional effort may significantly increase the response time.

Yes, MaxDB does a re-optimization of the statement whenever it's re-issued with new variables.

That's also true for OLTP systems.

So why do MaxDB systems run especially well on OLTP then (which they do)?

The answer is: it's not the amount of returned rows that determine the effort necessary to figure out an access plan!

In fact - most OLTP like queries are really simple to optimize - especially when you know about certain data skewnesses.

In most cases the MaxDB optimizer can figure out a good execution plan very quickly.

Therefore in general the better execution plans really pay the higher effort of permanent re-optimization.

I know that this answer may not be really satisfying to you, but in fact there is no "the better way to do it".

It really depends on how you use the database.

regards,

Lars

Former Member
0 Kudos

OK, your explanation confirmed what I expected. Do you know why MaxDB has such a different behavior compared to DB2/Oracle?

lbreddemann
Active Contributor
0 Kudos

Do you know why MaxDB has such a different behavior compared to DB2/Oracle?

Well, first of all: this is not that different from the Oracle behavior.

Also Oracle tries to figure out what values are stored in the host variables before calculating a execution path. That's called bind variable peeking.

And for BI systems bind variables aren't used at all, so the CBO knows exactly what values are requested.

The other thing is: by knowing the exact values the optimizer has a much better chance to decide for the most efficient execution plan, since it can take skew data distribution (and many data sets are have a certain skewness) into account.

So in fact by determining the execution plan newly for each call the quality of the execution plans get better.

On the other hand, this raises the requirements of computing power and memory so the ideal solution is somewhere in the middle.

With version 11.2 the CBO of Oracle e.g. got a bit cleverer again and now checks internally whether a new optimization would lead to better results. The database ends up with several reusable execution plans for the same statement in this case.

As you see, this is a pretty complex topic in which there is barely the right way.

regards,

Lars