Skip to Content
0

Leverage Statement Cache for queries with large IN clause - ASE

Nov 21, 2016 at 01:50 PM

60

avatar image
Former Member

We have a system that execute small number of generic queries against ASE, in response to on-demand queries. In general, the statement looks like:

SELECT ... FROM ... JOIN ... JOIN ... JOIN ... JOIN ...

WHERE P1 = ? and P2 = ? and P3 = ?

AND P4 in ( L1, L2, L3, L4, L5, L6, ...)

The Quesion:

Is there a way to tune the optimizer to consider that all those queries, regardless of the number of parameters in the IN-list should use the same entry in the statement cache ?

Details:

The value lists are generated on the fly, based on user input. It's size vary between 100 and 1000 items.

To speed up performance, we turned on statement cache and autoparam. This works wonders for repeated calls: First call took about 0.3 seconds (most of the time is for the Parse and Compile). Follow-up call take 0.05 seconds (just execution from a cached plan).

We noticed that Sybase creates a distinct entry in the cache (and spend the time to optimize it) for each distinct number of variables in the IN list - we get one entry for 4 parameters, one entry for 5 parameters, one for 6 parameters

Two problems:

* Since the size of the IN clause vary, it takes a lot of 'first time' calls to get all the combinations added to the statement cache.

* The cache size need to be very large to keep all those query plans, with the IN list going up all the way to 1000 items (Actual list is bigger, and is split into 1000 item pieces).

Thanks a lot!

Yair

10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

2 Answers

Bret Halford
Nov 21, 2016 at 02:10 PM
0

An untested idea: can you modify the client system to always generate the query with the same number of entries by padding the list with duplicate values up to the maximum expected number?

SELECT ... FROM ... JOIN ... JOIN ... JOIN ... JOIN ...
WHERE P1 = ? and P2 = ? and P3 = ?
AND P4 in ( L1, L2, L3, L4, L5, L6, L6, L6, ..<990 more L6 entries>..., L6, L6)

Show 2 Share
10 |10000 characters needed characters left characters exceeded
Former Member

Thanks for taking the time to look into this question..

This is an interesting idea. I have to test this - there is a note that large queries will not go through statement cache. Have to check if this will break the limit. We traditionally keep the IN-list at 1000, to stay under the referenced 1024 limit which was referenced in some documents as critical.

0
Former Member

Just tested, and it works! Many Thanks!. I will note that performance went bad when IN-LIST reaches 1500 (did not test exactly where, but is OK at 1000, probably 1024 is the limit).

I have to see how to implement the change, as the generated SQL will not be as intuitive as it was before. Small price to pay for performance :-)

I've also substituted NULL for the repeated entries, in the hope that Sybase might realize that those nulls will never match P4. Might save few cycles.

0
Mark A Parsons Nov 21, 2016 at 02:26 PM
0

Are any other parts of the query being dynamically built besides the P4/in-list? If not, or if there's a small set of base queries then you may want to consider looking at the use of abstract query plans (AQP); you can think of AQPs as the ultimate in optimizer hints.

With AQPs you could disable statement cache (probably at the session level if other processes are benefiting from statement cache); this would keep the application from flooding statement/proc cache with the large number of 'almost duplicate' query plans.

With statement cache disabled you'll now incur overhead for optimization for each query, but with AQPs you can greatly reduce this overhead.

Show 1 Share
10 |10000 characters needed characters left characters exceeded
Former Member

Thank you for taking the time to response.

We are currently using explicit 'plan' in the SELECT statement (based on Query Analyzer output). However, this is risky proposition, as the queries might get changes in the future, and the explicit plan might result in poor performance.

Looking for a solution that will address provide the benefit of one-time optimization, without the downside risk of incorrect plans.

0