Skip to Content
0
Former Member
May 17, 2013 at 08:11 AM

bad query plan for join on column with mostly ' ' content

219 Views

Hello,

We are running a 3rd party application for many years that had some performance problems.

Some queries do not use the correct index.

An example :

- table A column ref1 varchar(35) null. non unique index on this column.

select * from A where ref1 = '00000000123'

uses the index.

create table B (ref1 varchar(35) null)

insert B values ( '00000000123')

select * from A, B where A.ref1 = B.ref1

does a table scan on A :

|ROOT:EMIT Operator (VA = 3)

|

| |NESTED LOOP JOIN Operator (VA = 2) (Join Type: Inner Join)

| |

| | |SCAN Operator (VA = 0)

| | | FROM TABLE

| | | B

| | | Table Scan.

| | | Forward Scan.

| | | Positioning at start of table.

| | | Using I/O Size 4 Kbytes for data pages.

| | | With LRU Buffer Replacement Strategy for data pages.

| |

| | |SCAN Operator (VA = 1)

| | | FROM TABLE

| | | A

| | | Table Scan.

| | | Forward Scan.

| | | Positioning at start of table.

| | | Using I/O Size 8 Kbytes for data pages.

| | | With MRU Buffer Replacement Strategy for data pages.

There are no NULL values in the column, but most of the values are " ", as shown by the histogram in the optdiag result :

1 0x00000000 < 0x20

# 1 0.00000000 < " "

2 0xf8ee213f = 0x20

# 2 0.63255262 = " "

3 0x819da33c <= 0x303039313439323132373738

# 3 0.01997256 <= "009149212778"

4 0x3837ad3c <= 0x303130313032313733383231

# 4 0.02114449 <= "010102173821"

5 0x1dafa53c <= 0x303130313233313834363237

# 5 0.02022510 <= "010123184627"

.... etc

What I used to do in 12.5 and before :

- create a table X with column ref1 varchar(35),

- select non-blank values of A.ref1 into X and create an index on the column ref1

- run optdiag on A and X

- replace the statistics of A.ref1 by those of X.ref1 in the optdiag file of A

- reload the optdiag file into the database.

I did this for about 10 columns in the database.

I had hope that the optimizer in 15.7 would better understand these queries, but it doesn't.

Now I wonder if there isn't a more elegant way to make these queries choosing the right index.

Any ideas ?

Thanks,

Luc.