cancel
Showing results for 
Search instead for 
Did you mean: 

How can I make a Query consisting Count Distinct work faster in SAP Sybase IQ?

srujan_gannamaneni
Participant
0 Kudos

Hi,

I am having a large data set in a table where monthly data goes up to 2 Billion Records.

It consists of 2 columns Anum and Bnum out of which I have to select Anum when a column named Direction consists of X else Bnum over which I need to further apply distinct Count.

Is there any possible way to improve the performance of the query?

As of now the CPU utilization is reaching 94% and Query is taking around 30-40 minutes to return the result.

Regards

Srujan

Accepted Solutions (1)

Accepted Solutions (1)

c_baker
Employee
Employee
0 Kudos

Most likely, your Anum and Bnum have rolled over to flatFP due to the size of the table (depending on what the actual total distinct count on the columns Anum and Bnum are). As tokenized nBitFP indexes (default index in a new IQ 16 database) roller over around 1 million values (default setting of FP_NBIT_AUTOSIZE_LIMIT), IQ may be spending a lot of time scanning the columns to perform the selections and matches.

As mentioned, adding an HG index on 'Anum' and 'Bnum' will help, but you may also want to add an HG index on the 'Direction' column as well to improve the selectivity.

That being said - other things can adversly affect IQ performance. The first that comes to mind is the fact that you mention CPU utilization at 94%. Although you do not mention it - this may not be IQ but a lot of reading from disk (System I/O) and not necessarily IQ. How large are your memory cache settings in IQ (-iqlm, -iqtc, -iqmc) in your .cfg file? If these are undersized for the memory available, then IQ might rollover to flatFP earlier than 1 million values, or may need to continually read pages from the IQ mainstore due to the undersizing of -iqmc.

Have you read the IQ Sizing Guide at https://www.sap.com/documents/2017/02/363ddfab-a77c-0010-82c7-eda71af511fa.html ?

Chris

Answers (1)

Answers (1)

former_member199543
Contributor
0 Kudos

This is all about proper indexing. The best bet would be to set HG index in this scenario.

Do you REALLY need to execute this on the whole dataset?

srujan_gannamaneni
Participant
0 Kudos

Hi,

Yes we need to consider whole dataset (well atleast one month data which is around 2 billion).
And more over I have already tried creating HG index on Anum and Bnum, the co-ordinator node stopped responding which forced us to kill the process to bring the DB back to normal.

Regards

Srujan