Solved: which statement is faster?

Former Member · ‎03-01-2010

Hi gurus,

My table have estimated 2.8 million of records with blank SERNR (serial number).

Every SERNR has a SCAN_DATE with them. I just want the latest date from the 2.8mil records.

Which statement is faster?

SELECT scan_date

INTO TABLE temp

WHERE sernr = ' '.

SORT temp BY sernr ASCENDING scan_date DESCENDING.

DELETE ADJACENT temp COMPARING sernr.

VS

SELECT MAX( scan_date )

INTO TABLE temp

WHERE sernr = ' '

GROUP BY scan_date.

or is there any better way?

Edited by: siew lead choon on Mar 1, 2010 6:00 PM

kesavadas_thekkillath · ‎03-01-2010

Execute this and check


DATA: start  TYPE i,
      end    TYPE i,
      dif    TYPE i,
      avg    TYPE i.
CLEAR avg.
DO 5 TIMES.


DO 5 TIMES.
  GET RUN TIME FIELD start.
  SELECT scan_date
  INTO TABLE temp
  WHERE sernr = ' '.

  SORT temp BY sernr ASCENDING scan_date DESCENDING.
  DELETE ADJACENT temp COMPARING sernr.
  GET RUN TIME FIELD end.
  dif = end - start.
  WRITE: /001 'Time for NE     ', dif, 'microseconds'.
  avg = avg + dif.
ENDDO.
avg = avg / 5.
WRITE: /001 'Average         ', avg, 'microseconds'.
SKIP 1.
 
CLEAR avg.
DO 5 TIMES.
GET RUN TIME FIELD start.
SELECT MAX( scan_date )
INTO TABLE temp
WHERE sernr = ' '
GROUP BY scan_date.
 GET RUN TIME FIELD end.
  dif = end - start.
  WRITE: /001 'Time for EQ     ', dif, 'microseconds'.
  avg = avg + dif.
ENDDO.
avg = avg / 5.
WRITE: /001 'Average         ', avg, 'microseconds'.
SKIP 1.

kesavadas_thekkillath · ‎03-01-2010

Execute this and check


DATA: start  TYPE i,
      end    TYPE i,
      dif    TYPE i,
      avg    TYPE i.
CLEAR avg.
DO 5 TIMES.


DO 5 TIMES.
  GET RUN TIME FIELD start.
  SELECT scan_date
  INTO TABLE temp
  WHERE sernr = ' '.

  SORT temp BY sernr ASCENDING scan_date DESCENDING.
  DELETE ADJACENT temp COMPARING sernr.
  GET RUN TIME FIELD end.
  dif = end - start.
  WRITE: /001 'Time for NE     ', dif, 'microseconds'.
  avg = avg + dif.
ENDDO.
avg = avg / 5.
WRITE: /001 'Average         ', avg, 'microseconds'.
SKIP 1.
 
CLEAR avg.
DO 5 TIMES.
GET RUN TIME FIELD start.
SELECT MAX( scan_date )
INTO TABLE temp
WHERE sernr = ' '
GROUP BY scan_date.
 GET RUN TIME FIELD end.
  dif = end - start.
  WRITE: /001 'Time for EQ     ', dif, 'microseconds'.
  avg = avg + dif.
ENDDO.
avg = avg / 5.
WRITE: /001 'Average         ', avg, 'microseconds'.
SKIP 1.

Former Member · ‎03-01-2010

One more option you have is:


SELECT   scan_date
INTO     l_scan_date
FROM     table_name
UP TO 1 ROWS
WHERE    sernr = ' '
ORDER BY scan_date DESCENDING.

former_member194613 · ‎03-02-2010

the max is of course the fastest!

And there is something wrong with the sernr, if is initial, so there is only one value, GROUP BY is useless and

DELETE adjacent duplicates is also useless.

The only alternative is


SELECT

if ( ...data > max_date ).
  max_date = ...
endif.

ENDSELECT.

This can be o.k., if there are not so many hits.

The ORDER BY and UP TO 1 ROWS is also slower, because it must find all hit, sort them and then it takes the largest.

Former Member · ‎03-02-2010

hi keshav and sudhi,

I have tried all the 3 methods but since the table has est 2.8 million records with blank sernr, each methods still takes very long till the server time out on most case.

hi Siegfried,

can you explain more on your alternative way?

Thanks all for the suggestion anyway.:)

Former Member · ‎03-02-2010

Did you try this yet?


SELECT MAX( scan_date )
INTO   l_scan_date
FROM   table_name
WHERE  sernr = ''.

Former Member · ‎03-02-2010

yup, i tried that...its also very slow...

its very slow only when the sernr is ' ' as there are too many blank sernr...and i am not supplied enough WHERE fields to select more precise data...

PedroGuarita · ‎03-02-2010

Try selecting without the sernr something like :


data: begin of t_temp occurs 0,
           sernr,
          scan_date,
       end of t_temp.

select sernr scan_date into table t_temp from table where <keyfields>.

sort t_temp by sernr ascending scan_data descending.

read table t_temp index 1.

After the sort you should have the max date for the blank sernr in the first record.

former_member192616 · ‎03-02-2010

Hi,

>


> its very slow only when the sernr is ' ' as there are too many blank sernr...and i am not supplied enough WHERE fields to select more precise data...

please post table and index definition and the execution plan from ST05... .

Kind regards,

Hermann

former_member194613 · ‎03-02-2010

>I have tried all the 3 methods but since the table has est 2.8 million records with blank sernr, each methods still takes very long > till the server time out on most case.

what is very long??? with 2.8 srrnr = '' is will always take long, how many records are in the whole table, because it is a full table scan and there is no way to avoid it.

And your first option should actually never work with 2.8 mio records, it will take years, but I guess it dumps first.

former_member194613 · ‎03-03-2010

>but since the table has est 2.8 million records with blank sernr

the 2.8 million are the records with blank sernr or the total of the table.

In the first case there is nearly no chance that it is no full table scan, i.e. it must be slow!

Then comes the hard question, is this really a problem? I have some doubts, that this task is very reasonable and will be repeated very often.

Siegfried

Former Member · ‎03-03-2010

there are 2.8mil records with blank sernr...total records are about 65 million...

former_member192616 · ‎03-03-2010

Hi,

please post more details from ST05. Execution plan, indexes available, ... .

Kind regards,

Hermann

former_member194613 · ‎03-03-2010

> there are 2.8mil records with blank sernr...total records are about 65 million...

o.k., that is what I expected.

So what is the runtime and what do you expect?

Former Member · ‎03-03-2010

jz want to know which statement is the fastest since testing all would take hours...

my table is a custom table and has 3 own indexes...there are many other fields but i just simplify it here...

thanks all for help...:)

former_member192616 · ‎03-03-2010

Hi,

Albert Einstein said:

Everything should be made as simple as possible, but not simpler.

>


> my table is a custom table and has 3 own indexes...there are many other fields but i just simplify it here...
>

In order to help you we need more information.

How long does your query run and what is your expectation?

What execution plan is currently used?

How do your indexes look like?

...

Kind regards,

Hermann

Former Member · ‎03-04-2010

Hi,

if you eager to increase performance and possible troubles with capacity don't disturb you you can create index for a field you want to search by. But for date it's not the best solution to keep in index 65 million different dates. But this can be better than several hours of waiting the result

And I thought of one crazy idea. Probably you can in some way control the data (BADI or other enhancements) which is added to this table. Create Z* table with single date-field which will keep the latest one according to your purposes. Selecting this value will be definitely faster

Good luck!

Edited by: Drabkov Anton on Mar 4, 2010 4:08 PM

Edited by: Drabkov Anton on Mar 4, 2010 4:11 PM

former_member194613 · ‎03-04-2010

nonsense, you don't have to wait for hours !

Just add another WHERE condition and you can compare on a much smaller basis of data.

But as I told you already the second is faster.

But your large task will always be slow, 2 is slow and 1 is slower.

Clear !?!!!