03-01-2010 5:00 PM
Hi gurus,
My table have estimated 2.8 million of records with blank SERNR (serial number).
Every SERNR has a SCAN_DATE with them. I just want the latest date from the 2.8mil records.
Which statement is faster?
SELECT scan_date
INTO TABLE temp
WHERE sernr = ' '.
SORT temp BY sernr ASCENDING scan_date DESCENDING.
DELETE ADJACENT temp COMPARING sernr.
VS
SELECT MAX( scan_date )
INTO TABLE temp
WHERE sernr = ' '
GROUP BY scan_date.
or is there any better way?
Edited by: siew lead choon on Mar 1, 2010 6:00 PM
03-01-2010 6:42 PM
Execute this and check
DATA: start TYPE i,
end TYPE i,
dif TYPE i,
avg TYPE i.
CLEAR avg.
DO 5 TIMES.
DO 5 TIMES.
GET RUN TIME FIELD start.
SELECT scan_date
INTO TABLE temp
WHERE sernr = ' '.
SORT temp BY sernr ASCENDING scan_date DESCENDING.
DELETE ADJACENT temp COMPARING sernr.
GET RUN TIME FIELD end.
dif = end - start.
WRITE: /001 'Time for NE ', dif, 'microseconds'.
avg = avg + dif.
ENDDO.
avg = avg / 5.
WRITE: /001 'Average ', avg, 'microseconds'.
SKIP 1.
CLEAR avg.
DO 5 TIMES.
GET RUN TIME FIELD start.
SELECT MAX( scan_date )
INTO TABLE temp
WHERE sernr = ' '
GROUP BY scan_date.
GET RUN TIME FIELD end.
dif = end - start.
WRITE: /001 'Time for EQ ', dif, 'microseconds'.
avg = avg + dif.
ENDDO.
avg = avg / 5.
WRITE: /001 'Average ', avg, 'microseconds'.
SKIP 1.
03-01-2010 6:42 PM
Execute this and check
DATA: start TYPE i,
end TYPE i,
dif TYPE i,
avg TYPE i.
CLEAR avg.
DO 5 TIMES.
DO 5 TIMES.
GET RUN TIME FIELD start.
SELECT scan_date
INTO TABLE temp
WHERE sernr = ' '.
SORT temp BY sernr ASCENDING scan_date DESCENDING.
DELETE ADJACENT temp COMPARING sernr.
GET RUN TIME FIELD end.
dif = end - start.
WRITE: /001 'Time for NE ', dif, 'microseconds'.
avg = avg + dif.
ENDDO.
avg = avg / 5.
WRITE: /001 'Average ', avg, 'microseconds'.
SKIP 1.
CLEAR avg.
DO 5 TIMES.
GET RUN TIME FIELD start.
SELECT MAX( scan_date )
INTO TABLE temp
WHERE sernr = ' '
GROUP BY scan_date.
GET RUN TIME FIELD end.
dif = end - start.
WRITE: /001 'Time for EQ ', dif, 'microseconds'.
avg = avg + dif.
ENDDO.
avg = avg / 5.
WRITE: /001 'Average ', avg, 'microseconds'.
SKIP 1.
03-01-2010 7:38 PM
One more option you have is:
SELECT scan_date
INTO l_scan_date
FROM table_name
UP TO 1 ROWS
WHERE sernr = ' '
ORDER BY scan_date DESCENDING.
03-02-2010 9:20 AM
the max is of course the fastest!
And there is something wrong with the sernr, if is initial, so there is only one value, GROUP BY is useless and
DELETE adjacent duplicates is also useless.
The only alternative is
SELECT
if ( ...data > max_date ).
max_date = ...
endif.
ENDSELECT.
This can be o.k., if there are not so many hits.
The ORDER BY and UP TO 1 ROWS is also slower, because it must find all hit, sort them and then it takes the largest.
03-02-2010 10:45 AM
hi keshav and sudhi,
I have tried all the 3 methods but since the table has est 2.8 million records with blank sernr, each methods still takes very long till the server time out on most case.
hi Siegfried,
can you explain more on your alternative way?
Thanks all for the suggestion anyway.:)
03-02-2010 4:06 PM
Did you try this yet?
SELECT MAX( scan_date )
INTO l_scan_date
FROM table_name
WHERE sernr = ''.
03-02-2010 4:34 PM
yup, i tried that...its also very slow...
its very slow only when the sernr is ' ' as there are too many blank sernr...and i am not supplied enough WHERE fields to select more precise data...
03-02-2010 4:50 PM
Try selecting without the sernr something like :
data: begin of t_temp occurs 0,
sernr,
scan_date,
end of t_temp.
select sernr scan_date into table t_temp from table where <keyfields>.
sort t_temp by sernr ascending scan_data descending.
read table t_temp index 1.
After the sort you should have the max date for the blank sernr in the first record.
03-02-2010 8:11 PM
Hi,
>
> its very slow only when the sernr is ' ' as there are too many blank sernr...and i am not supplied enough WHERE fields to select more precise data...
please post table and index definition and the execution plan from ST05... .
Kind regards,
Hermann
03-02-2010 12:34 PM
>I have tried all the 3 methods but since the table has est 2.8 million records with blank sernr, each methods still takes very long > till the server time out on most case.
what is very long??? with 2.8 srrnr = '' is will always take long, how many records are in the whole table, because it is a full table scan and there is no way to avoid it.
And your first option should actually never work with 2.8 mio records, it will take years, but I guess it dumps first.
03-03-2010 11:38 AM
>but since the table has est 2.8 million records with blank sernr
the 2.8 million are the records with blank sernr or the total of the table.
In the first case there is nearly no chance that it is no full table scan, i.e. it must be slow!
Then comes the hard question, is this really a problem? I have some doubts, that this task is very reasonable and will be repeated very often.
Siegfried
03-03-2010 1:29 PM
there are 2.8mil records with blank sernr...total records are about 65 million...
03-03-2010 1:36 PM
Hi,
please post more details from ST05. Execution plan, indexes available, ... .
Kind regards,
Hermann
03-03-2010 3:45 PM
> there are 2.8mil records with blank sernr...total records are about 65 million...
o.k., that is what I expected.
So what is the runtime and what do you expect?
03-03-2010 4:47 PM
jz want to know which statement is the fastest since testing all would take hours...
my table is a custom table and has 3 own indexes...there are many other fields but i just simplify it here...
thanks all for help...:)
03-03-2010 5:20 PM
Hi,
Albert Einstein said:
Everything should be made as simple as possible, but not simpler.
>
> my table is a custom table and has 3 own indexes...there are many other fields but i just simplify it here...
>
In order to help you we need more information.
How long does your query run and what is your expectation?
What execution plan is currently used?
How do your indexes look like?
...
Kind regards,
Hermann
03-04-2010 3:07 PM
Hi,
if you eager to increase performance and possible troubles with capacity don't disturb you you can create index for a field you want to search by. But for date it's not the best solution to keep in index 65 million different dates. But this can be better than several hours of waiting the result
And I thought of one crazy idea. Probably you can in some way control the data (BADI or other enhancements) which is added to this table. Create Z* table with single date-field which will keep the latest one according to your purposes. Selecting this value will be definitely faster
Good luck!
Edited by: Drabkov Anton on Mar 4, 2010 4:08 PM
Edited by: Drabkov Anton on Mar 4, 2010 4:11 PM
03-04-2010 3:39 PM
nonsense, you don't have to wait for hours !
Just add another WHERE condition and you can compare on a much smaller basis of data.
But as I told you already the second is faster.
But your large task will always be slow, 2 is slow and 1 is slower.
Clear !?!!!