Application Development Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 

Select Query with Aggregate functions, Group By clause vs. Simple Select Query

Former Member
0 Kudos

Hi,

I have a requirement in which I need to sum the quantity field(mseg-menge) from Mseg table. In this I am using inner join also. I can write the select query in two ways:


1. Select with SUM and Group By  clause

SELECT mseg~matnr

           mseg~werks mseg~lgort SUM( mseg~menge ) mseg~bwart

         INTO TABLE gt_data1

         FROM mkpf

          INNER JOIN mseg

           ON mkpf~mblnr = mseg~mblnr

           AND mkpf~mjahr  = mseg~mjahr

          WHERE mseg~werks = p_site

          AND mseg~lgort = p_sloc

          AND mkpf~budat >= p_date

      GROUP BY mseg~werks mseg~lgort mseg~bwart

              mseg~matnr makt~maktx.

* P_site ,P_sloc, p_date are the selection screen parameters.

2. Select without SUM and Group By clause. Just by fetching all the data based on conditions and then adding quantity field by looping that internal table.

SELECT mseg~matnr

             mseg~werks mseg~lgort mseg~bwart mseg~menge

           INTO TABLE gt_data3

            FROM mkpf

            INNER JOIN mseg

            ON mkpf~mblnr = mseg~mblnr

            AND mkpf~mjahr  = mseg~mjahr

            WHERE mseg~werks = p_site

           AND mseg~lgort = p_sloc

           AND mkpf~budat >= p_date.

sort gt_data3 by matnr  werks lgort bwart.

loop at gt_data3 into gwa_data3.

move-corresponding gwa_data3 to gwa_data1.

clear gwa_data1-menge.

   at end of bwart.

     sum.

     move: gwa_data3-menge to gwa_data1-menge.

     append gwa_data1 to gt_data1.

clear : gwa_data3,gwa_data1.

     endat.

   endloop.

Please suggest , In terms of Performance which select query is better?

Regards,

Pankaj

1 ACCEPTED SOLUTION

raymond_giuseppi
Active Contributor
0 Kudos

Which of your servers are the most "comfortable", the application server or database server. aggregation functions will load the server database, loading retail positions and cumulation will load communication bandwidth and the application server. You should perform some test for your actual configuration, even if usually the aggregation give better results. (SAT/ST05)

Regards,

Raymond

8 REPLIES 8

raymond_giuseppi
Active Contributor
0 Kudos

Which of your servers are the most "comfortable", the application server or database server. aggregation functions will load the server database, loading retail positions and cumulation will load communication bandwidth and the application server. You should perform some test for your actual configuration, even if usually the aggregation give better results. (SAT/ST05)

Regards,

Raymond

former_member226419
Contributor
0 Kudos

This message was moderated.

naveen_inuganti2
Active Contributor
0 Kudos

You question is really about performance of "SUM with GROUP BY" (1st Query) and "ADDITIONAL LOOP" (2nd Query).

2nd Query works better especially when you have less amount of data... but as data increases you can see LOOP will be taking time more than "SUM and GROUP BY" clause.

Hence it tough to give judgement without looking at data volume and your system resources. However, if you want me suggest which one should be verified first then go ahead and try the first one.

One more thing, if you are dealing with high volume of data then think about using cursors ( you can still use group by clause there)!

Regards,

Naveen. I

0 Kudos

Hi Naveen,

Thanks for your reply.

I have huge amount of data. Moreover situation is very much confusing for me because it is difficult to judge the reason of slow execution .I don't know whether server is slow or this query is slow.

Regards,

Pankaj

0 Kudos

Pankaj - As I mentioned, when you deal with big data memory allocation to your program will also play key-role in response time. Hence, check and release unwanted memory assignment within the program. Also, fetch and process data with help of cursors and check if that helps (You can do it with appropriate package size or fetch into work area depending on data processing).

Regards,

Naveen

0 Kudos

Hi,

In the 1st query, let's assume that the DB server reads 1 billion records from the hard disk and sends the 1 billion records to the application server.

In the 2nd query, the DB server also reads 1 billion records from the hard disk.  However, after the aggregation, the DB server only sends 20 records to the application server.

Think about the network bandwidth, which query is faster?

0 Kudos

Hi Naveen,

I tried by using Cursor also but that didn't make much difference.

Select Queries are taking almost same amount of time.

Regards,

Pankaj

0 Kudos

I believe you have replicated/maintained and testing against expected amount of data. You don't see much difference between SELECT and FETCH statements, because there are no memory issues it seems. ...We can come back to this after checking couple of other things.

What are the expensive statements in trace results.. give us Net% figures. What is the Net% of this SELECT query? What is approx. volume of data/no. of records in your production system in these tables?

Do you have any Secondary Index close to your WHERE condition? Can you post execution procedure of this SELECT (Explain Option) from ST05? If there is no Index then we can also think about applying filter while processing data instead expensive where condition.

Try to check how FOR ALL ENTRIES clause reacting to this DB hit, as you can use oracle hint to controls units, I am thinking it should give you some good results.

Regards,

Naveen