cancel
Showing results for 
Search instead for 
Did you mean: 

BODS: How to return rows that have last transaction date?

0 Kudos

Requirement:

If an order has 2 or more transactions, then return only row with last transaction date.

This can be done by using SQL as below. Table st1, and st2 are same table.

But how to achieve this via BODS Data_Flow design?

SELECT *

FROM source_table st1

WHERE st1.order_type = 'PO' AND transaction_date = ( SELECT MAX(transaction_date) FROM source_table st2 WHERE st2.order_type = st1.order_type AND st2.order_no = st1.order_no);

Accepted Solutions (1)

Accepted Solutions (1)

rajan_burad
Active Participant
0 Kudos

Hi Rock,

I tried the same with your data.

Mapping:

Query_extract_timepart: I've added one new column "Time_part" to extract the time part:

Query_sort: Sorted order_no and Time_part

Query_Rank: used gen_row_num_by_group()

Query_filter: Use filter condition to filter the record with SEQ_ID = 1.

When you'll execute the job, you'll get the records with latest transaction_date. I've one more record for my testing and here is the screenshot and output for the same.

I'm not sure about performance with this approach.

Let us know if this helps to achieve your scenario.

Thanks,

Rajan

0 Kudos

I tried your solution with small set of source data (about 10 rows), and it is working fine!
Will try again with a huge number of data set.

Answers (6)

Answers (6)

kalyani_kolli
Explorer

Hi Rock,

In the first query--> Map the columns from source table like order_type,order_no, MAX(transaction_date) group by order_type,order_no

and then join those output columns with source table to get the latest transactions.

Hope this will help you..

0 Kudos

Hi! Kalyani...
my source_table contains approximately 500000++ rows data.

I afraid that the first query (which uses GROUP BY) may have performance issue.

Do u have any other ways?

kalyani_kolli
Explorer
0 Kudos

Hi Rock,

Try to use Data Transfer in order to get the better performance :).

rajan_burad
Active Participant
0 Kudos

Hi Rock,

Let us know if you achieved your solution 🙂

Thanks,

Rajan 🙂

0 Kudos

Hey...Dirk Venken/Rajan Burad/kalyani kolli...Thanks for ur valuable advice.
Let me try out.

BTW, here list out a few articles that related to solution that u all suggested, as reference.

Functioning of gen_row_num_by_group function in BODS [ Rajan Burad ]
https://blogs.sap.com/2017/08/16/functioning-of-gen_row_num_by_group-function-in-bods/

How to extract most recent data from a history table [ Dirk Venken ] https://wiki.scn.sap.com/wiki/display/EIM/How+to+extract+most+recent+data+from+a+history+table

Let the database do the hard work! Better performance in SAP Data Services thanks to full SQL-Pushdown[ Dirk Venken] - example 5. Another application of the Data_Transfer transform

https://blogs.sap.com/2014/02/13/let-the-database-do-the-hard-work-better-performance-in-sap-data-se...

Implementing Rank Functionality in SAP Data Services [Rich Hauser]

https://decisionfirst.files.wordpress.com/2013/03/rank-functionality.pdf

former_member187605
Active Contributor
0 Kudos

Check example 5. Another application of the Data_Transfer transform in

https://blogs.sap.com/2014/02/13/let-the-database-do-the-hard-work-better-performance-in-sap-data-se... for a very efficient implementation of kalyani's solution.

0 Kudos

Hey...Rajan. Thanks for your advice.

In this scenario, it may have 2 or more different order_no.

Your approach (with single global variable, $gv_transact_date in script) cannot resolve my problem.

Source_Table:

order_no, order_type, status, transaction_date

001, PO, NEW, 01/11/2018 10:00 AM

001, PO, PACKED, 01/11/2018 11:30 AM

002, PO, NEW, 01/11/2018 11:00 AM

002, PO, SHIPPED, 01/11/2018 11:40 AM

rajan_burad
Active Participant
0 Kudos

Hi Rock,

Take a script and create a global variable, $gv_transact_date (varchar type).

$gv_transact_date = sql('datastore','select max(transaction_date) from st1');

print($gv_transact_date);

DATAFLOW: In where clause give the condition as: transaction_date >= $gv_transact_date.

Execute your job, you'll get the records with last/latest transaction date.

Let us know if this helps!

Thanks,

Rajan 🙂