cancel
Showing results for 
Search instead for 
Did you mean: 

Best Practice for keeping track of former event

MarkusRichter
Explorer
0 Kudos

Hi,

in my scenario I am dealing with database events on HANA: If a database record is insert some logic shall be applied on that row and shall be written to another database table within HANA. The tricky thing is that the applied logic is based on what has already been inserted (e.g. insert the new record only if this has not been done before). This is somehow a replication task. But on last SAP TechEd, ESP has been named as one of the replication tools for HANA!!

Right now, I do not know how to solve this problem in a very efficient way. Using streams will not work as anytime the ESP will be re-started all data gets replicated again and duplicates the data on the target HANA tables ;-( Using windows (in combination with file store) does also not solve the issue, as the HANA tables are containing too much data. Therefore I created an additional HANA table to maintain a status flag of the records which have been replicated and wrote a JDBC library that looks for replicated rows and insert a flag to the ones that been replicated. My JAVA code works fine in a standalone program, but crashes the ESP server...

Before looking deeper into the crashing problem, I thought I would post a discussion and see whether there are other (better) approaches of keeping track on previously handled events. Any examples or suggestions?

Thanks and Best Regards.

Markus

Accepted Solutions (0)

Answers (1)

Answers (1)

JWootton
Advisor
Advisor
0 Kudos

From what you describe - unless I've misunderstood - I don't think ESP is the right tool for you.  ESP is not a replication tool for SAP HANA and is not recommended for such.  It is one of the "data provisioning" tools available for HANA, and is recommended for use in capturing data from real-time event streams - but is not recommended for use in replicating data between databases.  With that said, ESP can be used with SAP Sybase Replication Server:  RepServer performs realtime change capture on a source database and turns those DB events into a real-time event stream that can be fed into ESP (ESP has support for RepServer) where the events can be processed or analyzed - e.g. you can monitor trends, patterns, etc.   But even here - if you just want to replicate the data from the source into a destination DB,  then RepServer doesn't need to go through ESP.

But if you're looking to replicate data from one HANA table to another,  then I don't think ESP is probably your best option.

Now if you are working with an incoming stream of events (and yes, they could be database "events" - i.e. DB transactions), then it is possible to stream them through ESP and have ESP keep track of what's been stored in the target table.  i.e. if there really is a need for ESP here, you can design  your ESP data model to avoid duplicating data after a re-start.

And again, maybe I've misunderstood the use case...

MarkusRichter
Explorer
0 Kudos

Hi Jeff,

thanks for the explanation. Now, I understand the main purpose of ESP. Nevertheless, my use case is not a really 100% replication use case: I have database table A. Anytime a record gets insert into A, I need to do some complex data maniplulation before storing the data into the target table B. This data manipulation includes generation of additional values and split the record into different records. For those data maniplulation ESP does quite a nice job. I am not aware those extensive data manipulation is available within Rep Server or SAP SLT. Having said this, I actually was able to write such an ESP script and run it successfully. My problem is that anytime I re-start the ESP server, all records of table A will ge processed again, which leads to duplicates within the target table B. How do I overcome the problem of re-processsing former events of database table? Due to the large number of events, I am not able to use the file store. I think this is generic problem and not limited to my use case.

I appreciate any help.

Thanks and Regards.

Markus

vijaigeethach
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Markus,
It is difficult to give good suggestions without understanding the complete project. Here is a couple of ideas..
1. You can use the getData() function rather than the DB input adpater and pass a parameter to the query which will read rows that were not processed yet.   For example if there is a key column in your table like 1,2,3,4,5.. You had already processed 1,2,3 in your previous run and in the next run you can modify the query to process the next id by changing the parameter value(parameters can be passed through ccr files and we need not touch the project itself). Please see
   http://infocenter.sybase.com/help/topic/com.sybase.infocenter.dc01621.0512/doc/html/vge1312573080739...

2.You can read the content of table B only at the start of the project store it in a window. Read the content of table A, check to see with the key values whether the data has already been processed and then proceed with your logic. The adapters can be started in sequence and stopped using
esp_client command and by placing the adapters in  ADAPTER start groups.

Thanks,
Geetha.

JWootton
Advisor
Advisor
0 Kudos

Markus,

Thanks for clarifying - and in that case, yes, it does make sense to use ESP.  Longer term we plan to make this easier for you - we've seen others wanting to do the same.  For now, your best bet is one of the approaches Geetha has suggested: if there's a way to poll "Table A" to just get new records rather than everything, you can do that.  You can even set this up in a flex to fire getdata() at regular intervals, and you can use a variable to keep track of the last record received so that next time getdata() fires you can only get later ones. Otherwise, per Geetha's second option, set up some type of filter in ESP to filter out the rows already received.

Jeff

MarkusRichter
Explorer
0 Kudos

Hi All,

thanks for the hints. It is good to know that I am not the only one who is asking for such an option 😉

The "window" option will not work for me as the amount of data that needs to be kept is way too big over time (daily growing).

The getData () option could be a promosing solution. But I need to find out the best way of storing the last used value for narrowing the select statement. Because the issue occures after a re-start of the ESP server. So I just need to store (and override) the value of the select paramter after each execution of the getData () fucntion.

Thanks and Best Regards.

Markus

former_member217348
Participant
0 Kudos

Hi Markus,

I would suggest that you use a log store to keep the last used value.

Thanks,

Alice