cancel
Showing results for 
Search instead for 
Did you mean: 

What are the limitations of Hana Smart Data Integration /EIM

Former Member
0 Kudos

Hi Experts,

We have on Premise Hana 1(SPS12) multinode landscape in scale out configuration with different firewall zones.We are evaluating the SAP HANA EIM(Smart data Integtration and Smart Data Quality) as a DP/ETL tool. In this regard I have the following questions:

1) Is there any limit on the data transfer using SDI for a single replication/pull request.For example maximum number of rows/columns that can be transferred from any Source to HANA?Is there a limitation on data size(in MB/GB)?

2)Is there a limitation on number of parallel replications/batch jobs/pull or push requests in context of DP Agent or DP Server?

3)I have read it in the installation guide that DP Server and DP Agent do not support load balancing in scale out scenarios.Can there be a workaround for this?If not, what can be an ideal setup to handle high loads(lets say a total 5 billion entries in a day from 1000 executions) ?

4)Can oData adapter installed and configured on a DP Agent instead of DP Server?

5)Where can I find the information like advantage/disadvantage of various SAP provided adapters to make an informed choice?

Hi werner.daehn :

I have learnt a lot about SDI from your blogs, can you please guide me to find answers to above questions.

Thank you so much in advance.

Regards

Manas

Accepted Solutions (0)

Answers (1)

Answers (1)

jeffrey_kresse
Employee
Employee
0 Kudos
  1. No data size limit, per se, but HANA must have sufficient free memory to handle the incoming data. One possible exception is due to an Oracle performance limitation, we depend on Oracle log miner which cannot process more than about 1TB/day.
  2. Again, no hard limitation assuming the systems are sized appropriately ('appropriate' varies depending on data volume and parallelization).
  3. Based on the documentation I'm reading, DPAgent and DPServer don't support load-balancing, but failover is supported for real-time requests. I would say that 5bn records through the course of a day is not exceptionally high, assuming the loads are partitioned or spread across multiple agents there should be no issue with that volume.
  4. OData Adapter is only deployed on DPServer.
  5. Pages 18-20 of the PAM quick guide show capabilities of each SDI adapter, including real-time support, source/target support etc.

Hope I have answered your questions adequately.

Jeff K

werner_daehn
Active Contributor

Nicely said.

Maybe a bit of a background:

You are connected with your local Hana. This is one session. First time you do something with a remote source, a connection is created to that remote source and linked to your local session. As one session can execute a single command only - how to you run two select statements in parallel within a single session? - it is the same executing a query using a virtual table. So how do you run something in parallel? By using multiple sessions. Scheduling, scripts, ... whatever. And then local Hana commands with or without remote tables can be executed in parallel and independent from each other.

(Above is also the reason why you can reset a remote connection by a relogin to your Hana database)

For that reason, and because all is streaming, then answer 1-3 get obvious.

In regards to 4), adapters can theoretically be implemented in two languages and at two places: Java or C++; DPAgent or DPServer. This has not been made a user choice (yet) so the rule is DPAgent=Java; DPServer=C++. The oData adapter was written in C++ and for the DPServer.

ad 5) I guess your question was not so much about the different SDI Adapters but the different ETL technologies? In that case I would rather like to get the requirements from you and then we can go through the relevant pros/cons of each tool.

Generally speaking, SDA/SDI are used if your local Hana database is the center of everything. Then it is the least install, supports batch ETL, realtime ETL and federation.

Data Services is more for Hub-Spoke like setups, where you have many sources, many targets and a central ETL tool in the middle. But limited to batch (although Data Service does support a bit of realtime as well).

And then there are tons of other tools, SLT, RepServer,... all for their specific use cases.

-Werner

JaySchwendemann
Active Contributor
0 Kudos

Nice explantations Werner, thanks for that. About that "tons of other tools": Unfortunately this even holds true when solely looking at SAP as a vendor. Do you by chance know some high level overview where one could find use cases / separation when to use what tool in the whole ETL and (transactionla) integration domain? or is this simply a too broad field? Then what about only etl?

Cheers

Jens

werner_daehn
Active Contributor

jens.schwendemann

Oh, that's an easy one: SAP has clear guidelines when to use what product. Unfortunately these vary depending on whom you talk to.

  • If you ask the EIM team, SAP Data Intelligence can do everything.
  • If you ask the Cloud team, CPI can do everything.
  • If you ask the S/4Hana team, oData based integration can do everything.

That is at least what I have seen at my customers.

The official answer from Christian Klein is this: https://www.sap.com/documents/2020/02/520ea921-847d-0010-87a3-c30de2ffd8ff.html

As you asked for ETL tools, Data Services is still the best tool in the entire market, in my personal opinion. "Best" in terms of productivity and TCO. For realtime SAP Data Integration I promote a solution using Apache Kafka.

But as we are getting into a political mine field here, feel free to contact me by email.

There is an entire series on SAP Data Integration products published, maybe that is of value? e.g. this one as starting point.