Skip to Content
avatar image
Former Member

Transition from Informatica ETL to SAP HANA system

Hi,

My company is looking to replace the exisitng DW solution with SAP HANA to resolve the performance issue related to overnight batch loading ( as of today it takes around 8 hrs to load close to a billion de-normalized data into Oracle tables) and the issue that Business user facing to retrieve data into Excel pivots and Bo reports for decision making. Can some one please help me understand the following?  We are currently using Informatica as ETL tool ( we have some complex ETL logics in our Informatica jobs) and BO as the reporting tool. Oracle is the database that we use for reporting. So my questions are

1) We currently use Informatica as the ETL tool to extract the data from our Oracle based source system and load the data into Oracle based data warehouse. So If we replace ORACLE DB with SAP HANA aas our DW database can we still use Informatica as th ETL tool or should we use SAP Data Services as the ETL tool?

2) What is the differenece between SAP data Service and SAP BO Data Integrator?

3) If I have to make the transition from Informatica to SAP HANA, what areas should I focus and what is the learning curve involved in the process. I do not want to be only doing the ETL part but my end goal is to be a SAP HANA solution architect. My core skill is ETL so obviously I want to start from there and make a progression towards all other SAP HANA architecture.

I just wanted to know how SAP HANA can connect all these pieces together  ( will it make Informatica obsolete in near future ) and how I can make the transition to the ever growing community of SAP HANA developers and contribute to this community with my real time learning experiences.

Thanks

Shanil.

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

2 Answers

  • avatar image
    Former Member
    Oct 04, 2012 at 01:04 PM

    Hi Shanil,

    Responses to your questions:

    1. Only SAP Dataservices ETL tool has been certified for HANA as of now. However since HANA is based on open sql standards, it could technically work with other ETL tools but not supported by SAP in case of issues. Have a look at the last question in this link: http://www.experiencesaphana.com/docs/DOC-2117

    2. They are the same just renamed to  Data Services with BI4.0. Data services also includes the data quality bit in addition to the ETL aspect and hence the name change I believe.

    3. Since you are already from a datawarehousing background, I suggest you learn HANA data modelling aspects, the various views, engines, calculation methods followed reporting aspects from HANA etc. A good starting point is http://help.sap.com/hana_appliance/ which has a number of documents.

    Thanks,

    Anooj

    Add comment
    10|10000 characters needed characters exceeded

  • Oct 04, 2012 at 02:45 PM

    Hi Shanil,

    I have a couple of questions in a different direction.

    1) I am sure you would have analyzed the pain points of if your data loading performance. Is it related to the E (Extraction) and T (transformation) or L (Loading). HANA will help you only in the L part of it. Since your source system would continue to be Oracle, there will not be much impact on the E and T.

    2) You can look into the option of moving the T (Transformation) logic from the ETL tool to the HANA database (if you choose to implement) and gain high performance benefits.

    3) If your L (Loading) time is due to high number of de-normalized tables, don't expect the very radically simplified normalized data model in HANA. You may end up having some degree of de-normalization in HANA as the join operations on normalized tables are still expensive from the performance aspect.

    Rest of the information has been aptly provided by Anooj.

    Regards,

    Ravi

    Add comment
    10|10000 characters needed characters exceeded

    • Hi Emilyn,

      The join operation is performed in different engines and utilizes temporary memory and resources for

      - joining the data sets

        - Merging the data

        - Sorting the data

      These operations take memory and resources hence JOINs are bit expensive. It is not an "Issue" as such, but if the join is required to be performed over Large data sets, then it is likely to have impact on the performance.

      Regards

      Ravi