on 04-16-2021 7:59 AM
HI Experts,
We have a requirement to transfer data from sap HANA to Azure Data Lake storage(ADLS Gen 2) in Parquet or ORC format.
Can someone share if they have worked a similar senario.
Hi Pooja,
You can't generate Parquet or OCR file formats directly in Data Services, so you need to use another target that supports and manages those underlying formats and in this case Hive template will be a good choice.
If you can't use Hive, then typically Data Services will produce a simple text or csv file format that can be copied/uploaded to Azure. Once a file lands in its designated container, a data integration service within Azure Data Factory for example would handle the conversion to Parquet file format.
Another option is to create a dedicated self-hosted integration runtime near Data Services and this handles the compute required for the file Parquet format conversion and the overall data-integration.
As you may have realised by now creating Parquet file format directly is not straight forward and requires heavy resources to handle. Nevertheless if you are determined to produce Parquet within BODS environment then I would suggest using Python with pyarrow module to convert the output file from Data Services.
Thanks
Nawfal
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.