Solved: IMPORT preprocessing for CSV?

former_member314792 · ‎01-29-2016

I'm seeing some unusual results from the IMPORT CSV command on HANA, with the flat file on the Linux system. Server is SPS09, with HANA Studio (on Windows) at patch 97. A typical import script looks like this:

import from csv file '/home/user/import/xxxx.txt'

into source.xxxx

with field delimited by '\|'

skip first 1 row

error log '/home/user/import/xxxx.errlog'

;

In particular, the error log messages listing the offending record are shown in a quote-enclosed, comma-delimited format even though the flat file is pipe ('|') delimited and has no framing quotes; this seems to occur only when a null input field is imported into a DECIMAL table column. If the column datatype is changed to NVARCHAR, then any error log messages are shown using the expected format (pipe-delimited, unquoted). Is there a conversion being done to each record before it is imported?

There appears also to be some differences in how the "wizard" import tool (in HANA Studio) and the sql import tool (on the HANA server) function:

* The wizard imports fields containing only 0x00 (ASCII 'null') as NULL, while the HANA import changes these to a single blank (0x20).

* The wizard does not seem to have a way to change the 'optionally enclosed by' character (does it default to """"?)

* The wizard does not allow an arbitrary character to be specified as a field delimiter; the CSV data we are importing use a pipe ("|") character.

* The HANA import does not have a way to set the input character set; any conversion to the character set of the database must be done beforehand.

* The documentation is for the HANA import is unclear as to how special and unprintable characters are encoded; for specials it appears that an escaped format (e.g. '\t' for TAB) or a doubled-character (e.g. entering two ""'s when one is desired).

* When a [ROLLBACK] error appears in the log, the field contents shown appear to be taken from the first occurrence and are not updated on subsequent occurences, even though the input records differ. This makes it difficult to correlate the source record in error.

I'm interested if there is some 'internal' documentation available that would help answer some of these questions or if the community has researched them.

Much thanks in Advance,

Donn

lbreddemann · ‎02-07-2016

Hi Donn

both data import functions are completely separate and independent features.

The server side import was meant to provide a fast way to import server side data. Flexibility and control was not much of concern with this feature.

The client side import is focussed more on the use case of easy, simple data import e.g. for a PoC kind of situation. Once-off-imports I would call them.

For "proper" data integration - and your questions indicate that you have the requirements for flexibility, ad-hoc transformation and data parsing - SAP HANA provides the SAP HANA Enterprise Information Management option that allows to do such things.

If that's still not sufficient, there always SAP Data Services.

I am quite aware of the data import features of other DBMS and all I can say to this end is that the overall development for this capability was done with the two options I mentioned.

Hope that helps a bit though.

Cheers,

Lars

lucas_oliveira · ‎02-02-2016

Hi Donn,

It makes more sense to address some of your questions on an SAP support incident since it seems you're not having a consistent behavior on the import tool.

BRs,

Lucas de Oliveira

IMPORT preprocessing for CSV?

Accepted Solutions (1)

Accepted Solutions (1)

Answers (1)

Answers (1)

Re: DRC e-Cockpit Add a New Button

Re: BTP Build Apps- Error while integration of BTP...

SAP MAXDB : Content Server Administration - Reposi...

Re: "Failed to update setup engine executables. Pr...

Re: Timer showing while sending the mails from SAP