Skip to Content
0
Former Member
Oct 13, 2015 at 08:56 AM

Load PDF pages as Strings into a HANA Table

277 Views

Hi SCN community,

i created a table in HANA which should be filled with the content of a local stored PDF file.

The table has two columns, PAGE NUMBER and CONTENT.

Each row should represent one page of the PDF.

I tried this extracting part via a external Python Script, but the content couldn't be extracted for a lot of the PDF files.

The reason for that might be the diversification of PDF types or versions.

My questions are:

How can i extract these information out of an PDF file and load it into my HANA table without using external tools like Python and so on?

Is there already a file upload / extract tool integrated within HANA?

Thanks in advance!

Sebastian