Application Development Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to read data from Pdf file

Former Member
0 Kudos

Hi folks,

I have following requirement.

I am getting some pdf files from customer with some tabular and non tabular data.

My requirement is to read these tabular and non tabular data from this pdf file and store them in sap database.

Please suggest me, some way to read the data fro a pdf file.

Regards

PG

3 REPLIES 3

former_member193284
Active Participant
0 Kudos

HI PG,

I read pdf document for a PO. heres the code which might help you. Please refer GET_PDF_FATA subroutine. Hope this helps you!

REPORT ZNF_MM_PO_CREATE.

PERFORM PDF_UPLOAD. " Upload PDF filename and path

PERFORM GET_PDF_DATA. " Get data in XML format

PERFORM F_TRANS_DATA. " Transfer Data from XML to Internal Table

FORM PDF_UPLOAD .

MOVE P_LOCT TO W_UPLOAD_FILE.

CL_GUI_FRONTEND_SERVICES=>GUI_UPLOAD(

EXPORTING

FILENAME = W_UPLOAD_FILE

FILETYPE = 'BIN' "Binary

IMPORTING

FILELENGTH = W_FILE_LEN

CHANGING

DATA_TAB = W_DATA_TAB

EXCEPTIONS

ENDFORM. " pdf_data

********************************************************************

*Form GET_PDF_DATA

********************************************************************

*Text: Extract PDF data in XML format

********************************************************************

FORM GET_PDF_DATA .

  • Local data declaration

DATA: LO_FP TYPE REF TO IF_FP VALUE IS INITIAL,

L_FPEX TYPE REF TO CX_FP_RUNTIME,

XML_DATA TYPE XSTRING,

LT_XML_DATA TYPE STANDARD TABLE OF XSTRING.

LO_FP = CL_FP=>GET_REFERENCE( ).

LOOP AT W_DATA_TAB ASSIGNING <FS_DATA_TAB>.

CONCATENATE W_PDF <FS_DATA_TAB>-DATA INTO W_PDF IN BYTE MODE.

ENDLOOP.

  • Error handling

CATCH CX_FP_RUNTIME_INTERNAL INTO L_FPEX.

PERFORM ERROR USING L_FPEX 'INTERNAL ERROR'.

CATCH CX_FP_RUNTIME_SYSTEM INTO L_FPEX.

PERFORM ERROR USING L_FPEX 'SYSTEM ERROR'.

CATCH CX_FP_RUNTIME_USAGE INTO L_FPEX.

PERFORM ERROR USING L_FPEX 'USAGE ERROR'.

ENDTRY.

APPEND XML_DATA TO LT_XML_DATA.

  • Get Data in XML formt

LO_PDFOBJ->GET_DATA(

IMPORTING

FORMDATA = XML_DATA ).

  • Convert Xstring to string format

CALL FUNCTION 'ECATT_CONV_XSTRING_TO_STRING'

EXPORTING

IM_XSTRING = XML_DATA

IMPORTING

EX_STRING = LV_XML_DATA_STRING.

ENDFORM. " GET_PDF_DATA

********************************************************************

*Form F_TRANS_DATA

********************************************************************

*Text: Extract data from PDF fields into internal table

********************************************************************

FORM F_TRANS_DATA .

  • TYPE-POOLS: IXML.

  • Local data declaration

DATA: L_IXML TYPE REF TO IF_IXML,

STREAMFACTORY TYPE REF TO IF_IXML_STREAM_FACTORY,

ISTREAM TYPE REF TO IF_IXML_ISTREAM,

DOCUMENT TYPE REF TO IF_IXML_DOCUMENT,

PARSER TYPE REF TO IF_IXML_PARSER,

NODE TYPE REF TO IF_IXML_NODE.

  • DATA: L_FPEX TYPE REF TO CX_FP_RUNTIME.

DATA : V_LIFNR TYPE STRING,

V_PUR_GROUP TYPE STRING,

V_MATNR TYPE STRING,

V_REST TYPE STRING.

L_IXML = CL_IXML=>CREATE( ).

STREAMFACTORY = L_IXML->CREATE_STREAM_FACTORY( ).

ISTREAM =

STREAMFACTORY->CREATE_ISTREAM_STRING( LV_XML_DATA_STRING ).

DOCUMENT = L_IXML->CREATE_DOCUMENT( ).

PARSER = L_IXML->CREATE_PARSER( STREAM_FACTORY = STREAMFACTORY

ISTREAM = ISTREAM

DOCUMENT = DOCUMENT ).

PARSER->PARSE( ).

NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'DATE' ).

IF NOT NODE IS INITIAL.

Z_DATE = NODE->GET_VALUE( ).

ELSE.

MESSAGE E208(00) WITH TEXT-008 .

ENDIF.

NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'MATNR1' ).

WA_ITEM-MATNR = NODE->GET_VALUE( ).

NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'MENGE1' ).

WA_ITEM-MENGE = NODE->GET_VALUE( ).

WA_ITEM-MENGE = WA_ITEM-MENGE * 1000.

NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'PRICE1' ).

WA_ITEM-PRICE = NODE->GET_VALUE( ).

WA_ITEM-PRICE = WA_ITEM-PRICE / 1000000.

NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'LEWED1' ).

V_DATE = NODE->GET_VALUE( ).

CONCATENATE V_DATE0(4) V_DATE5(2) V_DATE+8(2) INTO WA_ITEM-EEIND.

NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'WERKS1' ).

WA_ITEM-WERKS = NODE->GET_VALUE( ).

APPEND WA_ITEM TO IT_ITEM.

ENDFORM. " f_trans_data

Edited by: Sumit Naik on Oct 13, 2010 10:52 PM

0 Kudos

Hi Sumit,

Thanks for your reply.

Will you please post the code with corrected data declaration. Since the current form of code does have many errors. And I am not able to correct them all.

So it would be great help, if you can post the running code( with chnaged name, since it may impact security policy of your customer)

Thanks for your help.

Regards

PG

0 Kudos

Hi Summit,

This code has various inconsistencies , so it does not work,

for example in FORM GET_PDF_DATA.

you concatenate all data line into   W_PDF .

but you dont use this variable anywhere .

and so on...


Could you resolve these  inconsistencies?


Thanks,