Skip to Content
author's profile photo Former Member
Former Member

How to read data from Pdf file

Hi folks,

I have following requirement.

I am getting some pdf files from customer with some tabular and non tabular data.

My requirement is to read these tabular and non tabular data from this pdf file and store them in sap database.

Please suggest me, some way to read the data fro a pdf file.

Regards

PG

Add a comment
10|10000 characters needed characters exceeded

Assigned Tags

Related questions

1 Answer

  • Posted on Oct 13, 2010 at 08:52 PM

    HI PG,

    I read pdf document for a PO. heres the code which might help you. Please refer GET_PDF_FATA subroutine. Hope this helps you!

    REPORT ZNF_MM_PO_CREATE.

    PERFORM PDF_UPLOAD. " Upload PDF filename and path

    PERFORM GET_PDF_DATA. " Get data in XML format

    PERFORM F_TRANS_DATA. " Transfer Data from XML to Internal Table

    FORM PDF_UPLOAD .

    MOVE P_LOCT TO W_UPLOAD_FILE.

    CL_GUI_FRONTEND_SERVICES=>GUI_UPLOAD(

    EXPORTING

    FILENAME = W_UPLOAD_FILE

    FILETYPE = 'BIN' "Binary

    IMPORTING

    FILELENGTH = W_FILE_LEN

    CHANGING

    DATA_TAB = W_DATA_TAB

    EXCEPTIONS

    ENDFORM. " pdf_data

    ********************************************************************

    *Form GET_PDF_DATA

    ********************************************************************

    *Text: Extract PDF data in XML format

    ********************************************************************

    FORM GET_PDF_DATA .

    • Local data declaration

    DATA: LO_FP TYPE REF TO IF_FP VALUE IS INITIAL,

    L_FPEX TYPE REF TO CX_FP_RUNTIME,

    XML_DATA TYPE XSTRING,

    LT_XML_DATA TYPE STANDARD TABLE OF XSTRING.

    LO_FP = CL_FP=>GET_REFERENCE( ).

    LOOP AT W_DATA_TAB ASSIGNING <FS_DATA_TAB>.

    CONCATENATE W_PDF <FS_DATA_TAB>-DATA INTO W_PDF IN BYTE MODE.

    ENDLOOP.

    • Error handling

    CATCH CX_FP_RUNTIME_INTERNAL INTO L_FPEX.

    PERFORM ERROR USING L_FPEX 'INTERNAL ERROR'.

    CATCH CX_FP_RUNTIME_SYSTEM INTO L_FPEX.

    PERFORM ERROR USING L_FPEX 'SYSTEM ERROR'.

    CATCH CX_FP_RUNTIME_USAGE INTO L_FPEX.

    PERFORM ERROR USING L_FPEX 'USAGE ERROR'.

    ENDTRY.

    APPEND XML_DATA TO LT_XML_DATA.

    • Get Data in XML formt

    LO_PDFOBJ->GET_DATA(

    IMPORTING

    FORMDATA = XML_DATA ).

    • Convert Xstring to string format

    CALL FUNCTION 'ECATT_CONV_XSTRING_TO_STRING'

    EXPORTING

    IM_XSTRING = XML_DATA

    IMPORTING

    EX_STRING = LV_XML_DATA_STRING.

    ENDFORM. " GET_PDF_DATA

    ********************************************************************

    *Form F_TRANS_DATA

    ********************************************************************

    *Text: Extract data from PDF fields into internal table

    ********************************************************************

    FORM F_TRANS_DATA .

    • TYPE-POOLS: IXML.

    • Local data declaration

    DATA: L_IXML TYPE REF TO IF_IXML,

    STREAMFACTORY TYPE REF TO IF_IXML_STREAM_FACTORY,

    ISTREAM TYPE REF TO IF_IXML_ISTREAM,

    DOCUMENT TYPE REF TO IF_IXML_DOCUMENT,

    PARSER TYPE REF TO IF_IXML_PARSER,

    NODE TYPE REF TO IF_IXML_NODE.

    • DATA: L_FPEX TYPE REF TO CX_FP_RUNTIME.

    DATA : V_LIFNR TYPE STRING,

    V_PUR_GROUP TYPE STRING,

    V_MATNR TYPE STRING,

    V_REST TYPE STRING.

    L_IXML = CL_IXML=>CREATE( ).

    STREAMFACTORY = L_IXML->CREATE_STREAM_FACTORY( ).

    ISTREAM =

    STREAMFACTORY->CREATE_ISTREAM_STRING( LV_XML_DATA_STRING ).

    DOCUMENT = L_IXML->CREATE_DOCUMENT( ).

    PARSER = L_IXML->CREATE_PARSER( STREAM_FACTORY = STREAMFACTORY

    ISTREAM = ISTREAM

    DOCUMENT = DOCUMENT ).

    PARSER->PARSE( ).

    NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'DATE' ).

    IF NOT NODE IS INITIAL.

    Z_DATE = NODE->GET_VALUE( ).

    ELSE.

    MESSAGE E208(00) WITH TEXT-008 .

    ENDIF.

    NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'MATNR1' ).

    WA_ITEM-MATNR = NODE->GET_VALUE( ).

    NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'MENGE1' ).

    WA_ITEM-MENGE = NODE->GET_VALUE( ).

    WA_ITEM-MENGE = WA_ITEM-MENGE * 1000.

    NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'PRICE1' ).

    WA_ITEM-PRICE = NODE->GET_VALUE( ).

    WA_ITEM-PRICE = WA_ITEM-PRICE / 1000000.

    NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'LEWED1' ).

    V_DATE = NODE->GET_VALUE( ).

    CONCATENATE V_DATE0(4) V_DATE5(2) V_DATE+8(2) INTO WA_ITEM-EEIND.

    NODE = DOCUMENT->FIND_FROM_NAME( NAME = 'WERKS1' ).

    WA_ITEM-WERKS = NODE->GET_VALUE( ).

    APPEND WA_ITEM TO IT_ITEM.

    ENDFORM. " f_trans_data

    Edited by: Sumit Naik on Oct 13, 2010 10:52 PM

    Add a comment
    10|10000 characters needed characters exceeded

    • Former Member

      Hi Summit,

      This code has various inconsistencies , so it does not work,

      for example in FORM GET_PDF_DATA.

      you concatenate all data line into W_PDF .

      but you dont use this variable anywhere .

      and so on...


      Could you resolve these inconsistencies?


      Thanks,





Before answering

You should only submit an answer when you are proposing a solution to the poster's problem. If you want the poster to clarify the question or provide more information, please leave a comment instead, requesting additional details. When answering, please include specifics, such as step-by-step instructions, context for the solution, and links to useful resources. Also, please make sure that you answer complies with our Rules of Engagement.
You must be Logged in to submit an answer.

Up to 10 attachments (including images) can be used with a maximum of 1.0 MB each and 10.5 MB total.