10-10-2013 6:03 PM
Hi.
We are a manufacturing company and receive numerous Purchase Orders and Invoices via email as a PDF.
We spend a lot of time typing this information in, is there a software/service out there that will allow me to convert these PDFs into useable XML so I can import it? I've tried a bunch of "PDF to XML" converters but they don't do anything other than export the internal tags which does nothing. I need a XML document that is clean and useable.
Many thanks for your help if I could solve this one.
Jason.
10-17-2013 10:39 PM
Hello Jason,
I assume you need some code to extract data from an ADS rendered PDF file, and that your ABAP system has access to the SAP Netweaver ADS using ABAP?
In that case you might use some of the code below.
IV_INPUT is importing parameter type XSTRING, assumed to be the rendered PDF file
EV_OUTPUT is exporting/returning type XSTRING, which is the data part of the PDF file in XML format.
You can parse EV_OUTPUT with the iXML library.
method extract.
data:
lv_str_base64 type string,
lv_dest type rfcdest,
lr_fp type ref to if_fp,
lr_pdfobj type ref to if_fp_pdf_object,
lv_pdf_bin type xstring,
lr_fpex type ref to cx_fp_runtime,
lr_root type ref to cx_root.
if iv_input is initial.
write 'error spot 1'.
* Some error handling
endif.
move cl_fp=>get_ads_connection( ) to lv_dest.
* get FP reference
lr_fp = cl_fp=>get_reference( ).
try.
* Create PDF Object
lr_pdfobj = lr_fp->create_pdf_object( connection = lv_dest ).
* Set document
lr_pdfobj->set_document(
pdfdata = iv_input ).
* Tell PDF object to extract data
call method lr_pdfobj->set_task_extractdata( ).
* Execute, call ADS
call method lr_pdfobj->execute( ).
* Get data
call method lr_pdfobj->get_data
importing
formdata = ev_output.
catch cx_fp_runtime_internal
cx_fp_runtime_system
cx_fp_runtime_usage into lr_fpex.
ls_faultdata-fault_text = lr_fpex->errmsg.
write: 'error spot 2', lr_fpex->errmsg.
* Some error handling
catch cx_root.
write: 'error spot 3'
* Some error handling
endtry.
endmethod.