Skip to Content

PDF as Source in Text Data Processing

Hi All,

I have read that Text Data Processing supports pdf,word and other binary formats but i dont understand how use the pdf/word as source.

Can anyone explain or guide me a work around.

Thanks,

srinivas

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

1 Answer

  • Best Answer
    Jul 20, 2017 at 07:03 AM

    In the file format definition set Type to Unstructured Text. The output schema will look like this:

    Then use a TDP Entity_Extraction transform to process te contents.

    Add comment
    10|10000 characters needed characters exceeded