Skip to Content
author's profile photo Former Member
Former Member

TREX does not search PDF files

Hi,

we have another problem with TREX 6.0.

Our file repository is working fine, search also works for .txt files, but doesn't work for pdf files. Out pdf files are indexed correctly, but there are no result for this kind of files if we do a search.

What can we do?

Kind regards

Thomas

Add a comment
10|10000 characters needed characters exceeded

Related questions

4 Answers

  • Best Answer
    author's profile photo Former Member
    Former Member
    Posted on Jan 10, 2005 at 07:32 PM

    Your situation may already be solved. However, one thing I did not hear in the details was: 1) how many PDF's were being indexed. What was the size of the files? Did you check the TREX Monitor to ensure all the PDF's had been sent through the entire system. In the crawler monitor, did it state it found the correct number of files you believe to be in the index? By default, TREX holds documents in a que for 30 minutes between processes unless you either reset this property or flush the que.

    There is a document TREXRecomenations which give some very good tips with regards to file size and other common settings. For PDF it states:

    You want to index very large documents in PDF format from Adobe. These documents are not being indexed because they fail to pass the preprocessing stage.

    Limitation PDF is a complicated file format to preprocess. Typically PDF files larger than 15 MB cause problems. The time taken for preprocessing and filtering rises to over an hour and the process delivers bad results. Recommendation You should avoid the indexing and processing of PDF files that are larger than 15 MB.

    If you cannot find this document, let me know and I can forward it to you

    Add a comment
    10|10000 characters needed characters exceeded

  • Posted on Oct 05, 2004 at 09:04 AM

    Hi Thomas,

    PDFs indexed correctly but no PDFs found, right?

    Do the PDFs by any chance contain scans or faxes or the like?

    Then they do pnly contain bitmaps, no text, and only their properties, titles and descriptions will be indexed.

    Or do they maybe contain scans and OCRed text as hidden text?

    Then please read this post/thread:

    teaching-trex-to-index-certain-type-of-document

    Regards,

    Karsten

    Add a comment
    10|10000 characters needed characters exceeded

  • author's profile photo Former Member
    Former Member
    Posted on Oct 13, 2004 at 09:01 AM

    Hi everybody, same Problem.

    When I create a very simple (only some words) pdf file out of a *.txt file. It is indexed perfectly and a search for the words in the pdf file shows that pdf file. But this is only working for these simple files. When I take a manual from SAP and try to index it. It does not work.

    When I search after a part of the Title the document is shown but with "No document excerpt available". So I assume that TREX is not able to read the document. Not even parts of it. But the document contains text that should be indexed.

    Does anybody have an idea how this problem may be fixed?

    Eik

    Add a comment
    10|10000 characters needed characters exceeded

  • author's profile photo Former Member
    Former Member
    Posted on Jan 08, 2005 at 01:42 PM

    Hi,

    It happend to me also at first. I did upload 4 pdf document in the portal. TREX run wucessfully (According to the queue status) but the search return no document. classification som-how working as expected. I then upload another pdf document and the result still the same. The problem was resolved somhow after a .txt document uploaded and reindex request submit and ran.

    I still don't know what is the course of the problem but I'm happy that it is working now.

    Good luck to all of you

    Add a comment
    10|10000 characters needed characters exceeded

Before answering

You should only submit an answer when you are proposing a solution to the poster's problem. If you want the poster to clarify the question or provide more information, please leave a comment instead, requesting additional details. When answering, please include specifics, such as step-by-step instructions, context for the solution, and links to useful resources. Also, please make sure that you answer complies with our Rules of Engagement.
You must be Logged in to submit an answer.

Up to 10 attachments (including images) can be used with a maximum of 1.0 MB each and 10.5 MB total.