cancel
Showing results for 
Search instead for 
Did you mean: 

DC Metadata for PDF files not indexing

stephen_spalding
Contributor
0 Kudos

Hi all,

We followed the instructions for configuring TREX to index DC Metadata for MS Word, MS Excel, and PDF files. It seems to work just fine for MS Word and Excel files, but it does not work for PDF files. I can go into more detail if necessary, but I was just wondering if anyone else had this problem and has a solution for it.

Thanks!

-StephenS

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi Stephen,

Have you maintained the mime types in the TREXValidMImeTypes.ini file? Are you getting some specific errors?

Regards,

Shyam

stephen_spalding
Contributor
0 Kudos

Our TREXValidMimeTypes.ini file has the following line in it:

application/pdf

I checked through the TREXPreprocess**.trc files and do not see any specific errors.

-StephenS

Answers (1)

Answers (1)

stephen_spalding
Contributor
0 Kudos

I have figured out what the problem was. It turns out that TREX is successfully indexing the DC Metadata for PDF files.

We are still using Windows XP for our desktop operating system, and in XP you can right click on the Word, Excel, and PDF files and choose 'Properties'. When the properties for the file come up, you select the 'Summary' tab to get to the Title, Subject, Author, Category, Keywords, and Comments attributes (the DC Metadata).

However, there's another way to get to the DC Metadata, and that's by opening the file and selecting File -> Properties. Apparently, you can update the DC Metadata either way for Word and Excel files, and TREX will pick it up. However, for PDF files, you cannot use the Right Click -> Properties -> Summary method. If you set the DC Metadata for the PDF in this manner, TREX does not pick it up. You must use a PDF editor to set the metadata. It almost seems as if there's two separate sets of metadata for PDFs.

I hope this helps someone else who comes across this issue.

Thanks!

-StephenS