cancel
Showing results for 
Search instead for 
Did you mean: 

Extra characters when exporting Crystal Report to PDF

davehuang
Explorer
0 Kudos

Hi, I'm experiencing the problem described in this KB article when using Crystal Reports Developer for Visual Studio SP21 (which I think is the latest version currently available). The KB article says the cause is usp10.dll being loaded from the Windows System directory instead of from C:\Program Files (x86)\SAP BusinessObjects\Crystal Reports for .NET Framework 4.0\Common\SAP BusinessObjects Enterprise XI 4.0\win32_x86, but I've checked with Process Explorer and Process Monitor, and it's using the DLL from the Crystal directory.

This problem has been reported on the forums numerous times, e.g., here, here, here, and here, but there never seems to be a real fix, only workarounds 😞 The "best" workaround being Vitaly Vedmetskiy's comment in this thread, where he suggests a specific version of usp10.dll to use (as a followup to a comment from someone at SAP who said that "We need a very specific version of usp10.dll for our text rendering to work correctly." Which raises the questions of why SAP is distributing version 1.626.7601.23259 if CR's text rendering doesn't work properly with that version) But as I said, that's just a workaround—Crystal shouldn't need a very specific version of usp10.dll, especially not one from all the way back in 2005 (which is the date of usp10.dll 1.422.3790.1830). There's nothing wrong with the newer versions usp10.dll.

The reason newer versions of usp10.dll don't work is that they now use an OpenType font's Standard Ligature ('liga') table to do automatic ligature substitution. E.g., if some text has the characters "fi", the font can tell the renderer (e.g., usp10.dll) to display that as the ligature fi. The bug in Crystal's PDF export is that it doesn't handle that substitution correctly. When it generates a subset of the font to embed in the PDF, it maps the first character of the ligature to the glyph of the whole ligature. For example, if the text contains "fi", it maps "f" to the "fi" ligature, turning all other "f"s (even when not followed by "i") to the "fi" ligature.

To demonstrate, I created a simple report, which just has a single text box with the text "ta ti fa fi" in Calibri font (most of the other "C" fonts that were introduced with Windows Vista will show the same problem—Candara, Constantia, and Corbel). Here's how it looks in the report design view. Notice how the "ti" and "fi" are connected.

If you click in it to edit the text, they change to separate letters:

Then if you go to the Report Preview tab, right-click and Export to a PDF file, it turns into this:

The PDF is compressed, but there are various utilities that can uncompress it... if you do that and look at the text that's being displayed, you'll see:

[t, 13, a t , f, 19, a f] TJ

That means display a "t", move an extra 13 units to the right, display "a t ", display "f", move an extra 19 units to the right, display "a f". Put that together and you see it thinks it should be displaying "ta t fa f". And indeed, if you copy the text from Acrobat Reader and paste it, that's exactly what you get. So if it thinks it's displaying "ta t fa f", why does it look like "tia ti fia fi"? Because there's a mapping in the PDF between characters and the glyph (image) that should be displayed for that character. And the PDF generated by CR maps "t" to the glyph for the "ti" ligature, and it maps "f" to the glyph for the "fi" ligature.

It appears that CR's PDF export is calling usp10.dll's ScriptShape function to determine the list of glyphs for a piece of text, but it doesn't seem to be doing the right thing when multiple characters are replaced by a single glyph. Ideally, the export would be fixed so it handles that case properly. But I guess a quick and dirty fix would be to tell usp10 to disable processing of the "liga" table. And the reason why old versions of usp10.dll "fix" the problem is because they don't process the "liga" table... disabling "liga" processing on a new usp10.dll would give the effect of reverting to an old usp10.dll, without the hassle of having to actually get an old version of the DLL.

Anyways, I hope the Crystal Reports developers will fix this once and for all.

Accepted Solutions (1)

Accepted Solutions (1)

0 Kudos

Hi David,

Thanks for the details. I was sure they fixed this but it appears it's been broken again.

I'll escalate it in the morning and get DEV to look into it again.

It may be an issue in the export dll now. CR 14.2.4 now uses 1.6 usp10, so it may be they need to undo what ever they did before to make it work.

Don

FYI, the CR formatting engine is based on hardware and software including printer drivers, USP10, GDI, GDIPlus and the various frameworks. There is a lot of dependencies being used to try to get this right. One change is all it takes to break things...

This was all fixed in a Registry key:

KBA 2534523 - Words are incorrectly exported to PDF format from Crystal Reports when using the font Calibri

Symptom

  • Words incorrect.
  • Letter "i" or "a" or "f" added to many words.
  • Word like: "test" export to PDF as "testie"
  • When exporting a report from Crystal Reports to PDF format that contains text formatted to uses the font Calibri, many of the words in the PDF document generated are incorrect.

Environment

  • SAP Crystal Reports 2011
  • SAP Crystal Reports 2013
  • SAP Crystal Reports 2016
  • Crystal Reports, Developer for Visual Studio

Reproducing the Issue

  1. In Crystal Reports, create a report off any data source.
  2. Insert a database field, or text object on the report that contains text.
  3. Format the database field or text object to use the font Calibri
  4. Export the report to PDF format
  5. When opening the PDF document generated, notice many of the words are incorrect, like:
    - quotes, becomes: quoties
    - test, becomes: tiest
    - date, becomes: datie

Resolution

  1. In Crystal Reports, create a report off any data source.
  2. Insert a database field, or text object on the report that contains text.
  3. Format the database field or text object to use the font Calibri
  4. Export the report to PDF format
  5. When opening the PDF document generated, notice many of the words are incorrect, like:
    - quotes, becomes: quoties
    - test, becomes: tiest
    - date, becomes: datie

Cause

This is a program error.

Solution

  • This issue is fixed in the patches listed in the "Support Packages & Patches" section below.

    The "Support Packages & Patches" section will be populated with the relevant patch levels once they are released.

    For Business Intelligence Platform maintenance schedule and strategy see the Knowledge Base Article 2144559 in References section.

  • To workaround the issue, add the registry key: UseCustomEncoding, and set the value to zero (0)

    1. Close Crystal Reports designer.

    2. Open the Microsoft Registry Editor.
    ( In MS Windows, under the menu Start, select Run, and type: regedit )

    3. Navigate to the following path:

    HKEY_CURRENT_USER\SOFTWARE\SAP Business Objects\Suite XI 4.0\Crystal Reports\Export\PDF

For CR for VS 32 bit use this key:

HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\SAP BusinessObjects\Crystal Reports for .NET Framework 4.0\Crystal Reports\Export\PDF

For CR for VS 64 bit use this key:

HKEY_LOCAL_MACHINE\SOFTWARE\SAP BusinessObjects\Crystal Reports for .NET Framework 4.0\Crystal Reports\Export\PDF

Note: The last part of the path may not exist. If it is the case, simply add the missing keys.

4. Right click on the PDF key, and select: "New - DWORD Value"
5. Set the name to: UseCustomEncoding
6. Set the value to: 0
7. Restart Crystal Reports.

Keywords

Calibri font, "a", usp10 1.6 CR, xport, Calibri, PDF

CRYSTAL REPORTS FOR VS 2010SP021

BI PLATFORM SERVERS 4.1SP011

BOP BI PLATFORM SERVERS 4.2 SP005

SBOP BI PLATFORM SERVERS 4.3

davehuang
Explorer
0 Kudos

Hi, setting the UseCustomEncoding registry value to 0 does seem to fix the problem for me. Thanks! I do notice that copying text from the PDF still results in the wrong characters (using the example of "ta ti fa fi" from my original post, if I copy that from the PDF and paste it into Notepad, it still results in "ta t fa f"). However, it displays and prints correctly, which is my main concern. It would be nice if the text in the PDF were also correct though, so text searches would work.

What's the downside to setting this registry entry? (I assume that since it's not the default behavior, there must be some tradeoff).

0 Kudos

mmmm I don't see that copy/paste problem....

Trade off is it works.

davehuang
Explorer
0 Kudos

Hmm, this is the PDF I got with UseCustomEncoding set to 0: LigatureTest.pdf

I've opened it in a few PDF viewers: Adobe Acrobat Reader DC, the Chrome web browser, the Windows 10 PDF viewer app, and SumatraPDF, and they all display it correctly ("ta ti fa fi"). But if I highlight the text, press Ctrl+C to copy it, then press Ctrl+V in Notepad, they all copy/paste "ta t fa f".

And if the only effect of setting the registry entry is that it makes things work, what's the point of having a registry entry for it? Why would you ever want it to not work? I'm just puzzled as to why I have to set a registry entry, instead of it being the default behavior.

Answers (1)

Answers (1)

0 Kudos

Due to bugs in USP10.dll version 1.4 CR uses a registry key to work around them.

We updated to 1.6 and so there may be bugs in it also.

I noticed Adobe is using version 10 usp10.dll (Windows 10 version ) so there may be issues converting/copying the characters.

If you don't want to use what works then don't use that font.

Don