Skip to Content
Nov 13, 2018 at 11:38 AM

OCR parameters explanation


I've got some questions regarding the parameters of the OCR service.
As stated in the documentation there are several different options for the page segmentation mode and the type of the machine learning model.
The description of these parameters is really short. Does anyone know where I can find a more in depth description?

Questions about the modelType
Regarding the different modelTypes I would like to know the difference between lstmPrecise, lstmFast and lstmStandard. I am familiar with LSTM cells but I didn't find any information on what makes the "precise model" precise, the "fast mode" fast and so on.

There also is a model with "LSTM cells and standard processing algorithms". Is there any information what standard processing algorithms are used?

I am also looking for information on the training of these models.

Questions about the pageSegMode

Most of the options are pretts self-explanatory, however I stumbled upon pageSegmode 13 - "Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific".
I know Tesseract as a free software for optical character recognition. Is the OCR service SAP provides based on Tesseract?
What Tesseract-specific hacks are bypassed?

I really hope that there is someone out there who can help me with this questions or at least has an idea who might know this.

Thanks in advance and best regards,