ONLINE HELP
 WINDEVWEBDEV AND WINDEV MOBILE

This content has been translated automatically.  Click here  to view the French version.
Help / WLanguage / WLanguage functions / Standard functions / OCR functions
  • Overview
  • How to use native OCR?
  • Language model
  • Reading the image or PDF
  • Remarks
WINDEV
WindowsLinuxJavaReports and QueriesUser code (UMC)
WEBDEV
WindowsLinuxPHPWEBDEV - Browser code
WINDEV Mobile
AndroidAndroid Widget iPhone/iPadIOS WidgetApple WatchMac Catalyst
Others
Stored procedures
Overview
An OCR (Optical Character Recognition) system analyzes an image to extract the text it contains. From version 26, you can integrate OCR functionalities into your applications and your sites.
The OCR engine is a neural network. It decrypts images containing text.
Take a picture of a contract with your phone, and retrieve the text in Word!
OCR is also very useful for a DMS, in order to index the contents.
How to use native OCR?
To retrieve text via native OCR:
  1. If necessary, load the model corresponding to the language used.
  2. Use OCRExtractText, indicating the name of the image or PDF document to be analyzed.

Language model

By default, the following language models are provided: English, French and Spanish.. The model that corresponds to the current language will be used.
To recognize other languages via native OCR, simply:
  1. Provide the neural network training model corresponding to the language (".traineddata" file to include in the directory of the executable):
  2. Use OCRLoadLanguage to load the desired language.
iPhone/iPad On iOS, Apple's native OCR is used. This native OCR is only available from iOS 13 onwards. Apple's OCR is only available in English as of now.

Reading the image or PDF

OCRExtractText returns all the text from the image. All content other than text is ignored. If required, this function can be used to analyze only part of an image: simply enter the coordinates of the part to be analyzed.
OCRExtractTextBlock analyzes an image and returns a set of rectangles each containing a block of text.

Remarks

  • To get the best results possible, it is recommended to:
    • Use a high-resolution image.
    • Crop the image around the text if possible (avoid unnecessary areas).
    • Limit text skew. If the image is slightly skewed, OCR may be able to detect the text, but the quality will be affected.
      iPhone/iPad Skewed images can be read.
    • Limit the number of models/languages used.
  • If the selected area is too small, it will not be possible to retrieve the corresponding text (for example, an area reduced to a single number or letter).
  • Note that, if the image used corresponds to an Image control, the source image will be directly manipulated. Therefore, the changes made in the Image control (image size for example) will not be taken into account. To apply these changes, it is necessary to save the image.
  • Note that, if the image used (via an Image control or not) is a PDF file, its quality will be set to 300 DPI.
  • OCR can only detect printed text. It cannot recognize handwritten text.
  • "White" text is not recognized.
Related Examples:
OCR functions Unit examples (WINDEV): OCR functions
[ + ] This example shows how to use OCR functions in WINDEV.
Minimum version required
  • Version 26
This page is also available for…
Comments
Click [Add] to post a comment

Last update: 03/27/2025

Send a report | Local help