ONLINE HELP
 WINDEVWEBDEV AND WINDEV MOBILE

Help / WLanguage / WLanguage functions / Standard functions / OCR functions
WINDEV
WindowsLinuxUniversal Windows 10 AppJavaReports and QueriesUser code (UMC)
WEBDEV
WindowsLinuxPHPWEBDEV - Browser code
WINDEV Mobile
AndroidAndroid Widget iPhone/iPadIOS WidgetApple WatchMac CatalystUniversal Windows 10 App
Others
Stored procedures
Returns all text areas of an image without reading the text. A text area is defined by a polygon.
Remark: To get the text and areas, it is recommended to use OCRExtractTextBlock directly. OCRExtractTextBlock is not slower at runtime.
Example
MyImage is Image
MyPolygonArray is array of Polygon
MyPolygonArray = OCRDetectTextArea(MyImage)
Syntax
<Result> = OCRDetectTextArea(<Image to use>)
<Result>: Array of Polygon variables
Array of Polygon variables corresponding to the different text areas.
<Image to use>: Control name, Image variable, character string
Image in which the text areas must be detected. The image can correspond to:
  • an Image control,
  • an Image variable,
  • an Image Memo item,
  • the path of an image file,
  • the path of PDF file.
    Caution: this file must contain only one page.
    Reminder: you can extract a page from a PDF file as an image using PDFExtractPage. This image can be processed by OCRDetectTextArea.
Remarks
  • Windows Legacy and LSTM engines are available for WINDEV applications (Windows and Linux).
  • The .traineddata models are required, even if the text is not read.
  • To get the best results possible, it is recommended to:
    • Use a high-resolution image.
    • Crop the image around the text if possible (avoid unnecessary areas).
    • Limit text skew. If the image is slightly skewed, OCR may be able to detect the text, but the quality will be affected.
    • Limit the number of models/languages used.
  • Note that, if the image used corresponds to an Image control, the source image will be directly manipulated. Therefore, the changes made in the Image control (image size for example) will not be taken into account. To apply these changes, it is necessary to save the image.
  • Note that, if the image used (via an Image control or not) is a PDF file, its quality will be set to 300 DPI.
  • OCR can only detect printed text. It cannot recognize handwritten text.
  • "White" text is not recognized.
Business / UI classification: Business Logic
Component: wd290ocr.dll
Minimum version required
  • Version 26
This page is also available for…
Comments
Click [Add] to post a comment

Last update: 05/26/2022

Send a report | Local help