OCRTesseract
OCRTesseract class provides an interface with the tesseract-ocr API (v3.02.02) in C++.
Notice that it is compiled only when tesseract-ocr is correctly installed.
@note - (C++) An example of OCRTesseract recognition combined with scene text detection can be found at the end_to_end_recognition demo: https://github.com/opencv/opencv_contrib/blob/master/modules/text/samples/end_to_end_recognition.cpp - (C++) Another example of OCRTesseract recognition combined with scene text detection can be found at the webcam_demo: https://github.com/opencv/opencv_contrib/blob/master/modules/text/samples/webcam_demo.cpp
Member of Text
-
Creates an instance of the OCRTesseract class. Initializes Tesseract.
Declaration
Objective-C
+ (nonnull OCRTesseract *)create:(nonnull NSString *)datapath language:(nonnull NSString *)language char_whitelist:(nonnull NSString *)char_whitelist oem:(ocr_engine_mode)oem psmode:(page_seg_mode)psmode;
Swift
class func create(datapath: String, language: String, char_whitelist: String, oem: ocr_engine_mode, psmode: page_seg_mode) -> OCRTesseract
Parameters
datapath
the name of the parent directory of tessdata ended with “/”, or NULL to use the system’s default directory.
language
an ISO 639-3 code or NULL will default to “eng”.
char_whitelist
specifies the list of characters used for recognition. NULL defaults to “0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ”.
oem
tesseract-ocr offers different OCR Engine Modes (OEM), by default tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values.
psmode
tesseract-ocr offers different Page Segmentation Modes (PSM) tesseract::PSM_AUTO (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.
-
Creates an instance of the OCRTesseract class. Initializes Tesseract.
Declaration
Objective-C
+ (nonnull OCRTesseract *)create:(nonnull NSString *)datapath language:(nonnull NSString *)language char_whitelist:(nonnull NSString *)char_whitelist oem:(ocr_engine_mode)oem;
Swift
class func create(datapath: String, language: String, char_whitelist: String, oem: ocr_engine_mode) -> OCRTesseract
Parameters
datapath
the name of the parent directory of tessdata ended with “/”, or NULL to use the system’s default directory.
language
an ISO 639-3 code or NULL will default to “eng”.
char_whitelist
specifies the list of characters used for recognition. NULL defaults to “0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ”.
oem
tesseract-ocr offers different OCR Engine Modes (OEM), by default tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.
-
Creates an instance of the OCRTesseract class. Initializes Tesseract.
Declaration
Objective-C
+ (nonnull OCRTesseract *)create:(nonnull NSString *)datapath language:(nonnull NSString *)language char_whitelist:(nonnull NSString *)char_whitelist;
Swift
class func create(datapath: String, language: String, char_whitelist: String) -> OCRTesseract
Parameters
datapath
the name of the parent directory of tessdata ended with “/”, or NULL to use the system’s default directory.
language
an ISO 639-3 code or NULL will default to “eng”.
char_whitelist
specifies the list of characters used for recognition. NULL defaults to “0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ”. tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.
-
Creates an instance of the OCRTesseract class. Initializes Tesseract.
Declaration
Objective-C
+ (nonnull OCRTesseract *)create:(nonnull NSString *)datapath language:(nonnull NSString *)language;
Swift
class func create(datapath: String, language: String) -> OCRTesseract
Parameters
datapath
the name of the parent directory of tessdata ended with “/”, or NULL to use the system’s default directory.
language
an ISO 639-3 code or NULL will default to “eng”. “0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ”. tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.
-
Creates an instance of the OCRTesseract class. Initializes Tesseract.
Declaration
Objective-C
+ (nonnull OCRTesseract *)create:(nonnull NSString *)datapath;
Swift
class func create(datapath: String) -> OCRTesseract
Parameters
datapath
the name of the parent directory of tessdata ended with “/”, or NULL to use the system’s default directory. “0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ”. tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.
-
Creates an instance of the OCRTesseract class. Initializes Tesseract.
system's default directory. "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ". tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.
Declaration
Objective-C
+ (nonnull OCRTesseract *)create;
Swift
class func create() -> OCRTesseract
-
Recognize text using the tesseract-ocr API.
Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found (e.g. words), and the list of those text elements with their confidence values.
Declaration
Objective-C
- (nonnull NSString *)run:(nonnull Mat *)image min_confidence:(int)min_confidence component_level:(int)component_level;
Swift
func run(image: Mat, min_confidence: Int32, component_level: Int32) -> String
Parameters
image
Input image CV_8UC1 or CV_8UC3 text elements found (e.g. words or text lines). recognition of individual text elements found (e.g. words or text lines). for the recognition of individual text elements found (e.g. words or text lines).
component_level
OCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXTLINE.
-
Recognize text using the tesseract-ocr API.
Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found (e.g. words), and the list of those text elements with their confidence values.
Declaration
Objective-C
- (nonnull NSString *)run:(nonnull Mat *)image min_confidence:(int)min_confidence;
Swift
func run(image: Mat, min_confidence: Int32) -> String
Parameters
image
Input image CV_8UC1 or CV_8UC3 text elements found (e.g. words or text lines). recognition of individual text elements found (e.g. words or text lines). for the recognition of individual text elements found (e.g. words or text lines).
-
Declaration
Objective-C
- (void)setWhiteList:(NSString*)char_whitelist NS_SWIFT_NAME(setWhiteList(char_whitelist:));
Swift
func setWhiteList(char_whitelist: String)