Recognition settings

Aspose.OCR for JavaScript via C++ provides good recognition accuracy and performance out of the box. Nevertheless, there will inevitably be cases where the default settings fail to deliver dependable recognition results.

To fine-tune the recognition settings, pass an optional Module.WasmAsposeOCRRecognitionSettings object to the recognition function.

Language configuration

Set the recognition language.

Setting Value Default value Description
language_alphabet Object Module.Language.NONE Specify a language for recognition. It is highly recommended that you always provide this setting.
alphabet String All symbols A custom list of characters to be recognized, provided as a case-sensitive string. Characters that do not match the provided list are ignored.
allowed_characters Object Module.CharactersAllowedType.ALL The predefined whitelist of characters Aspose.OCR engine will look for.
ignoredCharacters String none A blacklist of characters that are ignored during recognition.

Document structure detection

Determine how to handle complex text layouts (multi-column texts, tables, and so on).

Setting Value Default value Description
detect_areas_mode detect_areas_mode_enum detect_areas_mode_enum::DOCUMENT Manually override the default document areas detection model.
lines_filtration Boolean false Set to true to recognize text in tables.
Set to false to improve performance by ignoring table separators and treating tables as plain text lines.
all_image Boolean false Force recognition of the entire image as a single block of text. It is recommended to enable (set to true) this setting only when working with very simple, one-line images.

Preprocessing

Improve recognition accuracy for low-quality images.

Setting Value Default value Description
correct_skew Boolean true Automatically correct image tilt (deskew) before proceeding to recognition. Since most of the images have some degree of tilt, it is recommended to keep this setting always on.
skew Floating point number 0.0 Manually rotate the image by the specified degree. Recommended for significantly tilted images (by more than 15° either direction), for which the automatic skew correction may fail.
  • -360 to 0: rotate counterclockwise;
  • 0 to 360: rotate clockwise.
auto_contrast Boolean false Automatically increase the contrast of images before proceeding to recognition. Enable for blurry images and out-of-focus photos.
auto_denoising Boolean false Automatically remove noise from images before proceeding to recognition.
upscale_small_font Boolean false Improve small font recognition and detection of dense lines. Recommended for endnotes, food labels and other images with small text.
threshold_value Number 0 Override the automatic binarization settings.

Example

The following code example shows how to fine-tune recognition:

// Prepare images
var source = Module.WasmAsposeOCRInput();
source.url = filename;
var content = new Module.WasmAsposeOCRInputs();
content.push_back(source);
// Recognize rotated photos of book pages in Ukrainian
var settings = Module.WasmAsposeOCRRecognitionSettings();
settings.language_alphabet = Module.Language.UKR;
settings.skew = -90;
settings.detect_areas_mode = Module.DetectAreasMode.CURVED_TEXT;
// Recognize image
var result = Module.AsposeOCRRecognize(content, settings);