Defining the whitelist of characters

[ ]

Limiting a subset of characters instead of using the full set can greatly improve recognition accuracy, increase speed, and reduce resource consumption. A list of characters can be automatically identified from an image using the built-in Aspose.OCR mechanisms.

You can define a list of characters Aspose.OCR engine will look for by specifying them as a case-sensitive string in setAllowedCharacters method of RecognitionSettings object.

Alternatively, you can use the preset:

Preset Subset of characters
CharactersAllowedType.ALL All characters.
CharactersAllowedType.LATIN_ALPHABET Latin / English text (A to Z and a to z), without accented characters.
CharactersAllowedType.DIGITS Binary, octal, decimal, or hexadecimal numbers (0-9 and A to F).

Characters that do not match the provided list are ignored.

AsposeOCR api = new AsposeOCR();
RecognitionSettings recognitionSettings = new RecognitionSettings();
// Prepare batch
OcrInput images = new OcrInput(InputType.SingleImage, filters);
// Recognize images
ArrayList<RecognitionResult> results = api.Recognize(input, recognitionSettings);