Defining the whitelist of characters

Limiting a subset of characters instead of using the full set can greatly improve recognition accuracy, increase speed, and reduce resource consumption. A list of characters can be automatically identified from an image using the built-in Aspose.OCR mechanisms.

You can define a list of characters Aspose.OCR engine will look for by specifying them as a case-sensitive string in setAllowedCharacters method of RecognitionSettings object.

Alternatively, you can use the preset:

Preset	Subset of characters
CharactersAllowedType.ALL	All characters.
CharactersAllowedType.LATIN_ALPHABET	Latin / English text (`A` to `Z` and `a` to `z`), without accented characters.
CharactersAllowedType.DIGITS	Binary, octal, decimal, or hexadecimal numbers (`0-9` and `A` to `F`).

Characters that do not match the provided list are ignored.

AsposeOCR api = new AsposeOCR();
RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setAllowedCharacters(CharactersAllowedType.DIGITS);
// Prepare batch
OcrInput images = new OcrInput(InputType.SingleImage, filters);
images.add("image.png");
// Recognize images
ArrayList<RecognitionResult> results = api.Recognize(input, recognitionSettings);
System.out.println(results[0].recognitionText);

Identifying the characters Defining the blacklist of characters