Working with custom dictionaries

The recognized text may contain specialized terminology, abbreviations, and other words which are not present in common spelling dictionaries. This may cause these words to be considered errors and incorrectly replaced by the spelling corrector. To overcome these situations, you can provide your own word lists in addition to Aspose.OCR’s built-in dictionaries. The words present in the dictionary will be considered correct.

For example, if the text contains the phrase “Helloo, world!”, it will be automatically corrected to “Hello, world!”. However, if you add the word “helloo” to the dictionary, the phrase will remain unchanged.

Dictionary file format

The user dictionary is provided as a UTF-8 encoded text file with Windows or Unix line endings. Each word is provided in lowercase on a separate line, followed by its frequency separated from the word by a single space or tab:

helloo 20000
heloo 22000
helooo 19998

The path to the custom dictionary file can be provided with useUserDictionary method of RecognitionResult class.

AsposeOCR api = new AsposeOCR();
// Add an image to OcrInput object
OcrInput input = new OcrInput(InputType.SingleImage);
input.Add("source.png");
// Recognize image
ArrayList<RecognitionResult> results = api.Recognize(input);
// Use custom dictionary
results[0].useUserDictionary("dictionary.txt");
// Output corrected results
String correctedResult = results[0].getSpellCheckCorrectedText(SpellCheckLanguage.Eng);
System.out.println("Recognition result:\n" + correctedResult + "\n\n");