Recognition languages

Aspose.OCR for Java can recognize a text in a large number of languages and all popular writing scripts, including texts with mixed languages.

To specify a language for recognition, provide one of the following values in setLanguage method of RecognitionSettings class:

Value Language
Language.None Extended Latin characters, including diacritics
Language.Latin Extended Latin characters, including diacritics
Language.Cyrillic Cyrillic characters
Language.Bel Belorussian
Language.Bul Bulgarian
Language.Chi Chinese (more than 6,000 characters)
Language.Cze Czech
Language.Dan Danish
Language.Deu German
Language.Dum Dutch
Language.Eng English
Language.Est Estonian
Language.Fin Finnish
Language.Fra French
Language.Hin Hindi
Language.Ita Italian
Language.Kaz Kazakh
Language.Lav Latvian
Language.Lit Lithuanian
Language.Nor Norwegian
Language.Pol Polish
Language.Por Portuguese
Language.Rum Romanian
Language.Rus Russian
Language.Slk Slovak
Language.Slv Slovene
Language.Spa Spanish
Language.Srp Serbian
Language.Srp_hrv Serbo-Croatian
Language.Swe Swedish
Language.Ukr Ukrainian

If this parameter is omitted, the OCR engine will assume that the text is written in extended Latin.


The following code sample demonstrates how to specify the recognition language:

AsposeOCR api = new AsposeOCR();
RecognitionSettings recognitionSettings = new RecognitionSettings();
// Recognize Ukrainian text
RecognitionResult result = api.RecognizePage("source.png", recognitionSettings);
System.out.println("Recognition result:\n" + result.recognitionText + "\n\n");