Recognition languages

Contents
[ ]

Aspose.OCR for Python via .NET can recognize a text in a large number of languages and all popular writing scripts, including texts with mixed languages.

To specify a language for recognition, provide one of the following values in language property of recognition settings:

Value Alphabet
Language.MULTILANGUAGE
Aspose.OCR.Language.AUTO
Aspose.OCR.Language.UNIVERSAL
Automatically detects the language. Supports multiple language families, including Latin, Cyrillic, Arabic, Chinese and more.
Language.EXT_LATIN Auto-detect all supported Latin characters and diacritics
Language.CYRILLIC Auto-detect all supported Cyrillic characters
Language.CHINESE All Chinese languages. Mixed-language Chinese/English texts also supported.
Language.DEVANAGARI
Language.INDIC
Indic texts based on Devanagari script, including mixed Devanagari/English texts.
Language.EUROPEAN Mixed-language Cyrillic/English texts (experimental).
Language.PERSO_ARABIC
Language.ISLAMIC
Mixed-language texts with Arabic, Persian and English.

Language.AFR | Afrikaans Language.ALN | Albanian Language.ARA | Arabic, including texts in mixed Arabic/English Language.AWA | Awadhi Language.AZB | Azerbaijani (Azeri) Language.BCL | Bikol Language.BEL | Belarusan (Belorussian) Language.BEM | Bemba (Chibemba) Language.BEW | Betawi Language.BGC | Haryanvi Language.BHO | Bhojpuri Language.BHR | Malagasy Language.BJJ | Kanauji Language.BOS | Bosnian Language.BUL | Bulgarian Language.CAT | Catalan Language.CCX | Zhuang Language.CDO | Min Dong Language.CEB | Cebuano Language.CES | Czech Language.CHE | Chechen Language.CMN | Mandarin (Chinese) Language.CPX | Pu-Xian Language.DAN | Danish Language.DEU | German Language.DHD | Dhundari Language.DIQ | Dimli Language.DOC | Dong Language.ENG | English Language.EST | Estonian Language.FIN | Finnish Language.FRA | French Language.GAN | Gan Language.GAX | Oromo Language.GBM | Garhwali Language.GLG | Galician Language.GLK | Gilaki Language.GUZ | Gusii Language.HAK | Hakka Language.HAU | Hausa Language.HBS | Serbo-Croatian (Latin) Language.HIL | Hiligaynon Language.HIN | Hindi Language.HMN | Hmong Language.HNE | Chattisgarhi (Laria, Khaltahi) Language.HRV | Croatian Language.HSN | Xiang Language.HUN | Hungarian (Magyar) Language.ILO | Ilocano Language.IND | Indonesian Language.ITA | Italian Language.JPN | Japanese (mixed texts in Japanese and English are also supported) Language.KAN | Mixed-language Kannada/English texts. Language.KAZ | Kazakh Language.KBD | Kabardian Language.KFY | Kumauni Language.KIN | Rwanda Language.KLN | Nandi Language.KMR | Kurdish (Kurmanji) Language.KNC | Kanuri Language.KNN | Konkani Language.KON | Kikongo Language.KOR | Korean (mixed texts in Korean and English are also supported) Language.LATIN | Latin Language.LAV | Latvian Language.LIT | Lithuanian Language.LMN | Lamani (Lambadi) Language.LNC | Occitan Language.LUO | Luo Language.MAG | Magahi Language.MAI | Maithili Language.MAK | Makassar (Makasar) Language.MAR | Marathi Language.MER | Meru Language.MIN | Minangkabau Language.MLY | Malay (Melayu) Language.MNP | Min Bei Language.MON | Mongolian Language.MTQ | Muong Language.MTR | Mewari Language.MUI | Musi Language.MUP | Malvi Language.NAN | Min Nan Language.NBL | Ndebele Language.NDS | Low German Language.NEP | Nepali Language.NLD | Dutch Language.NOR | Norwegian Language.NSO | Sotho (Northern) Language.NYA | Chichewa (Chewa, Nyanja) Language.PAG | Pangasinan Language.PAM | Kapampangan Language.PCC | Bouyei (Buyi, Giáy) Language.PES | Persian (Farsi), including texts in mixed Persian/English Language.PLM | Palembang Language.POL | Polish Language.POR | Portuguese Language.QUC | K’iche' Language.QXA | Quechua Language.RJB | Rajbanshi Language.RON | Romanian Language.RUF | Luguru Language.RUS | Russian Language.RWR | Marwari Language.SAS | Sasak Language.SLK | Slovak Language.SLV | Slovene (Slovenian) Language.SNA | Shona (Karanga) Language.SOM | Somali Language.SOT | Sotho (Southern) Language.SPA | Spanish Language.SRP | Serbian (Cyrillic) Language.SRR | Serer-Sine Language.SSW | Swati (Swazi) Language.SUK | Sukuma Language.SUN | Sundanese (Sunda) Language.SWE | Swedish Language.SWH | Swahili Language.TAM | Mixed-language Tamil/English texts. Language.TEL | Mixed-language Telugu/English texts. Language.TGL | Tagalog (Pilipino) Language.TOI | Tonga Language.TSN | Tswana Language.TSO | Tsonga Language.TUK | Turkmen Language.TUM | Tumbuka Language.TUR | Turkish Language.UIG | Uyghur, including texts in mixed Uyghur/English Language.UKR | Ukrainian Language.UMB | Umbundu Language.URD | Urdu, including texts in mixed Urdu/English Language.VIE | Vietnamese Language.VMW | Makua (Makhuwa) Language.WAL | Wolaytta Language.WAR | Waray-Waray Language.WBR | Wagdi Language.WTM | Mewati Language.WUU | Wu (Changzhou) Language.XHO | Xhosa Language.YAO | Yao Language.YOR | Yoruba Language.YUE | Cantonese Language.ZUL | Zulu

If this parameter is omitted, the OCR engine will assume that the text is written in extended Latin.

Example

The following code sample demonstrates how to specify the recognition language:

# Instantiate Aspose.OCR API
api = AsposeOcr()
# Add image to the recognition batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("source.png")
# Recognize Ukrainian text
recognitionSettings = RecognitionSettings()
recognitionSettings.language = Language.UKR
# Recognize the image
result = api.recognize(input, recognitionSettings)
# Print recognition result
print(result[0].recognition_text)
input("Press Enter to continue...")