Recognition languages

Contents
[ ]

Aspose.OCR for Python via .NET can recognize a text in a large number of languages and all popular writing scripts, including texts with mixed languages.

To specify a language for recognition, provide one of the following values in language property of recognition settings:

Value Alphabet
Language.EXT_LATIN Auto-detect all supported Latin characters and diacritics
Language.CYRILLIC Auto-detect all supported Cyrillic characters
Language.CHINESE All Chinese languages. Mixed-language Chinese/English texts also supported.
Language.DEVANAGARI
Language.INDIC
Indic texts based on Devanagari script, including mixed Devanagari/English texts.
Language.EUROPEAN Mixed-language Cyrillic/English texts (experimental).
Language.AFR Afrikaans
Language.ALN Albanian
Language.ARA Arabic, including texts in mixed Arabic/English
Language.AWA Awadhi
Language.AZB Azerbaijani (Azeri)
Language.BCL Bikol
Language.BEL Belarusan (Belorussian)
Language.BEM Bemba (Chibemba)
Language.BEW Betawi
Language.BGC Haryanvi
Language.BHO Bhojpuri
Language.BHR Malagasy
Language.BJJ Kanauji
Language.BOS Bosnian
Language.BUL Bulgarian
Language.CAT Catalan
Language.CCX Zhuang
Language.CDO Min Dong
Language.CEB Cebuano
Language.CES Czech
Language.CHE Chechen
Language.CMN Mandarin (Chinese)
Language.CPX Pu-Xian
Language.DAN Danish
Language.DEU German
Language.DHD Dhundari
Language.DIQ Dimli
Language.DOC Dong
Language.ENG English
Language.EST Estonian
Language.FIN Finnish
Language.FRA French
Language.GAN Gan
Language.GAX Oromo
Language.GBM Garhwali
Language.GLG Galician
Language.GLK Gilaki
Language.GUZ Gusii
Language.HAK Hakka
Language.HAU Hausa
Language.HBS Serbo-Croatian (Latin)
Language.HIL Hiligaynon
Language.HIN Hindi
Language.HMN Hmong
Language.HNE Chattisgarhi (Laria, Khaltahi)
Language.HRV Croatian
Language.HSN Xiang
Language.HUN Hungarian (Magyar)
Language.ILO Ilocano
Language.IND Indonesian
Language.ITA Italian
Language.JPN Japanese (mixed texts in Japanese and English are also supported)
Language.KAN Mixed-language Kannada/English texts.
Language.KAZ Kazakh
Language.KBD Kabardian
Language.KFY Kumauni
Language.KIN Rwanda
Language.KLN Nandi
Language.KMR Kurdish (Kurmanji)
Language.KNC Kanuri
Language.KNN Konkani
Language.KON Kikongo
Language.KOR Korean (mixed texts in Korean and English are also supported)
Language.LATIN Latin
Language.LAV Latvian
Language.LIT Lithuanian
Language.LMN Lamani (Lambadi)
Language.LNC Occitan
Language.LUO Luo
Language.MAG Magahi
Language.MAI Maithili
Language.MAK Makassar (Makasar)
Language.MAR Marathi
Language.MER Meru
Language.MIN Minangkabau
Language.MLY Malay (Melayu)
Language.MNP Min Bei
Language.MON Mongolian
Language.MTQ Muong
Language.MTR Mewari
Language.MUI Musi
Language.MUP Malvi
Language.NAN Min Nan
Language.NBL Ndebele
Language.NDS Low German
Language.NEP Nepali
Language.NLD Dutch
Language.NOR Norwegian
Language.NSO Sotho (Northern)
Language.NYA Chichewa (Chewa, Nyanja)
Language.PAG Pangasinan
Language.PAM Kapampangan
Language.PCC Bouyei (Buyi, Giáy)
Language.PES Persian (Farsi), including texts in mixed Persian/English
Language.PLM Palembang
Language.POL Polish
Language.POR Portuguese
Language.QUC K’iche'
Language.QXA Quechua
Language.RJB Rajbanshi
Language.RON Romanian
Language.RUF Luguru
Language.RUS Russian
Language.RWR Marwari
Language.SAS Sasak
Language.SLK Slovak
Language.SLV Slovene (Slovenian)
Language.SNA Shona (Karanga)
Language.SOM Somali
Language.SOT Sotho (Southern)
Language.SPA Spanish
Language.SRP Serbian (Cyrillic)
Language.SRR Serer-Sine
Language.SSW Swati (Swazi)
Language.SUK Sukuma
Language.SUN Sundanese (Sunda)
Language.SWE Swedish
Language.SWH Swahili
Language.TAM Mixed-language Tamil/English texts.
Language.TEL Mixed-language Telugu/English texts.
Language.TGL Tagalog (Pilipino)
Language.TOI Tonga
Language.TSN Tswana
Language.TSO Tsonga
Language.TUK Turkmen
Language.TUM Tumbuka
Language.TUR Turkish
Language.UIG Uyghur, including texts in mixed Uyghur/English
Language.UKR Ukrainian
Language.UMB Umbundu
Language.URD Urdu, including texts in mixed Urdu/English
Language.VIE Vietnamese
Language.VMW Makua (Makhuwa)
Language.WAL Wolaytta
Language.WAR Waray-Waray
Language.WBR Wagdi
Language.WTM Mewati
Language.WUU Wu (Changzhou)
Language.XHO Xhosa
Language.YAO Yao
Language.YOR Yoruba
Language.YUE Cantonese
Language.ZUL Zulu

If this parameter is omitted, the OCR engine will assume that the text is written in extended Latin.

Example

The following code sample demonstrates how to specify the recognition language:

# Instantiate Aspose.OCR API
api = AsposeOcr()
# Add image to the recognition batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("source.png")
# Recognize Ukrainian text
recognitionSettings = RecognitionSettings()
recognitionSettings.language = Language.UKR
# Recognize the image
result = api.recognize(input, recognitionSettings)
# Print recognition result
print(result[0].recognition_text)
input("Press Enter to continue...")