Language detecton

Contents
[ ]

Aspose.OCR for Java can automatically identify the following languages and language families used on the provided image using DetectLanguages method:

  • Latin
  • Cyrillic
  • Arabic
  • Chinese
  • Japanese
  • Korean
  • Hindi
  • Tamil
  • Telugu
  • Kannada

The method accepts the collection of images in any of the supported formats and returns them as a list of LanguageDetectionOutput objects with the following properties:

Property Type Description
source string The full path or URL of the source file. If the file is provided as a BufferedImage, InputStream, an array of pixels, or a Base64 string, this value will be empty.
page int Page number. When working with single-page images, this value is always 0.
languages List<Map.Entry<Language, Float>> Lists the recognition languages detected in the image along with their probabilities.

Example

The following code sample demonstrates how to detect the image languages:

OcrInput input = new OcrInput(InputType.SingleImage);
input.add("source.png");
LanguageDetectionOutput result = api.DetectLanguages(input).get(0);
out.println("File: " + result.source);
out.println("Page: " + result.page);
for(Map.Entry<Language, Float> l : result.languages)
out.println("Language: " + l.getKey()+ " : " + l.getValue());