Supported file formats
Contents
[
Hide
]
Recognized image formats
Aspose.OCR for Python via Java can recognize any file you get from a scanner or camera:
Extension | Details |
---|---|
Portable Document Format. | |
.JPG | JPEG, the most popular format for smartphone photos. |
.PNG | Portable Network Graphics, 24-bit with transparency. |
.TIFF or .TIF | Tag Image File Format, commonly used for high quality scanning. Multi-page TIFF images are fully supported. |
.GIF | Graphics Interchange Format, limited to 256 colors. |
.BMP | Bitmap image file. |
.DJVU | DjVu, primarily designed for scanned documents, containing a combination of text, line drawings, indexed color images, and photographs. |
Additional recognition options
- You can recognize the above-mentioned file formats from folders. The number of recognized files is unlimited; however, subfolders are not processed.
- You can recognize the above-mentioned file formats from ZIP archives. The number of recognized files is unlimited, but the library does not process nested folders and archives.
- Aspose.OCR for Python via Java can read an image from the public URL that points directly to the file. However, it cannot extract images from HTML pages and does not support authentication.
Recognition results
Recognition results are returned in the most popular document and data exchange formats:
Format | Details |
---|---|
.TXT | Plain text |
.HTML | Web page |
.RTF | A universal format for exchanging rich text documents between different word processing programs |
.DOCX | Microsoft Word document |
.XLSX | Microsoft Excel spreadsheet |
Portable Document Format | |
.EPUB | Popular e-book file format |
JSON | A popular open-standard format, widely used in software development and data exchange |
XML | Extensible Markup Language, a universal format for most systems |