Document areas detection

A scanned image or photograph of a text document may contain a large number of blocks of various content - text paragraphs, tables, illustrations, formulas, and the like. Detecting, ordering, and classifying areas of interest on a page is the cornerstone of successful and accurate OCR. This process is called document areas detection.

Document structure analysis and recognition

Aspose.OCR offers several document areas detection algorithms, allowing you to choose the one that works best for your specific content.

You can manually override the default document areas detection function if you are unhappy with the results or get unwanted artifacts. Document structure analysis algorithm is specified in an optional detect_areas_mode parameter of recognition settings.

// Provide the image
string file = "source.png";
AsposeOCRInput source;
source.url = file.c_str();
std::vector<AsposeOCRInput> content = { source };
// Fine-tune recognition
RecognitionSettings settings;
settings.detect_areas_mode = detect_areas_mode_enum::CURVED_TEXT;
// Extract text from the image
auto result = asposeocr_recognize(content.data(), content.size(), settings);
// Output the recognized text
wchar_t* buffer = asposeocr_serialize_result(result, buffer_size, export_format::text);
std::wcout << std::wstring(buffer) << std::endl;
// Release the resources
asposeocr_free_result(result);

Aspose.OCR for C++ supports the following document structure analysis functions provided in detect_areas_mode_enum enumeration:

Name Value Description Use cases
detect_areas_mode_enum::NONE 0 Do not analyze document structure. Never disable automatic document areas detection when working with multi-paragraph and multi-column documents, tables, or photos. This can significantly reduce recognition accuracy. Simple images containing a few lines of text without illustrations or formatting.
Applications requiring maximum recognition speed
Web applications
detect_areas_mode_enum::DOCUMENT 1 Detect large blocks of text, such as paragraphs and columns. Optimal for multi-column documents with illustrations.
See detect_areas_mode_enum::DOCUMENT for additional details.
Contracts
Books
Articles
Newspapers
High-quality scans
detect_areas_mode_enum::PHOTO 2 Finds small text blocks inside complex images.
See detect_areas_mode_enum::PHOTO for additional details.
Driver’s licenses
Social security cards
Government and work IDs
Visas
Photos
Screenshots
Advertisements
detect_areas_mode_enum::MIXED_TEXT 3 The combination of detect_areas_mode_enum::DOCUMENT and detect_areas_mode_enum::PHOTO.
See detect_areas_mode_enum::MIXED_TEXT for additional details.
Posters
Billboards
Datasheets
Random photos
Batch recognition
detect_areas_mode_enum::TABLE 4 Detects cells in tabular structures.
See detect_areas_mode_enum::TABLE for additional details.
Tables
Invoices
detect_areas_mode_enum::CURVED_TEXT 5 Auto-straightens curved lines and finds text blocks inside the resulting image.
See detect_areas_mode_enum::CURVED_TEXT for additional details.
Photos of books, magazine articles, and other curved pages.
detect_areas_mode_enum::UNIVERSAL 6 Optimal choice for general image processing. However, specialized algorithms can provide faster or more accurate results for their intended use cases.
See detect_areas_mode_enum::UNIVERSAL for additional details.
On average, this algorithm achieves good results with most image types.

Performance impact

Pipeline Time
detect_areas_mode_enum::PHOTO 2.9 seconds
detect_areas_mode_enum::PHOTO
Recognition
4.7 seconds
detect_areas_mode_enum::CURVED_TEXT
Recognition
8.5 seconds