Preparing content for recognition

Aspose.OCR for C++ provides a standardized way to prepare your content for OCR. The image, document, folder, or URL is provided as an AsposeOCRInput structure. Multiple AsposeOCRInput structures are encapsulated into std::vector sequence container allowing you to easily process a single image or a large number of images (for example, pages from an auto-feed scanner) with a single API call.

Supported content types

Depending on the content type, you should set different AsposeOCRInput structure members:

Content type Required structure members
File (of any supported format)
  • url - provide an absolute or relative path to the source file.
Directory
  • url - provide an absolute or relative path to the directory with images and PDF documents. Sub-directories will be ignored.
URL
  • url - provide a public URL to an image or PDF document.
Raw image data
  • raw_data - provide an image as raw bytes or pixels.
  • width - image width, in pixels (only required when providing an image as pixel array).
  • height - image height, in pixels (only required when providing an image as pixel array).
  • raw_data_size - size of the raw_data member.
  • raw_data_type - image color model (when providing an image as pixel array) or file format (when providing an image file as byte array).
Recognition settings
  • special_settings - provide a pointer to RecognitionSettings structure. If the pointer is not NULL, the content-specific recognition settings are used instead of recognition settings provided in the asposeocr_recognize() function.
    This can be useful if you want to change the recognition language or adjust the processing options for one of the images from the set.

Examples

The following code samples demonstrates how to prepare content for recognition: