Image preprocessing
The accuracy and reliability of text recognition is highly dependent on the quality of the original image. Aspose.OCR offers a large number of fully automated and manual image processing filters that enhance an image before it is sent to the OCR engine.
Each preprocessing filter increases the image processing time. The approximate amount of additional time required for pre-processing (as a percentage of the minimum image processing pipeline) is listed in the Performance Impact column.
Filter | Action | Performance impact | Usage scenarios |
---|---|---|---|
Skew correction | Automatically straighten images aligned at a slight angle to the horizontal. | 12% | Skewed images |
Rotation | Manually rotate severely skewed images. | 7.5% | Rotated images |
Noise removal | Automatically remove dirt, spots, scratches, glare, unwanted gradients, and other noise from photos and scans. | 175% extra time 38% more memory (1) |
Photos Old books Newspapers Postcards Documents with stains and dirt |
Contrast correction | Automatically adjust the image contrast. | 7.5% | Photos Old papers Text on a background |
Resizing | Proportionally scale images up / down, or manually define the width and height of the image. | up to 100% (2) | Medication guides Food labels Full-sized photos from modern cameras and smartphones Scanned images at very high DPI |
Binarization | Convert images to black and white automatically or manually adjust the criteria that determines whether a pixel is considered black or white. | 0.9% | Always used for text detection and most automatic image corrections |
Conversion to grayscale | Discard color information from images and leave only shades of gray. | 0.5% | Photos Scanned ID cards Full-color scans |
Color inversion | Swap image colors so that light areas appear dark and dark areas appear light. | 0.25% | White text on black background Advertisements Business cards Screenshots |
Dilation | Increase the thickness of characters in an image by adding pixels to the edges of high-contrast objects, such as letters. | 3.1% | Receipts Printouts with very thin font |
Median filter | Blur noisy images while preserving the edges of high-contrast objects like letters. | 6.25% | Photos taken in low light conditions Poor quality printouts Highly compressed JPEG’s |
Notes
- Automatic noise removal uses a powerful artificial intelligence algorithm that consumes significant computing resources and RAM. Use it with care, especially when developing public websites and mobile apps.
- Resizing takes between 6% and 100% more time than the minimum processing pipeline, depending on the original image size.
Chaining preprocessing filters
Multiple preprocessing filters can be applied to the same image to further improve the recognition quality. The filters are applied one by one in the order they are added to custom_preprocessing_filters
structure (up to 12 filters are allowed).
Note that each filter requires additional time and resources on the computer running the application. Do not add extra filters if you are satisfied with the recognition accuracy, especially when developing web applications.
custom_preprocessing_filters filters_;
filters_.filter_1 = OCR_IMG_PREPROCESS_THRESHOLD({THRESHOLD});
filters_.filter_2 = OCR_IMG_PREPROCESS_AUTOSKEW;
filters_.filter_3 = OCR_IMG_PREPROCESS_ROTATE({ANGLE});
filters_.filter_4 = OCR_IMG_PREPROCESS_AUTODENOISING;
filters_.filter_5 = OCR_IMG_PREPROCESS_CONTRAST_CORRECTION;
filters_.filter_6 = OCR_IMG_PREPROCESS_SCALE({RATIO});
filters_.filter_7 = OCR_IMG_PREPROCESS_RESIZE({WIDTH}, {HEIGHT});
filters_.filter_8 = OCR_IMG_PREPROCESS_GRAYSCALE;
filters_.filter_9 = OCR_IMG_PREPROCESS_INVERT;
filters_.filter_10 = OCR_IMG_PREPROCESS_DILATE;
filters_.filter_11 = OCR_IMG_PREPROCESS_MEDIAN;
Approximate increase of processing time: 0%
Viewing preprocessed images
Aspose.OCR for C++ offers an easy way to save preprocessed images using asposeocr_preprocess_page_and_save()
or preprocess_page_and_save_from_raw_bytes()
functions. These functions apply preprocessing filters to the image and save the resulting image to a file.
You can use this file to analyze the effectiveness of preprocessing filters, exclude unnecessary filters that consume resources without affecting the result, or show the result of preprocessing in the user interface.
std::string image_path = "source.png";
custom_preprocessing_filters filters_;
filters_.filter_1 = OCR_IMG_PREPROCESS_GRAYSCALE;
filters_.filter_2 = OCR_IMG_PREPROCESS_THRESHOLD(20);
filters_.filter_3 = OCR_IMG_PREPROCESS_BINARIZE;
filters_.filter_4 = OCR_IMG_PREPROCESS_RESIZE(1500, 2500);
filters_.filter_5 = OCR_IMG_PREPROCESS_SCALE(0.8);
filters_.filter_6 = OCR_IMG_PREPROCESS_DILATE;
filters_.filter_7 = OCR_IMG_PREPROCESS_ROTATE(-20);
filters_.filter_8 = OCR_IMG_PREPROCESS_INVERT;
asposeocr_preprocess_page_and_save(image_path.c_str(), "result.png", filters_);