Analyzing your prompt, please hold on...
An error occurred while retrieving the results. Please refresh the page and try again.
While you can extract text from color or grayscale scans or photographs, Aspose.OCR engine always uses black and white images to detect text and perform automatic corrections. The conversion to black-and-white is performed automatically; this process is called binarization.
By default, Aspose.OCR automatically calculates the optimal binarization parameters. To convert the image to black and white before performing the recognition, apply OCR_IMG_PREPROCESS_BINARIZE
preprocessing filter:
The image is automatically converted to black and white when applying the following filters:
std::string image_path = "source.png";
custom_preprocessing_filters filters_;
filters_.filter_1 = OCR_IMG_PREPROCESS_BINARIZE;
asposeocr_preprocess_page_and_save(image_path.c_str(), "result.png", filters_);
In some rare cases, you may need to override the automatic binarization settings to get more accurate recognition results:
If you notice that part of the text disappears from the recognition results, try manually specifying the threshold criteria that determine whether a pixel is considered black or white. If a pixel is lighter than the threshold, it is considered a white pixel, otherwise it is considered a black pixel. In other words, the higher the threshold value, the more content will be sent for recognition, including words printed in very light colors. If the threshold set to 0
, the black and white are assigned automatically based on the content of the image.
To specify binarization threshold, provide it in OCR_IMG_PREPROCESS_THRESHOLD
preprocessing filter or set threshold_value
property in recognition settings. To rely on automatic processing, do not add a filter and do not set the threshold_value
.
std::string image_path = "source.png";
const size_t len = 4096;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.threshold_value = 0;
size_t res_len = aspose::ocr::page_settings(image_path.c_str(), buffer, len, settings);
std::wcout << buffer;
<specify threshold value to recognize the image>
Optical character recognition or
is the electronic or
conversion of images of typed,
handwritten ore text into
machine-encoded text, whether from a
scanned document, a photo of a document,
a scene-photo or from subtitle text
superImposed on an image.
Optical character recognition or
Is the or
conversion of images of typed,
or printed text into
machine-encoded text, whether from a
scanned document,
a or from subtitle text
superimposed on an image.
Optical character recognition or
is the electronic or
conversion of images of typed,
or printed text into
machine-encoded text, whether from a
scanned document, a photo of a document,
a or from subtitle text
superimposed or an Image.
Optical character recognition or
is the electronic or
conversion of images of typed,
handwritten or printed text into
machine-encoded text, whether from a
scanned document,a photo of a document,
a scene-photo or from subtitle text
superimposed on an image.
Optical character recognition or optical
character reader is the electronic or
mechanical conversion of images of typed,
handwritten or printed text into
machine-encoded text, whether from a
scanned document,a photo of a document,
a scene-photo or from subtitle text
superimposed on an image.
Binarization is always used for text detection and automatic image corrections.
Analyzing your prompt, please hold on...
An error occurred while retrieving the results. Please refresh the page and try again.