Conversion to grayscale

In most cases, color is not needed for recognition and can even mislead OCR algorithms. Grayscale allows images to be processed more efficiently, resulting in less specks, cleaner backgrounds, and crisper text than color images. Converting to grayscale can also improve the results of other processing filters, such as automatic deskewing.

Aspose.OCR for Python via .NET provides a method for converting an image to grayscale before proceeding with image processing or OCR.

Grayscale conversion is automatically performed when applying the median filter.

Grayscale conversion

To convert the image to grayscale, run the image through to_grayscale processing filter.

# Instantiate Aspose.OCR API
api = AsposeOcr()
# Initialize image processing
filters = PreprocessingFilter()
filters.add(PreprocessingFilter.to_grayscale())
# Add image to the recognition batch and apply processing filter
input = OcrInput(InputType.SINGLE_IMAGE, filters)
input.add("source.png")
# Save processed image to the "result" folder
ImageProcessing.save(input, "result")
# Recognize the image
result = api.recognize(input)
# Print recognition result
print(result[0].recognition_text)

Usage scenarios

Grayscale conversion is recommended for the following images:

Photos.
Scanned ID cards and other personal documents.
Full-color scans.

In addition, grayscale conversion may decrease the original image size.

Improvements in recognition accuracy and quality enhancements will be highly dependent on the original image and should be empirically tested.

Binarization Color inversion