Browse our Products

Aspose.OCR for Python via .NET 23.9.1 - Release Notes

This article contains a summary of recent changes, enhancements and bug fixes in Aspose.OCR for Python via .NET 23.9.1 (September 2023) release.

What was changed

Key	Summary	Category
OCRPY‑41	Improved support for multi-threaded recognition. The speed of batch recognition has been significantly increased (up to 2 times).	Enhancement

Public API changes and backwards compatibility

This section lists all public API changes introduced in Aspose.OCR for Python via .NET 23.9.1 that may affect the code of existing applications.

Added public APIs:

No changes.

Updated public APIs:

No changes.

Removed public APIs:

No changes.

Changes in application logic

Compatibility: fully backward compatible.

Multithreading support has been significantly redesigned. Now it works differently depending on the number of images in the recognition batch:

Recognizing one image

This scenario is applied to recognition of a single image or a single-page PDF. For example:

# Instantiate Aspose.OCR API
api = AsposeOcr()
# Add images to the recognition batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("source.png")
# Limit resource usage to 4 threads
recognitionSettings = RecognitionSettings()
recognitionSettings.threads_count = 4
# Recognize the image
result = api.recognize(input, recognitionSettings)
# Print recognition result
print(result[0].recognition_text)

The recognition behavior has not changed from previous versions. Aspose.OCR for Python via .NET will use all CPU cores/threads for recognizing the provided image (if threads_count is not configured) or the number of threads specified in threads_count (if set).

Recognizing multiple files/pages

This scenario is used for bulk recognition of several images or recognition of a multi-page document (PDF, DjVu). It is also applicable when processing files from a folder or ZIP archive. For example:

# Instantiate Aspose.OCR API
api = AsposeOcr()
# Add images to the recognition batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("source1.png")
input.add("source2.png")
input.add("source3.png")
# Limit resource usage to 6 threads
recognitionSettings = RecognitionSettings()
recognitionSettings.threads_count = 6
# Recognize the image
results = api.recognize(input, recognitionSettings)
# Print recognition results
for result in results:
	print(result.recognition_text)

Each image from the batch is processed in one separate thread. If more than one thread is available, images are recognized in parallel.

Previously, images from a batch were processed one by one.

The number of images processed simultaneously cannot exceed the value of the threads_count recognition setting or the total number of CPU threads (if threads_count is not configured or exceeds the number of CPU threads).

Parallel processing increased the batch recognition speed by approximately 100% (twice as fast) compared to the previously used approach.

Recognition of a single image is unaffected.