Dewarping

Geometric distortions is a very common case when dealing with photos of books, magazines, multi-page documents, and similar content. They can be caused by physical page curvature or camera lens distortion (ultra-wide and fisheye lenses, as well as entry-level smartphone lenses).

Warped images are very hard to be processed by most OCR algorithms. Thus, image straightening and distortion removal is critical to the recognition process as it directly affects the reliability and efficiency of segmentation and text extraction. Aspose.OCR implements a processing filter for automated correction of geometric distortions before proceeding to recognition.

Important considerations

Due to the high complexity of the underlying neural network, dewarping requires a lot of resources and time. We recommend using it for pinpoint processing of individual images that are otherwise unrecognizable. This processing filter is not recommended for high load web applications that recognize images close to real time.

  • Dewarping one image may take up to 45 seconds.
  • Dewarping is very resource-intensive (CPU and RAM).
  • After removing the image curvature, text lines will have some wave-like distortion. Thus, it is highly recommended to use DetectAreasMode.CURVED_TEXT areas detection mode to extract text from the dewarped image. Other area detection modes may produce inaccurate results.
  • Dewarping also corrects image tilt. We recommend disabling automatic skew correction - it will have no effect and may even result in severe image distortion.

Dewarping

To straighten the curved image, run it through AutoDewarping processing filter.

Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
// Fix curvilinear distortions
Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter filters = new Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter();
filters.Add(Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter.AutoDewarping());
// Add an image to OcrInput object and apply processing filters
Aspose.OCR.OcrInput input = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage, filters);
input.Add("source.png");
// Save processed image to the folder
Aspose.OCR.ImageProcessing.Save(input, @"C:\result");
// Recognize image
List<Aspose.OCR.RecognitionResult> results = recognitionEngine.Recognize(input);
foreach(Aspose.OCR.RecognitionResult result in results)
{
	Console.WriteLine(result.RecognitionText);
}
Curved page photo Dewarped image

Usage scenarios

Automatic dewarping is recommended for:

  • Photos of book pages, magazines.
  • Curved paper.
  • Ultra wide-angle and fisheye photos.