Analyzing your prompt, please hold on...
An error occurred while retrieving the results. Please refresh the page and try again.
PDF (Portable Document Format) has become the de-facto standard for document storage and sharing in both personal and professional contexts. PDF documents can be viewed on virtually any computing platform, including Windows, macOS, Linux, and mobile operating systems like iOS and Android. This cross-platform compatibility ensures that PDFs can be easily shared and accessed by a wide range of users.
As a result, PDF has also become one of the most popular formats for scanning paper documents, especially due to its ability to combine multiple pages into a single file. This format is widely used for the exchange of contracts, invoices, legal documents, passports and ID cards, and many other documents between individuals, businesses, banks and government agencies.
However, any scanned PDF is essentially a collection of images, which leads to serious disadvantages, particularly in terms of accessibility and usability:
Aspose.OCR for .NET offers you a fast, easy and highly reliable way to convert any scanned PDF into a fully searchable and indexable document. It accurately recognizes page content, converting it into a machine-readable text layer that can be selected, copied, read by text-to-speech software, and even automatically processed by translators, summarizers, and other AI-powered analytics tools.
Moreover, you will not lose the original content. The original images are placed in the background, and the recognized text is placed as an invisible but searchable and selectable overlay on top of the images. This way, all notes, images, marks, signatures and other data remain in the document, allowing it to be used in digital archives.
In the example below, we will demonstrate how to digitize a signed contract sent to you as a scanned PDF. You will only need 24 lines of code (including comments) - see for yourself.
Use your own document or download the sample delivery agreement.
Delivery-Agreement.pdf
.Aspose.OCR
namespace to improve the code readability:
using Aspose.OCR;
License license = new License();
license.SetLicense("Aspose.OCR.lic");
OcrInput pdf = new OcrInput(InputType.PDF);
pdf.Add("Delivery-Agreement.pdf");
AsposeOcr api = new AsposeOcr();
List<RecognitionResult> result = api.Recognize(pdf);
AsposeOcr.SaveMultipageDocument("Readable-Contract.pdf", SaveFormat.Pdf, result);
using Aspose.OCR;
namespace ConvertScannedPDF
{
internal class Program
{
static void Main(string[] args)
{
// Apply license
License license = new License();
license.SetLicense("Aspose.OCR.lic");
// Load the scanned PDF
OcrInput pdf = new OcrInput(InputType.PDF);
pdf.Add("Delivery-Agreement.pdf");
// Recognize the text from document
AsposeOcr api = new AsposeOcr();
List<RecognitionResult> result = api.Recognize(pdf);
// Save searchable PDF
AsposeOcr.SaveMultipageDocument("Readable-Contract.pdf", SaveFormat.Pdf, result);
// Report progress
Console.WriteLine($@"Recognition finished. See '{Directory.GetCurrentDirectory()}\Readable-Contract.pdf'.");
}
}
}
Run the program directly from the Visual Studio or build it and execute the file from the command line. Wait a few seconds, depending on your system performance.
A Readable-Contract.pdf
file will be created in the program working directory. As you see, you can select the text in the file, search it, and even read the text aloud.
Analyzing your prompt, please hold on...
An error occurred while retrieving the results. Please refresh the page and try again.