Analyzing your prompt, please hold on...
An error occurred while retrieving the results. Please refresh the page and try again.
While many businesses, organizations and individuals have been actively working on reducing their reliance on paper documents, this is still the most widespread format for storage and sharing. For example, scans of paper documents are actively used to exchange signed contracts, NDA’s, invoices and other legally binding documents.
Scanned documents backed by physical archives are sufficient for regulatory compliance, legal purposes, long-term backup and redundancy. However, business cases frequently arise for creating new documents based on existing scanned content or portions of existing documents. Here are some common scenarios where this practice is advantageous:
Aspose.OCR for .NET makes it easy to convert a scanned image, an image-based PDF, or even a photo into an editable DOCX or RTF document or Microsoft Excel spreadsheet (XLSX). Content is recognized with high accuracy and speed, saving you the time and effort of manual typing and ensuring there are no human errors, especially when working with large volumes of text. The library supports 28 languages based on Latin, Cyrillic and Asian scrips which makes it applicable for most businesses and organizations on the global scale. All popular typefaces such as Arial, Times New Roman, Courier New, Tahoma, Calibri and more in regular, bold and italic styles are detected and recognized.
In the example below, we will show you how to convert a signed contract scanned to PDF into an editable Microsoft Word document. You will only need 24 lines of code (including comments) - see for yourself.
Use your own scan or download the sample scanned agreement in PDF format.
Delivery-Agreement.pdf
.Aspose.OCR
namespace to improve the code readability:
using Aspose.OCR;
License license = new License();
license.SetLicense("Aspose.OCR.lic");
OcrInput scans = new OcrInput(InputType.PDF);
scans.Add("Delivery-Agreement.pdf");
AsposeOcr api = new AsposeOcr();
List<RecognitionResult> results = api.Recognize(scans);
AsposeOcr.SaveMultipageDocument("contract.docx", SaveFormat.Docx, results, true);
using Aspose.OCR;
namespace EditScan
{
internal class Program
{
static void Main(string[] args)
{
// Apply license
License license = new License();
license.SetLicense("Aspose.OCR.lic");
// Load the scanned agreement
OcrInput scans = new OcrInput(InputType.PDF);
scans.Add("Delivery-Agreement.pdf");
// Recognize the text from document
AsposeOcr api = new AsposeOcr();
List<RecognitionResult> results = api.Recognize(scans);
// Automatically correct spelling errors and save editable document in Microsoft Word (DOCX) format
AsposeOcr.SaveMultipageDocument("contract.docx", SaveFormat.Docx, results, true);
// Report progress
Console.WriteLine($@"The scan has been converted to '{Directory.GetCurrentDirectory()}\contract.docx'.");
}
}
}
Run the program directly from the Visual Studio or build it and execute the file from the command line. Wait a few seconds, depending on your system performance.
A contract.docx
file will be created in the program working directory. You can open it Microsoft Word, edit the content, search for text, and copy and paste the text to other documents.
Analyzing your prompt, please hold on...
An error occurred while retrieving the results. Please refresh the page and try again.