Saving recognition results as a searchable PDF

To convert recognition results into a searchable and indexable PDF document, use SaveMultipageDocument method of Aspose.OCR.AsposeOcr class. This can be useful for recognizing books, contracts, articles, and other printouts consisting of multiple pages, as well as for batch recognition. Provide Aspose.OCR.SaveFormat.Pdf as saveFormat parameter.

In addition to the recognized text, you can save the resulting PDF may have original images in the background and a transparent text overlay that can be searched, selected and copied. The type of the PDF document is controlled by the selected result type option:

Format	Description
`Aspose.OCR.SaveFormat.Pdf`	The original images are placed in the background; the recognized text is placed as an invisible but searchable and selectable overlay on top of the images. Can be useful if you need to keep all notes, images, marks and other data along with the text.
`Aspose.OCR.SaveFormat.PdfNoImg`	The PDF document containing only the recognized text. The original images are not saved along with the recognition results. This can be useful when digitizing large amounts of high-quality text (such as books) so that the resulting file takes up much less space than using the `Aspose.OCR.SaveFormat.Pdf` parameter.

When an original image is placed in the background, only the following processing filters are applied to the background image:

Skew correction (including automatic deskewing and manual rotation).
Resizing.

Other image processing filters are applied during recognition, but do not affect the background image in a searchable PDF.

To balance between file size and image quality of saved PDFs, use the optional optimizePdf parameter, which accepts the value of Aspose.OCR.PdfOptimizationMode enumeration.

Name	Value	Description
`NONE`	0	Do not optimize PDF size.
`MAXIMUM_QUALITY`	1	Default. Optimize file size while preserving the highest image quality.
`HIGH_QUALITY`	2	Smaller PDF file size at the expense of sight image downsampling.
`BALANCED`	3	Downsample images to balance file size and image quality.
`AGGRESSIVE`	4	Significantly reduce the PDF file size at the expense of lower image quality.

The resulting PDF file size depends on the size and complexity of the original image.

You can optionally enable automatic spelling corrections for recognition results, provide a custom dictionary, or specify the font to be embedded into a PDF document. The latter is only applicable when saving recognition results into text-only PDF (Aspose.OCR.SaveFormat.PdfNoImg).

Save PDF to file

Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
// Add images to OcrInput object
Aspose.OCR.OcrInput input = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage);
input.Add("page1.png");
input.Add("page2.png");
// Recognize image
List<Aspose.OCR.RecognitionResult> results = recognitionEngine.Recognize(input);
// Save results
Aspose.OCR.AsposeOcr.SaveMultipageDocument("result.pdf", Aspose.OCR.SaveFormat.Pdf, results);

Write PDF to memory

Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
// Add images to OcrInput object
Aspose.OCR.OcrInput input = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage);
input.Add("page1.png");
input.Add("page2.png");
// Recognize image
List<Aspose.OCR.RecognitionResult> results = recognitionEngine.Recognize(input);
// Save results
using(MemoryStream ms = new MemoryStream())
{
	Aspose.OCR.AsposeOcr.SaveMultipageDocument(ms, Aspose.OCR.SaveFormat.Pdf, results);
}

Saving recognition results as a file Getting recognition results as text