Analyzing your prompt, please hold on...
An error occurred while retrieving the results. Please refresh the page and try again.
Please note that all comparing tools are available in Aspose.PDF.Drawing library.
When working with PDF documents, there are times when you need to compare the content of two documents to identify differences. The Aspose.PDF for .NET library provides a powerful toolset for this purpose. In this article, we’ll explore how to compare PDF documents using a couple of simple code snippets.
The comparison functionality in Aspose.PDF allows you to compare two PDF documents page by page. You can choose to compare either specific pages or entire documents. The resulting comparison document highlights differences, making it easier to identify changes between the two files.
Here is a list of possible ways to compare PDF documents using the Aspose.PDF for .NET library:
Comparing Specific Pages - Compare the first pages of two PDF documents.
Comparing Entire Documents - Compare the entire content of two PDF documents.
Compare PDF documents graphically:
Compare PDF with GetDifference method - individual images where changes are marked.
Compare PDF with CompareDocumentsToPdf method - PDF document with images where changes are marked.
The first code snippet demonstrates how to compare the first pages of two PDF documents.
Document Initialization. The code starts by initializing two PDF documents using their respective file paths (documentPath1 and documentPath2). The paths are specified as empty strings for now, but in practice, you would replace these with the actual file paths.
Comparison Process.
‘AdditionalChangeMarks = true’ - this option ensures that additional change markers are displayed. These markers highlight differences that might be present on other pages, even if they are not on the current page being compared.
‘ComparisonMode = ComparisonMode.IgnoreSpaces’ - this mode tells the comparer to ignore spaces in the text, focusing only on changes within words.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ComparingSpecificPages()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentCompare();
// Open PDF documents
using (var document1 = new Aspose.Pdf.Document(dataDir + "ComparingSpecificPages1.pdf"))
{
using (var document2 = new Aspose.Pdf.Document(dataDir + "ComparingSpecificPages2.pdf"))
{
// Compare
Aspose.Pdf.Comparison.SideBySidePdfComparer.Compare(document1.Pages[1], document2.Pages[1], dataDir + "ComparingSpecificPages_out.pdf", new Aspose.Pdf.Comparison.SideBySideComparisonOptions
{
AdditionalChangeMarks = true,
ComparisonMode = Aspose.Pdf.Comparison.ComparisonMode.IgnoreSpaces,
DeleteColor = Color.DarkGray,
InsertColor = Color.LightYellow
});
}
}
}
The second code snippet expands the scope to compare the entire content of two PDF documents.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ComparingEntireDocuments()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentCompare();
// Open PDF documents
using (var document1 = new Aspose.Pdf.Document(dataDir + "ComparingEntireDocuments1.pdf"))
{
using (var document2 = new Aspose.Pdf.Document(dataDir + "ComparingEntireDocuments2.pdf"))
{
// Compare
Aspose.Pdf.Comparison.SideBySidePdfComparer.Compare(
document1,
document2,
dataDir + "ComparingEntireDocuments_out.pdf",
new Aspose.Pdf.Comparison.SideBySideComparisonOptions
{
AdditionalChangeMarks = true,
ComparisonMode = Aspose.Pdf.Comparison.ComparisonMode.IgnoreSpaces,
DeleteColor = Color.DarkGray,
InsertColor = Color.LightYellow
});
}
}
}
The comparison results generated by these snippets are PDF documents that you can open in a viewer like Adobe Acrobat. If you use the Two-page view in Adobe Acrobat, you’ll see the changes side by side:
By setting ‘AdditionalChangeMarks’ to ’true’, you can also see markers for changes that may occur on other pages, even if those changes aren’t on the current page being viewed.
Aspose.PDF for .NET provides robust tools for comparing PDF documents, whether you need to compare specific pages or entire documents. By using options like ‘AdditionalChangeMarks’ and different ‘ComparisonMode settings’, you can tailor the comparison process to your specific needs. The resulting document provides a clear, side-by-side view of changes, making it easier to track revisions and ensure document accuracy.
When collaborating on documents, especially in professional environments, you often end up with multiple versions of the same file.
You can use the GraphicalPdfComparer class to compare PDF documents and pages. The class is suitable for comparing changes in a page’s graphic content.
With Aspose.PDF for .NET, it’s possible to compare documents and pages and output the comparison result to a PDF document or image file.
You can set the following class properties:
The class has a method that allows you to get page image differences in a form suitable for further processing: ImagesDifference GetDifference(Page page1, Page page2).
This method returns an object of the ImagesDifference class, which contains an image of the first page being compared and an array of differences. The array of differences and the original image has the RGB24bpp pixel format.
ImagesDifference allows you to generate a different image and get an image of the second page being compared by adding an array of differences to the original image. To do this, use the ImagesDifference.GetDestinationImage and ImagesDifference.DifferenceToImage methods.
The provided code defines a method GetDifference that compares two PDF documents and generates visual representations of the differences between them.
This method compares the first pages of two PDF files and generates two PNG images:
This process can be useful for visually comparing changes or differences between two versions of a document.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ComparePDFWithGetDifferenceMethod()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentCompare();
// Open PDF documents
using (var document1 = new Aspose.Pdf.Document(dataDir + "ComparePDFWithGetDifferenceMethod1.pdf"))
{
using (var document2 = new Aspose.Pdf.Document(dataDir + "ComparePDFWithGetDifferenceMethod2.pdf"))
{
// Create comparer
var comparer = new Aspose.Pdf.Comparison.GraphicalPdfComparer();
// Compare
using (var imagesDifference = comparer.GetDifference(document1.Pages[1], document2.Pages[1]))
{
using (var diffImg = imagesDifference.DifferenceToImage(Aspose.Pdf.Color.Red, Aspose.Pdf.Color.White))
{
diffImg.Save(dataDir + "ComparePDFWithGetDifferenceMethodDiffPngFilePath_out.png");
}
using (var destImg = imagesDifference.GetDestinationImage())
{
destImg.Save(dataDir + "ComparePDFWithGetDifferenceMethodDestPngFilePath_out.png");
}
}
}
}
}
The provided code snippet uses the CompareDocumentsToPdf method, which compares two documents and generates a PDF report of the comparison results.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ComparePDFWithCompareDocumentsToPdfMethod()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentCompare();
// Open PDF documents
using (var document1 = new Aspose.Pdf.Document(dataDir + "ComparePDFWithCompareDocumentsToPdfMethod1.pdf"))
{
using (var document2 = new Aspose.Pdf.Document(dataDir + "ComparePDFWithCompareDocumentsToPdfMethod2.pdf"))
{
// Create comparer
var comparer = new Aspose.Pdf.Comparison.GraphicalPdfComparer()
{
Threshold = 3.0,
Color = Aspose.Pdf.Color.Blue,
Resolution = new Aspose.Pdf.Devices.Resolution(300)
};
// Compare
comparer.CompareDocumentsToPdf(document1, document2, dataDir + "compareDocumentsToPdf_out.pdf");
}
}
}
You can use the TextPdfComparer class to compare documents and individual pages. The class allows comparing documents page by page or as a single continuous content stream (without page separation). Comparison methods return an array of differences that can be passed to any class implementing the interfaces IStringOutputGenerator or IFileOutputGenerator to produce a formatted comparison output. Output is supported in: HTML, Markdown, PDF, and JSON. You can also use the ComparisonStatistics class to obtain statistics on the performed comparison operations.
| Method | Description | Parameters | Return Value |
|---|---|---|---|
CompareDocumentsPageByPage(Document document1, Document document2, ComparisonOptions options) |
Compares two PDF documents page by page. | * document1 – the first document.* document2 – the second document.* options – comparison options (see below). |
List<List<DiffOperation>> – list of differences for each page. |
CompareDocumentsPageByPage(Document document1, Document document2, ComparisonOptions options, string resultPdfDocumentPath) |
Same as above, but also saves the comparison results to a PDF file. | Same parameters as above + resultPdfDocumentPath – path to the output file. |
List<List<DiffOperation>> – list of differences. |
CompareFlatDocuments(Document document1, Document document2, ComparisonOptions options) |
Compares two PDF documents as a single continuous text (merging all pages). | Same parameters as in CompareDocumentsPageByPage. |
List<DiffOperation> – list of all differences. |
CompareFlatDocuments(Document document1, Document document2, ComparisonOptions options, string resultPdfDocumentPath) |
Flat document comparison with result saved to a PDF file. | Same parameters as in CompareFlatDocuments + resultPdfDocumentPath. |
List<DiffOperation> – list of differences. |
ComparePages(Page page1, Page page2, ComparisonOptions options) |
Compares two individual pages. | * page1 – the first page.* page2 – the second page.* options – comparison options. |
List<DiffOperation> – list of page differences. |
CreateComparisonStatistics(List<DiffOperation> diffs) |
Generates comparison statistics for a list of operations (single page). | diffs – list of DiffOperation. |
TextItemComparisonStatistics (see below). |
CreateComparisonStatistics(List<List<DiffOperation>> diffs) |
Generates comparison statistics for a list of operations across pages (document level). | diffs – list of lists of DiffOperation. |
DocumentComparisonStatistics (see below). |
AssemblySourcePageText(List<DiffOperation> diffs) |
Reconstructs the original (pre-change) page text. | diffs – list of operations. |
string – original text. |
AssemblyDestinationPageText(List<DiffOperation> diffs) |
Reconstructs the modified (post-change) page text. | diffs – list of operations. |
string – modified text. |
ComparisonOptionsParameters affecting the comparison process:
| Property | Type | Description |
|---|---|---|
ExtractionArea |
Rectangle |
The area from which text will be extracted. Not compatible with ExcludeTables, ExcludeAreas1/2. |
ExcludeTables |
bool |
Exclude tables from comparison. Not compatible with ExtractionArea. |
ExcludeAreas1 |
Rectangle[] |
Array of areas to be excluded for the first document. |
ExcludeAreas2 |
Rectangle[] |
Array of areas to be excluded for the second document. |
EditOperationsOrder |
EditOperationsOrder (enum) |
Order of applying insert/delete operations (default is DeleteFirst). |
DiffOperationRepresents a single difference operation. Contains:
| Property | Type | Description |
|---|---|---|
Operation |
Operation (enum) |
Type of operation (Equal, Delete, Insert). |
Text |
string |
Text associated with the operation. |
ComparisonStatistics| Class | Description |
|---|---|
TextItemComparisonStatistics |
Statistics for an individual text (page). Includes total characters, number of insertions, deletions, and corresponding operations. |
DocumentComparisonStatistics (inherits from TextItemComparisonStatistics) |
Extended statistics for the entire document, including a list of PagesStatistics (per-page statistics). |
Analyzing your prompt, please hold on...
An error occurred while retrieving the results. Please refresh the page and try again.