在C#中合并PDF文档
概述
本文解释如何使用C#合并、组合或连接不同的PDF文件为单个PDF。它涵盖了以下主题,例如
使用文件路径合并 PDF 文件
PdfFileEditor 是 Aspose.Pdf.Facades 命名空间 中的类,允许您合并多个 PDF 文件。 你不仅可以使用 FileStreams 来连接文件,还可以使用 MemoryStreams。在本文中,将解释如何使用 MemoryStreams 来连接文件,并通过代码片段进行展示。
PdfFileEditor 类的 Concatenate 方法可以用来连接两个 PDF 文件。Concatenate 方法允许您传递三个参数:第一个输入 PDF,第二个输入 PDF 和输出 PDF。最终的输出 PDF 包含两个输入 PDF 文件。
以下 C# 代码片段展示了如何使用文件路径连接 PDF 文件。
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_Pages(); | |
// Create PdfFileEditor object | |
PdfFileEditor pdfEditor = new PdfFileEditor(); | |
// Concatenate files | |
pdfEditor.Concatenate(dataDir + "input.pdf", dataDir + "input2.pdf", dataDir + "ConcatenateUsingPath_out.pdf"); |
在某些情况下,当有大量大纲时,用户可以通过将 CopyOutlines 设置为 false 来禁用它们,从而提高连接的性能。
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_Pages(); | |
PdfFileEditor pfe = new PdfFileEditor(); | |
string[] files = Directory.GetFiles(dataDir); | |
pfe.CopyOutlines = false; | |
pfe.Concatenate(files, dataDir + "CopyOutline_out.pdf"); |
使用内存流连接多个PDF文件
PdfFileEditor 类的 Concatenate 方法将源PDF文件和目标PDF文件作为参数。这些参数可以是磁盘上PDF文件的路径,也可以是内存流。现在,对于这个示例,我们将首先创建两个文件流以从磁盘读取PDF文件。然后我们将这些文件转换为字节数组。PDF文件的这些字节数组将被转换为内存流。一旦我们从PDF文件中获取到内存流,我们就可以将它们传递给连接方法并合并到一个单一的输出文件中。
以下C#代码片段向您展示了如何使用内存流连接多个PDF文件:
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_TechnicalArticles(); | |
// Create two file streams to read pdf files | |
FileStream fs1 = new FileStream(dataDir + "inFile.pdf", FileMode.Open, FileAccess.Read); | |
FileStream fs2 = new FileStream(dataDir + "inFile2.pdf", FileMode.Open, FileAccess.Read); | |
// Create byte arrays to keep the contents of PDF files | |
byte[] buffer1 = new byte[Convert.ToInt32(fs1.Length)]; | |
byte[] buffer2 = new byte[Convert.ToInt32(fs2.Length)]; | |
int i = 0; | |
// Read PDF file contents into byte arrays | |
i = fs1.Read(buffer1, 0, Convert.ToInt32(fs1.Length)); | |
i = fs2.Read(buffer2, 0, Convert.ToInt32(fs2.Length)); | |
// Now, first convert byte arrays into MemoryStreams and then concatenate those streams | |
using (MemoryStream pdfStream = new MemoryStream()) | |
{ | |
using (MemoryStream fileStream1 = new MemoryStream(buffer1)) | |
{ | |
using (MemoryStream fileStream2 = new MemoryStream(buffer2)) | |
{ | |
// Create instance of PdfFileEditor class to concatenate streams | |
PdfFileEditor pdfEditor = new PdfFileEditor(); | |
// Concatenate both input MemoryStreams and save to putput MemoryStream | |
pdfEditor.Concatenate(fileStream1, fileStream2, pdfStream); | |
// Convert MemoryStream back to byte array | |
byte[] data = pdfStream.ToArray(); | |
// Create a FileStream to save the output PDF file | |
FileStream output = new FileStream(dataDir + "merged_out.pdf", FileMode.Create, | |
FileAccess.Write); | |
// Write byte array contents in the output file stream | |
output.Write(data, 0, data.Length); | |
// Close output file | |
output.Close(); | |
} | |
} | |
} | |
// Close input files | |
fs1.Close(); | |
fs2.Close(); |
使用文件路径连接PDF文件数组
如果您想连接多个PDF文件,您可以使用Concatenate方法的重载,它允许您传递PDF文件数组。最终输出被保存为一个从数组中所有文件创建的合并文件。以下C#代码片段向您展示如何使用文件路径连接PDF文件数组。
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_Pages(); | |
// Create PdfFileEditor object | |
PdfFileEditor pdfEditor = new PdfFileEditor(); | |
// Array of files | |
string[] filesArray = new string[2]; | |
filesArray[0] = dataDir + "input.pdf"; | |
filesArray[1] = dataDir + "input2.pdf"; | |
// Concatenate files | |
pdfEditor.Concatenate(filesArray, dataDir + "ConcatenateArrayOfFilesWithPath_out.pdf"); |
使用流连接PDF文件数组
连接PDF文件数组不仅限于仅限于磁盘上的文件。 你还可以连接来自流的 PDF 文件数组。如果您想连接多个 PDF 文件,可以使用 Concatenate 方法的适当重载。首先,您需要创建一个输入流数组和一个用于输出 PDF 的流,然后调用 Concatenate 方法。输出将保存在输出流中。以下 C# 代码片段向您展示如何使用流连接 PDF 文件数组。
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_Pages(); | |
// Create PdfFileEditor object | |
PdfFileEditor pdfEditor = new PdfFileEditor(); | |
// Output stream | |
FileStream outputStream = new FileStream(dataDir + "ConcatenateArrayOfPdfUsingStreams_out.pdf", FileMode.Create); | |
// Array of streams | |
FileStream[] inputStreams = new FileStream[2]; | |
inputStreams[0] = new FileStream(dataDir + "input.pdf", FileMode.Open); | |
inputStreams[1] = new FileStream(dataDir + "input2.pdf", FileMode.Open); | |
// Concatenate file | |
pdfEditor.Concatenate(inputStreams, outputStream); |
连接特定文件夹中的所有 Pdf 文件
您甚至可以在运行时读取特定文件夹中的所有 Pdf 文件并将它们连接起来,甚至无需知道文件名。 简单提供包含您想要连接的PDF文档的目录路径。
请尝试使用以下C#代码片段来实现此功能。
连接PDF表单并保持字段名称唯一
PdfFileEditor 类在 Aspose.Pdf.Facades namespace 中提供了连接PDF文件的功能。 现在,如果要连接的 PDF 文件中包含具有相似字段名称的表单字段,Aspose.PDF 提供了使结果 PDF 文件中的字段保持唯一的功能,并且您还可以指定后缀以使字段名称唯一。KeepFieldsUnique 属性的 PdfFileEditor 设为 true 时将使字段名称在连接 PDF 表单时唯一。此外,PdfFileEditor 的 UniqueSuffix 属性可用于指定用户定义的后缀格式,该后缀将添加到字段名称中以使其在表单连接时唯一。此字符串必须包含 %NUM%
子字符串,该子字符串将在结果文件中替换为数字。
请参阅以下简单代码片段以实现此功能。
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_TechnicalArticles(); | |
// Set input and output file paths | |
string inputFile1 = dataDir + "inFile1.pdf"; | |
string inputFile2 = dataDir + "inFile2.pdf"; | |
string outFile = dataDir + "ConcatenatePDFForms_out.pdf"; | |
// Instantiate PdfFileEditor Object | |
PdfFileEditor fileEditor = new PdfFileEditor(); | |
// To keep unique field Id for all the fields | |
fileEditor.KeepFieldsUnique = true; | |
// Format of the suffix which is added to field name to make it unique when forms are concatenated. | |
fileEditor.UniqueSuffix = "_%NUM%"; | |
// Concatenate the files into a resultant Pdf file | |
fileEditor.Concatenate(inputFile1, inputFile2, outFile); |
合并 PDF 文件并创建目录
合并 PDF 文件
请查看以下代码片段,了解如何合并 PDF 文件。
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_TechnicalArticles(); | |
// Create PdfFileEditor object | |
PdfFileEditor pdfEditor = new PdfFileEditor(); | |
// Save concatenated output file | |
pdfEditor.Concatenate(new FileStream(dataDir + "input1.pdf", FileMode.Open), new FileStream(dataDir + "input2.pdf", FileMode.Open), new FileStream(dataDir + "ConcatenatePdfFilesAndCreateTOC_out.pdf", FileMode.Create)); |
插入空白页
PDF 文件合并后,我们可以在文档的开头插入一个空白页,在其上创建目录。为了实现这一要求,我们可以将合并后的文件加载到 Document 对象中,并需要调用 Page.Insert(…) 方法插入空白页。
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_TechnicalArticles(); | |
// Insert a blank page at the begining of concatenated file to display Table of Contents | |
Aspose.Pdf.Document concatenated_pdfDocument = new Aspose.Pdf.Document(new FileStream(dataDir + "Concatenated_Table_Of_Contents.pdf", FileMode.Open)); | |
// Insert a empty page in a PDF | |
concatenated_pdfDocument.Pages.Insert(1); |
添加文本印章
为了创建目录,我们需要在第一页使用 PdfFileStamp 和 Stamp 对象添加文本印章。 Stamp 类提供 BindLogo(...)
方法来添加 FormattedText,我们还可以使用 SetOrigin(..)
方法指定添加这些文本印章的位置。在本文中,我们正在连接两个 PDF 文件,因此我们需要创建两个文本印章对象指向这些单独的文档。
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_TechnicalArticles(); | |
// Set Text Stamp to display string Table Of Contents | |
Aspose.Pdf.Facades.Stamp stamp = new Aspose.Pdf.Facades.Stamp(); | |
stamp.BindLogo(new FormattedText("Table Of Contents", System.Drawing.Color.Maroon, System.Drawing.Color.Transparent, Aspose.Pdf.Facades.FontStyle.Helvetica, EncodingType.Winansi, true, 18)); | |
// Specify the origin of Stamp. We are getting the page width and specifying the X coordinate for stamp | |
stamp.SetOrigin((new PdfFileInfo(new FileStream(dataDir + "input1.pdf", FileMode.Open)).GetPageWidth(1) / 3), 700); | |
// Set particular pages | |
stamp.Pages = new int[] { 1 }; |
创建本地链接
现在我们需要添加指向连接文件内部页面的链接。为了完成这一要求,我们可以使用 PdfContentEditor 类的 CreateLocalLink(..)
方法。在以下代码片段中,我们将 Transparent 作为第四个参数传递,以便链接周围的矩形不可见。
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_TechnicalArticles(); | |
// Now we need to add Heading for Table Of Contents and links for documents | |
PdfContentEditor contentEditor = new PdfContentEditor(); | |
// Bind the PDF file in which we added the blank page | |
contentEditor.BindPdf(new FileStream(dataDir + "Concatenated_Table_Of_Contents.pdf", FileMode.Open)); | |
// Create link for first document | |
contentEditor.CreateLocalLink(new System.Drawing.Rectangle(150, 650, 100, 20), 2, 1, System.Drawing.Color.Transparent); |
完整代码
// For complete examples and data files, please go to https://github.com/aspose-pdf/Aspose.Pdf-for-.NET | |
// The path to the documents directory. | |
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_TechnicalArticles(); | |
// Create PdfFileEditor object | |
PdfFileEditor pdfEditor = new PdfFileEditor(); | |
// Create a MemoryStream object to hold the resultant PDf file | |
using (MemoryStream Concatenated_Stream = new MemoryStream()) | |
{ | |
// Save concatenated output file | |
pdfEditor.Concatenate(new FileStream(dataDir + "input1.pdf", FileMode.Open), new FileStream(dataDir + "input2.pdf", FileMode.Open), Concatenated_Stream); | |
// Insert a blank page at the begining of concatenated file to display Table of Contents | |
Aspose.Pdf.Document concatenated_pdfDocument = new Aspose.Pdf.Document(Concatenated_Stream); | |
// Insert a empty page in a PDF | |
concatenated_pdfDocument.Pages.Insert(1); | |
// Hold the resultnat file with empty page added | |
using (MemoryStream Document_With_BlankPage = new MemoryStream()) | |
{ | |
// Save output file | |
concatenated_pdfDocument.Save(Document_With_BlankPage); | |
using (var Document_with_TOC_Heading = new MemoryStream()) | |
{ | |
// Add Table Of Contents logo as stamp to PDF file | |
PdfFileStamp fileStamp = new PdfFileStamp(); | |
// Find the input file | |
fileStamp.BindPdf(Document_With_BlankPage); | |
// Set Text Stamp to display string Table Of Contents | |
Aspose.Pdf.Facades.Stamp stamp = new Aspose.Pdf.Facades.Stamp(); | |
stamp.BindLogo(new FormattedText("Table Of Contents", System.Drawing.Color.Maroon, System.Drawing.Color.Transparent, Aspose.Pdf.Facades.FontStyle.Helvetica, EncodingType.Winansi, true, 18)); | |
// Specify the origin of Stamp. We are getting the page width and specifying the X coordinate for stamp | |
stamp.SetOrigin((new PdfFileInfo(Document_With_BlankPage).GetPageWidth(1) / 3), 700); | |
// Set particular pages | |
stamp.Pages = new int[] { 1 }; | |
// Add stamp to PDF file | |
fileStamp.AddStamp(stamp); | |
// Create stamp text for first item in Table Of Contents | |
var Document1_Link = new Aspose.Pdf.Facades.Stamp(); | |
Document1_Link.BindLogo(new FormattedText("1 - Link to Document 1", System.Drawing.Color.Black, System.Drawing.Color.Transparent, Aspose.Pdf.Facades.FontStyle.Helvetica, EncodingType.Winansi, true, 12)); | |
// Specify the origin of Stamp. We are getting the page width and specifying the X coordinate for stamp | |
Document1_Link.SetOrigin(150, 650); | |
// Set particular pages on which stamp should be displayed | |
Document1_Link.Pages = new int[] { 1 }; | |
// Add stamp to PDF file | |
fileStamp.AddStamp(Document1_Link); | |
// Create stamp text for second item in Table Of Contents | |
var Document2_Link = new Aspose.Pdf.Facades.Stamp(); | |
Document2_Link.BindLogo(new FormattedText("2 - Link to Document 2", System.Drawing.Color.Black, System.Drawing.Color.Transparent, Aspose.Pdf.Facades.FontStyle.Helvetica, EncodingType.Winansi, true, 12)); | |
// Specify the origin of Stamp. We are getting the page width and specifying the X coordinate for stamp | |
Document2_Link.SetOrigin(150, 620); | |
// Set particular pages on which stamp should be displayed | |
Document2_Link.Pages = new int[] { 1 }; | |
// Add stamp to PDF file | |
fileStamp.AddStamp(Document2_Link); | |
// Save updated PDF file | |
fileStamp.Save(Document_with_TOC_Heading); | |
fileStamp.Close(); | |
// Now we need to add Heading for Table Of Contents and links for documents | |
PdfContentEditor contentEditor = new PdfContentEditor(); | |
// Bind the PDF file in which we added the blank page | |
contentEditor.BindPdf(Document_with_TOC_Heading); | |
// Create link for first document | |
contentEditor.CreateLocalLink(new System.Drawing.Rectangle(150, 650, 100, 20), 2, 1, System.Drawing.Color.Transparent); | |
// Create link for Second document | |
// We have used new PdfFileInfo("d:/pdftest/Input1.pdf").NumberOfPages + 2 as PdfFileInfo.NumberOfPages(..) returns the page count for first document | |
// And 2 is because, second document will start at Input1+1 and 1 for the page containing Table Of Contents. | |
contentEditor.CreateLocalLink(new System.Drawing.Rectangle(150, 620, 100, 20), new PdfFileInfo(dataDir + "Input1.pdf").NumberOfPages + 2, 1, System.Drawing.Color.Transparent); | |
// Save updated PDF | |
contentEditor.Save( dataDir + "Concatenated_Table_Of_Contents.pdf"); | |
} | |
} | |
} |
合并文件夹中的 PDF 文件
Aspose.Pdf.Facades 命名空间中的 PdfFileEditor 类为您提供了合并 PDF 文件的功能。您甚至可以在运行时读取特定文件夹中的所有 Pdf 文件并将它们合并,而无需知道文件名。只需提供包含您要合并的 PDF 文档的目录路径即可。
请尝试使用以下 C# 代码片段从 Aspose.PDF 实现此功能:
// 文档目录的路径。
string dataDir = RunExamples.GetDataDir_AsposePdfFacades_TechnicalArticles();
// 检索特定目录中所有 Pdf 文件的名称
string[] fileEntries = Directory.GetFiles(dataDir, "*.pdf");
// 获取当前系统日期并设置其格式
string date = DateTime.Now.ToString("MM-dd-yyyy");
// 获取当前系统时间并设置其格式
string hoursSeconds = DateTime.Now.ToString("hh-mm");
// 设置最终结果 Pdf 文档的值
string masterFileName = date + "_" + hoursSeconds + "_out.pdf";
// 实例化 PdfFileEditor 对象
Aspose.Pdf.Facades.PdfFileEditor pdfEditor = new PdfFileEditor();
// 调用 PdfFileEditor 对象的 Concatenate 方法将所有输入文件
// 合并为一个输出文件
pdfEditor.Concatenate(fileEntries, dataDir + masterFileName);