Convert PDF to EPUB, LaTeX, Text, XPS in C#
Convert PDF to EPUB
Try to convert PDF to EPUB online
Aspose.PDF for .NET presents you online free application “PDF to EPUB”, where you may try to investigate the functionality and quality it works.
EPUB is a free and open e-book standard from the International Digital Publishing Forum (IDPF). Files have the extension .epub. EPUB is designed for reflowable content, meaning that an EPUB reader can optimize text for a particular display device. EPUB also supports fixed-layout content. The format is intended as a single format that publishers and conversion houses can use in-house, as well as for distribution and sale. It supersedes the Open eBook standard.
The following code snippet also work with Aspose.PDF.Drawing library.
Aspose.PDF for .NET also supports the feature to convert PDF documents to EPUB format. Aspose.PDF for .NET has a class named EpubSaveOptions which can be used as the second argument to Document.Save(..)
method, to generate an EPUB file.
Please try using the following code snippet to accomplish this requirement with C#.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ConvertPDFtoEPUB()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentConversion();
// Open PDF document
using (var document = new Aspose.Pdf.Document(dataDir + "PDFToEPUB.pdf"))
{
// Instantiate Epub Save options
EpubSaveOptions options = new EpubSaveOptions();
// Specify the layout for contents
options.ContentRecognitionMode = EpubSaveOptions.RecognitionMode.Flow;
// Save ePUB document
document.Save(dataDir + "PDFToEPUB_out.epub", options);
}
}
Convert PDF to LaTeX/TeX
Aspose.PDF for .NET support converting PDF to LaTeX/TeX. The LaTeX file format is a text file format with the special markup and used in TeX-based document preparation system for high-quality typesetting.
Try to convert PDF to LaTeX/TeX online
Aspose.PDF for .NET presents you online free application “PDF to LaTeX”, where you may try to investigate the functionality and quality it works.
To convert PDF files to TeX, Aspose.PDF has the class LaTeXSaveOptions which provides the property OutDirectoryPath for saving temporary images during the conversion process.
The following code snippet shows the process of converting PDF files into the TEX format with C#.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ConvertPDFtoTeX()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentConversion();
// Open PDF document
using (var document = new Aspose.Pdf.Document(dataDir + "PDFToTeX.pdf"))
{
// Instantiate LaTex save option
LaTeXSaveOptions saveOptions = new LaTeXSaveOptions();
// Specify the output directory
string pathToOutputDirectory = dataDir;
// Set the output directory path for save option object
saveOptions.OutDirectoryPath = pathToOutputDirectory;
// Save PDF document into LaTex format
document.Save(dataDir + "PDFToTeX_out.tex", saveOptions);
}
}
Convert PDF to Text
Aspose.PDF for .NET support converting whole PDF document and single page to a Text file.
Convert whole PDF document to Text file
You can convert PDF document to TXT file using Visit method of TextAbsorber class.
The following code snippet explains how to extract the texts from the all pages.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ConvertPDFtoTXT()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentConversion();
// Open PDF document
using (var document = new Aspose.Pdf.Document(dataDir + "demo.pdf"))
{
var ta = new Aspose.Pdf.Text.TextAbsorber();
ta.Visit(document);
// Save the extracted text in text file
File.WriteAllText(dataDir + "input_Text_Extracted_out.txt",ta.Text);
}
}
Try to convert Convert PDF to Text online
Aspose.PDF for .NET presents you online free application “PDF to Text”, where you may try to investigate the functionality and quality it works.
Convert PDF page to text file
You can convert PDF document to TXT file with Aspose.PDF for .NET. You should use Visit
method of TextAbsorber
class for resolve this task.
The following code snippet explains how to extract the texts from the particular pages.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ConvertPDFtoTXT()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentConversion();
// Open PDF document
using (var document = new Aspose.Pdf.Document(dataDir + "demo.pdf"))
{
var ta = new Aspose.Pdf.Text.TextAbsorber();
var pages = new [] {1, 3, 4};
foreach (var page in pages)
{
ta.Visit(document.Pages[page]);
}
// Save the extracted text in text file
File.WriteAllText(dataDir + "input_Text_Extracted_out.txt", ta.Text);
}
}
Convert PDF to XPS
Aspose.PDF for .NET gives a possibility to convert PDF files to XPS format. Let try to use the presented code snippet for converting PDF files to XPS format with C#.
Try to convert PDF to XPS online
Aspose.PDF for .NET presents you online free application “PDF to XPS”, where you may try to investigate the functionality and quality it works.
The XPS file type is primarily associated with the XML Paper Specification by Microsoft Corporation. The XML Paper Specification (XPS), formerly codenamed Metro and subsuming the Next Generation Print Path (NGPP) marketing concept, is Microsoft’s initiative to integrate document creation and viewing into the Windows operating system.
To convert PDF files to XPS, Aspose.PDF has the class XpsSaveOptions that is used as the second argument to the Document.Save(..) method to generate the XPS file.
Since the 24.2 release, Aspose.PDF has implemented converting Searchable PDF to XPS while keeping Text Selectable in the resultant XPS. To preserve text, it’s necessary to set the XpsSaveOptions.SaveTransparentTexts property to true.
The following code snippet shows the process of converting PDF file into XPS format.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ConvertPDFtoXPS()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentConversion();
using (var document = new Aspose.Pdf.Document(dataDir + "demo.pdf"))
{
var xpsOptions = new XpsSaveOptions
{
SaveTransparentTexts = true
};
// Save XPS document
document.Save(dataDir + "PDFtoXPS_out.xps", xpsOptions);
}
}
Convert PDF to Markdown
Aspose.PDF for .NET gives a possibility to convert PDF files to MD format. Let try to use the presented code snippet for converting PDF files to MD format with C#.
Markdown is a lightweight markup language designed to represent plain text formatting with maximum human readability and machine-readability to advanced publishing languages.
Optimize image usage by PDF to Markdown converter
You can notice that in directories with images, the number of images is smaller than the number of images in PDF files.
Since the markdown file cannot set the image size, without the MarkdownSaveOptions.UseImageHtmlTag option, the same kind of pictures with different sizes are saved as different.
For the enabled option MarkdownSaveOptions.UseImageHtmlTag will save unique images, which are scaled in the document by the img tag.
The code opens a PDF document, configures the parameters for converting it to a Markdown file (saving any images in the folder named “images”), and saves the resulting Markdown file in the specified output path.
The following code snippet shows the process of converting PDF file into MD format.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ConvertPDFtoMarkup()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentConversion();
// Open PDF document
using (var document = new Aspose.Pdf.Document(dataDir + "demo.pdf"))
{
// Create an instance of MarkdownSaveOptions to configure the Markdown export settings
var saveOptions = new MarkdownSaveOptions()
{
// Set to false to prevent the use of HTML <img> tags for images in the Markdown output
UseImageHtmlTag = false
}
// Specify the directory name where resources (like images) will be stored
saveOptions.ResourcesDirectoryName = "images";
// Save PDF document in Markdown format to the specified output file path using the defined save options
document.Save(dataDir + "PDFtoMarkup_out.md", saveOptions);
}
}
Convert PDF to MobiXml
MobiXML is a popular eBook format, designed to be usen on mobile platforms. The following code snippet explains how to convert PDF document to MobiXML file.
// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ConvertPdfToMobiXml()
{
// The path to the documents directory
var dataDir = RunExamples.GetDataDir_AsposePdf_DocumentConversion();
// Open PDF document
using (var document = new Aspose.Pdf.Document(dataDir + "PDFToXML.pdf"))
{
// Save PDF document in XML format
document.Save(dataDir + "PDFToXML_out.xml", Aspose.Pdf.SaveFormat.MobiXml);
}
}