Convert other file formats to PDF in Python

Overview

This article explains how to convert various other types of file formats to PDF using Python. It covers the following topics.

Format: EPUB

Format: Markdown

Format: MD

Format: PCL

Format: Text

Format: TXT

Format: XPS

Convert EPUB to PDF

Aspose.PDF for Python via .NET allows you simply convert EPUB files to PDF format.

EPUB (short for electronic publication) is a free and open e-book standard from the International Digital Publishing Forum (IDPF). Files have the extension .epub. EPUB is designed for reflowable content, meaning that an EPUB reader can optimize text for a particular display device.

EPUB also supports fixed-layout content. The format is intended as a single format that publishers and conversion houses can use in-house, as well as for distribution and sale. It supersedes the Open eBook standard.The version EPUB 3 is also endorsed by the Book Industry Study Group (BISG), a leading book trade association for standardized best practices, research, information and events, for packaging of content.

Steps Convert EPUB to PDF in Python:

  1. Load EPUB Document
  2. Convert EPUB to PDF
  3. Print Confirmation

Next following code snippet show you how to convert EPUB files to PDF format with Python.


    import aspose.pdf as apdf
    from io import FileIO
    from os import path
    import pydicom

    path_infile = path.join(self.dataDir, infile)
    path_outfile = path.join(self.dataDir, "python", outfile)

    load_options = apdf.EpubLoadOptions()
    document = apdf.Document(path_infile, load_options)

    document.save(path_outfile)
    print(infile + " converted into " + outfile)

Convert Markdown to PDF

This feature is supported by version 19.6 or greater.

This code snippet by Aspose.PDF for Python via .NET helps convert Markdown files into PDFs, allowing better document sharing, formatting preservation, and printing compatibility.o

The following code snippet shows how to use this functionality with Aspose.PDF library:


    import aspose.pdf as apdf
    from io import FileIO
    from os import path
    import pydicom

    path_infile = path.join(self.dataDir, infile)
    path_outfile = path.join(self.dataDir, "python", outfile)

    load_options = apdf.MdLoadOptions()
    document = apdf.Document(path_infile, load_options)
    document.save(path_outfile)
    print(infile + " converted into " + outfile)

Convert PCL to PDF

PCL (Printer Command Language) is a Hewlett-Packard printer language developed to access standard printer features. PCL levels 1 through 5e/5c are command based languages using control sequences that are processed and interpreted in the order they are received. At a consumer level, PCL data streams are generated by a print driver. PCL output can also be easily generated by custom applications.

To allow conversion from PCL to PDF, Aspose.PDF has the class PclLoadOptions which is used to initialize the LoadOptions object. Later on this object is passed as an argument during Document object initialization and it helps the PDF rendering engine to determine the input format of source document.

The following code snippet shows the process of converting a PCL file into PDF format.


    import aspose.pdf as apdf
    from io import FileIO
    from os import path
    import pydicom

    path_infile = path.join(self.dataDir, infile)
    path_outfile = path.join(self.dataDir, "python", outfile)

    load_options = apdf.PclLoadOptions()
    load_options.supress_errors = True

    document = apdf.Document(path_infile, load_options)
    document.save(path_outfile)

    print(infile + " converted into " + outfile)

Convert Text to PDF

Aspose.PDF for Python via .NET support the feature converting plain text and pre-formatted text file to PDF format.

Converting text to PDF means adding text fragments to the PDF page. As for text files, we are dealing with 2 types of text: pre-formatting (for example, 25 lines with 80 characters per line) and non-formatted text (plain text). Depending on our needs, we can control this addition ourselves or entrust it to the library’s algorithms.


    import aspose.pdf as apdf
    from io import FileIO
    from os import path
    import pydicom

    path_infile = path.join(self.dataDir, infile)
    path_outfile = path.join(self.dataDir, "python", outfile)

    with open(path_infile, "r") as file:
        lines = file.readlines()

    monospace_font = apdf.text.FontRepository.find_font("Courier New")

    document = apdf.Document()
    page = document.pages.add()

    page.page_info.margin.left = 20
    page.page_info.margin.right = 10
    page.page_info.default_text_state.font = monospace_font
    page.page_info.default_text_state.font_size = 12

    for line in lines:
        if line != "" and line[0] == "\x0c":
            page = document.Pages.Add()
            page.page_info.margin.left = 20
            page.page_info.margin.right = 10
            page.page_info.defaultTextState.Font = monospace_font
            page.page_info.defaulttextstate.FontSize = 12
        else:
            text = apdf.text.TextFragment(line)
            page.paragraphs.add(text)

    document.save(path_outfile)

    print(infile + " converted into " + outfile)

Convert XPS to PDF

Aspose.PDF for Python via .NET support feature converting XPS files to PDF format. Check this article to resolve your tasks.

The XPS file type is primarily associated with the XML Paper Specification by Microsoft Corporation. The XML Paper Specification (XPS), formerly codenamed Metro and subsuming the Next Generation Print Path (NGPP) marketing concept, is Microsoft’s initiative to integrate document creation and viewing into its Windows operating system.

The following code snippet shows the process of converting XPS file into PDF format with Python.


    import aspose.pdf as apdf
    from io import FileIO
    from os import path
    import pydicom

    path_infile = path.join(self.dataDir, infile)
    path_outfile = path.join(self.dataDir, "python", outfile)

    load_options = apdf.XpsLoadOptions()
    document = apdf.Document(path_infile, load_options)
    document.save(path_outfile)

    print(infile + " converted into " + outfile)

Convert PostScript to PDF

Aspose.PDF for Python via .NET support features converting PostScript files to PDF format. One of the features from Aspose.PDF is that you can set a set of font folders to be used during conversion.

Following code snippet can be used to convert a PostScript file into PDF format with Aspose.PDF for Python via .NET:


    import aspose.pdf as apdf
    from io import FileIO
    from os import path
    import pydicom

    path_infile = path.join(self.dataDir, infile)
    path_outfile = path.join(self.dataDir, "python", outfile)

    load_options = apdf.PsLoadOptions()
    document = apdf.Document(path_infile, load_options)
    document.save(path_outfile)

    print(infile + " converted into " + outfile)

Convert XML to PDF

The XML format used to store structured data. There are several ways to convert XML to PDF in Aspose.PDF:

Following code snippet can be used to convert a XML to PDF format with Aspose.PDF for Python:


    import aspose.pdf as apdf
    from io import FileIO
    from os import path
    import pydicom

    path_infile = path.join(self.dataDir, infile)
    path_outfile = path.join(self.dataDir, "python", outfile)

    load_options = apdf.XmlLoadOptions("template.xslt")
    document = apdf.Document(path_infile, load_options)
    document.save(path_outfile)

    print(infile + " converted into " + outfile)

Convert XSL-FO to PDF

Following code snippet can be used to convert a XSLFO to PDF format with Aspose.PDF for Python via .NET:


    import aspose.pdf as apdf
    from io import FileIO
    from os import path
    import pydicom

    path_xsltfile = path.join(self.dataDir, xsltfile)
    path_xmlfile = path.join(self.dataDir, xmlfile)
    path_outfile = path.join(self.dataDir, "python", outfile)

    load_options = apdf.XslFoLoadOptions(path_xsltfile)
    load_options.parsing_errors_handling_type = apdf.XslFoLoadOptions.ParsingErrorsHandlingTypes.ThrowExceptionImmediately
    document = apdf.Document(path_xmlfile, load_options)
    document.save(path_outfile)

    print(xmlfile + " converted into " + outfile)

Convert LaTeX/TeX to PDF

The LaTeX file format is a text file format with markup in the LaTeX derivative of the TeX family of languages and LaTeX is a derived format of the TeX system. LaTeX (ˈleɪtɛk/lay-tek or lah-tek) is a document preparation system and document markup language. It is widely used for the communication and publication of scientific documents in many fields, including mathematics, physics, and computer science. It also has a prominent role in the preparation and publication of books and articles that contain complex multilingual materials, such as Sanskrit and Arabic, including critical editions. LaTeX uses the TeX typesetting program for formatting its output, and is itself written in the TeX macro language.

The following code snippet shows the process of converting LaTex file to PDF format with Python.


    import aspose.pdf as apdf
    from io import FileIO
    from os import path
    import pydicom

    path_infile = path.join(self.dataDir, infile)
    path_outfile = path.join(self.dataDir, "python", outfile)

    load_options = apdf.LatexLoadOptions()
    document = apdf.Document(path_infile, load_options)
    document.save(path_outfile)

    print(infile + " converted into " + outfile)