Convert HTML to PDF in Python

A PDF file is a fixed-layout document that includes the text, graphics, hyperlinks, buttons, form fields, multimedia, and other information needed to display. PDFs are highly secure, allowing password protection, encryption, and digital signatures to safeguard sensitive information. They are also universally accessible and easily viewable on any device without specific software. Furthermore, PDFs are compact and can compress high-quality files into smaller sizes, making them ideal for sharing and storage.

In this guide, you will find information on how to convert an HTML document into a Portable Document Format (PDF) file format using Aspose.HTML for Python via .NET. We are going to cover in detail how to convert HTML to PDF using the convert_html() methods of the Converter class, and how to apply PdfSaveOptions. Also, you can try an Online HTML Converter to test the Aspose.HTML functionality and convert HTML on the fly.

To continue following this tutorial, install and configure the Aspose.HTML for Python via .NET in your Python project. Our code examples help you to convert HTML to PDF and generate PDF files using the Python library.

HTML to PDF by a single line of code

The methods of the Converter class are primarily used as the easiest way to convert an HTML code into various formats. You can convert HTML to PDF in your Python application literally with a single line of code!

1from aspose.html import *
2from aspose.html.converters import *
3from aspose.html.saving import *
4
5# Convert HTML to PDF
6Converter.convert_html("document.html", PdfSaveOptions(), "document.pdf")

Online HTML Converter

You can test the power of Aspose.HTML for Python via .NET and perform HTML conversion in real-time. Simply load an HTML file from your local file system or URL, select the desired output format, and run the provided code example. The example uses the default save options, allowing for a simple conversion process. Once completed, you will instantly receive the converted file in the format of your choice.

Convert HTML to PDF using PdfSaveOptions

With Aspose.HTML for Python via .NET, you can convert files programmatically with full control over a wide range of conversion parameters. To convert HTML to PDF with PdfSaveOptions specifying, you should follow a few steps:

Load an HTML file using one of HTMLDocument() constructors of the HTMLDocument class. In the example above, we initialize an HTML document from a file.
Create a new PdfSaveOptions object and specify the required properties. The PdfSaveOptions class provides numerous properties that give you full control over a wide range of parameters and improve the process of converting HTML to PDF.
Use the convert_html() method of the Converter class. In the following example, you need to pass the HTMLDocument, PdfSaveOptions, and output file path to the convert_html() method.

The following Python code example shows how to use PdfSaveOptions and create a PDF file with custom save options:

 1import os
 2from aspose.html import *
 3from aspose.html.converters import *
 4from aspose.html.saving import *
 5from aspose.html.drawing import *
 6from aspose.html.rendering.pdf import *
 7
 8# Setup directories and define paths
 9output_dir = "output/"
10input_dir = "data/"
11if not os.path.exists(output_dir):
12    os.makedirs(output_dir)
13
14document_path = os.path.join(input_dir, "aspose.html")
15save_path = os.path.join(output_dir, "aspose-output.pdf")
16
17# Initialize an HTML document from the file
18document = HTMLDocument(document_path)
19
20# Initialize PdfSaveOptions
21options = PdfSaveOptions()
22options.page_setup.any_page = Page(Size(680, 500), Margin(10, 10, 10, 10))
23options.css.media_type.PRINT
24
25# Convert HTML to PDF
26Converter.convert_html(document, options, save_path)

We convert an HTML document to a PDF file using save options in this example. The process involves initializing the HTML document, setting custom save options such as page size and css media_type, and then performing the conversion. Finally, the converted PDF file is saved to a specified output directory.

You can evaluate the quality of conversion by trying our product. The following figure shows the result of converting an aspose.html file to PDF format:

Text “The figure illustrates the aspose.html file”

PdfSaveOptions Class

Aspose.HTML for Python via .NET provides the PdfSaveOptions class, which gives you more control over how documents are saved in PDF format. Some properties of this class inherit properties of base classes, such as PdfRenderingOptions or RenderingOptions. PdfSaveOptions usage enables you to customize the rendering process; you can specify the page size, margins, file permissions, Css, etc. Here is a description of properties available in PdfSaveOptions:

page_setup – This property provides access to a PageSetup object used to configure the layout and settings of the output PDF pages to fit specific printing or display requirements.
horizontal_resolution – This property controls the horizontal resolution for both internal images used during processing and any external images included in the HTML. By default, it is set to 300 dpi.
vertical_resolution – Similar to horizontal_resolution, this property manages the vertical resolution for internal and external images during PDF generation. Like its horizontal counterpart, it defaults to 300 dpi.
background_color – This property sets or retrieves the background color that fills each PDF document page. The default value is transparent, but this can be customized to suit branding or aesthetic preferences, ensuring consistency across all pages.
css – This property uses a CssOptions object to configure the processing of CSS properties during HTML to PDF conversion. It allows precise control over how styles from the HTML are interpreted and applied in the resulting PDF.
document_info – This property contains metadata and information about the output PDF document, such as title, author, subject, and keywords. This metadata helps document management, indexing, and searchability, making the PDF more informative and organized.
form_field_behaviour – This property specifies the behavior of interactive form fields in the generated PDF.
jpeg_quality – This property determines the JPEG compression quality used for images embedded in a PDF document. The default quality is set to 95, providing a good balance between image fidelity and file size. Setting this property allows you to optimize file size or image quality based on your specific needs.
encryption – This property provides detailed information about PDF document encryption, including password protection and permission settings. If it is not configured, no encryption is applied, but setting this property allows you to distribute and control access to sensitive PDF content securely.
is_tagged_pdf – When set to true, a tagged layout is created within the PDF document, enhancing accessibility for users with disabilities. This ensures that content is properly structured and navigable using assistive technology and meets accessibility standards.

HTML to PDF conversion and PDF flattening

Aspose.HTML for Python via .NET offers the form_field_behaviour property of the PdfSaveOptions class to flatten PDF documents after their conversion from HTML or MHTML. This property is used to specify the behavior of form fields in a PDF document. If the value is set to FormFieldBehaviour.FLATTENED all form fields in the PDF document will be flattened.

 1import os
 2from aspose.html import *
 3from aspose.html.converters import *
 4from aspose.html.saving import *
 5from aspose.html.rendering.pdf import *
 6
 7# Setup directories and define paths
 8data_dir = "data/"
 9output_dir = "output/"
10if not os.path.exists(output_dir):
11    os.makedirs(output_dir)
12source_path = os.path.join(data_dir, "SampleHtmlForm.html")
13result_path = os.path.join(output_dir, "form-flattened.pdf")
14
15# Initialize an HTML document from the file
16document = HTMLDocument(source_path)
17
18# Prepare PDF save options
19options = PdfSaveOptions()
20options.form_field_behaviour = FormFieldBehaviour.FLATTENED
21
22# Convert HTML to PDF
23Converter.convert_html(document, options, result_path)

How to convert HTML to XPS

Aspose.HTML for Python via .NET supports HTML to XPS conversion. To do this, you should use XpsSaveOptions to get a save options object that is passed to the convert_html() method:

options = XpsSaveOptions()

XpsSaveOptions usage enables you to customize the rendering process; you can specify the page_setup, background_color, css, horizontal_resolution, and vertical_resolution properties.

Download the Aspose.HTML for Python via .NET library to successfully, quickly, and easily convert your HTML, MHTML, EPUB, SVG, and Markdown documents to the most popular formats.

Aspose.HTML offers a free online HTML to PDF Converter that converts HTML to PDF with high quality, easy and fast. Just upload, convert your files and get the result in a few seconds!

HTML Converter Convert HTML to DOCX