Convert HTML to PDF in Python
Python HTML to PDF Conversion
Aspose.PDF for Python via .NET lets you convert existing HTML documents to PDF with flexible rendering options. You can fine-tune how the output is generated to match your layout, styling, accessibility, and archiving requirements.
Convert HTML to PDF
The following Python example shows the basic workflow for converting an HTML document to PDF.
- Create an instance of the HtmlLoadOptions class.
- Initialize a Document object with the source HTML file.
- Save the output PDF document by calling
document.save().
from os import path
import aspose.pdf as ap
path_infile = path.join(self.data_dir, infile)
path_outfile = path.join(self.data_dir, "python", outfile)
load_options = ap.HtmlLoadOptions()
load_options.page_layout_option = ap.HtmlPageLayoutOption.SCALE_TO_PAGE_WIDTH
document = ap.Document(path_infile, load_options)
document.save(path_outfile)
print(infile + " converted into " + outfile)
Related conversions
- Convert PDF to HTML when you need web-ready output from existing PDF files.
- Convert other file formats to PDF for EPUB, XPS, text, and PostScript conversion workflows.
- Convert images to PDF when your source content is image-based instead of HTML markup.
Try to convert HTML to PDF online
Aspose presents the online application “HTML to PDF”, where you can test the conversion quality and output.
Convert HTML to PDF using media type
This example shows how to convert an HTML file to PDF using specific rendering options.
- Create an instance of the HtmlLoadOptions() class.
- Set
html_media_typeto apply CSS rules intended for screen or print layouts, such asHtmlMediaType.SCREENorHtmlMediaType.PRINT. - Load the HTML into an
ap.Documentusing the load options. - Save the document as a PDF.
from os import path
import aspose.pdf as ap
path_infile = path.join(self.data_dir, infile)
path_outfile = path.join(self.data_dir, "python", outfile)
load_options = ap.HtmlLoadOptions()
load_options.html_media_type = ap.HtmlMediaType.SCREEN
document = ap.Document(path_infile, load_options)
document.save(path_outfile)
print(infile + " converted into " + outfile)
Prioritize the CSS @page rule during HTML-to-PDF conversion
Some documents use the @page rule for page layout. If those styles conflict with other settings, you can control the priority with is_priority_css_page_rule.
- Create an instance of the HtmlLoadOptions class.
- Set
is_priority_css_page_rule = Falseto let other styles take precedence over@pagerules. - Load the HTML into an
ap.Documentwith the configured options. - Save the document as a PDF.
from os import path
import aspose.pdf as ap
path_infile = path.join(self.data_dir, infile)
path_outfile = path.join(self.data_dir, "python", outfile)
load_options = ap.HtmlLoadOptions()
# load_options.is_priority_css_page_rule = False
document = ap.Document(path_infile, load_options)
document.save(path_outfile)
print(infile + " converted into " + outfile)
Convert HTML to PDF with embedded fonts
This example shows how to convert an HTML file to PDF while embedding fonts. If you need the output PDF to preserve the original typography, set is_embed_fonts to True.
- Create
HtmlLoadOptions()to configure HTML-to-PDF conversion. - Set
is_embed_fonts = Trueto embed the fonts used in the HTML directly into the PDF. - Load the HTML into an
ap.Documentwith these options. - Save the document as a PDF.
from os import path
import aspose.pdf as ap
path_infile = path.join(self.data_dir, infile)
path_outfile = path.join(self.data_dir, "python", outfile)
load_options = ap.HtmlLoadOptions()
load_options.is_embed_fonts = True
document = ap.Document(path_infile, load_options)
document.save(path_outfile)
print(infile + " converted into " + outfile)
Render HTML content on a single PDF page
This example demonstrates how to convert an HTML file into a single-page PDF using Aspose.PDF for Python via .NET. Use the is_render_to_single_page property when you want the full HTML content rendered onto one continuous page.
- Create an instance of
HtmlLoadOptions()to configure the conversion process. - Enable
is_render_to_single_pageto render the full HTML content on one page. - Load the document with the configured options into an
ap.Document. - Save the result as a PDF file.
from os import path
import aspose.pdf as ap
path_infile = path.join(self.data_dir, infile)
path_outfile = path.join(self.data_dir, "python", outfile)
options = ap.HtmlLoadOptions()
options.is_render_to_single_page = True
doc = ap.Document(path_infile, options)
doc.save(path_outfile)
Create logical structure from HTML tags
Logical structure, also called a tagged PDF, preserves the semantic hierarchy of the original HTML, such as headings, paragraphs, and lists. This makes the resulting PDF more accessible, searchable, and suitable for structured document workflows.
By enabling logical structure during conversion, the HTML DOM is mapped into a PDF tag tree rather than rendered only as visual content.
To meet accessibility requirements, a PDF should include logical structure elements that define reading order, provide alternate text for screen readers, and preserve the hierarchy of the content.
The quality of the logical structure in the output PDF depends directly on the quality of the original HTML markup. Poorly structured or invalid HTML may result in incomplete or inaccurate tagging in the converted PDF.
- Create an HtmlLoadOptions instance to control how the HTML is converted.
- Activate semantic tagging so the PDF contains structured elements.
- Open the HTML file using the configured options.
- Save the structured PDF.
import aspose.pdf as ap
# Path to the source HTML
input_html_path = "input.html"
# Path for the Logical Structure PDF
output_pdf_path = "output_logical_structure.pdf"
# Initialize HtmlLoadOptions
options = ap.HtmlLoadOptions()
# Convert HTML markup to PDF logical structure elements
options.create_logical_structure = True
# Open PDF document
with ap.Document(input_html_path, options) as document:
# Save PDF document
document.save(output_pdf_path)
Convert MHTML to PDF
This example shows how to convert an MHT or MHTML file into a PDF document using Aspose.PDF for Python via .NET with specific page dimensions.
- Create an instance of
ap.MhtLoadOptions()to configure MHTML file processing. - Set various parameters, such as page size.
- Initialize the document with the input file and configured loading options.
- Save the resulting document as a PDF.
from os import path
import aspose.pdf as ap
path_infile = path.join(self.data_dir, infile)
path_outfile = path.join(self.data_dir, "python", outfile)
load_options = ap.MhtLoadOptions()
load_options.page_info.width = 842
load_options.page_info.height = 1191
document = ap.Document(path_infile, load_options)
document.save(path_outfile)
print(infile + " converted into " + outfile)
