Convert HTML to MHTML in Python
MHTML is a web page archive format that combines HTML code and related resources, such as images, stylesheets, and scripts, into a single file. This is particularly useful for archiving or sharing web pages in a single file. HTML to MHTML conversion preserves the entire structure and formatting of the web page as it appears in the browser, ensuring consistency when the MHTML file is opened. Additionally, MHTML files enable offline access to web pages, providing a convenient way to view content without an Internet connection.
In this article, you will find information on how to convert HTML to MHTML and how to use MHTMLSaveOptions.
To continue following this tutorial, install and configure the Aspose.HTML for Python via .NET in your Python project. Our code examples help you to convert HTML to MHTML and generate MHTML files using the Python library.
Online HTML Converter
You can convert HTML to MHTML with Aspose.HTML for Python via .NET API in real time. First, load an HTML file from your local drive or URL and run the example. This example uses the default save options. You will immediately receive the conversion result as a separate file.
Convert HTML to MHTML
Converting a file to another format using the convert_html()
method is a sequence of operations among which document loading and saving. The next example explains how to convert HTML to MHTML:
- Load the HTML file using HTMLDocument class.
- Create an instance of the MHTMLSaveOptions class to control HTML to MHTML conversion options.
- Use the
convert_html() method of
Converter class to save HTML document as an MHTML file. The method takes the
document
,options
, output file pathsave_path
and performs the conversion operation.
HTML to MHTML by a single line of code
The methods of the Converter class are primarily used as the easiest way to convert an HTML code into various formats. You can convert HTML to MHTML in your Python application literally with a single line of code!
1from aspose.html import *
2from aspose.html.converters import *
3from aspose.html.saving import *
4
5# Convert HTML to MHTML
6Converter.convert_html("document.html", MHTMLSaveOptions(), "document.mht")
Convert HTML to MHTML using MHTMLSaveOptions
When converting HTML to MHTML using Aspose.HTML for Python via .NET, you can customize the conversion process using MHTMLSaveOptions. The following Python code example shows how to create an MHTML file with custom save options:
1import os
2from aspose.html import *
3from aspose.html.converters import *
4from aspose.html.saving import *
5
6# Prepare directories and paths
7output_dir = "output/"
8if not os.path.exists(output_dir):
9 os.makedirs(output_dir)
10
11# Prepare HTML code with a link to another file and save it to "document1.html"
12code = "<span>Hello, World!!</span> <a href="document2.html">click</a>"
13with open("document1.html", "w") as file:
14 file.write(code)
15
16# Prepare HTML code and save it to "document2.html"
17code = "<span>Hello, World!!</span>"
18with open("document2.html", "w") as file:
19 file.write(code)
20
21save_path = os.path.join(output_dir, "output-options.mht")
22
23# Change the value of the resource linking depth to 1 in order to convert document with directly linked resources
24options = MHTMLSaveOptions()
25options.resource_handling_options.max_handling_depth = 1
26
27# Convert HTML to MHTML
28Converter.convert_html("document.html", options, save_path)
In the above example, we use the property max_handling_depth = 1
means that only pages directly referenced from the saved document will be handled.
Save Options – MHTMLSaveOptions Class
MHTMLSaveOptions usage enables you to customize the rendering process. Its
ResourceHandlingOptions property is crucial for controlling how external resources referenced in the HTML document are managed during the conversion process. It allows you to specify options such as resource_url_restriction
, page_url_restriction
, max_handling_depth
, etc.
Property | Description |
---|---|
page_url_restriction | This property gets or sets restrictions applied to URLs of handled pages. The default value is ROOT_AND_SUB_FOLDERS . |
resource_url_restriction | Gets or sets restrictions applied to URLs of handled resources such as CSS, js, images, etc. The default is SAME_HOST . |
max_handling_depth | Determines the maximum depth for handling linked resources. This is useful for ensuring that all necessary resources are embedded within the MHTML file, maintaining the integrity and appearance of the original HTML content. |
Aspose.HTML offers a free online HTML to MHTML Converter that converts HTML to MHTML with high quality, easy and fast. Just upload, convert your files and get results in a few seconds!