Convert MHTML to DOCX in Python
MHTML to DOCX conversion is often required to take advantage of DOCX format for specific tasks. Aspose.HTML for Python via .NET provides a simple and efficient way to perform this conversion using convert_mhtml() methods of the Converter class.
In this article, you find information on how to convert MHTML to DOCX using Aspose.HTML Python library and how to apply DocSaveOptions.
To continue following this tutorial, install and configure Aspose.HTML for Python via .NET in your Python project.
Online MHTML Converter
You can convert MHTML to DOCX with Aspose.HTML for Python via .NET API in real time. Load an MHTML file from a local file system, select the output format and run the example. The conversion will be performed with default save options. You will immediately receive the conversion result as a separate file.
Convert MHTML to DOCX using DocSaveOptions
Using convert_mhtml() methods is the most common way to convert MHTML into various formats. With Aspose.HTML for Python via .NET, you can convert MHTML to DOCX format programmatically with full control over a wide range of conversion parameters.
To convert MHTML to DOCX with DocSaveOptions
specifying, you should follow a few steps:
- Open an existing MHTML file. In the example, we use the
open()
method to open and read MHTML from a file system at the specified path. - Create an instance of the
DocSaveOptions class. The DocSaveOptions class provides numerous properties that give you full control over a wide range of parameters and improve the process of converting MHTML to DOCX format. In the example, we use the
page_setup
property that specifies the page size of the DOCX document,document_format
, andcss.media_type
properties. - Use one of the
convert_mhtml() methods of the
Converter class to save MHTML as a DOCX file. In the following example, the
convert_mhtml()
method takes thestream
,options
, output file pathsave_path
and performs the conversion operation.
The following Python code example shows how to convert MHTML to DOCX using DocSaveOptions:
1import os
2from aspose.html import *
3from aspose.html.converters import *
4from aspose.html.saving import *
5from aspose.html.drawing import *
6
7# Setup directories and define paths
8output_dir = "output/"
9input_dir = "data/"
10if not os.path.exists(output_dir):
11 os.makedirs(output_dir)
12document_path = os.path.join(input_dir, "document.mht")
13save_path = os.path.join(output_dir, "document.docx")
14
15# Open an existing MHTML file for reading
16with open(document_path, "rb") as stream:
17
18 # Create an instance of DocSaveOptions
19 options = DocSaveOptions()
20 options.page_setup.any_page = Page(Size(400, 400), Margin(10, 10, 10, 10))
21 options.document_format.DOCX
22 options.css.media_type.SCREEN
23
24 # Convert MHTML to DOCX
25 Converter.convert_mhtml(stream, options, save_path)
Save Options – DocSaveOptions Class
Aspose.HTML for Python via .NET allows converting MHTML to DOCX using default or custom save options. DocSaveOptions class is configured to save the document as DOCX and it includes the following properties:
- page_setup – This property lets you define the page’s layout, including page size, margins, and other layout aspects, ensuring the output document matches the desired format.
- document_format – This property sets the file format of the output document. The default is DOCX.
- horizontal_resolution – This property sets or gets the horizontal resolution for internal images in pixels per inch. By default, it is 300 dpi. Higher resolutions can produce better rendering quality but larger file sizes. This property allows you to control the trade-offs between quality and file size.
- vertical_resolution – This property sets or gets the vertical resolution for internal images in pixels per inch. By default, it is 300 dpi. Similar to
horizontal_resolution,
this controls the vertical resolution of documents, affecting their clarity and overall size. - background_color – This property allows you to set the background color for the rendered output. If not set, the default background is transparent.
- css – This property gets a CssOptions object, which is used to configure CSS properties processing. For example, the
css.media_type
property specifies different styles for different media types, ensuring that the correct CSS rules are applied based on how the document is being rendered. - font_embedding_rule – This property sets the rule for embedding fonts and controls whether and how fonts are embedded in the output document. The default value is
NONE
.
Some properties of this class inherit properties of base classes, such as DocRenderingOptions or RenderingOptions.
Download the Aspose.HTML for Python via .NET library to successfully, quickly, and easily convert your HTML, MHTML, EPUB, SVG, and Markdown documents to the most popular formats.
Aspose.HTML offers a free online MHTML to DOCX Converter that converts MHTML to DOCX file with high quality, easy and fast. Just upload, convert your files and get results in a few seconds!