Convert HTML to DOCX in Java

A DOCX file is a Microsoft Word document that typically contains text but can contain a wide range of data, including tables, graphics, video, sounds, etc. The DOCX file is highly editable, easy to use, and manageable in size. This format is popular because of the variety of options it offers users to write any documents and is available through numerous programs.

In this article, you find information on how to convert an HTML document into a ( DOCX) file format using convertHTML() methods of the Converter class, and how to apply DocSaveOptions and ICreateStreamProvider parameters.

HTML to DOCX by a single line of Java code

HTML to DOCX conversion is a highly sought-after feature, and Aspose.HTML for Java offers an easy solution. Through static methods in the Converter class, users can convert an HTML document into a DOCX file with just a single line of code!

1    // Invoke the convertHTML() method to convert HTML code to DOCX
2    com.aspose.html.converters.Converter.convertHTML("<h1>Convert HTML to DOCX!</h1>", ".", new DocSaveOptions(), Path.combine(getOutputDir(), "convert-with-single-line.docx"));   

Convert HTML to DOCX

Let’s walk through the step-by-step instructions for a simple HTML to DOCX conversion scenario:

  1. Load an HTML file using one of HTMLDocument() constructors of the HTMLDocument class. You can load HTML from a file, HTML code, stream, or URL (see the Creating an HTML Document article). In the example we use HTMLDocument(address) constructor that initializes an HTML document from a file.
  2. Create a new DocSaveOptions object.
  3. Use the сonvertHTML(document, options, savePath) method of the Converter class to save HTML as a DOCX file.

Please review the following Java code snippet, which shows the HTML to DOCX conversion process with step-by-step instructions:

 1    // Prepare a path to a source HTML file
 2    String documentPath = Path.combine(getDataDir(), "canvas.html");
 3
 4    // Prepare a path for converted file saving 
 5    String savePath = Path.combine(getOutputDir(), "canvas-output.docx");
 6
 7    // Initialize an HTML document from the file
 8    HTMLDocument document = new HTMLDocument(documentPath);
 9    try {        }
10    finally { if (document != null) document.dispose(); }
11
12    // Initialize DocSaveOptions 
13    DocSaveOptions options = new DocSaveOptions();
14
15    // Convert HTML to DOCX
16    com.aspose.html.converters.Converter.convertHTML(document, options, savePath);      

You can download the complete examples and data files from GitHub.

Convert HTML to DOCX using DocSaveOptions

With Aspose.HTML for Java, you can convert files programmatically with full control over a wide range of conversion parameters. To convert HTML to DOCX with DocSaveOptions specifying, you should follow a few steps:

  1. Load an HTML file using one of the HTMLDocument() constructors of the HTMLDocument class.
  2. Create a new DocSaveOptions object and specify the required properties. Use the getPageSetup() method to specify the page size and margins for the output document.
  3. Use the сonvertHTML() method of the Converter class to save HTML as a DOCX file.

The following Java example shows how to use DocSaveOptions and create a DOCX file with custom page size and margins:

 1    // Prepare a path to a source HTML file
 2    String documentPath = Path.combine(getDataDir(), "canvas.html");
 3
 4    // Prepare a path for converted file saving 
 5    String savePath = Path.combine(getOutputDir(), "canvas-output-options.docx");
 6
 7    // Initialize an HTML document from the file
 8    HTMLDocument   document = new HTMLDocument(documentPath);
 9    try {        }
10    finally { if (document != null) document.dispose(); }
11
12    // Initialize DocSaveOptions. Set up the pag size 600x400 pixels and margins
13    DocSaveOptions options = new DocSaveOptions();
14    options.getPageSetup().setAnyPage(new Page(new com.aspose.html.drawing.Size(600, 400), new Margin(10, 10, 10, 10)));
15
16    // Convert HTML to DOCX
17    com.aspose.html.converters.Converter.convertHTML(document, options, savePath);       

The DocSaveOptions() constructor initializes an instance of the DocSaveOptions class that is passed to convertHTML() method. The method takes the document, options, output file path savePath and performs the conversion operation.

You can download the complete examples and data files from GitHub.

Aspose.HTML offers a free online HTML to DOCX Converter that converts HTML to DOCX with high quality, easy and fast. Just upload, convert your files and get the result in a few seconds!

Text “Banner HTML to DOCX Converter”

Subscribe to Aspose Product Updates

Get monthly newsletters & offers directly delivered to your mailbox.