Save HTML Document – C# Examples

After downloading an existing file or creating an HTML document from scratch, you can save the changes using one of the HTMLDocument.Save() methods. There are overloaded methods to save a document to a file, URL, or streams.

The API provides Aspose.Html.Saving namespace with the SaveOptions and ResourceHandlingOptions classes that allow you to set options for saving operations.
The API provides Aspose.Html.Saving.ResourceHandlers namespace that contains ResourceHandler and FileSystemResourceHandler classes responsible for handling resources.

Please note that we have two different concepts for creating the output files:

The first conception is based on producing the HTML like files as output. The SaveOptions as a base class for this approach helps to handle the saving process of related resources such as scripts, styles, images, etc. The ResourceHandler class is responsible for handling resources. It is developed to save HTML content and resources into streams and provides methods that allow you to control what will be done with the resource.
The second concept could be used to creating a visual representation of HTML as a result. The base class for this conception is RenderingOptions; it has specialized methods to specify the page size, page-margins, resolution, user-styles, etc.

This article only describes how to use SaveOptions and ResourceHandler classes. To read more about the rendering mechanism, please follow the Renderers and Rendering Options articles.

SaveOptions & ResourceHandlingOptions

The SaveOptions is a base class that allows you to specify additional options for saving operations and helps to manage the linked resources. The ResourceHandlingOptions property of the SaveOptions class is used for configuration of resources handling. The ResourceHandlingOptions class represents resource handling options and the list of available ones are demonstrated in the following table:

Option	Description
UrlRestriction	Applies restrictions to the host or folders where resources are located.
MaxHandlingDepth	If you need to save not the only specified HTML document, but also the linked HTML pages, this option gives you the ability to control the depth of the linked pages that should be saved.
JavaScript	This option specifies how do we need to treat the JavaScript files: it could be saved as a separated linked file, embed into HTML file or even be ignored.
Default	This option specifies behavior for other than JavaScript files. Gets or sets an enum, which represents the default way of resource handling. Currently, Save, Ignore, and Embed values are supported. The default value is Save.

Save HTML

Once you have finished your changes in HTML, you may want to save the document. You can do it using one of the Save() methods of the HTMLDocument class. The following example is the easiest way to save an HTML file:

 1// Save HTML to a file using C#
 2
 3// Prepare an output path for a document saving
 4string documentPath = Path.Combine(OutputDir, "save-to-file.html");
 5
 6// Initialize an empty HTML document
 7using (HTMLDocument document = new HTMLDocument())
 8{
 9    // Create a text element and add it to the document
10    Text text = document.CreateTextNode("Hello, World!");
11    document.Body.AppendChild(text);
12
13    // Save the HTML document to the file on a disk
14    document.Save(documentPath);
15}

Example-SaveHtmlToFile.cs hosted with ❤ by GitHub

In the example above, we use the HTMLDocument() constructor for initializing an empty HTML document. The CreateTextNode(data) method of the HTMLDocument class creates a text node given the specified string. The Save(path) method saves the document to a local file specified by path.

The sample above is quite simple. However, in real-life applications, you often need additional control over the saving process. The next few sections describe how to use resource handling options or save you document to the different formats.

Save HTML to a File

The following code snippet shows how to use ResourceHandlingOptions property of the SaveOptions class to manage linked to your document files.

 1// Save HTML with a linked resources using C#
 2
 3// Prepare an output path for an HTML document 
 4string documentPath = Path.Combine(OutputDir, "save-with-linked-file.html");
 5
 6// Prepare a simple HTML file with a linked document
 7File.WriteAllText(documentPath, "<p>Hello, World!</p>" +
 8                                "<a href='linked.html'>linked file</a>");
 9
10// Prepare a simple linked HTML file
11File.WriteAllText(Path.Combine(OutputDir, "linked.html"), "<p>Hello, linked file!</p>");
12
13// Load the "save-with-linked-file.html" into memory
14using (HTMLDocument document = new HTMLDocument(documentPath))
15{
16    // Create a save options instance
17    HTMLSaveOptions options = new HTMLSaveOptions();
18
19    // The following line with value '0' cuts off all other linked HTML-files while saving this instance
20    // If you remove this line or change value to the '1', the 'linked.html' file will be saved as well to the output folder
21    options.ResourceHandlingOptions.MaxHandlingDepth = 1;
22
23    // Save the document with the save options
24    document.Save(Path.Combine(OutputDir, "save-with-linked-file_out.html"), options);
25}

Example-SaveHtmlWithLinkedFile.cs hosted with ❤ by GitHub

Save HTML to a Local File System Storage

The HTML document can contain different resources like CSS, external images and files. Aspose.HTML for .NET provides a way to save HTML with all linked files – the ResourceHandler class is developed for saving HTML content and resources to streams. This class is responsible for handling resources and provides methods that allow you to control what is done with each resource.

Let’s consider an example of saving HTML with resourses to user-specified local file storage. The source with-resources.html document and its linked image file are in the same directory. The FileSystemResourceHandler(customOutDir) constructor takes a path indicating where the document with resources will be saved and creates a FileSystemResourceHandler object. The Save(resourceHandler) method takes this object and saves HTML to the output storage.

 1// Save HTML with resources to local storage using C#
 2
 3// Prepare a path to a source HTML file
 4string inputPath = Path.Combine(DataDir, "with-resources.html");
 5
 6// Prepare a full path to an output directory 
 7string customOutDir = Path.Combine(Directory.GetCurrentDirectory(), "./../../../../tests-out/saving/");
 8
 9// Load the HTML document from a file
10using (HTMLDocument doc = new HTMLDocument(inputPath))
11{
12    // Save HTML with resources
13    doc.Save(new FileSystemResourceHandler(customOutDir));
14}

Example-SaveHtmlWithResourcesToStorage.cs hosted with ❤ by GitHub

Save HTML to a Zip Archive

You can implement the ResourceHandler by creating ZipResourceHandler class. It allows you to create a structured and compressed archive containing HTML documents and associated resources, making it suitable for scenarios such as archiving and storage optimization. The HandleResource() method in the ZipResourceHandler class serves to customize the behavior of how individual resources are processed and stored in a Zip archive.

In the following example, the ZipResourceHandler class is used to save the with-resources.html document along with its linked resources to a Zip archive:

 1// Save an HTML document with all linked resources into a ZIP archive using C#
 2
 3// Prepare a path to a source HTML file 
 4string inputPath = Path.Combine(DataDir, "with-resources.html");
 5
 6string dir = Directory.GetCurrentDirectory();
 7
 8// Prepare a full path to an output zip storage
 9string customArchivePath = Path.Combine(dir, "./../../../../tests-out/saving/archive.zip");
10
11// Load the HTML document 
12using (HTMLDocument doc = new HTMLDocument(inputPath))
13{
14    // Initialize an instance of the ZipResourceHandler class
15    using (ZipResourceHandler resourceHandler = new ZipResourceHandler(customArchivePath))
16    {
17        // Save HTML with resources to a Zip archive
18        doc.Save(resourceHandler);
19    }
20}

Example-SaveHtmlWithResourcesToZip.cs hosted with ❤ by GitHub

The ResourceHandler class is intended for customer implementation. The ZipResourceHandler class extends the ResourceHandler base class and provides a convenient way to manage the entire process of handling and storing resources linked with an HTML document into a Zip archive:

 1// Custom resource handler to save HTML with resources into a ZIP archive
 2
 3internal class ZipResourceHandler : ResourceHandler, IDisposable
 4{
 5    private FileStream zipStream;
 6    private ZipArchive archive;
 7    private int streamsCounter;
 8    private bool initialized;
 9
10    public ZipResourceHandler(string name)
11    {
12        DisposeArchive();
13        zipStream = new FileStream(name, FileMode.Create);
14        archive = new ZipArchive(zipStream, ZipArchiveMode.Update);
15        initialized = false;
16    }
17
18    public override void HandleResource(Resource resource, ResourceHandlingContext context)
19    {
20        string zipUri = (streamsCounter++ == 0
21            ? Path.GetFileName(resource.OriginalUrl.Href)
22            : Path.Combine(Path.GetFileName(Path.GetDirectoryName(resource.OriginalUrl.Href)),
23                Path.GetFileName(resource.OriginalUrl.Href)));
24        string samplePrefix = String.Empty;
25        if (initialized)
26            samplePrefix = "my_";
27        else
28            initialized = true;
29
30        using (Stream newStream = archive.CreateEntry(samplePrefix + zipUri).Open())
31        {
32            resource.WithOutputUrl(new Url("file:///" + samplePrefix + zipUri)).Save(newStream, context);
33        }
34    }
35
36    private void DisposeArchive()
37    {
38        if (archive != null)
39        {
40            archive.Dispose();
41            archive = null;
42        }
43
44        if (zipStream != null)
45        {
46            zipStream.Dispose();
47            zipStream = null;
48        }
49
50        streamsCounter = 0;
51    }
52
53    public void Dispose()
54    {
55        DisposeArchive();
56    }
57}

Example-ZipResourceHandlerClass.cs hosted with ❤ by GitHub

Save HTML to Memory Streams

The ResourceHandler class implementation in the MemoryResourceHandler class allows saving HTML to memory streams. The following code shows how to use the MemoryResourceHandler class to store an HTML document in memory, collecting and printing information about the handled resources.

Initialize an HTML Document using the specified HTML file path.
Create an instance of the MemoryResourceHandler class. This class is designed to capture and store resources within memory streams during the resource-handling process.
Call the Save() method of the HTML document and pass it the MemoryResourceHandler instance as an argument. This associates the resource handling logic of the MemoryResourceHandler with the HTML document-saving process.
Use the PrintInfo() method of the MemoryResourceHandler to print information about the handled resources.

 1// Save HTML with resources to memory streams using C#
 2
 3// Prepare a path to a source HTML file 
 4string inputPath = Path.Combine(DataDir, "with-resources.html");
 5
 6// Load the HTML document 
 7using (HTMLDocument doc = new HTMLDocument(inputPath))
 8{
 9    // Create an instance of the MemoryResourceHandler class and save HTML to memory
10    MemoryResourceHandler resourceHandler = new MemoryResourceHandler();
11    doc.Save(resourceHandler);
12    resourceHandler.PrintInfo();
13}

Example-SaveHtmlToMemory.cs hosted with ❤ by GitHub

After the example run, the message about memory storage will be printed:

uri:memory:///with-resources.html, length:256
uri:memory:///photo1.png, length:57438

The ResourceHandler is a base class that supports the creation and management of output streams. The MemoryResourceHandler class allows you to capture and store resources in-memory streams, providing a dynamic and flexible way to handle resources without physically saving them to the file system. The following code snippet shows the realization of the ResourceHandler in the MemoryResourceHandler class:

 1// In-memory resource handler that captures and stores HTML resources as streams
 2
 3internal class MemoryResourceHandler : ResourceHandler
 4{
 5    public List<Tuple<Stream, Resource>> Streams;
 6
 7    public MemoryResourceHandler()
 8    {
 9        Streams = new List<Tuple<Stream, Resource>>();
10    }
11
12    public override void HandleResource(Resource resource, ResourceHandlingContext context)
13    {
14        MemoryStream outputStream = new MemoryStream();
15        Streams.Add(Tuple.Create<Stream, Resource>(outputStream, resource));
16        resource
17            .WithOutputUrl(new Url(Path.GetFileName(resource.OriginalUrl.Pathname), "memory:///"))
18            .Save(outputStream, context);
19    }
20
21    public void PrintInfo()
22    {
23        foreach (Tuple<Stream, Resource> stream in Streams)
24            Console.WriteLine($"uri:{stream.Item2.OutputUrl}, length:{stream.Item1.Length}");
25    }
26}

Example-MemoryResourceHandlerClass.cs hosted with ❤ by GitHub

Save HTML to MHTML

In some cases, you need to save your web page as a single file. MHTML document could be handy and helpful for this purpose since it is a web-page archive and it stores everything inside itself. The HTMLSaveFormat Enumeration specifies the format in which document is saved, it can be HTML, MHTML, and MD formats. The example below shows how to use the Save(path, saveFormat) method for HTML to MHTML saving.

 1// Save HTML as MHTML using C#
 2
 3// Prepare an output path for a document saving
 4string savePath = Path.Combine(OutputDir, "save-to-mhtml.mht");
 5
 6// Prepare a simple HTML file with a linked document
 7File.WriteAllText("save-to-mhtml.html", "<p>Hello, World!</p>" +
 8                                        "<a href='linked-file.html'>linked file</a>");
 9
10// Prepare a simple linked HTML file
11File.WriteAllText("linked-file.html", "<p>Hello, linked file!</p>");
12
13// Load the "save-to-mhtml.html" into memory
14using (HTMLDocument document = new HTMLDocument("save-to-mhtml.html"))
15{
16    // Save the document to MHTML format
17    document.Save(savePath, HTMLSaveFormat.MHTML);
18}

Example-SaveHtmlToMHtml.cs hosted with ❤ by GitHub

The saved “save-to-mhtml.mht” file stores HTML of the “document.html” and “linked-file.html” files.

Save HTML to Markdown

Markdown is a markup language with plain-text syntax. As well as for HTML to MHTML example, you can use the HTMLSaveFormat for HTML to MD saving. Please take a look at the following example:

 1// Save HTML as Markdown using C#
 2
 3// Prepare an output path for a document saving
 4string documentPath = Path.Combine(OutputDir, "save-html-to-markdown.md");
 5
 6// Prepare HTML code
 7string html_code = "<H2>Hello, World!</H2>";
 8
 9// Initialize a document from a string variable
10using (HTMLDocument document = new HTMLDocument(html_code, "."))
11{
12    // Save the document as a Markdown file
13    document.Save(documentPath, HTMLSaveFormat.Markdown);
14}

Example-SaveHtmlToMarkdown.cs hosted with ❤ by GitHub

For the more information how to use HTML Converter, please visit the Convert HTML to Markdown article.

Save SVG

Usually, you could see SVG as a part of an HTML file, it is used to represent the vector data on the page: images, icons, tables, etc. However, SVG also could be extracted from the web page and you can manipulate it in a similar way as the HTML document.

Since SVGDocument and HTMLDocument are based on the same WHATWG DOM standard, all operations such as loading, reading, editing, converting and saving are similar for both documents. So, all examples where you can see manipulation with the HTMLDocument are applicable for the SVGDocument as well.

To save your changes, please use follows:

 1// Create and save SVG image using C#
 2
 3// Prepare an output path for a document saving
 4string documentPath = Path.Combine(OutputDir, "create-and-save-svg.svg");
 5
 6// Prepare SVG code
 7string code = @"
 8    <svg xmlns='http://www.w3.org/2000/svg' height='200' width='300'>
 9        <g fill='none' stroke-width= '10' stroke-dasharray='30 10'>
10            <path stroke='red' d='M 25 40 l 215 0' />
11            <path stroke='black' d='M 35 80 l 215 0' />
12            <path stroke='blue' d='M 45 120 l 215 0' />
13        </g>
14    </svg>";
15
16// Initialize an SVG instance from the content string
17using (SVGDocument document = new SVGDocument(code, "."))
18{
19    // Save the SVG file to a disk
20    document.Save(documentPath);
21}

Example-SaveSvgDocument.cs hosted with ❤ by GitHub

For more information about SVG Basics Drawing and and the API usage for the processing and rendering of SVG documents, see the Aspose.SVG for .NET Documentation.

You can download the complete examples and data files from GitHub.

Sandboxing Converting Between Formats

Analyzing your prompt, please hold on...

An error occurred while retrieving the results. Please refresh the page and try again.