Aspose.HTML for .NET 24.2.0 – Save HTML to a Stream

Aspose.HTML for .NET 24.2.0

In version 24.2.0, the IOutputStorage interface has been deprecated but will continue to work until version 24.5.0 is released. If you use earlier versions of Aspose.HTML for .NET, we recommend that you upgrade and migrate to the new version, as version 24.5.0 will remove this deprecated interface.

DeprecatedNew
IOutputStorage InterfaceResourceHandler Class

The HTML document can contain different resources like CSS, external images, and files. In the article, we will consider the cases for saving HTML documents with resources to Zip archive and memory stream. Aspose.HTML for .NET continues to develop and enhance the ways to save HTMLs with all linked files. Here we will look at examples of saving files using deprecated classes and offer advanced solutions for implementing new classes.

ResourceHandler Class Vs IOutputStorage Interface

The ResourceHandler class allows developers to implement the HandleResource() method, in which you can create a stream yourself and release it wherever you need it. The new method adds the ability to see a Resource and process more information about it. So, by adopting the ResourceHandler class, developers can benefit from a more streamlined and expressive approach to managing resources, resulting in cleaner, more maintainable, and flexible code when saving HTML documents.

Save HTML to a Zip Archive

Here, we will examine one and the same example of saving the with-resources.html file in a Zip archive using an outdated IOutputStorage interface and a new ResourceHandler class.

Using IOutputStorage – 24.1.0 and Earlier Versions


The IOutputStorage interface was a base interface that supported the creation and management of output streams. You can implement the IOutputStorage interface by creating ZipStorage class to save HTML with resources to a Zip archive:

 1using System.IO;
 2using Aspose.Html;
 3using Aspose.Html.Saving;
 4using System.IO.Compression;
 5...
 6
 7	// Prepare a path to a source HTML file 
 8    string inputPath = Path.Combine(DataDir, "with-resources.html");;
 9    
10    var dir = Directory.GetCurrentDirectory();
11
12    // Prepare a full path to an output Zip storage
13    string customArchivePath = Path.Combine(dir, "./../../../../tests-out/old/archive.zip");
14
15    // Load the HTML document 
16    using (var doc = new HTMLDocument(inputPath))
17    {
18        // Initialize an instance of the ZipStorage class
19        using (var zipSrorage = new ZipStorage(customArchivePath))
20        {
21            // Save HTML with resources to a Zip archive
22            doc.Save(zipSrorage);
23        }
24    }

The following code snippet shows the realization of the IOutputStorage in the ZipStorage class to demonstrate saving an HTML document with resources to a Zip archive.

 1    internal class ZipStorage : IOutputStorage, IDisposable
 2    {
 3        private FileStream zipStream;
 4        private ZipArchive archive;
 5        private int streamsCounter;
 6        private bool initialized;
 7
 8        public ZipStorage(string name)
 9        {
10            DisposeArchive();
11            zipStream = new FileStream(name, FileMode.Create);
12            archive = new ZipArchive(zipStream, ZipArchiveMode.Update);
13            initialized = false;
14        }
15
16        public OutputStream CreateStream(OutputStreamContext context)
17        {
18            var zipUri = (streamsCounter++ == 0 ? Path.GetFileName(context.Uri) :
19                Path.Combine(Path.GetFileName(Path.GetDirectoryName(context.Uri)), Path.GetFileName(context.Uri)));
20            var samplePrefix = String.Empty;
21            if (initialized)
22                samplePrefix = "my_";
23            else
24                initialized = true;
25
26            var newStream = archive.CreateEntry(samplePrefix + zipUri).Open();
27            var outputStream = new OutputStream(newStream, "file:///" + samplePrefix + zipUri);
28            return outputStream;
29        }
30
31        public void ReleaseStream(OutputStream stream)
32        {
33            stream.Flush();
34            stream.Close();
35        }
36
37        private void DisposeArchive()
38        {
39            if (archive != null)
40            {
41                archive.Dispose();
42                archive = null;
43            }
44            if (zipStream != null)
45            {
46                zipStream.Dispose();
47                zipStream = null;
48            }
49            streamsCounter = 0;
50        }
51
52        public void Dispose()
53        {
54            DisposeArchive();
55        }
56    }

Using new ResourceHandler class – Since Version 24.2.0


The ResourceHandler class is intended for customers implementation. The following C# example shows how to save an HTML document with resources to a Zip archive using the ZipResourceHandler class of the ResourceHandler:

 1using System.IO;
 2using Aspose.Html;
 3using Aspose.Html.Saving;
 4using Aspose.Html.Saving.ResourceHandlers;
 5using System.IO.Compression;
 6...
 7
 8    // Prepare a path to a source HTML file 
 9    string inputPath = Path.Combine(DataDir, "with-resources.html");
10
11    var dir = Directory.GetCurrentDirectory();
12
13    // Prepare a full path to an output zip storage
14    string customArchivePath = Path.Combine(dir, "./../../../../tests-out/new/archive.zip");
15
16    // Load the HTML document
17    using (var doc = new HTMLDocument(inputPath))
18    {
19        // Initialize an instance of the ZipResourceHandler class
20        using (var resourceHandler = new ZipResourceHandler(customArchivePath))
21        {
22            // Save HTML with resources to a Zip archive
23            doc.Save(resourceHandler);
24        }
25    }

The following code snippet shows the realization of the ResourceHandler in the ZipResourceHandler class to demonstrate saving an HTML document with resources to a Zip archive. The HandleResource() method of the ZipResourceHandler class is responsible for handling each resource during the saving process when creating a Zip archive:

 1    internal class ZipResourceHandler : ResourceHandler, IDisposable
 2    {
 3        private FileStream zipStream;
 4        private ZipArchive archive;
 5        private int streamsCounter;
 6        private bool initialized;
 7
 8        public ZipResourceHandler(string name)
 9        {
10            DisposeArchive();
11            zipStream = new FileStream(name, FileMode.Create);
12            archive = new ZipArchive(zipStream, ZipArchiveMode.Update);
13            initialized = false;
14        }
15
16        public override void HandleResource(Resource resource, ResourceHandlingContext context)
17        {
18            var zipUri = (streamsCounter++ == 0
19                ? Path.GetFileName(resource.OriginalUrl.Href)
20                : Path.Combine(Path.GetFileName(Path.GetDirectoryName(resource.OriginalUrl.Href)),
21                    Path.GetFileName(resource.OriginalUrl.Href)));
22            var samplePrefix = String.Empty;
23            if (initialized)
24                samplePrefix = "my_";
25            else
26                initialized = true;
27
28            using (var newStream = archive.CreateEntry(samplePrefix + zipUri).Open())
29            {
30                resource.WithOutputUrl(new Url("file:///" + samplePrefix + zipUri)).Save(newStream, context);
31            }
32        }
33
34        private void DisposeArchive()
35        {
36            if (archive != null)
37            {
38                archive.Dispose();
39                archive = null;
40            }
41
42            if (zipStream != null)
43            {
44                zipStream.Dispose();
45                zipStream = null;
46            }
47
48            streamsCounter = 0;
49        }
50
51        public void Dispose()
52        {
53            DisposeArchive();
54        }
55    }

Save HTML to Memory Streams

Let’s consider the C# example of saving an HTML file with linked resources to a memory stream using the deprecated IOutputStorage interface and the new ResourceHandler class. The source with-resources.html document and the linked files are in the same directory.

Using IOutputStorage – 24.1.0 and Earlier Versions


The IOutputStorage interface implementation allowed saving HTML to memory streams:

 1using System.IO;
 2using Aspose.Html;
 3using Aspose.Html.Saving;
 4using System.Collections.Generic;
 5...
 6
 7    // Prepare a path to a source HTML file
 8    string inputPath = Path.Combine(DataDir, "with-resources.html");
 9
10    // Initialaze an HTML document
11    using (var doc = new HTMLDocument(inputPath))
12    {
13        // Create an instance of the MemoryOutputStorage class and save HTML to memory
14        var memoryStorage = new MemoryOutputStorage();
15        doc.Save(memoryStorage);
16	    memoryStorage.PrintInfo();
17    }

After the example run, the message about memory storage will be printed:

uri:memory:///with-resources.html, length:256
uri:memory:///photo1.png, length:57438

The following code snippet shows the realization of the IOutputStorage in the MemoryOutputStorage class to demonstrate saving an HTML document to memory streams.

 1    internal class MemoryOutputStorage : IOutputStorage
 2    {
 3        public List<Tuple<OutputStream, string>> Streams;
 4
 5        public MemoryOutputStorage()
 6        {
 7            Streams = new List<Tuple<OutputStream, string>>();
 8        }
 9
10        public OutputStream CreateStream(OutputStreamContext context)
11        {
12            var normalizedPath = new Url(context.Uri).Pathname;
13            var uri = new Url(Path.GetFileName(normalizedPath), "memory:///").Href;
14            var outputStream = new OutputStream(new MemoryStream(), uri);
15            Streams.Add(Tuple.Create(outputStream, uri));
16            return outputStream;
17        }
18
19        public void ReleaseStream(OutputStream stream)
20        {
21            stream.Flush();
22        }
23
24        public void PrintInfo()
25        {
26            foreach (var stream in Streams)
27                Console.WriteLine($"uri:{stream.Item2}, length:{stream.Item1.Length}");
28        }
29    }

Using new ResourceHandler class – Since Version 24.2.0


The following code snippet shows the realization of the ResourceHandler in the MemoryResourceHandler class to demonstrate saving an HTML document to memory streams:

 1using System.IO;
 2using Aspose.Html;
 3using Aspose.Html.Saving;
 4using Aspose.Html.Saving.ResourceHandlers;
 5using System.Collections.Generic;
 6...
 7
 8    // Prepare a path to a source HTML file 
 9    string inputPath = Path.Combine(DataDir, "with-resources.html");
10    
11    // Load the HTML document
12    using (var doc = new HTMLDocument(inputPath))
13    {
14        // Create an instance of the MemoryResourceHandler class and save HTML to memory
15        var resourceHandler = new MemoryResourceHandler();
16        doc.Save(resourceHandler);
17        resourceHandler.PrintInfo();
18    }

Instead of the CreateStream() and ReleaseStream() methods implemented in the MemoryOutputStorage class of the IOutputStorage interface, there is now one HandleResource() method implemented in the MemoryResourceHandler class of the ResourceHandler class, in which you can create a stream yourself and release it wherever you need it. The new method adds the ability to see a Resource and process more information about it:

 1    internal class MemoryResourceHandler : ResourceHandler
 2    {
 3        public List<Tuple<Stream, Resource>> Streams;
 4
 5        public MemoryResourceHandler()
 6        {
 7            Streams = new List<Tuple<Stream, Resource>>();
 8        }
 9
10        public override void HandleResource(Resource resource, ResourceHandlingContext context)
11        {
12            var outputStream = new MemoryStream();
13            Streams.Add(Tuple.Create<Stream, Resource>(outputStream, resource));
14            resource
15                .WithOutputUrl(new Url(Path.GetFileName(resource.OriginalUrl.Pathname), "memory:///"))
16                .Save(outputStream, context);
17        }
18
19        public void PrintInfo()
20        {
21            foreach (var stream in Streams)
22                Console.WriteLine($"uri:{stream.Item2.OutputUrl}, length:{stream.Item1.Length}");
23        }
24    }

As you can see, the new approach directly operates with Resource objects, eliminating the need for additional classes such as OutputStream. This simplifies the code and makes interaction with resources more explicit and understandable. Thus, the ResourceHandler base class offers a more simplified and expressive way to handle resources when saving HTML documents to memory streams.

After the example run, the message about memory storage will be printed:

uri:memory:///with-resources.html, length:256
uri:memory:///photo1.png, length:57438

Aspose.HTML offers free HTML Web Applications that are an online collection of converters, mergers, SEO tools, HTML code generators, URL tools, and more. The applications work on any operating system with a web browser and do not require any additional software installation. It’s a fast and easy way to efficiently and effectively solve your HTML-related tasks.

Subscribe to Aspose Product Updates

Get monthly newsletters & offers directly delivered to your mailbox.