Aspose.HTML for .NET 24.2.0 – Save HTML to a Stream
Aspose.HTML for .NET 24.2.0
In version 24.2.0, the IOutputStorage interface has been deprecated but will continue to work until version 24.5.0 is released. If you use earlier versions of Aspose.HTML for .NET, we recommend that you upgrade and migrate to the new version, as version 24.5.0 will remove this deprecated interface.
Deprecated | New |
---|---|
IOutputStorage Interface | ResourceHandler Class |
The HTML document can contain different resources like CSS, external images, and files. In the article, we will consider the cases for saving HTML documents with resources to Zip archive and memory stream. Aspose.HTML for .NET continues to develop and enhance the ways to save HTMLs with all linked files. Here we will look at examples of saving files using deprecated classes and offer advanced solutions for implementing new classes.
ResourceHandler Class Vs IOutputStorage Interface
The
ResourceHandler class allows developers to implement the
HandleResource() method, in which you can create a stream yourself and release it wherever you need it. The new method adds the ability to see a Resource
and process more information about it. So, by adopting the ResourceHandler
class, developers can benefit from a more streamlined and expressive approach to managing resources, resulting in cleaner, more maintainable, and flexible code when saving HTML documents.
Save HTML to a Zip Archive
Here, we will examine one and the same example of saving the
with-resources.html file in a Zip archive using an outdated IOutputStorage
interface and a new ResourceHandler
class.
Using IOutputStorage – 24.1.0 and Earlier Versions
The IOutputStorage
interface was a base interface that supported the creation and management of output streams. You can implement the IOutputStorage
interface by creating ZipStorage class to save HTML with resources to a Zip archive:
1using System.IO;
2using Aspose.Html;
3using Aspose.Html.Saving;
4using System.IO.Compression;
5...
6
7 // Prepare a path to a source HTML file
8 string inputPath = Path.Combine(DataDir, "with-resources.html");;
9
10 var dir = Directory.GetCurrentDirectory();
11
12 // Prepare a full path to an output Zip storage
13 string customArchivePath = Path.Combine(dir, "./../../../../tests-out/old/archive.zip");
14
15 // Load the HTML document
16 using (var doc = new HTMLDocument(inputPath))
17 {
18 // Initialize an instance of the ZipStorage class
19 using (var zipSrorage = new ZipStorage(customArchivePath))
20 {
21 // Save HTML with resources to a Zip archive
22 doc.Save(zipSrorage);
23 }
24 }
The following code snippet shows the realization of the IOutputStorage
in the ZipStorage class to demonstrate saving an HTML document with resources to a Zip archive.
1 internal class ZipStorage : IOutputStorage, IDisposable
2 {
3 private FileStream zipStream;
4 private ZipArchive archive;
5 private int streamsCounter;
6 private bool initialized;
7
8 public ZipStorage(string name)
9 {
10 DisposeArchive();
11 zipStream = new FileStream(name, FileMode.Create);
12 archive = new ZipArchive(zipStream, ZipArchiveMode.Update);
13 initialized = false;
14 }
15
16 public OutputStream CreateStream(OutputStreamContext context)
17 {
18 var zipUri = (streamsCounter++ == 0 ? Path.GetFileName(context.Uri) :
19 Path.Combine(Path.GetFileName(Path.GetDirectoryName(context.Uri)), Path.GetFileName(context.Uri)));
20 var samplePrefix = String.Empty;
21 if (initialized)
22 samplePrefix = "my_";
23 else
24 initialized = true;
25
26 var newStream = archive.CreateEntry(samplePrefix + zipUri).Open();
27 var outputStream = new OutputStream(newStream, "file:///" + samplePrefix + zipUri);
28 return outputStream;
29 }
30
31 public void ReleaseStream(OutputStream stream)
32 {
33 stream.Flush();
34 stream.Close();
35 }
36
37 private void DisposeArchive()
38 {
39 if (archive != null)
40 {
41 archive.Dispose();
42 archive = null;
43 }
44 if (zipStream != null)
45 {
46 zipStream.Dispose();
47 zipStream = null;
48 }
49 streamsCounter = 0;
50 }
51
52 public void Dispose()
53 {
54 DisposeArchive();
55 }
56 }
Using new ResourceHandler class – Since Version 24.2.0
The
ResourceHandler class is intended for customers implementation. The following C# example shows how to save an HTML document with resources to a Zip archive using the ZipResourceHandler class of the ResourceHandler
:
1using System.IO;
2using Aspose.Html;
3using Aspose.Html.Saving;
4using Aspose.Html.Saving.ResourceHandlers;
5using System.IO.Compression;
6...
7
8 // Prepare a path to a source HTML file
9 string inputPath = Path.Combine(DataDir, "with-resources.html");
10
11 var dir = Directory.GetCurrentDirectory();
12
13 // Prepare a full path to an output zip storage
14 string customArchivePath = Path.Combine(dir, "./../../../../tests-out/new/archive.zip");
15
16 // Load the HTML document
17 using (var doc = new HTMLDocument(inputPath))
18 {
19 // Initialize an instance of the ZipResourceHandler class
20 using (var resourceHandler = new ZipResourceHandler(customArchivePath))
21 {
22 // Save HTML with resources to a Zip archive
23 doc.Save(resourceHandler);
24 }
25 }
The following code snippet shows the realization of the ResourceHandler
in the ZipResourceHandler class to demonstrate saving an HTML document with resources to a Zip archive. The
HandleResource() method of the ZipResourceHandler
class is responsible for handling each resource during the saving process when creating a Zip archive:
1 internal class ZipResourceHandler : ResourceHandler, IDisposable
2 {
3 private FileStream zipStream;
4 private ZipArchive archive;
5 private int streamsCounter;
6 private bool initialized;
7
8 public ZipResourceHandler(string name)
9 {
10 DisposeArchive();
11 zipStream = new FileStream(name, FileMode.Create);
12 archive = new ZipArchive(zipStream, ZipArchiveMode.Update);
13 initialized = false;
14 }
15
16 public override void HandleResource(Resource resource, ResourceHandlingContext context)
17 {
18 var zipUri = (streamsCounter++ == 0
19 ? Path.GetFileName(resource.OriginalUrl.Href)
20 : Path.Combine(Path.GetFileName(Path.GetDirectoryName(resource.OriginalUrl.Href)),
21 Path.GetFileName(resource.OriginalUrl.Href)));
22 var samplePrefix = String.Empty;
23 if (initialized)
24 samplePrefix = "my_";
25 else
26 initialized = true;
27
28 using (var newStream = archive.CreateEntry(samplePrefix + zipUri).Open())
29 {
30 resource.WithOutputUrl(new Url("file:///" + samplePrefix + zipUri)).Save(newStream, context);
31 }
32 }
33
34 private void DisposeArchive()
35 {
36 if (archive != null)
37 {
38 archive.Dispose();
39 archive = null;
40 }
41
42 if (zipStream != null)
43 {
44 zipStream.Dispose();
45 zipStream = null;
46 }
47
48 streamsCounter = 0;
49 }
50
51 public void Dispose()
52 {
53 DisposeArchive();
54 }
55 }
Save HTML to Memory Streams
Let’s consider the C# example of saving an HTML file with linked resources to a memory stream using the deprecated IOutputStorage
interface and the new
ResourceHandler class. The source
with-resources.html document and the linked files are in the same directory.
Using IOutputStorage – 24.1.0 and Earlier Versions
The IOutputStorage
interface implementation allowed saving HTML to memory streams:
1using System.IO;
2using Aspose.Html;
3using Aspose.Html.Saving;
4using System.Collections.Generic;
5...
6
7 // Prepare a path to a source HTML file
8 string inputPath = Path.Combine(DataDir, "with-resources.html");
9
10 // Initialaze an HTML document
11 using (var doc = new HTMLDocument(inputPath))
12 {
13 // Create an instance of the MemoryOutputStorage class and save HTML to memory
14 var memoryStorage = new MemoryOutputStorage();
15 doc.Save(memoryStorage);
16 memoryStorage.PrintInfo();
17 }
After the example run, the message about memory storage will be printed:
uri:memory:///with-resources.html, length:256
uri:memory:///photo1.png, length:57438
The following code snippet shows the realization of the IOutputStorage
in the MemoryOutputStorage class to demonstrate saving an HTML document to memory streams.
1 internal class MemoryOutputStorage : IOutputStorage
2 {
3 public List<Tuple<OutputStream, string>> Streams;
4
5 public MemoryOutputStorage()
6 {
7 Streams = new List<Tuple<OutputStream, string>>();
8 }
9
10 public OutputStream CreateStream(OutputStreamContext context)
11 {
12 var normalizedPath = new Url(context.Uri).Pathname;
13 var uri = new Url(Path.GetFileName(normalizedPath), "memory:///").Href;
14 var outputStream = new OutputStream(new MemoryStream(), uri);
15 Streams.Add(Tuple.Create(outputStream, uri));
16 return outputStream;
17 }
18
19 public void ReleaseStream(OutputStream stream)
20 {
21 stream.Flush();
22 }
23
24 public void PrintInfo()
25 {
26 foreach (var stream in Streams)
27 Console.WriteLine($"uri:{stream.Item2}, length:{stream.Item1.Length}");
28 }
29 }
Using new ResourceHandler class – Since Version 24.2.0
The following code snippet shows the realization of the ResourceHandler
in the MemoryResourceHandler class to demonstrate saving an HTML document to memory streams:
1using System.IO;
2using Aspose.Html;
3using Aspose.Html.Saving;
4using Aspose.Html.Saving.ResourceHandlers;
5using System.Collections.Generic;
6...
7
8 // Prepare a path to a source HTML file
9 string inputPath = Path.Combine(DataDir, "with-resources.html");
10
11 // Load the HTML document
12 using (var doc = new HTMLDocument(inputPath))
13 {
14 // Create an instance of the MemoryResourceHandler class and save HTML to memory
15 var resourceHandler = new MemoryResourceHandler();
16 doc.Save(resourceHandler);
17 resourceHandler.PrintInfo();
18 }
Instead of the CreateStream()
and ReleaseStream()
methods implemented in the MemoryOutputStorage class of the IOutputStorage
interface, there is now one
HandleResource() method implemented in the MemoryResourceHandler
class of the ResourceHandler
class, in which you can create a stream yourself and release it wherever you need it. The new method adds the ability to see a Resource
and process more information about it:
1 internal class MemoryResourceHandler : ResourceHandler
2 {
3 public List<Tuple<Stream, Resource>> Streams;
4
5 public MemoryResourceHandler()
6 {
7 Streams = new List<Tuple<Stream, Resource>>();
8 }
9
10 public override void HandleResource(Resource resource, ResourceHandlingContext context)
11 {
12 var outputStream = new MemoryStream();
13 Streams.Add(Tuple.Create<Stream, Resource>(outputStream, resource));
14 resource
15 .WithOutputUrl(new Url(Path.GetFileName(resource.OriginalUrl.Pathname), "memory:///"))
16 .Save(outputStream, context);
17 }
18
19 public void PrintInfo()
20 {
21 foreach (var stream in Streams)
22 Console.WriteLine($"uri:{stream.Item2.OutputUrl}, length:{stream.Item1.Length}");
23 }
24 }
As you can see, the new approach directly operates with Resource
objects, eliminating the need for additional classes such as OutputStream
. This simplifies the code and makes interaction with resources more explicit and understandable. Thus, the ResourceHandler
base class offers a more simplified and expressive way to handle resources when saving HTML documents to memory streams.
After the example run, the message about memory storage will be printed:
uri:memory:///with-resources.html, length:256
uri:memory:///photo1.png, length:57438
Aspose.HTML offers free HTML Web Applications that are an online collection of converters, mergers, SEO tools, HTML code generators, URL tools, and more. The applications work on any operating system with a web browser and do not require any additional software installation. It’s a fast and easy way to efficiently and effectively solve your HTML-related tasks.