Convert HTML from ZIP archive to PDF – C# example
In this article, we create a custom Message Handler to do a specific task – convert HTML from ZIP archive to PDF.
There are many reasons why would you require to convert HTML from ZIP archive to PDF format. PDF comes with many advantages that other files don’t have. For example, many programs and apps support PDF documents, most web browsers such as Chrome and Firefox can display a PDF. PDF files are optimized for printing; they are ideal for creating physical copies of your documents. PDF supports many compression algorithms. You can configure the security settings for your PDF file, etc.
Create a Custom Message Handler
Aspose.HTML for .NET offers functionality for custom message handlers creating. Let’s design a custom handler that we can use to work with ZIP archives. Take the following steps:
Use the necessary Namespace, which is the Aspose.Html.Net. This Namespace is presented by classes and interfaces which are responsible for helping easy network processing.
To create a custom Message Handler, you need to define your own class that will be derived from the MessageHandler class. The MessageHandler class represents a base type for message handlers. Inheriting from IDisposable is necessary to provide a mechanism for the deterministic release of unmanaged resources.
1using Aspose.Html.Net;
2...
3
4 class ZipArchiveMessageHandler : MessageHandler, IDisposable
5 {
6 }
So, you have defined your own ZipArchiveMessageHandler class, now you need to do some operations within it.
Initialize an instance of the ZipArchiveMessageHandler class and define a Filter property for it.
Override the Invoke() method of the MessageHandler class to implement the custom message handler behaviour.
1using System.Net;
2using Aspose.Html;
3using Aspose.Html.Net;
4using Aspose.Html.Net.MessageFilters;
5using Aspose.Zip;
6...
7
8 // Define ZipArchiveMessageHandler class that is derived from the MessageHandler class
9 class ZipArchiveMessageHandler : MessageHandler, IDisposable
10 {
11 private string filePath;
12 private Archive archive;
13
14 // Initialize an instance of the ZipArchiveMessageHandler class
15 public ZipArchiveMessageHandler(string path)
16 {
17 this.filePath = path;
18 Filters.Add(new ProtocolMessageFilter("zip"));
19 }
20
21 // Override the Invoke() method
22 public override void Invoke(INetworkOperationContext context)
23 {
24 // Call the GetFile() method that defines the logic in the Invoke() method
25 var buff = GetFile(context.Request.RequestUri.Pathname.TrimStart('/'));
26 if (buff != null)
27 {
28 // Checking: if a resource is found in the archive, then return it as a Response
29 context.Response = new ResponseMessage(HttpStatusCode.OK)
30 {
31 Content = new ByteArrayContent(buff)
32 };
33 context.Response.Headers.ContentType.MediaType = MimeType.FromFileExtension(context.Request.RequestUri.Pathname);
34 }
35 else
36 {
37 context.Response = new ResponseMessage(HttpStatusCode.NotFound);
38 }
39
40 // Call the next message handler
41 Next(context);
42 }
43 }
Let’s take a closer look at this code snippet:
First of all, the custom ZipArchiveMessageHandler needs to inherit from the base MessageHandler class. It has two variables: the archive and the string representation of the path to the archive.
The message handler has the concept of filtering. In this case, a protocol (schema) filter is added; this message handler will only work with the
"zip"
protocol. That is, if the resource has a"zip"
protocol, then it will be processed by ZipArchiveMessageHandler.Filtering messages by resource protocol is implemented using the ProtocolMessageFilter class. The ProtocolMessageFilter() constructor initializes a new instance of the ProtocolMessageFilter class. It takes the
"zip"
protocols as a parameter.The Invoke() method implements the message handler behaviour. It is called for each handler in the pipeline and takes a
context
as a parameter. The GetFile() method defines the logic in the Invoke() method. It implements the chain of duties, after which the next Next(context
) handler is called. The GetFile() method realizes a search for data as a byte array in a zip archive based on Request and forms Response.context
provides contextual information for network services, the entity of the operation is passed through it, and the result of the operation is returned. In Aspose.HTML, thecontext
is realized by INetworkOperationContext interface that has two properties (parameters) – Request and Response. Request gets or sets the request message, Response gets or sets the response message. The Request contains information for a web request, for example, a URL – a path to a resource, headers, etc. The Response contains the response that the endpoint (Internet) returned.
Define the GetFile(), GetArchive(), and Dispose() Methods
1using System.IO;
2using Aspose.Zip;
3...
4
5 byte[] GetFile(string path)
6 {
7 path = path.Replace(@"\", @"/");
8 var result = GetArchive().Entries.FirstOrDefault(x => path == x.Name);
9 if (result != null)
10 {
11 using (var fs = result.Open())
12 using (MemoryStream ms = new MemoryStream())
13 {
14 fs.CopyTo(ms);
15 return ms.ToArray();
16 }
17 }
18 return null;
19 }
20
21 Archive GetArchive()
22 {
23 return archive ??= new Archive(filePath);
24 }
25
26 public void Dispose()
27 {
28 archive?.Dispose();
29 }
You can download the complete examples and data files from GitHub.
Add ZipArchiveMessageHandler in the Pipeline
You would now need to add ZipArchiveMessageHandler in the pipeline. Use Add() method that takes a zip
object as a parameter and adds ZipArchiveMessageHandler to the end of the message handlers’ collection.
The INetworkService.MessageHandlers property gets a list of MessageHandler instances to be invoked as a RequestMessage executes.
1using System;
2using Aspose.Html;
3using Aspose.Html.Net;
4using Aspose.Html.Rendering.Pdf;
5using Aspose.Html.Services;
6...
7
8 // Prepare path to a source zip file
9 string documentPath = Path.Combine(DataDir, "test.zip");
10
11 // Prepare path for converted file saving
12 string savePath = Path.Combine(OutputDir, "zip-to-pdf.pdf");
13
14 // Create an instance of ZipArchiveMessageHandler
15 using var zip = new ZipArchiveMessageHandler(documentPath);
16
17 // Create an instance of the Configuration class
18 using var configuration = new Configuration();
19
20 // Add ZipArchiveMessageHandler to the chain of existing message handlers
21 configuration
22 .GetService<INetworkService>()
23 .MessageHandlers.Add(zip);
24
25 // Initialize an HTML document with specified configuration
26 using var document = new HTMLDocument("zip:///test.html", configuration);
27
28 // Create the PDF Device
29 using var device = new PdfDevice(savePath);
30
31 // Render ZIP to PDF
32 document.RenderTo(device);
In the example, the ZIP archive (test.zip) has the HTML document (test.html) in which all related resources have paths relative to the HTML document.
Note: The
HTMLDocument(address, configuration
) constructor takes the absolute path to the ZIP archive. But all related resources have relative paths in the HTML document and in the example’s code.
Please read the
Fine-Tuning Convserters article to learn more about converting HTML to PDF using
Renderto(device
) method.
Aspose.HTML provides a free online ZIP to PDF Converter that allows you to quickly, easily and clearly convert HTML from ZIP archive to PDF. Upload, convert files and get results in seconds. No additional software is required. Try our robust Converter for free now!