Creating a Document
The HTMLDocument is a starting point for Aspose.HTML class library. You can load the HTML page into the Document Object Model (DOM) by using the HTMLDocument class, and then programmatically read, modify, and remove HTML in the document.
The HTMLDocument has several overloaded constructors allowing you to create a blank document or to load it from a file or stream:
Create a New HTML Document
If you want to generate a document programmatically from scratch, please use constructor without parameters as specified in the following code snippet:
Load from a File
Following code snippet shows how to load the HTMLDocument from an existing file:
Load from a URL
In the next code snippet you can see how to load a web page into HTMLDocument
In case if you pass a wrong URL that can’t be reached right at the moment, the library throws the DOMException with specialized code ‘NetworkError’ to inform you that the selected resource can not be found.
Load from HTML Code
Since Scalable Vector Graphics (SVG) is a part of W3C standards and could be embedded into HTMLDocument, we implemented SVGDocument and all its functionality. Our implementation is based on official specification SVG 2 specification, so you can load, read, manipulate SVG documents as it described officially.
Since SVGDocument and HTMLDocument are based on the same WHATWG DOM standard, the all operations such as loading, reading, editing, converting and saving are similar for both documents. So, the all examples where you can see manipulation with HTMLDocument are applicable for SVGDocument as well.
The example below shows you how to load the SVG Document from the in-memory System.String variable:
MHTML stands for MIME encapsulation of aggregate HTML documents. It is a speficalized format to create web page archives. The Aspose.HTML library supports this format, but with some limitations. We only support the rendering operations from MHTML to the supported output formats. For more details, please read Converting Between Formats article.
For EPUB format, which represents an electronic publication format, we have the same limitation as for MHTML. We only support the rendering operations from EPUB to the supported output formats. For more details, please read Converting Between Formats article.
We realize that loading a document could be a resource-intensive operation since it’s required loading not only the document itself but all linked resources and processing all scripts. So, in the following code snippets, we show you how to use asynchronous operations and load HTMLDocument without blocking the main thread:
ReadyStateChange is not the only event that can used to handle an async loading operation, you can also subscribe for Load event, as it follows: