Extract Images From Website – C# Examples
Suppose you are a web developer, graphic designer, researcher, journalist, or student, in a word, a person working on his own project. In that case, you will obviously need images for your project in the development process. To download images from website manually, you spend a very long time. You will have to copy each image URL and then save the image, but you can use Aspose.HTML for .NET library to extract images from website programmatically.
In this article, we look at how to extract from website various types of images by using Aspose.HTML for .NET API. Using our C# library will help you download images from website. It’s better than digging and looking for them manually. Let’s try! Extract images from website programmatically!
Extract Images from Website – C# code
Most pictures in an HTML document are represented using the <img>
element. Here is an example of how to use Aspose.HTML for .NET to find images specified by this element. So, to download images from website, you should take a few following steps:
- Use the
HTMLDocument(Url) constructor to create an instance of the
HTMLDocument
class and pass it the URL of the website from which you want to find images. - Use the
GetElementsByTagName(“img”) method to collect all
<img>
elements. The method returns a list of the HTML document’s<img>
elements. - Use the
Select()
method to create a distinct collection of relative image URLs and the GetAttribute(“src”) method to extract thesrc
attribute of each<img>
element. - Create absolute image URLs using the
Url class and the
BaseURI property of the
HTMLDocument
class. - For each absolute URL, create a request using the RequestMessage(url) constructor.
- Use the document’s Context.Network.Send(request) method to send the request. The response is checked to ensure it was successful.
- Finally, if the response was successful, use the
File.WriteAllBytes()
method to save each image to a local file.
1using Aspose.Html;
2using Aspose.Html.Net;
3using System.IO;
4using System.Linq;
5...
6 // Open a document you want to extract images from
7 using var document = new HTMLDocument("https://docs.aspose.com/svg/net/drawing-basics/svg-shapes/");
8
9 // Collect all <img> elements
10 var images = document.GetElementsByTagName("img");
11
12 // Create a distinct collection of relative image URLs
13 var urls = images.Select(element => element.GetAttribute("src")).Distinct();
14
15 // Create absolute image URLs
16 var absUrls = urls.Select(src => new Url(src, document.BaseURI));
17
18 foreach (var url in absUrls)
19 {
20 // Create an image request message
21 using var request = new RequestMessage(url);
22
23 // Extract image
24 using var response = document.Context.Network.Send(request);
25
26 // Check whether a response is successful
27 if (response.IsSuccess)
28 {
29 // Save image to a local file system
30 File.WriteAllBytes(Path.Combine(OutputDir, url.Pathname.Split('/').Last()), response.Content.ReadAsByteArray());
31 }
32 }
Note: It is important to adhere to copyright laws and obtain proper permission or licensing before using saved images for commercial purposes. We do not support data extraction and use of other people’s files for commercial purposes without their permission.
Extract Icons – C# code
Icons are a kind of image in HTML documents that are specified using <link>
elements with the rel
attribute set to icon
. Let’s look at how to extract icons from website using the Aspose.HTML for .NET library:
- Use the
HTMLDocument(Url) constructor to create an instance of the
HTMLDocument
class and pass it the URL of the website from which you want to extract icons. - Use the
GetElementsByTagName(“link”) method to collect all
<link>
elements. - To filter out non-icon images, use the
Where()
method that filters the collection based on thelink => link.GetAttribute("rel") == "icon"
expression. Thus, theicons
collection will contain onlylinks
with arel
attribute with the valueicon
. - Use the
Select()
method to create a distinct collection of relative icon URLs and the GetAttribute(“href”) method to extract thehref
attribute of each<link>
element. - Create absolute icon URLs using the
Url class and the
BaseURI property of the
HTMLDocument
class. - Then, for each absolute URL, create a request using the RequestMessage class.
- Use the document’s Context.Network.Send(request) method to send the request. The response is checked to ensure it was successful.
- If the response was successful, use the
File.WriteAllBytes()
method to save icons to a local file. As a result, you will have a collection of icons from website in your local folder.
1using Aspose.Html;
2using Aspose.Html.Net;
3using System.IO;
4using System.Linq;
5...
6 // Open a document you want to extract icons from
7 using var document = new HTMLDocument("https://docs.aspose.com/html/net/message-handlers/");
8
9 // Collect all <link> elements
10 var links = document.GetElementsByTagName("link");
11
12 // Leave only "icon" elements
13 var icons = links.Where(link => link.GetAttribute("rel") == "icon");
14
15 // Create a distinct collection of relative icon URLs
16 var urls = icons.Select(icon => icon.GetAttribute("href")).Distinct();
17
18 // Create absolute icon URLs
19 var absUrls = urls.Select(src => new Url(src, document.BaseURI));
20
21 foreach (var url in absUrls)
22 {
23 // Create a extracting request
24 using var request = new RequestMessage(url);
25
26 // Extract icon
27 using var response = document.Context.Network.Send(request);
28
29 // Check whether a response is successful
30 if (response.IsSuccess)
31 {
32 // Save icon to a local file system
33 File.WriteAllBytes(Path.Combine(OutputDir, url.Pathname.Split('/').Last()), response.Content.ReadAsByteArray());
34
35 Assert.True(File.Exists(Path.Combine(OutputDir, url.Pathname.Split('/').Last())));
36 }
37 }
You can use these C# examples to automate extracting all images from website, which can be helpful for tasks such as archiving, researching, analyzing web content, or any other application for personal use. Also, this is great for web designers and developers wanting to pull images from sites without diving into the source code.
You can download the complete C# examples and data files from GitHub.
Aspose.HTML offers HTML Web Applications that are an online collection of free converters, mergers, SEO tools, HTML code generators, URL tools, and more. The applications work on any operating system with a web browser and do not require any additional software installation. Easily convert, merge, encode, generate HTML code, extract data from the web, or analyze web pages in terms of SEO wherever you are. Use our collection of HTML Web Applications to perform your daily matters and make your workflow seamless!