Extract SVG From Website Using Aspose.HTML for Java
SVG is a vector graphics format designed primarily for the web, often used in HTML documents. The main advantage of SVG is its exceptional ability to scale to any size without losing quality. In addition, SVG offers several benefits, including programmability, small file size, styling options, interactivity, and more, all of which can improve the visual appeal and functionality of a web page.
Downloading SVG is not as easy as it may seem. If you have ever used the right mouse button to save or open an image from a web page, you have probably noticed that SVG files are challenging to extract from a website. Sometimes, right-clicking does not allow you to open it in a new tab or save it. So what can you do? You can manually inspect the HTML code to identify SVG tags and determine where the SVG content begins and ends. Fortunately, there is a more straightforward solution: you can use Aspose.HTML for Java to download SVG files from a website programmatically.
SVG graphics on web pages can be embedded in two ways: as inline SVG within the HTML or as external SVG referenced via URLs. In this article, we explore how to extract both inline and external SVGs using the Aspose.HTML for Java API. With this approach, you can automatically collect every SVG from a website without manually hunting through the code. Let’s dive in and make SVG extraction effortless!
Extract SVG from Website – Inline SVG
Inline SVG images are SVG elements <svg>
whose content describes the image. Inline SVG refers to embedding SVG code directly into HTML code rather than linking to an external SVG file. This is a popular technique for creating website icons, logos, and other graphical elements.
To save inline SVG images, we will find all <svg>
elements in an HTML document and use the OuterHTML
property of the
Element class to get their content. So, to download SVG from website, you should take a few following steps:
- Use the
HTMLDocument(Url) constructor to create an instance of
HTMLDocument
, passing the URL of the web page containing inline SVG images. - Call the
getElementsByTagName(“img”) method to collect all
<svg>
elements present in the HTML document. - Create a loop to iterate through each SVG image in the
images
collection. - For each image in the array, use the
getOuterHTML()
method to get the SVG element content, and then use theFileHelper.writeAllText()
method to write the SVG content into a local.svg
file.
1// How to extract inline SVG images from a webpage using Java
2
3// Open a document you want to download inline SVG images from
4final HTMLDocument document = new HTMLDocument("https://products.aspose.com/html/net/");
5
6// Collect all inline SVG images
7HTMLCollection images = document.getElementsByTagName("svg");
8
9for (int i = 0; i < images.getLength(); i++) {
10 // Save every image to a local file system
11 FileHelper.writeAllText($o("{i}.svg"), images.get_Item(i).getOuterHTML());
12}
Note: Always respect copyright laws when working with SVG files. Some SVG files, such as company logos or branded graphics, may be protected, and using them without permission may be considered plagiarism. Before extracting or using any SVG files in your projects, check the website’s terms of use or contact the site owner to get proper permission.
Extract SVG from Website – External SVG
External SVG is an SVG file stored outside an HTML document and loaded into the document using, for example, a <img>
tag. Separating SVG files from HTML makes it possible to reuse the same SVG image in multiple places without duplicating the code, making web pages more efficient and easier to maintain.
External SVG images are represented by the <img>
element, which in turn can also refer to other types of images, so SVG images should be further filtered. Let’s look at how to download SVG from website using the Aspose.HTML for Java library:
- Create an instance of the HTMLDocument class using the
HTMLDocument(
Url
) constructor and pass the URL of the website from which you want to extract external SVG images. - Collect all
<img>
elements in the HTML document using the getElementsByTagName("img"
) method. - Extract the
src
attribute from each image element using the getAttribute(“src”) method and store the values in aSet
. - Filter only
.svg
image URLs by checking if each URL ends with.svg
, and add those to a new list. - Create absolute SVG image URLs using the
Url class and the
BaseURI property of the
HTMLDocument
class. - Iterate through the absolute URLs and create a request using the RequestMessage class for each SVG URL.
- Send each request using
document.getContext().getNetwork().send(request)
and check the response for success. - Finally, if the response is successful, use the
FileHelper.writeAllBytes()
to save the SVG content to the local file system.
1// Download external SVG images from HTML using Java
2
3// Open a document you want to download external SVGs from
4final HTMLDocument document = new HTMLDocument("https://products.aspose.com/html/net/");
5
6// Collect all image elements
7HTMLCollection images = document.getElementsByTagName("img");
8
9// Create a distinct collection of relative image URLs
10java.util.Set<String> urls = new HashSet<>();
11for (Element element : images) {
12 urls.add(element.getAttribute("src"));
13}
14
15// Filter out non SVG images
16java.util.List<String> svgUrls = new ArrayList<>();
17for (String url : urls) {
18 if (url.endsWith(".svg")) {
19 svgUrls.add(url);
20 }
21}
22// Create absolute SVG image URLs
23java.util.List<Url> absUrls = svgUrls.stream()
24 .map(src -> new Url(src, document.getBaseURI()))
25 .collect(Collectors.toList());
26
27// foreach to while statements conversion
28for (Url url : absUrls) {
29 // Create a downloading request
30 final RequestMessage request = new RequestMessage(url);
31
32 // Download SVG image
33 final ResponseMessage response = document.getContext().getNetwork().send(request);
34
35 // Check whether response is successful
36 if (response.isSuccess()) {
37 String[] split = url.getPathname().split("/");
38 String path = split[split.length - 1];
39
40 // Save file to a local file system
41 FileHelper.writeAllBytes($o(path), response.getContent().readAsByteArray());
42 }
43}
This approach automates the extraction of external SVG images from a web page, saving you the time and effort of manually downloading each file. This is great for designers and developers who want to pull SVGs from sites without needing to dive into the source code.
Aspose.HTML provides a set of free online HTML Web Applications, including converters, mergers, SEO tools, HTML code generators, URL utilities, and more. These browser-based tools work on all operating systems and don’t require any additional software installation. Whether you need to convert or merge files, extract web data, generate HTML code, or analyze pages for SEO, you can do it all right on the web. Streamline your daily tasks and increase your productivity with our easy-to-use HTML Web Apps – anytime, anywhere.