Extract SVG From Website Using Java
SVG is a vector graphics format designed primarily for the web, often used in HTML documents. The main advantage of SVG is its exceptional ability to scale to any size without losing quality. In addition, SVG offers several benefits, including programmability, small file size, styling options, interactivity, and more, all of which can improve the visual appeal and functionality of a web page.
Downloading SVG is not as easy as it may seem. If you have ever used the right mouse button to save or open an image from a web page, you have probably noticed that SVG files are challenging to extract from a website. Sometimes, right-clicking does not allow you to open it in a new tab or save it. So what can you do? You can manually inspect the HTML code to identify SVG tags and determine where the SVG content begins and ends. Fortunately, there is a more straightforward solution: you can use Aspose.HTML for Java to download SVG files from a website programmatically.
SVG graphics on web pages can be embedded in two ways: as inline SVG within the HTML or as external SVG referenced via URLs. In this article, we explore how to extract both inline and external SVGs using the Aspose.HTML for Java API. With this approach, you can automatically collect every SVG from a website without manually hunting through the code. Let’s dive in and make SVG extraction effortless!
Extract SVG from Website – Inline SVG
Inline SVG images are SVG elements <svg>
whose content describes the image. Inline SVG refers to embedding SVG code directly into HTML code rather than linking to an external SVG file. This is a popular technique for creating website icons, logos, and other graphical elements.
To save inline SVG images, we will find all <svg>
elements in an HTML document and use the OuterHTML
property of the
Element class to get their content. So, to download SVG from website, you should take a few following steps:
- Use the
HTMLDocument(Url) constructor to create an instance of
HTMLDocument
, passing the URL of the web page containing inline SVG images. - Call the
getElementsByTagName(“img”) method to collect all
<svg>
elements present in the HTML document. - Create a loop to iterate through each SVG image in the
images
collection. - For each image in the array, use the
getOuterHTML()
method to get the SVG element content, and then use theFileHelper.writeAllText()
method to write the SVG content into a local.svg
file.
1// Open a document you want to download inline SVG images from
2final HTMLDocument document = new HTMLDocument("https://products.aspose.com/html/net/");
3
4// Collect all inline SVG images
5HTMLCollection images = document.getElementsByTagName("svg");
6
7for (int i = 0; i < images.getLength(); i++) {
8 // Save every image to a local file system
9 FileHelper.writeAllText("{i}.svg", images.get_Item(i).getOuterHTML());
10}
Note: Always respect copyright laws when working with SVG files. Some SVG files, such as company logos or branded graphics, may be protected, and using them without permission may be considered plagiarism. Before extracting or using any SVG files in your projects, check the website’s terms of use or contact the site owner to get proper permission.
Extract SVG from Website – External SVG
External SVG is an SVG file stored outside an HTML document and loaded into the document using, for example, a <img>
tag. Separating SVG files from HTML makes it possible to reuse the same SVG image in multiple places without duplicating the code, making web pages more efficient and easier to maintain.
External SVG images are represented by the <img>
element, which in turn can also refer to other types of images, so SVG images should be further filtered. Let’s look at how to download SVG from website using the Aspose.HTML for Java library:
- Create an instance of the HTMLDocument class using the
HTMLDocument(
Url
) constructor and pass the URL of the website from which you want to extract external SVG images. - Collect all
<img>
elements in the HTML document using the getElementsByTagName("img"
) method. - Extract the
src
attribute from each image element using the getAttribute(“src”) method and store the values in aSet
. - Filter only
.svg
image URLs by checking if each URL ends with.svg
, and add those to a new list. - Create absolute SVG image URLs using the
Url class and the
BaseURI property of the
HTMLDocument
class. - Iterate through the absolute URLs and create a request using the RequestMessage class for each SVG URL.
- Send each request using
document.getContext().getNetwork().send(request)
and check the response for success. - Finally, if the response is successful, use the
FileHelper.writeAllBytes()
to save the SVG content to the local file system.
1// Open a document you want to download external SVGs from
2final HTMLDocument document = new HTMLDocument("https://products.aspose.com/html/net/");
3
4// Collect all image elements
5HTMLCollection images = document.getElementsByTagName("img");
6
7// Create a distinct collection of relative image URLs
8java.util.Set<String> urls = new HashSet<>();
9for (Element element : images) {
10 urls.add(element.getAttribute("src"));
11}
12
13// Filter out non SVG images
14java.util.List<String> svgUrls = new ArrayList<>();
15for (String url : urls) {
16 if (url.endsWith(".svg")) {
17 svgUrls.add(url);
18 }
19}
20// Create absolute SVG image URLs
21java.util.List<Url> absUrls = svgUrls.stream()
22 .map(src -> new Url(src, document.getBaseURI()))
23 .collect(Collectors.toList());
24
25// foreach to while statements conversion
26for (Url url : absUrls) {
27 // Create a downloading request
28 final RequestMessage request = new RequestMessage(url);
29
30 // Download SVG image
31 final ResponseMessage response = document.getContext().getNetwork().send(request);
32
33 // Check whether response is successful
34 if (response.isSuccess()) {
35 String[] split = url.getPathname().split("/");
36 String path = split[split.length - 1];
37
38 // Save file to a local file system
39 FileHelper.writeAllBytes(path, response.getContent().readAsByteArray());
40 }
41}
This approach automates the extraction of external SVG images from a web page, saving you the time and effort of manually downloading each file. This is great for designers and developers who want to pull SVGs from sites without needing to dive into the source code.
Aspose.HTML provides a set of free online HTML Web Applications, including converters, mergers, SEO tools, HTML code generators, URL utilities, and more. These browser-based tools work on all operating systems and don’t require any additional software installation. Whether you need to convert or merge files, extract web data, generate HTML code, or analyze pages for SEO, you can do it all right on the web. Streamline your daily tasks and increase your productivity with our easy-to-use HTML Web Apps – anytime, anywhere.