Extract SVG From Website Using Java

SVG is a vector graphics format designed primarily for the web, often used in HTML documents. The main advantage of SVG is its exceptional ability to scale to any size without losing quality. In addition, SVG offers several benefits, including programmability, small file size, styling options, interactivity, and more, all of which can improve the visual appeal and functionality of a web page.

Downloading SVG is not as easy as it may seem. If you have ever used the right mouse button to save or open an image from a web page, you have probably noticed that SVG files are challenging to extract from a website. Sometimes, right-clicking does not allow you to open it in a new tab or save it. So what can you do? You can manually inspect the HTML code to identify SVG tags and determine where the SVG content begins and ends. Fortunately, there is a more straightforward solution: you can use Aspose.HTML for Java to download SVG files from a website programmatically.

SVG graphics on web pages can be embedded in two ways: as inline SVG within the HTML or as external SVG referenced via URLs. In this article, we explore how to extract both inline and external SVGs using the Aspose.HTML for Java API. With this approach, you can automatically collect every SVG from a website without manually hunting through the code. Let’s dive in and make SVG extraction effortless!

Extract SVG from Website – Inline SVG

Inline SVG images are SVG elements <svg> whose content describes the image. Inline SVG refers to embedding SVG code directly into HTML code rather than linking to an external SVG file. This is a popular technique for creating website icons, logos, and other graphical elements.

To save inline SVG images, we will find all <svg> elements in an HTML document and use the OuterHTML property of the Element class to get their content. So, to download SVG from website, you should take a few following steps:

  1. Use the HTMLDocument(Url) constructor to create an instance of HTMLDocument, passing the URL of the web page containing inline SVG images.
  2. Call the getElementsByTagName(“img”) method to collect all <svg> elements present in the HTML document.
  3. Create a loop to iterate through each SVG image in the images collection.
  4. For each image in the array, use the getOuterHTML() method to get the SVG element content, and then use the FileHelper.writeAllText() method to write the SVG content into a local .svg file.
 1// Open a document you want to download inline SVG images from
 2final HTMLDocument document = new HTMLDocument("https://products.aspose.com/html/net/");
 3
 4// Collect all inline SVG images
 5HTMLCollection images = document.getElementsByTagName("svg");
 6
 7for (int i = 0; i < images.getLength(); i++) {
 8    // Save every image to a local file system
 9    FileHelper.writeAllText("{i}.svg", images.get_Item(i).getOuterHTML());
10}
Example_ExtractInlineSvg hosted with ❤ by GitHub

Note: Always respect copyright laws when working with SVG files. Some SVG files, such as company logos or branded graphics, may be protected, and using them without permission may be considered plagiarism. Before extracting or using any SVG files in your projects, check the website’s terms of use or contact the site owner to get proper permission.

Extract SVG from Website – External SVG

External SVG is an SVG file stored outside an HTML document and loaded into the document using, for example, a <img> tag. Separating SVG files from HTML makes it possible to reuse the same SVG image in multiple places without duplicating the code, making web pages more efficient and easier to maintain.

External SVG images are represented by the <img> element, which in turn can also refer to other types of images, so SVG images should be further filtered. Let’s look at how to download SVG from website using the Aspose.HTML for Java library:

  1. Create an instance of the HTMLDocument class using the HTMLDocument(Url) constructor and pass the URL of the website from which you want to extract external SVG images.
  2. Collect all <img> elements in the HTML document using the getElementsByTagName("img") method.
  3. Extract the src attribute from each image element using the getAttribute(“src”) method and store the values in a Set.
  4. Filter only .svg image URLs by checking if each URL ends with .svg, and add those to a new list.
  5. Create absolute SVG image URLs using the Url class and the BaseURI property of the HTMLDocument class.
  6. Iterate through the absolute URLs and create a request using the RequestMessage class for each SVG URL.
  7. Send each request using document.getContext().getNetwork().send(request) and check the response for success.
  8. Finally, if the response is successful, use the FileHelper.writeAllBytes() to save the SVG content to the local file system.
 1// Open a document you want to download external SVGs from
 2final HTMLDocument document = new HTMLDocument("https://products.aspose.com/html/net/");
 3
 4// Collect all image elements
 5HTMLCollection images = document.getElementsByTagName("img");
 6
 7// Create a distinct collection of relative image URLs
 8java.util.Set<String> urls = new HashSet<>();
 9for (Element element : images) {
10    urls.add(element.getAttribute("src"));
11}
12
13// Filter out non SVG images
14java.util.List<String> svgUrls = new ArrayList<>();
15for (String url : urls) {
16    if (url.endsWith(".svg")) {
17        svgUrls.add(url);
18    }
19}
20// Create absolute SVG image URLs
21java.util.List<Url> absUrls = svgUrls.stream()
22    .map(src -> new Url(src, document.getBaseURI()))
23    .collect(Collectors.toList());
24
25// foreach to while statements conversion
26for (Url url : absUrls) {
27    // Create a downloading request
28    final RequestMessage request = new RequestMessage(url);
29
30    // Download SVG image
31    final ResponseMessage response = document.getContext().getNetwork().send(request);
32
33    // Check whether response is successful
34    if (response.isSuccess()) {
35        String[] split = url.getPathname().split("/");
36        String path = split[split.length - 1];
37
38        // Save file to a local file system
39        FileHelper.writeAllBytes(path, response.getContent().readAsByteArray());
40    }
41}

This approach automates the extraction of external SVG images from a web page, saving you the time and effort of manually downloading each file. This is great for designers and developers who want to pull SVGs from sites without needing to dive into the source code.

Aspose.HTML provides a set of free online HTML Web Applications, including converters, mergers, SEO tools, HTML code generators, URL utilities, and more. These browser-based tools work on all operating systems and don’t require any additional software installation. Whether you need to convert or merge files, extract web data, generate HTML code, or analyze pages for SEO, you can do it all right on the web. Streamline your daily tasks and increase your productivity with our easy-to-use HTML Web Apps – anytime, anywhere.

Text “HTML Web Applications”

Subscribe to Aspose Product Updates

Get monthly newsletters & offers directly delivered to your mailbox.