Convert PDF to Images formats

Aspose.PDF for Java allows you to convert PDF documents to image formats such as BMP, JPEG, GIF, PNG, EMF, TIFF, and SVG using two approaches. The conversion is done using Device and using SaveOption.

There are several classes in the library that allow you to use a virtual device to transform images. DocumentDevice is oriented for conversion whole document, but ImageDevice - for a particular page.

Convert PDF using DocumentDevice class

Aspose.PDF for Java makes a possible to convert PDF Pages to TIFF images.

The TiffDevice class allows you to convert PDF pages to TIFF images. This class provides a method named Process which allows you to convert all the pages in a PDF file to a single TIFF image.

Convert PDF Pages to One TIFF Image

Aspose.PDF for Java explain how to convert all pages in a PDF file to a single TIFF image:

  1. Create an object of the Document class.
  2. Call the Process method to convert the document.
  3. To set the output file’s properties, use the TiffSettings class.

The following code snippet shows how to convert all the PDF pages to a single TIFF image.

public static void ConvertPDFtoTiffAllPages() throws IOException {

        // Open document
        String pdfDocumentFileName = Paths.get(_dataDir.toString(), "PageToTIFF.pdf").toString();
        Document pdfDocument = new Document(pdfDocumentFileName);

        // Create Resolution object
        Resolution resolution = new Resolution(300);

        // Create TiffSettings object
        TiffSettings tiffSettings = new TiffSettings();
        tiffSettings.setCompression(CompressionType.None);
        tiffSettings.setDepth(ColorDepth.Default);
        tiffSettings.setShape(ShapeType.Landscape);
        tiffSettings.setSkipBlankPages(false);

        // Create TIFF device
        TiffDevice tiffDevice = new TiffDevice(resolution, tiffSettings);

        // Convert a particular page and save the image to stream
        tiffDevice.process(pdfDocument, _dataDir + "AllPagesToTIFF_out.tif");
    }

Convert Single Page to TIFF Image

Aspose.PDF for Java allows to convert a particular page in a PDF file to a TIFF image, use an overloaded version of the Process(..) method which takes a page number as an argument for conversion. The following code snippet shows how to convert the first page of a PDF to TIFF format.

public static void ConvertPDFtoTiffSinglePage() throws IOException {

        // Open document
        String pdfDocumentFileName = Paths.get(_dataDir.toString(), "PageToTIFF.pdf").toString();
        String tiffFileName = Paths.get(_dataDir.toString(), "PageToTIFF_out.tif").toString();
        Document pdfDocument = new Document(pdfDocumentFileName);

        // Create Resolution object
        Resolution resolution = new Resolution(300);

        // Create TiffSettings object
        TiffSettings tiffSettings = new TiffSettings();
        tiffSettings.setCompression(CompressionType.None);
        tiffSettings.setDepth(ColorDepth.Default);
        tiffSettings.setShape(ShapeType.Landscape);

        // Create TIFF device
        TiffDevice tiffDevice = new TiffDevice(resolution, tiffSettings);

        // Convert a particular page and save the image to stream
        tiffDevice.process(pdfDocument, 1, 1, tiffFileName);
    }

Use Bradley algorithm during conversion

Aspose.PDF for Java has been supporting the feature to convert PDF to TIFF using LZW compression and then with the use of AForge, Binarization can be applied. However one of the customers requested that for some images, they need to get the Threshold using Otsu, so they also would like to use Bradley.

    public static void ConvertPDFtoTiffBradleyBinarization() throws IOException {

        // Open document
        String pdfDocumentFileName = Paths.get(_dataDir.toString(), "PageToTIFF.pdf").toString();
        Document pdfDocument = new Document(pdfDocumentFileName);

        String outputImageFileName = Paths.get(_dataDir.toString(), "resultant_out.tif").toString();
        String outputBinImageFileName = Paths.get(_dataDir.toString(), "37116-bin_out.tif").toString();

        java.io.OutputStream outputImageFile = new java.io.FileOutputStream(outputImageFileName);
        java.io.OutputStream outputBinImageFile = new java.io.FileOutputStream(outputBinImageFileName);

        // Create Resolution object
        Resolution resolution = new Resolution(300);
        // Create TiffSettings object
        TiffSettings tiffSettings = new TiffSettings();
        tiffSettings.setCompression(CompressionType.LZW);
        tiffSettings.setDepth(ColorDepth.Format1bpp);

        // Create TIFF device
        TiffDevice tiffDevice = new TiffDevice(resolution, tiffSettings);
        // Convert a particular page and save the image to stream
        tiffDevice.process(pdfDocument, outputImageFile);
        outputImageFile.close();

        // Create stream object to save the output image
        java.io.InputStream inStream = new java.io.FileInputStream(outputImageFileName);
        tiffDevice.binarizeBradley(inStream, outputBinImageFile, 0.1);

    }

Convert PDF Pages to Pixelated TIFF Images

To convert all pages in a PDF file to Pixelated TIFF format, use Pixelated option of IndexedConversionType

   public static void ConvertPDFtoTIFF_Pixelated() throws IOException {
        // Convert PDF Pages to Pixelated TIFF Images
        // To convert all pages in a PDF file to Pixelated TIFF format, use Pixelated
        // option of IndexedConversionType.

        // Open document
        String pdfDocumentFileName = Paths.get(_dataDir.toString(), "PageToTIFF.pdf").toString();
        Document pdfDocument = new Document(pdfDocumentFileName);

        // Create stream object to save the output image
        java.io.OutputStream imageStream = new java.io.FileOutputStream("Image.tiff");

        // Create Resolution object
        com.aspose.pdf.devices.Resolution resolution = new com.aspose.pdf.devices.Resolution(300);

        // instantiate TiffSettings object
        com.aspose.pdf.devices.TiffSettings tiffSettings = new com.aspose.pdf.devices.TiffSettings();

        // set the compression of resultant TIFF image
        tiffSettings.setCompression(com.aspose.pdf.devices.CompressionType.CCITT4);
        // set the color depth for resultant image
        tiffSettings.setDepth(com.aspose.pdf.devices.ColorDepth.Format4bpp);
        // skip blank pages while rendering PDF to TIFF
        tiffSettings.setSkipBlankPages(true);
        // set image brightness
        tiffSettings.setBrightness(.5f);

        // set IndexedConversion Type, default value is simple
        tiffSettings.setIndexedConversionType(IndexedConversionType.Pixelated);

        // Create TiffDevice object with particular resolution
        TiffDevice tiffDevice = new TiffDevice(2480, 3508, resolution, tiffSettings);

        // Convert a particular page (Page 1) and save the image to stream
        tiffDevice.process(pdfDocument, 1, 1, imageStream);

        // Close the stream
        imageStream.close();
    }

Convert PDF using ImageDevice class

ImageDevice is the ancestor for BmpDevice, JpegDevice, GifDevice, PngDevice and EmfDevice.

  • The BmpDevice class allows you to convert PDF pages to BMP images.
  • The EmfDevice class allows you to convert PDF pages to EMF images.
  • The JpegDevice class allows you to convert PDF pages to JPEG images.
  • The PngDevice class allows you to convert PDF pages to PNG images.
  • The GifDevice class allows you to convert PDF pages to GIF images.

Let’s take a look at how to convert a PDF page to an image.

BmpDevice class provides a method named Process which allows you to convert a particular page of the PDF file to BMP image format. The other classes have the same method. So, if we need to convert a PDF page to an image, we just instantiate the required class.

The following code snippet shows this possibility:

public final class ConvertPDFtoImage {
    private static Path _dataDir = Paths.get("/home/aspose/pdf-examples/Samples");

    public static void ConvertPDFusingImageDevice() {
        // Create Resolution object
        Resolution resolution = new Resolution(300);
        BmpDevice bmpDevice = new BmpDevice(resolution);
        JpegDevice jpegDevice = new JpegDevice(resolution);
        GifDevice gifDevice = new GifDevice(resolution);
        PngDevice pngDevice = new PngDevice(resolution);
        EmfDevice emfDevice = new EmfDevice(resolution);

        Document document = new Document(_dataDir + "ConvertAllPagesToBmp.pdf");

        ConvertPDFtoImage(bmpDevice, "bmp", document);
        ConvertPDFtoImage(jpegDevice, "jpeg", document);
        ConvertPDFtoImage(gifDevice, "gif", document);
        ConvertPDFtoImage(pngDevice, "png", document);
        ConvertPDFtoImage(emfDevice, "emf", document);

    }
}

public static void ConvertPDFtoImage(ImageDevice imageDevice, string ext, Document pdfDocument) {
        for (int pageCount = 1; pageCount <= pdfDocument.Pages.Count; pageCount++) {
            java.io.OutputStream outputBinImageFile = new java.io.FileOutputStream(
                    _dataDir + "image" + pageCount + "_out." + ext);
            // Convert a particular page and save the image to stream
            imageDevice.Process(pdfDocument.Pages[pageCount], imageStream);

            // Close stream
            imageStream.Close();
        }
    }

Convert PDF using SaveOptions class

This part of article shows you how to convert PDF to SVG using Java and SaveOptions class.

Scalable Vector Graphics (SVG) is a family of specifications of an XML-based file format for two-dimensional vector graphics, both static and dynamic (interactive or animated). The SVG specification is an open standard that has been under development by the World Wide Web Consortium (W3C) since 1999.

SVG images and their behaviors are defined in XML text files. This means that they can be searched, indexed, scripted and if required, compressed. As XML files, SVG images can be created and edited with any text editor, but it is often more convenient to create them with drawing programs such as Inkscape.

Convert PDF pages to SVG images

Aspose.PDF for Java supports the feature to convert PDF file to SVG format. To accomplish this requirement, the SvgSaveOptions class has been introduced into the com.aspose.pdf package. Instantiate an object of SvgSaveOptions and pass it as a second argument to the Document.save(..) method.

The following code snippet shows the steps for converting a PDF file to SVG format.

package com.aspose.pdf.examples;

import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;

import com.aspose.pdf.*;

public class ConvertPDFtoSVG {
    private ConvertPDFtoSVG() {
    }
    // The path to the documents directory.
    private static Path _dataDir = Paths.get("/home/admin1/pdf-examples/Samples");

    public static void main(String[] args) throws IOException {
        String pdfFileName = Paths.get(_dataDir.toString(), "input.pdf").toString();
        String svgFileName = Paths.get(_dataDir.toString(), "PDFToSVG_out.svg").toString();
        
        // Load PDF document
        Document doc = new Document(pdfFileName);
        
        // Instantiate an object of SvgSaveOptions
        SvgSaveOptions saveOptions = new SvgSaveOptions();
        
        // Do not compress SVG image to Zip archive
        saveOptions.CompressOutputToZipArchive = false;
        
        // Save the output in SVG files
        doc.save(svgFileName, saveOptions);
    }
}