Extract Images from PDF File using Python

Contents
[ ]

Do you need to separate images from your PDF files? For simplified management, archiving, analysis, or sharing images of your documents, use Aspose.PDF for Python and extract images from PDF files.

Images are held in each page’s resources collection’s XImage collection. To extract a particular page, then get the image from the Images collection using the particular index of the image.

The image’s index returns an XImage object. This object provides a save() method which can be used to save the extracted image. The following code snippet shows how to extract images from a PDF file.


    import aspose.pdf as ap

    # Open document
    document = ap.Document(input_file)

    # Extract a particular image
    xImage = document.pages[2].resources.images[1]
    outputImage = io.FileIO(output_image, "w")

    # Save output image
    xImage.save(outputImage)
    outputImage.close()