Extract Images from Document

Aspose.Words - Extract Images from Document

To extract all images or images having specific type from the document, follow these steps:

  • Use the Document.GetChildNodes method to select all Shape nodes.
  • Iterate through resulting node collections.
  • Check the Shape.HasImage boolean property.
  • Extract image data using the Shape.ImageData property.
  • Save image data to a file.

Java

// The path to the documents directory.
String dataDir = Utils.getDataDir(AsposeExtractImages.class);

Document doc = new Document(dataDir + "document.doc");

NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
int imageIndex = 0;
for (Shape shape : (Iterable<Shape>) shapes)
{
    if (shape.hasImage())
    {
        String imageFileName = java.text.MessageFormat.format(
                        "Aspose.Images.{0}{1}", imageIndex, FileFormatUtil
                                        .imageTypeToExtension(shape.getImageData()
                                                        .getImageType()));
        shape.getImageData().save(dataDir + imageFileName);

        imageIndex++;
    }
}

Apache POI HWPF XWPF - Extract Images from Document

getAllPictures is used to extract images from the document.

Java

// The path to the documents directory.
String dataDir = Utils.getDataDir(ApacheExtractImages.class);

HWPFDocument doc = new HWPFDocument(new FileInputStream(dataDir + "document.doc"));
List<Picture> pics = doc.getPicturesTable().getAllPictures();

for (int i = 0; i < pics.size(); i++)
{
    Picture pic = (Picture) pics.get(i);

    FileOutputStream outputStream = new FileOutputStream(dataDir + "Apache_" + pic.suggestFullFileName());
    outputStream.write(pic.getContent());
    outputStream.close();
}

Download Running Code

Download Extract Images from Document form any of the below mentioned social coding sites: