Extract Vector Data from a PDF file using Java
Contents
[
Hide
]
Access vector data from a PDF document
Use GraphicsAbsorber to inspect vector graphic elements on a page and write their basic geometry to a text file.
- Open the source PDF Document.
- Create a GraphicsAbsorber and visit the target Page.
- Iterate through the extracted GraphicElement objects.
- Build the output text with element geometry and operator counts.
- Write the extracted vector data to the output file.
public static void extractGraphicsElements(Path inputFile, Path outputFile) throws Exception {
try (Document document = new Document(inputFile.toString())) {
GraphicsAbsorber absorber = new GraphicsAbsorber();
absorber.visit(document.getPages().get_Item(1));
StringBuilder text = new StringBuilder();
int index = 1;
for (GraphicElement element : absorber.getElements()) {
text.append("Element ").append(index)
.append(": Rectangle = ").append(element.getRectangle())
.append(", Position = ").append(element.getPosition())
.append(", Operators = ").append(element.getOperators().size())
.append("\n");
index++;
}
Files.writeString(outputFile, text.toString());
}
}
Save page vector graphics to SVG
- Open the source PDF Document.
- Get the target Page from the document.
- Save the page vector graphics to the output SVG file.
public static void saveVectorGraphicsToSvg(Path inputFile, Path outputFile) {
try (Document document = new Document(inputFile.toString())) {
Page page = document.getPages().get_Item(1);
page.trySaveVectorGraphics(outputFile.toString());
}
}
Save each extracted element to a separate SVG
- Open the source PDF Document.
- Create a GraphicsAbsorber and visit the target Page.
- Create the output directory for the extracted subpaths.
- Iterate through the extracted GraphicElement objects.
- Save each element to a separate SVG file.
public static void extractSubpathsToSvgs(Path inputFile, Path outputDir) throws Exception {
try (Document document = new Document(inputFile.toString())) {
GraphicsAbsorber absorber = new GraphicsAbsorber();
absorber.visit(document.getPages().get_Item(1));
Path subpathsDir = outputDir.resolve("subpaths");
Files.createDirectories(subpathsDir);
int index = 1;
for (GraphicElement element : absorber.getElements()) {
element.saveToSvg(subpathsDir.resolve("subpath_" + index + ".svg").toString());
index++;
}
}
}
Combine extracted elements into a single SVG
- Open the source PDF Document.
- Create a GraphicsAbsorber and visit the target Page.
- Create the SVG wrapper content.
- Iterate through the extracted GraphicElement objects and append each SVG fragment.
- Write the combined SVG output to the target file.
public static void extractListOfElementsToSingleImage(Path inputFile, Path outputFile) throws Exception {
try (Document document = new Document(inputFile.toString())) {
GraphicsAbsorber absorber = new GraphicsAbsorber();
absorber.visit(document.getPages().get_Item(1));
StringBuilder svg = new StringBuilder();
svg.append("<svg xmlns=\"http://www.w3.org/2000/svg\">\n");
for (GraphicElement element : absorber.getElements()) {
svg.append(element.saveToSvg()).append("\n");
}
svg.append("</svg>\n");
Files.writeString(outputFile, svg.toString());
}
}
Extract a single vector element
- Open the source PDF Document.
- Create a GraphicsAbsorber and visit the target Page.
- Get the target GraphicElement from the extracted elements collection.
- Check whether the element is an XFormPlacement and select the nested element when needed.
- Save the selected vector element to the output SVG file.
public static void extractSingleVectorElement(Path inputFile, Path outputFile) {
try (Document document = new Document(inputFile.toString())) {
GraphicsAbsorber graphicsAbsorber = new GraphicsAbsorber();
Page page = document.getPages().get_Item(1);
graphicsAbsorber.visit(page);
if (graphicsAbsorber.getElements().size() > 1) {
GraphicElement xformPlacement = graphicsAbsorber.getElements().get_Item(1);
if (xformPlacement instanceof XFormPlacement) {
XFormPlacement placement = (XFormPlacement) xformPlacement;
if (placement.getElements().size() > 2) {
placement.getElements().get_Item(2).saveToSvg(outputFile.toString());
}
} else {
xformPlacement.saveToSvg(outputFile.toString());
}
}
}
}