Extract PDF Links in Java
Contents
[
Hide
]
You can inspect PDF links by iterating over page annotations and filtering for AnnotationType.Link.
Extract link annotations
Use this example when you need the location and page information for link annotations on a page.
- Open the source PDF Document.
- Iterate through the page annotations and filter for link annotations.
- Read the page index and rectangle for each matching link.
public static void extractLinkAnnotation(Path inputFile) {
try (Document document = new Document(inputFile.toString())) {
for (Annotation annotation : document.getPages().get_Item(1).getAnnotations()) {
if (annotation.getAnnotationType() == AnnotationType.Link && annotation instanceof LinkAnnotation) {
LinkAnnotation linkAnnotation = (LinkAnnotation) annotation;
System.out.println("Page: " + linkAnnotation.getPageIndex()
+ ", location: " + linkAnnotation.getRect());
}
}
}
}
Extract hyperlink destinations
Use this example when you need to read the target URIs from web link annotations.
- Open the source PDF Document.
- Find LinkAnnotation objects whose action is a GoToURIAction.
- Print the page index and URI target for each hyperlink.
public static void extractHyperlinks(Path inputFile) {
try (Document document = new Document(inputFile.toString())) {
for (Annotation annotation : document.getPages().get_Item(1).getAnnotations()) {
if (annotation.getAnnotationType() == AnnotationType.Link && annotation instanceof LinkAnnotation) {
LinkAnnotation linkAnnotation = (LinkAnnotation) annotation;
if (linkAnnotation.getAction() instanceof GoToURIAction) {
GoToURIAction action = (GoToURIAction) linkAnnotation.getAction();
System.out.println("Page " + linkAnnotation.getPageIndex() + ", URI:" + action.getURI());
}
}
}
}
}