Extract Links from the PDF File

Links are represented as annotations in a PDF file, so to extract links, extract all the LinkAnnotation objects.

  1. Create a Document object.
  2. Get the Page you want to extract links from.
  3. Use the AnnotationSelector class to extract all the LinkAnnotation objects from the specified page.
  4. Pass the AnnotationSelector object to the Page object’s Accept method.
  5. Get all the selected link annotations into an IList object using the AnnotationSelector object’s getSelected method.

The following code snippet shows you how to extract links from a PDF file.

    public static void ExtractLinksFromThePDFFile() {        
        // Load the PDF file
        Document document = new Document(_dataDir + "UpdateLinks.pdf");
        Page page = document.getPages().get_Item(1);
           
        AnnotationSelector selector = new AnnotationSelector(new LinkAnnotation(page, Rectangle.getTrivial()));
        page.accept(selector);
        java.util.List<Annotation> list = selector.getSelected();
        for(Annotation annot : list)
        {
            System.out.println("Annotation located: " + annot.getRect());
        }
                
        // Save the document with updated link
        //document.save(_dataDir + "ExtractLinks_out.pdf");
    }