Extract Tagged Content from PDF

In this article you will learn how to to extract tagged content PDF document using C#.

The following code snippet also work with Aspose.PDF.Drawing library.

Getting Tagged PDF Content

In order to get content of PDF Document with Tagged Text, Aspose.PDF offers TaggedContent property of Document class.

Following code snippet shows how to get content of a PDF document with Tagged Text:

Getting Root Structure

In order to get the root structure of Tagged PDF Document, Aspose.PDF offers StructTreeRootElement property of ITaggedContent interface and StructureElement. Following code snippet shows how to get the root structure of Tagged PDF Document:

Accessing Child Elements

In order to access child elements of a Tagged PDF Document, Aspose.PDF offers ElementList class. Following code snippet shows how to access child elements of a Tagged PDF Document:

Tagging Images in Existing PDF

In order to tag images in existing PDF document, Aspose.PDF offers FindElements method of StructureElement class. You can add alternative text for figures using AlternativeText property of FigureElement class.

Following code snippet shows how to tag images in existing PDF document: