Parse PDF documents

Contents
[ ]

Parse PDF documents is a term releated to extraction variuous kind of information from PDF file. This section covers how to:

  • Extract Text from PDF. Text Parsing or Extraction is the most popular operation with ready-made PDFs. You will learn about text parsing from a whole document, a particular page, or a particular region in a page.
  • Extract Images from PDF. Image Extraction does the same for images as the operation above for text.
  • Extract Fonts from PDF. Font Extraction is a specific operation with fonts in PDFs.
  • Extract Data from the Form. If you have a bunch of PDF documents with Forms, probably you need to get the data from those forms. This article will help to understand how to extract AcroForms data with Aspose.PDF for .NET.
  • Extract Text From Stamps - get text information from your PDF document.
  • Extract Data from Table - get data from table in PDF document.
  • Extract Vector Data from PDF - you can get the vector data (path, polygon, polyline), such as position, color, linewidth, etc.