Extract Text from PDF using Go

Extract Text From PDF Document

Extracting text from the PDF document is a very common and useful task. PDFs often contain critical information that needs to be accessed, analyzed, or processed for various purposes. Extracting text enables easier reuse in databases, reports, or other documents.

Extracting text makes PDF content searchable, allowing users to locate specific information quickly without manually reviewing the entire document.

In case you want to extract text from PDF document, you can use ExtractText function. Please check following code snippet in order to extract text from PDF file using Go via C++.

  1. Open a PDF document with the given filename.
  2. ExtractText extracts the text content from the PDF document.
  3. Print the extracted text to the console.

    package main

    import "github.com/aspose-pdf/aspose-pdf-go-cpp"
    import "log"
    import "fmt"

    func main() {
        // Open(filename string) opens a PDF-document with filename
        pdf, err := asposepdf.Open("sample.pdf")
        if err != nil {
            log.Fatal(err)

        }
        // ExtractText() returns PDF-document contents as plain text
        txt, err := pdf.ExtractText()
        if err != nil {
            log.Fatal(err)
        }
        // Print
        fmt.Println("Extracted text:\n", txt)
        // Close() releases allocated resources for PDF-document
        defer pdf.Close()
    }