TextExtractor Plugin

Overview

The Aspose.Pdf.LowCode.TextExtractor plugin extracts text from PDF documents. Configure the operation with TextExtractorOptions, add one or more input documents, and call Process.

Extract text from a PDF document

// For complete examples and data files, visit https://github.com/aspose-pdf/Aspose.PDF-for-.NET
private static void ExtractTextWithPlugin()
{
    // The path to the documents directory
    var dataDir = RunExamples.GetDataDir_AsposePdf_Text();

    // Create PDF TextExtractor plugin
    using (var plugin = new Aspose.Pdf.LowCode.TextExtractor())
    {
        // Add input file
        var options = new Aspose.Pdf.LowCode.TextExtractorOptions();
        options.FormattingMode = Aspose.Pdf.Text.TextFormattingMode.Pure;
        options.AddInput(new Aspose.Pdf.LowCode.FileDataSource(dataDir + "ExtractText.pdf"));

        // Extract text
        var result = plugin.Process(options);

        // Get extracted text
        var extractedText = result.ResultCollection[0].ToString();
    }
}

Result

Process returns string results that contain extracted text.