Table recognition

Aspose.OCR for .NET now provides a dedicated API for detecting table layout and recognizing table data in images, scanned documents, screenshots, or photos. To extract table text, simply call the universal Aspose.OCR.AsposeOcr.Recognize method with DetectAreasMode.TABLE settings.

This method accepts an OcrInput object and optional recognition settings.

Recognition results are returned as a list of Aspose.OCR.RecognitionResult objects. Each result contains extracted table data, detected regions, and allows exporting to various formats. Additionally, you can retrieve the table’s row and column structure using the GetTableData() method.

DetectAreasMode.TABLE and GetTableData

The following code example shows how to extract text from table and get rows and columns structure:

Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
// Add images to OcrInput object
Aspose.OCR.OcrInput input = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage);
input.Add("source1.png");
input.Add("source2.jpg");
// Configure recognition settings for table
Aspose.OCR.RecognitionSettings settings = new Aspose.OCR.RecognitionSettings();
settings.DetectAreasMode = Aspose.OCR.DetectAreasMode.TABLE;
// Recognize tables on the image
Aspose.OCR.OcrOutput results = recognitionEngine.Recognize(input, recognitionSettings);
OCRTable table = results.GetTableData();
foreach (OCRTablePage p in table.Pages)
{
      Console.WriteLine($"page {p.PageIndex}");
      foreach (OCRTableRow r in p.Rows)
      {
          Console.Write($"row {r.RowIndex}\t");
          foreach (OCRTableCell c in r.Cells)
          {
              Console.Write($"col {c.ColumnIndex}  {c.Text} \t");
          }

          Console.WriteLine();
      }
}