Hello, world!

In this article, you will learn how to build a basic web page that requests an image file from an user and extracts the text from it with with Aspose.OCR for Node.js via C++.

You will need

  • A computer with Node.js 14 or later.
  • Any text editor.
  • Some image containing a text. You can simply download the one from this article.
  • 15 minutes of spare time.

Preparing

  1. Create a directory somewhere on your system where the project files will be kept. For example, C:\Aspose-OCR-Example\.
    This directory will later be referred as project directory.

  2. Create node_modules directory in the project directory.

  3. Download Aspose.OCR for Node.js via C++ distributive.

  4. Unpack the downloaded package to aspose-ocr directory inside node_modules directory.

  5. Download a sample image to the project directory under the name source.png:

    Source image

Coding

  1. Create an index.js file in the the project directory which will be used as a main (startup) project script.

  2. Import Aspose.OCR and File system modules:

    const Module = require("aspose-ocr/lib/asposeocr");
    const fs = require("fs");
    
  3. Start recognition once Aspose.OCR for Node.js via C++ module finishes loading:

    Module.onRuntimeInitialized = async _ => {
    }
    
  4. Load the image into the project’s temporary storage:

    fs.readFile("source.png", (err, imageData) => {
       const imageBytes = new Uint8Array(imageData);
       let internalFileName = "temp";
       let stream = Module.FS.open(internalFileName, "w+");
       Module.FS.write(stream, imageBytes, 0, imageBytes.length, 0);
       Module.FS.close(stream);
    });
    
  5. Add the image to the recognition batch:

    let source = Module.WasmAsposeOCRInput();
    source.url = internalFileName;
    let batch = new Module.WasmAsposeOCRInputs();
    batch.push_back(source);
    
  6. Specify the recognition language:

    let recognitionSettings = Module.WasmAsposeOCRRecognitionSettings();
    recognitionSettings.language_alphabet = Module.Language.ENG;
    
  7. Send image for recognition:

    var result = Module.AsposeOCRRecognize(batch, recognitionSettings);
    
  8. Output image text to the console:

    var text = Module.AsposeOCRSerializeResult(result, Module.ExportFormat.text);
    console.log(text);
    

Full listing (index.js)

const Module = require("aspose-ocr/lib/asposeocr");
const fs = require("fs");

Module.onRuntimeInitialized = async _ => {
   // Load image file
   fs.readFile("source.png", (err, imageData) => {
      // Save image to the virtual storage
      const imageBytes = new Uint8Array(imageData);
      let internalFileName = "temp";
      let stream = Module.FS.open(internalFileName, "w+");
      Module.FS.write(stream, imageBytes, 0, imageBytes.length, 0);
      Module.FS.close(stream);
      // Add image to recognition batch
      let source = Module.WasmAsposeOCRInput();
      source.url = internalFileName;
      let batch = new Module.WasmAsposeOCRInputs();
      batch.push_back(source);
      // Specify recognition language
      let recognitionSettings = Module.WasmAsposeOCRRecognitionSettings();
      recognitionSettings.language_alphabet = Module.Language.ENG;
      // Send image for OCR
      var result = Module.AsposeOCRRecognize(batch, recognitionSettings);
      // Output image text to the console
      var text = Module.AsposeOCRSerializeResult(result, Module.ExportFormat.text);
      console.log(text);
   });
}

Running

  1. Open the command prompt and navigate to the project directory.
  2. Run index.js script with the following command:
    node --no-experimental-fetch index
  3. Wait for recognition to complete. It may take a while depending on the image size and your system performance.

You will see extracted text in the console output:

Hello. World! I can read this text

What’s next

Congratulations! You have performed OCR on an image and extracted the machine-readable text from it using Node.js.