Extract Tables from PDF in Node.js
Contents
[
Hide
]
Extract tables while converting PDF to CSV files
Convert PDF to CSV
If there are tables in PDF then they are saved in separate CSV files. In case you want to convert PDF document, you can use AsposePdfTablesToCSV function. Please check following code snippet in order to convert PDF file in Node.js environment.
CommonJS:
- Call
require
and importasposepdfnodejs
module asAsposePdf
variable. - Specify the name of the PDF file that will be converted.
- Call
AsposePdf
as Promise and perform the operation for converting file. Receive the object if successful. - Call the function AsposePdfTablesToCSV.
- Convert PDF file. Thus, if ‘json.errorCode’ is 0, the result of the operation is saved in “ResultPDFtoXlsX.xlsx”. If the json.errorCode parameter is not 0 and, accordingly, an error appears in your file, the error information will be contained in ‘json.errorText’.
const AsposePdf = require('asposepdfnodejs');
const pdf_file = 'Aspose.pdf';
AsposePdf().then(AsposePdfModule => {
/*Convert a PDF-file to CSV (extract tables) with template "ResultPdfTablesToCSV{0:D2}.csv" ({0}, {0:D2}, {0:D3}, ... format page number), TAB as delimiter and save*/
const json = AsposePdfModule.AsposePdfTablesToCSV(pdf_file, "ResultPdfTablesToCSV{0:D2}.csv", "\t");
console.log("AsposePdfTablesToCSV => %O", json.errorCode == 0 ? json.filesNameResult : json.errorText);
});
ECMAScript/ES6:
- Import the
asposepdfnodejs
module. - Specify the name of the PDF file that will be converted.
- Initialize the AsposePdf module. Receive the object if successful.
- Call the function AsposePdfTablesToCSV.
- Convert PDF file. Thus, if ‘json.errorCode’ is 0, the result of the operation is saved in “ResultPDFtoXlsX.xlsx”. If the json.errorCode parameter is not 0 and, accordingly, an error appears in your file, the error information will be contained in ‘json.errorText’.
import AsposePdf from 'asposepdfnodejs';
const AsposePdfModule = await AsposePdf();
const pdf_file = 'Aspose.pdf';
/*Convert a PDF-file to CSV (extract tables) with template "ResultPdfTablesToCSV{0:D2}.csv" ({0}, {0:D2}, {0:D3}, ... format page number), TAB as delimiter and save*/
const json = AsposePdfModule.AsposePdfTablesToCSV(pdf_file, "ResultPdfTablesToCSV{0:D2}.csv", "\t");
console.log("AsposePdfTablesToCSV => %O", json.errorCode == 0 ? json.filesNameResult : json.errorText);