Rud5G / extract-pdf-content

A data extraction example showing how to get a pdf's content.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Extract PDF Content

This repository contains example demonstrations on how to use PDF.js in conjunction with Lodash, to extract data from a pdf.

There are two example applications, a web application to ease data exploration and a CLI application to ease data entry from a node.js application.

NOTE: These are prototypes for further exploration and will need to be customised to a specific use case.

Usage

Choose between:

  • a CLI implementation ideally to be set up and used on a server (requires node.js installed)
cd server/
npm install
node index.js
  • a web application implementation (open app folder, open index.htm in a web browser)

License

Apache-2.0

About

A data extraction example showing how to get a pdf's content.


Languages

Language:HTML 71.2%Language:JavaScript 28.8%