openlawnz / openlawnz-parsers

PDF data extraction parsers that get published onto npm. Standalone, but run in conjunction with the openlawnz-pipeline.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

openlawnz-parsers

This package is used in the openlawnz-pipeline during pdf conversion.

It is standalone so that it can be versioned and others can easily work on it.

Commands

npm install
npm build
npm build:watch
npm run test
npm run test:coverage
npm run lint

Input

Input is a JSON file being the output of either:

  • PDF.js text output; or
  • Azure Cognitive Services OCR

See /testData/initFromConversion for an example input file

About

PDF data extraction parsers that get published onto npm. Standalone, but run in conjunction with the openlawnz-pipeline.

License:GNU General Public License v3.0


Languages

Language:TypeScript 99.9%Language:JavaScript 0.1%