miodeqqq / testing-pdf_libs

Analyzing a set of Python-like libraries for general performance and accuracy.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Python-PDF libraries performance tests

Checking performance with reading PDF and:

  • gathering info about the number of pages using python libraries.
  • ... some day ...

Current stable version: v1.0

Release date: 26.03.2019

Author:

Maciej Januszewski (maciek@mjanuszewski.pl)

Pre-requirements:

  • Firstly run Apache-Tika Server (for Tika purposes):
docker pull logicalspark/docker-tikaserver
docker run -d -p 9998:9998 logicalspark/docker-tikaserver

Running as human:

./run.py <path/to/pdfs_data/>

Sample plots outputs:

- Scatter plot: Scatter plot generated by plotly

- Boxes plot: Boxes plot generated by plotly

About

Analyzing a set of Python-like libraries for general performance and accuracy.


Languages

Language:Python 99.7%Language:Shell 0.3%