thomassrob / linkedin-pdf-parsing

Parsing resumes in a PDF format from linkedIn

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

linkedin pdf parsing

Parsing resumes in a PDF format from linkedIn. The script takes a folder with PDF files, goes through every one of them looking for Experience and Education sections, extracts all data that is found there and creates a database with following structure:

alt tag

Requirements

Python 2.7

PDFMiner

Usage

 script.py -i inputfolder -o outputfile

Script will search 'inputfolder' for PDF files and will create a database with 'outputfile' path.

Example usage:

python path/to/script.py -i home/mypdfs -o home/mydb.db

About

Parsing resumes in a PDF format from linkedIn

License:GNU General Public License v2.0


Languages

Language:Python 100.0%