codetronaut / doc_tag_test

This tool basically searches the given word in pdf file hierarchy. It searches one or more keywords in the hierarchy and generates an HTML report of it.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Doc-Tag

Description

This tool basically searches the given word within pdf files in a specific location or directory. This program searches one or more keywords and generates an html report of it ,i.e what sort of information is found.

How To Run This Project

Since the project use python-2.7 I recommend you to run it with repl for py2.7

Run the Application

# Install Python 2.7 
$ sudo apt install python2
$ python2 --version

# Clone into YOUR local drive
$ git clone https://github.com/codetronaut/doc_tag_test.git

#move to project
$ cd doc_tag_test

# Run 
$ python2 Doc_Tag.py

Note: you can also change your directory(in which your are going to search) right now i have hard coded it to the PDFs directory.

Tools Used:

It is coded in fully python with the help of crucial modules i.e PDFminer and markdown.

Contribution:

willing to contribute let's get started. The main issues to be resolved are:

  • Shipping this current version(i.e python-2.7) to python-3.x.
  • Enabling back support for some modules which are deprecated(for python-2.7).
  • Refactor the code and write more comments for easy understanding to those who are new.
  • Shoot any other your ideas to me if any :)

Peace!

About

This tool basically searches the given word in pdf file hierarchy. It searches one or more keywords in the hierarchy and generates an HTML report of it.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 54.2%Language:HTML 44.6%Language:CSS 0.8%Language:TeX 0.2%Language:Makefile 0.2%Language:Shell 0.1%