There are 0 repository under pdf-analysis topic.
PDF Analysis: Extracting words and their word frequencies from PDF files; Preparation of text data for performing topic analysis on annual reports of German car manufacturers - e.g. Volkswagen, Porsche and Audi. Please note that words are only being extracted, stemming is not being applied. In order to improve this, use nltk.stem.snowball.SnowballStemmer('german'), for example.
ArchLinux packaged version of the kali-linux pdf analysis tool pdfid. Original author is DidierStevensSuite! His license applies!