rahul8590 / cmp646_ir

All my codebase for IR Class CMP646

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cmp646_ir

All my codebase for IR Class CMP646

p0

On running the parse.py on medium set token count => 193981880 using Cat / | wc -w token count => 193981880

Instructions to Run

Copy the parse.py file into the top-most directory where all the files are present in the sub-directory.

Sample output

for /rahul_extra/books-medium/FCE3DE743CEC29EC/annalsofmusicina00laheuoft_ocrml.xml token count => 193677928 for /rahul_extra/books-medium/F40282EF2829178C/biologyintroduct00connrich_ocrml.xml token count => 193816296 for /rahul_extra/books-medium/66E683B4EBBF3C70/broadbroadoceans00jonerich_ocrml.xml token count => 193981880 the final token count is 193981880

About

All my codebase for IR Class CMP646


Languages

Language:Python 57.3%Language:TeX 42.5%Language:Shell 0.2%