Scan directories for MS Word documents
The script in this repository crawls through directories, looks for MS Word documents, extracts their content into and prints it into the browser.
Remember to change the Windows \
with /
in the paths if you're running the script on Linux.
Requirements
- folder named
/documetns
that will contain the documents in the root dir.
Known issues
- in Windows, the script can't output
.doc
files properly, outputs a string of random characters (Y, B8L 1(IzZYrH9pd4n(KgVB,lDAeX)Ly5otebW3gp�j/gQjZTae9i5j5fE514g7vnO( ,jV9kvvadVoTAn7jahy@ARhW.GMuO /e5sZWfPtfkA0zUw@tAm4T2j 6Q
).
Resoruces
- base on a stackoverflow answer
TODO:
- craete interface that allows the upload of multiple forms;
- extract the recursive serach into it's own function;
- refactor the main class to allow scaling;
- add markup parser;
- add more supported files.