cainesap / wordtracker

Tracking changes in word use through comparison of reference and web corpora

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WordTracker

Tracking changes in word use through comparison of reference and web corpora.

Step 1 extractPlainText.sh: Extract texts from BNC XML files, with date of publication (where available), text class (written/spoken, academic/discussion/fiction/etc), and document title. Requires XSL files justTheWords.xsl (BNC distribution) and metadata.xsl (self-authored).

Andrew Caines, apc38 at cam.ac.uk, November 2017

About

Tracking changes in word use through comparison of reference and web corpora


Languages

Language:XSLT 68.3%Language:Shell 31.7%