mediagestalt / Collocation

iPython notebook that generates collocational statistics for word-pairs in a multi-file directory.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Collocation

iPython notebook that determines collocates for documents in a multi-file directory. The complete version is found here: http://mediagestalt.com/thesis/Collocations.html

The data is from the Canadian House of Commons Parliamentary debates, published as Hansard. It can be downloaded as a zip file here: https://dataverse.library.ualberta.ca/dvn/dv/hansard

The data includes transcripts for the years 2006 to 2015 (Parliaments 39-41) inclusive.

This repo can also be viewed in iPython notebook format at: http://nbviewer.ipython.org/github/mediagestalt/Collocations. Download the directory and explore the data in your own way.

About

iPython notebook that generates collocational statistics for word-pairs in a multi-file directory.

License:MIT License


Languages

Language:Jupyter Notebook 100.0%