Jonathan Dunn's repositories
text_analytics
Basic text analytics and natural language processing in Python
corpus_similarity
Measure the similarity of text corpora for 74 languages
common_crawl_corpus
Scripts for building a geo-located web corpus using Common Crawl data
corpus_analysis
Code notebooks as exercises to accompany the text_analytics package
earthLings
Corpus-based language and dialect mapping
political_classification
Code from "Profile-based authorship analysis"
pacific_CodeSwitch
Code-switching detection for Pacific languages