David McClure's repositories
open-syllabus-project
What can be learned from 1M+ college course syllabi? (OLD)
svg-to-wkt
Convert SVG to WKT for use on maps.
literary-interior
Surveying the literary interior.
pyspark-deploy
Lightweight Spark + Python cluster deployment.
lint-analysis
Analysis rig for literary interior.
sentence-ordering
Sentence ordering.
gutenberg-catalog
JSON dump of the Project Gutenberg catalog.
bloom-canon
Bloom's canon, CSV + JSON
name-dataset
Probably the biggest dataset of Names, worldwide.
pull-twitter-followers
Harvest Twitter account followers, via RQ + SQLite.
pyspark-deploy-example
Example setup / test driver for pyspark-deploy
twitter-ext
Twitter + Spark
better_profanity
Blazingly fast cleaning swear words (and their leetspeak) in strings
docker-ami
Build a base Docker AMI.
Named-Entity-Recognition-NER-Papers
An elaborate and exhaustive paper list for Named Entity Recognition (NER)
tf-aws-vpc
VPC + internet gateway + subnet + route table.
us-pop-density
US population density metrics