groceryheist's repositories
afew
an initial tagging script for notmuch mail
articlequality
A library for performing automatic detection of assessment classes of Wikipedia article text.
cdsc_reddit
Pipeline for building research datasets from Reddit pushshift dumps
dec_datetime
A webpage showing the current date according to a 365-day year and the decimal time (10 hours with 100 minutes per day).
identifying_competition_and_mutualism_on_reddit
Reproduction code for "Identifying Competition and Mutualism between Online Groups"
identifying_competition_and_mutualism_reddit_overleaf
R+LaTeX knitr source code for building the article: TeBlunthuis, N., & Hill, B. M. (2022). Identifying competition and mutualism between online groups. International AAAI Conference on Web and Social Media (ICWSM 2022), 16, 993–004. https://doi.org/10.1609/icwsm.v16i1.19352
myconfigs
groceryheist's config files
python-mediawiki-utilities
A set of utilities for accessing and processing MediaWiki data.
RemembR
A simple utility that manages a collection of objects and saves them to disk. Useful in reproducible research workflows that require caching intermediate results from expensive computations or robustness checks and loading them in knitr. Interoperable with the pyRemembeR python package to support projects that use both languages. Uses filelocking so that multiple threads or processes can operate on the same cache.
brms
brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
editquality
Supervised learning approach to determining the quality of edits in Wikipedia
mittens
A fast implementation of GloVe, with optional retrofitting
mw_revert_tool_detector
Utility for figuring out what tool was used to revert changes to mediawiki wikis.
presto-ethereum
Presto Ethereum Connector
pyRembr
A simple utility for data scientists to save intermediate objects from either R or python. Useful in reproducible research workflows that require caching intermediate results from expensive computations or robustness checks and loading them in knitr. Interoperable with the pyRemembeR python package to support projects that use both languages. Uses filelocking so that multiple threads or processes can operate on the same cache.
python-mwxml
A set of utilities for processing MediaWiki XML dump data.
scikit-learn
scikit-learn: machine learning in Python
timeseries-bootcamp
SJMC Time Series Boot Camp - Materials by Josephine Lukito and Jordan Foley
trawlDiversity
Long-term trends in regional species richness
twitter_scripts
Repo for some scripts I've used to get data from twitter. Built on tweepy.
UWBotThings
The Patron Saint of inconsequential things