Anton Prokopyev's repositories
amazon-reviews
LASSO + XGBoost + text2vec ensemble to predict sentiment in R
franz-plugins
Franz Plugin Repository
are-bots-more-emotional
Sentiment Analysis of Twitter Spam
Business-Analyst-Nanodegree
Udacity Business Analyst Nanodegree
Severity_models_light_gbm
Modelling Average cost for claims using Light_GBM - as a Tutorial on this algorithm
twitter_scraping
Grab all a user's tweets (and get past 3200 limit)
gis-projects
GIS and Remote Sensing repo
active_stream
Active learning support for targeted Twitter stream
datacleaner
A Python tool that automatically cleans data sets and readies them for analysis.
facebook-page-post-scraper
Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis
GBM-tune
Tuning GBMs (hyperparameter tuning) and impact on out-of-sample predictions
gunsales
Statistical analysis of monthly background checks of gun purchases
kaggle-house-prices
House Prices: Advanced Regression Techniques
Kaggle-Quora
Kaggle Quora Questions Pairs Competition
kaggle-quora-dup
Solution to Kaggle's Quora Duplicate Question Detection Competition
kaggle-quora-question-pairs
My solution to Kaggle Quora Question Pairs competition (Top 2%, Private LB log loss 0.13497).
Machine-Learning-Engineer-Nanodegree
Projects done on Udacity: Machine Learning Engineer Nanodegree
messenger-platform-samples
Messenger Platform samples for sending and receiving messages. Walk through the Get Started with this code. https://developers.facebook.com/docs/messenger-platform/quickstart
nics-firearm-background-checks
Monthly data from the FBI's National Instant Criminal Background Check System, converted from PDF to CSV.
NLP_HSE_school
HSE MIEM NLP school project
timeseries-rady
Time-series analysis in R
wikipedia-word-frequency
Gather modern English word frequencies from all enwiki articles.
yelp-challenge-2017
Project for a Data Mining and Predictive Analytics Class