Thom Su's repositories
Udemine-Scraper
It scrapes course description and reviews from Udemy.
Box_Office_Success
Predict movie profitability given movie budget, genres, facebook likes and many more features using logistic regression, an assortment of trees and svm.
Dress-Image-Recognition
Pattern Classification of Dresses
Mass-Shootings-Time-Series
Time series forecasting with SARIMA, VAR, Fast Fourier Transform, Exponential Smoothing, Prophet and LSTM Network on US gun violence incidents that result in multiple casualties.
TripAdvisor_Recommender
Single Vector Decomposition recommender system with Surprise library
Twitter-Data-Visualization
Visual exploration of Democratic presidential candidate tweets' metadata.
Bike-Sharing-Analysis
Analyze one year of bike sharing data to uncover insights on casual (non-member) riders in order to formulate marketing strategies aimed at converting casual riders into annual members.
City-Employee-Payroll
Cross Analysis of the Payroll in America's 2 Largest Cities, New York City and Los Angeles with PySpark
Missing-Values-Experiment
A fun mini experiment to test the predictions of tree ensembles without missing value replacement against the prediction from logistic regression using median and mode imputations.
NQueens
Variation of the original n queens problem where the position of the first queen is fixed and the function has to place the remaining queens such that no other queens should be in the same row, column and diagonal axes. Instead of just a standard chessboard, the board size used for this simulation can be 5x5, 6x6, ... up to 10x10.
nyc-mhtn-ds-012819-lectures
Lecture repo for 012819 cohort
Online_Retail_w_PySpark
Using PySpark to perform EDA and Customer Segmentation.
SQuAD-Question-Answering
Predict answer to question given a context text where the answer may be found with Stanford question answering dataset, SQuAD.
Style-Transfer-PyTorch
Generate images to match the image style of another.