Data Science Projects

Collection of smaller data science projects.

Ames House Price Prediction

Slightly edited and condensed version of one of five projects for EPFL course «Data Science: Applied Machine Learning».

Main techniques: EDA, data preparation and cleaning, outlier removal, regression modelling with various classifiers, scikit pipelines.

Costa Rican Household Poverty Level Prediction

Final project for Advanced Machine Learning course at FHNW.

Students were pointed to the Kaggle competition and had to analyze the data, train models and submit predictions.
Main techniques: Regression modelling with various classifiers, scikit pipelines, LightGBM, hyperparameter tuning with GridSearch/RandomSearch, Hyperopt and scikit-optimize, category encoding, data aggregation with featuretools, analyzing feature importances with permutation, creating interaction features, oversampling with imbalanced-learn.

Spaceship Titanic Challenge

Quick EDA and some modelling test runs for Kaggles Spaceship Titanic Challenge.

Main techniques: EDA, data preparation and cleaning, classification modelling with various classifiers, scikit pipelines.

San Francisco Purchasing Data

Project for course «Data Science Project Competence» at FHNW.

Just the data was given. Students were asked to analyse the data, present insights and propose appropriate data products. In addition to that I created a working data dashboard with Streamlit.

Swiss Real Estate Analysis & Price Prediction

Quick EDA and regression modelling for a Machine Learning Lab during Data Science studies at FHNW.

Just the real estate data was given.
The project was setup as a closed Kaggle competition. Students had to compete and beat teachers' models.

Podcast Lengths Analysis

Quick examination of podcast lengths to help quantify creative choices for podcast producers. I analysed ~225k episodes of ~1.8k iTunes podcasts and 37k episodes of ~800 Spotify podcasts.

Findings:

A prototypical length of a podcast episode is around 40 minutes.
90% of all podcast episodes have a length between 20 and 60 minutes.
Typical lengths vary between the different genres – with median values between 15 and 65 minutes.

rnckp / Data-Science-Projects