chejuichia / DataScienceStudy

SpringBoard Data Science Projects

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Technical Portfolio

Juichia's Data Science & Machine Learning Projects at a Glance

Winner of Data Science Hackathon

What are the locations with demand for more EV charging infrastructure? What types of charging infrastructure are lacking or are popular in the locations of demand? What are the locations with demand for EV at the Porsche Taycan price level? Who are the consumers that live around these locations of demand? What are the preferred EV attributes for these consumers in these locations?

Workinig in a team of 3, our proposal to answer these questions was selected out of 19+ teams to compete in the final hackathon, and our final analysis and presentation won second place in the competition. All deliverables were completed within 10 days.

Key Skills & Deliverables

Capstone Project 1

Restaurants are a big part of modern life and they make up a significant portion of small businesses everywhere. They are known for high turnover and failure rates, especially within the first year of business. Failed restaurants are costly to the restaurateurs, to the industry and to the diners. When restaurants fail as businesses, the restaurateurs themselves and their financial sponsors bear economic burden, the food services industry lose potential growth from the contribution of new ideas and economy, and the diners lose out on opportunities for new enjoyable experiences. This study in data science aims to help restaurateurs to incorporate more success factors into their strategies when opening new restaurants.

Key Skills & Deliverables

Capstone Project 2

What is the seasonality of products in a store? How does demand compare across different stores for the same item? What about seasonally? What will the demand for a product be in the next few months? Accurate forecasting of sales and demand is an important part to managing the supply chain for both online and physical retail. This study in data science analyzes the sales of 50 different items at 10 stores across 5 years and goes through a process of fine tuning the forecasts in demand.

Key Skills & Deliverables

Machine Learning Projects

Predicted housing prices with linear regression models, scored and fine tuned models

Modeling binary outcomes in gender with features in heights and weights using logistic regression, scored and fine tuned models

Predicting movie ratings from reviews using Bayesian methods

Predicting demographic features from census data using ML pipeline of logistic regression, gradient boosting GBM classifier, and random forests

Statistics Projects

Applied the Central Limit Theorem to a sampling distribution and calculated critical values and confidence intervals for hospital charges

Key Skills

  • Use the Central Limit Theorem, the z-statistic and t-statistic
  • Estimate the population mean and standard deviation from a sample
  • Sampling distribution of a test statistic
  • Calculate a confidence interval

Ran experiment replicates for hypothesis tests on subgroups of hospital charge data

Key Skills

  • Test the differences between multiple subgroups
  • Calculate the p-values for the differences between subgroups

Used pymc3 library to model hospital charges and their range of values

Key Skills

  • Estimation of Parameters
  • Simulation with random variates
  • Model distributions

Data Wrangling Projects

Data wrangling techniques in python applied on a dataset in JSON format

Key Skills

  • JSON Manipulation and Extraction
  • Applied Plotting and Charting

SQL queries applied to three tables from a SQL database

Key Skills

  • SQL Queries
  • Time Series Analysis
  • Applied Plotting and Charting

Data wrangling techniques in python applied on a dataset sourced from an API

Key Skills

  • API Connection and Requests
  • JSON Extraction and Interpretation
  • Time Series Analysis
  • Statistics

About

SpringBoard Data Science Projects


Languages

Language:Jupyter Notebook 99.4%Language:Makefile 0.2%Language:Python 0.2%Language:Batchfile 0.1%Language:TSQL 0.0%