Data Science Hackathon: Kickstarter
The following content is the final presentation for the Data Science Hackathon in Hong Kong. The dataset contains details of the kickstarter project ranging from 2009 to 2017.
Kickstarter Distribution in the Globe (2009 - 2017)
Kickstarter Goal & Pledged Amount Trend in the Globe (2009 - 2016, 2017)
Kickstarter Goal & Pledged Amount Trend in the Globe (2009 - 2017)
Logistic Regression
-
Target: successful or not
-
Features: category, country, goal (USD), length of campaign
Decision Tree
- Target: successful or not
- Features: Time to get funded, goal real (USD)
- Methodology: Split data into train (75%) and test (25%) set. Apply 4 layer depth.
- Accuracy Score: 66%
Light GBM
- Target: successful or not
- Features: category, main_category, currency, country, goal (USD), length of campaign, deadline month, deadline day, launch month, launch day
- Methodology: Split data prior 2017 into train (70%) and test (30%) for modeling using LightGBM with a further random split (LightGBM uses cross validation in model development)
Feature Importance
Model accuracy by iteration
After first few iterations, cross validation results do not improve