Amy Koldeway, Sarah Cross
We will use decision tree model to answer the question: Can an AI predict the rating of an Epicurious recipe?
We will be using the existing dataset of Epicurious recpices found on Kaggle (https://www.kaggle.com/hugodarwood/epirecipes). Data set is for recipes created in 2005-2016. If time allows we will scrape 2017-2019 data to add to dataset.
We will peform the follwing cleaning steps:
- drop the Nan records
- remove outliers
- remove records with a zero rating that do not have a review count (this assumes 0 is valid rating score)
We will investigate proper way to identify the correct "features" for the decision tree model and perform addtion cleaning to ensure all recipes have features populated.
We will calculate the accuracy, recall, precision and f1 score of our model to determine its accuracy. Based on those results, we will perform optimizations to increase the accuracy.
While jupyter notbook or colab may be where all the caculations happen, we will build a webpage that illustrates steps we took and results received for easy readability.
Our Data set is around 30k records so we will use our own computers or colab - undecided at this point.