pochoi / sta141b-discussion

STA141B Discussion

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

STA141B Discussion

Binder

Chi Po Choi

STA-141b Final Project instructions

Description: You should work in a group of no more than 4 people on a final data science project. The purpose of the project is to provide you with real data science experience, including posing questions, finding data, exploring and visualizing the data, analyzing the data, and summarizing your findings. As a group you should begin with a certain curiosity, for example, in my lecture 'What happened in Ohio?' I looked at the presidential election in OH. Then we processed the data, visualized it, and asked specific questions.

Data Sources: Based on what you have learned you can extract data from pretty much anywhere, but for inspiration you can look at the following links:

Grading criteria

  1. Code: we will grade the code according to the rubric

  2. Data Extraction: Anything that is done to get your data into memory, which includes web APIs, web scraping, reading data from file. Is your data extracted from an online source? Are there multiple data sources? Is your data in a difficult data format on the machine?

  3. Data munging and storage: do you process the data in an clear, efficient, and organized way? Do you join multiple data sources appropriately? Did you work with unstructured data? Do you store your processed data in an efficient way, using databases or well thought out data structures?

  4. Visualization: do your visualizations follow the principles of graphical excellence? Do your visualizations support your conclusions?

  5. Exploratory data analysis and transformations: Did you explore the data before moving on with your analysis? Looking at the data can mean summary statistics, dealing with missingness, visualization, etc.

  6. Statistics: is your use of statistics and machine learning valid? Did you choose appropriate methods based on your questions, the data, and your assumptions?

  7. Organization and summaries: Are there clear research questions that you asked, and did you address these in an orderly fashion? Do you make well justified conclusions? Is your project easy to read?

We will grade each of these according to a scale, with the highest grades going to only the best examples of these categories. Then we will drop the lowest 2 of these scores, so that we will promote excellence without necessarily requiring that the you hit all of these bases. We will also add grades to smaller groups, and penalize larger groups. You should roughly have material proportional to the number of people in your group.

About

STA141B Discussion


Languages

Language:Jupyter Notebook 100.0%