ssrosa / dsc-1-01-27-section-recap-summary-online-ds-pt-100118

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Section Recap

Introduction

This short lesson summarizes the topics we covered in section 01 and why they'll be important to you as a Data Scientist.

Objectives

  • Understand and explain what was covered in this section
  • Understand and explain why this section will help you become a data scientist

Key Takeaways

  • There is a lot to learn about data science, but most of the time you're predicting a continuous value (regression), predicting a category (classification), identifying unusual data (anomaly detection) or generating recommendations.
  • Data science is not just about selecting and tuning machine learning models. Much of the value comes from understanding the business needs and formulating the problem thoughtfully. And most of the effort is in the early stages of finding, cleaning, exploring and simplifying the data so it's ready to be run against your models.
  • It's important to use professional tools. Jupyter Notebook is a great environment for combining your notes and your code. Git allows you to keep track of your changes. GitHub allows to share them with your team. Conda virtual environments ensure that the libraries you use for one project won't break another project you were working on, and testing frameworks like PyTest are a powerful tool for ensuring that any code you choose to re-use does what you expect.

About

License:Other


Languages

Language:Jupyter Notebook 100.0%