learn-co-students / dsc-2-18-01-introduction-online-ds-pt-112618

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

Introduction

This lesson summarizes the topics we'll be covering in section 18 and why they'll be important to you as a data scientist.

Objectives

You will be able to:

  • Understand and explain what is covered in this section
  • Understand and explain why the section will help you to become a data scientist

HTML, CSS and Web Scraping

While many companies provide access to information via APIs, sometimes you have to scrape the information that you need for your analysis from web pages designed to be read by people. In this section we'll build upon what we learned about client/server architectures, the http(s) request/response cycle and retrieving data from APIs to learn how to obtain data by scraping web pages.

HTML

We kick off the section with an introduction to the HyperText Markup Language (HTML) - the "language of the web". We then get you "up to speed" with some of the more recent developments in HTML by introducing HTML5 semantic elements - designed to make HTML more consistent and readable. After that we look into the process for handling new HTML elements that you might not have encountered before.

CSS

Next up, we introduce Cascading Style Sheets (how we make web pages look pretty). We start by introducing the concept of separating content from presentation, then introduce CSS, do a code along and then provide you with three labs to get some practice working with HTML and CSS.

Web Scraping

Finally, we introduce beautiful soup - a package for scraping websites, and then give you some practice of using it to retrieve information from a website.

Summary

You will often find that the information you want to retrieve isn't available via an API. When that's the case, it's incredibly important to be proficient with web scraping so that you can retrieve the information you need for your analysis.

About

License:Other


Languages

Language:Jupyter Notebook 100.0%