RossThorn / bs4-WebScraping

A simple example of web scraping and crawling with BeautifulSoup4 python package

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bs4-WebScraping

A simple example of web scraping and crawling with BeautifulSoup4 python package.

Created for UW-Madison Cart Lab Education Series (CLES)

Install BeautifulSoup and Requests to run these examples!

The scripts scrape my portfolio for all the projects on the page. portfolioAscrape.py goes to each project page to extract the text description of each project.

If you'd rather write the outputs to a file, simply replace all print statements with file writing code:

Adapting portfolioscrape.py to write projects to file

Open a file at the beginning of the script.

file = open("Projects.txt","w") 

Replace print statement with code that writes it to the open file.

for project in projects:
    file.write(project.text+"\n")

Finally, close the file at the end of the script.

file.close()

This code is a must-have for any hardcore Ross Thorn fan that wants to keep up to date on what I'm doing.

About

A simple example of web scraping and crawling with BeautifulSoup4 python package


Languages

Language:Python 100.0%