r-trump / presidential-documents-scraping

Scraping Presidential Documents from the Web

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scraping Presidential Speeches & Preparing the Speeches for Text Analysis

This repo contains a code sample for scraping presidential documents from the website "American Presidency Project" (APP) and preparing the speeches for text analysis.

The code enables researchers to scrape presidential speeches for all US Presidents over their entire presidency periods. Researchers can easily tweak the code to modify the document category (e.g. declarations, executive orders, etc.), the number of presidents, and the time period.

You will also find the data for the presidential speeches of all presidents (including Biden's first year) in this repo under the folder separate_folders. In total, there are 73,707 speeches. Each president's speeches are stored under separate folders as .txt files.

Please feel free to reach out to me at burcukolcakk@gmail.com if you have any questions or spot any errors!

About

Scraping Presidential Documents from the Web

License:MIT License


Languages

Language:Jupyter Notebook 100.0%