adityaazad79 / Youtube_Channel_Scraping_with_AWS_Deployment

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

YouTube Scraper

This is a Python Flask web application that extracts data from the first 5 videos on a YouTube search page.

The application extracts the following data for each of the 5 videos:

  • Link to the video
  • Thumbnail image URL
  • Title of the video
  • Number of views
  • Time of posting

The extracted data is then saved in a CSV file named youtube_scrap.csv.

Deployment

The project has been deployed on AWS successfully with all the above mentioned functionalities.

Project Github Repository Link - Github repo link

AWS Deployment Link - Click here for live site (Link Down)

Screenshots

Screenshot 1

Home Page Loading...

Screenshot 2

Scraped Result Loading...

Requirements

This application requires the following Python libraries:

  • Flask
  • Flask-Cors
  • BeautifulSoup
  • Selenium
  • ChromeDriver Manager

Technologies Used

  • Python 3.7
  • Github v1
  • AWS - Elastic Beanstalk
  • AWS - CodePipeline

Installation

  1. Clone this repository.
  2. Install the dependencies.
  $ pip install -r requirements.txt
  1. Download the Chrome driver from here and add it to your system path.
  2. Run the application:
  $ python application.py
  1. Open your web browser and go to
  http://127.0.0.1:8000

to see the application running.

Usage

  1. Input a search query in the text box and click the "Scrape" button.

  2. The application will scrape YouTube for the top 5 videos related to the search query, and display their links, titles, thumbnails, views and posting times.

  3. The scraped data will also be saved in a CSV file named youtube_scrap.csv in the same directory as the application.py file.

Contributing

Aditya Azad - Initial work

This program was created as a learning exercise, and contributions are not currently being accepted. However, you are welcome to use and modify the code for your own purposes.

Acknowledgments

This project was a part of the Data Science course provided by PW Skills.

About


Languages

Language:Python 41.8%Language:CSS 35.9%Language:HTML 22.3%