Kareem-Emad / youtube_metadata_scraper

An expansion over the Youtube-8m Dataset to get more data about the videos such likes/views and channel info through scrapping youtube

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Youtube Metadata Scraper

Python 3.6 PyPI license

This repo serves as an expansion over youtube-8m dataset in terms of metadata, the project aims to scrap more metadata about the videos itself rather than data, such as views count, comments, etc.

The program takes a directory of your tf records that you have downloaded from youtube-8m dataset from here, and decodes them to get video urls to scrap them for more data

Setup

you need to have the tf records (video level) downloaed from youtube-8m dataset

pip install -r requirements.txt

Linting

flake8

How to Run

python main.py y8m_tf_records_data_directory commit_every_x_videos

Example

python main.py ./data 20

About

An expansion over the Youtube-8m Dataset to get more data about the videos such likes/views and channel info through scrapping youtube


Languages

Language:Python 100.0%