maancham / IMDB-Crawler

Automatic IMDB movies, ratings, and reviews crawler

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Table of contents

General info

This repository contains the IMDB crawler bot I coded for multiple pages of IMDB such as Movie page, Ratings page, Cast page, etc.

Technologies

  • Scrapy: Main framework for crawling and extrating info from web content.
  • BeautifulSoup: Used in some parts of data cleaning and manipulating.
  • MongoDB: All records are saved in a MongoDB instance. This allows for easily scalable storage system and robust CSV outputs.

Additional info

It is probably better to clean the code and also probably create a decent app from it. Unfortunately, I won't be able to devote any more time to this project. Feel free to contact me should any question arise on this topic.

About

Automatic IMDB movies, ratings, and reviews crawler


Languages

Language:Python 100.0%