Haytes / scraping-netflix

A Python API that scrapes movie information from Netflix. A nice substitute for their now-privatized API. Used by http://www.tomatoflix.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scraping Netflix - The Unofficial Way!

A Python class that scrapes information about Netflix movies that are available for streaming. Used by [www.tomatoflix.com][1]

Warning: Netflix's APIs have changed and this package has not yet been updated to follow those changes.


This project started because I wanted to create [Tomatoflix][1], an interactive website that helps lazy people like me find random Netflix movies to watch. I was surprised to find out that Netflix privatized their API. I took matters into my own hands and decided to forge a Netflix API of my own.


  • Python 3
  • Modules:
    • BeautifulSoup
    • Requests
    • Fuzzywuzzy (Not very impressed with this one... Open to fuzzy matching alternatives. But it'll do for now.)


Git clone this to your local computer and it should be good to go. Currently working on making this installable via Pip.


from netflix import *

# Insert netflix ID as a raw string
# To find Netflix ID:
# Sign into Netflix > Chrome Developer Tools > Resources > Cookies > www.netflix.com > NetflixId
netflix_id = r'INSERT NETFLIX ID HERE'

movie = Netflix(netflix_id)

# Initialization only has to be done once.
# This method creates jsons for all of the major genres that will be used to pull data from

>Genres were successfully downloaded as JSON files

# search() looks to see if the movie is available on Netflix streaming.
# other methods are chained to search() and returns specific information about the movie.
movie.search('Jerry Maguire').duration()
movie.search('Jerry Maguire').netflix_rating()

>Movie was found
>2hr 18m
>3.6 stars

Check out the example.py file. E-mail any specific questions to jameskang410@gmail.com

All Available Functions

[1]: http://www.tomatoflix.com
__Functions__ __Return Data Type__ __Description__
initialize(_netflix\_id\_as\_string_) None Creates a JSON file for each movie, organized by genre. This method has to be run __only once__ and __should not be run after the JSON files have been pulled successfully__. This will minimize your chances of getting "caught" by Netflix (as if they don't know what we're up to...).
all\_titles() List Returns a list of every title that's available for streaming on Netflix. Loop through this list to get information about every movie.
search(_movie\_string_) None Checks if the string is a movie that is currently available on Netflix. Will return one of the following messages: ```Movie was found``` or ```Movie could not be found. Did you mean any of the following movies?```. If movie is not found, a list of movies that paritially matched the search string will be printed to the console. __In order to find specific information about a movie, the algorithm must find a movie match.__
movie\_number() Int Returns the Netflix movie ID number
genres() List Returns a list of genres the movie belongs to on Netflix
title() String Returns the title of the movie
tv\_show() String Returns a _"Y"_ if the movie is considered a TV show. Returns a _"N"_ if it is only a movie.
synopsis() String Returns the synopsis for the movie.
year() Int Returns the year the movie was made. NOTE: This year does not always match the year listed on other movie websites like Rotten Tomatoes.
netflix\_rating() String Returns the average Netflix rating for the movie.
cert\_rating() String Returns the maturity rating for the movie.
actors\_list() List Returns a list of the prominent actors in the movie.
actors\_string() String Returns a string of the prominent actors in the movie.
url() String Returns the non Netflix member friendly URL for the movie.
duration() String Returns the duration (hours and minutes or number of seasons) of the movie or TV show.
box\_art() String Returns the URL for the small box art of the movie.
large\_box\_art() String Returns the URL for the large box art of the movie. NOTE: Because of the different layout of Netflix movie pages, this method does not always work.


A Python API that scrapes movie information from Netflix. A nice substitute for their now-privatized API. Used by http://www.tomatoflix.com


Language:Python 100.0%