Product popularity prediction

A small test project that scrapes some parts of the web and searches Twitter for mentions of a product keyword. It stores this data (rather inefficiently) and can later plot a historical trend for the popularity of the product, based on mentions.

It was later expanded as an academic endeavor with sentiment-analysis (code not available here) to try and distinguish between negative and positive feedback/popularity for trending products.

This code is written using the Python programming language (Python 2.7) and will only work on the Google App Engine platform as a web application.

IMPORTANT: This code will not work straight out of the box. It crawled specific web-pages with specific structures and the chances are high that those page structures have changed since this code was written (within three weeks of May 2013). Furthermore, it uses Twitter API v1 which is now retired.

It is put here for reference and for posterity; I doubt I'll be making any updates to it seeing as it was a small (successful) experiment.

Main modules

main.py - The main entry point of the web application. Handle the presentation of the home page and all the other web pages
process.py - Handles task queueing and dispatching.
twitter - Handles Twitter search data retrieval
scraper - Handles web page search, mining and scraping.
others - Utility scripts to handle various utility functionality

Libraries/tools used

Beautiful Soup 4 for scraping the interwebs
gae-pytz for help with timezones
Flot Charts for plotting charts
Twitter Bootstrap for UI framework

About

A small Twitter/web-crawler that attempts to visualize popularity of products

Languages

Language:Python 50.7%Language:JavaScript 49.3%Language:CSS 0.1%