hack-c / taxidata

Playing with the NYC Taxi data dump

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

 /$$$$$$$$ /$$$$$$  /$$   /$$ /$$$$$$ /$$$$$$$   /$$$$$$  /$$$$$$$$ /$$$$$$ 
|__  $$__//$$__  $$| $$  / $$|_  $$_/| $$__  $$ /$$__  $$|__  $$__//$$__  $$
   | $$  | $$  \ $$|  $$/ $$/  | $$  | $$  \ $$| $$  \ $$   | $$  | $$  \ $$
   | $$  | $$$$$$$$ \  $$$$/   | $$  | $$  | $$| $$$$$$$$   | $$  | $$$$$$$$
   | $$  | $$__  $$  >$$  $$   | $$  | $$  | $$| $$__  $$   | $$  | $$__  $$
   | $$  | $$  | $$ /$$/\  $$  | $$  | $$  | $$| $$  | $$   | $$  | $$  | $$
   | $$  | $$  | $$| $$  \ $$ /$$$$$$| $$$$$$$/| $$  | $$   | $$  | $$  | $$
   |__/  |__/  |__/|__/  |__/|______/|_______/ |__/  |__/   |__/  |__/  |__/  

by Charlie Hack

Here are a few utilities, scripts and queries I made while examining the NYC Taxi data set with BigQuery.

  • preprocess.py
    This script can be tweaked to format geodata for use with OpenHeatMap.

  • taxidata.sql
    A few GQL queries that were used for the article in SuperCompressor.

  • taxidata.ipynb
    A basic visual exploration of the data, and a few plots. Run ipython notebook --pylab inline to display the plots right inline with the notebook.

The data is BigQuery'd up and ready to go here.

About

Playing with the NYC Taxi data dump


Languages

Language:Python 100.0%