liwencong1995 / SDS430D-Honors-Project-2017-2018

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How much NYC taxi data do I want to load into MySQL?

liwencong1995 opened this issue · comments

Uber Launch: May 4, 2011
Uber data covers: 2014-2015
Lyft Launch: July 25, 2014
Lyft data cover: 2015-now

Taxi yellow data:
2010 - 2016

Taxi green data:
Aug 2013 - 2016

Thinking about deleting yellow taxi data from 2010 and 2016 in order to save space.
Tried loading different number of years of data into mysql database and then run simple queries. Only when having only one year of data allows me to get mysql output before timing out.

Uber: over 4.5 million from Apr to Sep 2014 and 14.3 million from Jan to June 2015
Lyft: 2015-2016
Yellow: 2015
Green: Aug 2013-2016

Only used 2017 yellow data in final analysis.