DawnnnJ / Where-should-a-taxi-driver-pickup-passengers

Find the locations with most pick-ups and lucrative trips in NYC using clustering analysis.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Where-should-a-taxi-driver-pickup-passengers

Key Word: Cluster Analysis, Python, Google Maps API

Introduction

This project aims to analyze taxi data in New York City. It uses cluster anaysis to identify the locations with most pick-ups, and the locations generating most lucrative trips. The results are presented using google maps API. It can help taxi drivers to determine where they should wait for the passengers.

Data

NYC cab data is available from the NYC Taxi & Limousine Commission’s Trip Record Data site: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml.

In the demonstration code, March 2016 'Green cabs' data downloaded from above link is used.

Analysis

Use k-means cluster analysis to identify:

  • Pick-up locations with most pick-ups.
  • Pick-up locations of Most lucrative trips.

Here we define lucrative trips as those generating the highest fare for least amount time spent.

Code:

  • cluster_analysis.py --> location.csv
  • cat heatmap-start.txt > heatmap.html
  • python latlng.py location1.csv >> heatmap.html
  • cat heatmap-end.txt >> heatmap.html
  • open heatmap.html

Output

The interactive output can be found in googlemap repository.

  • Pick-up locations with most pick-ups. us_map 1

  • Pick-up locations of Most lucrative trips.

us_map 3

Reference: https://github.com/parrt/msan692/blob/master/notes/sfpd.md

About

Find the locations with most pick-ups and lucrative trips in NYC using clustering analysis.


Languages

Language:Python 100.0%