In this project, I used New York City's CitiBike data from the month of March in 2014-2018. This was done to make the data easier to work with while still getting 3.5 million records and showing changes in CitiBike usage over time.
Data cleaning was done in the attached Jupyter Notebook. Here some fields were changed and columns were added for clarity.
Tableau workbook was pushed to Github using Git Large File Storage.
Key findings:
While overall rides have drastically increased from 2014-2018, there were decreases from 2014 to 2015 and 2016 to 2017. This appears to be related to software issues and a large percentage of bikes that were out of service (CitiBike's goal is for 90% of their bikes to be in service and in both of those years that number dipped below 50%)
Stations that are close to other stations tend to have shorter average trip durations while the more remote stations have longer trip durations. Stations that are closer together tend to be more popular in terms of usage as well. These findings are unsurprising as the density of the stations tend to reflect population density.
Female ridership has increased at a rate greater than male's ridership, but it is still over three times less than male's. Females on average ride for a longer duration than males. This is true across all age groups.
Surprisingly, age has little to do with average duration of a trip. If anything, the elderly tend to ride for longer than younger people, though it this isn't conclusive due to the relative small sample size of elderly riders.