Insights from NYC Taxi Data (Part II)

Using a random sample of 100,000 transactions coming from Chris Whong's FOIL'ed (Freedom of Information Law) data from the Taxi & Limousine Commission of NYC, I started to look at when transactions occur. There are times where it seems like it's impossible to catch a cab. Every cab that passes you is already taken and every time you see an available one, someone upstreams you or hails it from across the street. 

I subsetted the data to only look at weekday traffic, and plotted the number of transactions by the hour of the day. 

Not surprisingly, there's very little cab activity at 4AM. Then there's a surge right before work starts (at around 9AM). There's a strange midday peak at around 1:30PM. The number of transactions peaks during PM rush hour, at around 7:45PM, but stays relatively high during after-work hours (maybe people going out for drinks after work then rushing home). In general, the PM rush hour has many more transactions than the AM rush hour.  Maybe people tend to take the subway to work and then, after a long day, general exhaustion, and some decision fatigue, opt to take a cab back home. (And you could test that hypothesis if the MTA released data on the number of people riding the subway at a given time).

Since the data set also offers the total distance and the total trip time, I decided to calculate the average trip speed for the cab rides. Presumably, this includes waiting at red lights, traffic, and the time it takes you to pay, because the average speed is actually pretty slow:

You might expect that the average speed of the trip might be related to the number of cars on the road, and it seems like it is. Here is a graph of average trip velocity over time of day: 

What does this say? You can catch a really fast cab between 4 and 5 in the morning; you might as well take a Citibike during rush hour and workday hours, but if it's getting late and you just want to get home fast and with little effort, cab seems to be the way to go.