COOL CITIBIKE DATA VISUALIZATION

nyc citibike trips in september

(click on the image to see the video)

Hello! I moved to New York City this month. I still have big plans for chs-tonight, but I wanted to do something related to NYC. So I dug around and found that CitiBike posts their usage data and decided to make a little visualization of that data.

Check out the code here. As it says, this is heavily indebted to nyc-taxi.

I chose the dataset for September. It's a CSV file with a little less than 1.3 million records in it. Cool. Using byline and moment, I iterated over each line of the file and extracted/formatted the start & end coordinates and the datetime of the trip:

function parseLine(line) {
  var cells = line.split(',')
  var startLat = parseFloat(cells[5].slice(1,-1))
  var startLng = parseFloat(cells[6].slice(1,-1))
  var endLat = parseFloat(cells[9].slice(1,-1))
  var endLng = parseFloat(cells[10].slice(1,-1))
  var startTime = cells[1].slice(1,-1).split('/').join(' ').split(' ')
  if(startTime[0].length<2) {startTime[0] = '0'+startTime[0]}
  if(startTime[1].length<2) {startTime[1] = '0'+startTime[1]}
  var endTime = cells[2].slice(1,-1).split('/').join(' ').split(' ')
  if(endTime[0].length<2) {endTime[0] = '0'+endTime[0]}
  if(endTime[1].length<2) {endTime[1] = '0'+endTime[1]}
  var trip = {
    startTime: moment(startTime.join('-').split(':').join('-'), "MM-DD-YYYY-HH-mm-ss"),
    endTime: moment(endTime.join('-').split(':').join('-'), "MM-DD-YYYY-HH-mm-ss"),
    coordinates: [[+startLat,+startLng], [+endLat,+endLng]],
  }
  return trip
}
var stream = byline(fs.createReadStream('./data/201509-citibike-tripdata.csv', {encoding: 'utf8'}))
stream.on('data', function(line) {
  var trip = parseLine(line)
}

Cool. Now that I had the star and end coordinates, I used OSRM to create routes based on their bicycle profile. What this means is that I don't have exact routes each rider took, but assuming each rider obeyed traffic laws and took the most efficient route to their destinations, this should be pretty close to what they did:

osrm.route({coordinates: trip.coordinates, printInstructions: true}, function(err, route) {
  if(route.route_geometry) {
    var coords = polyline.decode(route.route_geometry, 6)
  }
})

Each route segment was stored in a LevelDB, which resulted in over 48 million segments.

I used geojson.io to get bbox coordinates, took a screenshot of the areas I wanted to examine, and used node-canvas to render frames of trips overtop the screenshot. Of course, this was a bad idea.

After a while, I had a folder with more than 15,000 frames, weighing in at around 32 gigabytes. My computer ran out of harddrive space and started freaking out, so I stopped the render process and deleted a good portion of the frames.

So, the video you see at the top is sadly not representative of all NYC CitiBike trips in September. If anyone wants to buy me a much bigger hard drive, I'd be glad to do this all again and include all the trips.

Now, what's the first thing you notice about CitiBike activity in that video? You're completely right: There's not much activity in Manhattan above 59th Street or Brooklyn and none in Queens. Any idea why?

citibike stations manhattan citibike stations queens

There aren't any stations in those places! I am an analyst now. Thank you, thank you.

This was a pretty fun project and I got some cool visualizations out of it. For your viewing pleasure, there are two more videos here, one for lower Manhattan and one for Brooklyn. Enjoy.

manhattan citibike trips in september

brooklyn citibike trips in september

11-25-15