Follow-up post

See here for my follow-up post with more data and comparisons of Dublin with several other cities.


The Dublin Bikes scheme, run by the French advertising multinational J.C. Decaux, has been in operation for about 7 weeks now. J.C. Decaux report that over 16,000 people have signed up so the scheme seems to be popular. It has certainly been very useful to me on a few occasions.

Shortly after the scheme began I noticed that there was an iPhone app, written by Fusio which can be used to find out how many bikes are docked at each of the 40 stations around the city in "real-time". Unfortunately J.C. Decaux succeeded in getting Fusio to withdraw their app only a few days after it appeared but the app gave me an idea. I thought it might be interesting to keep a record of how many bikes were available at the various stations around the city and see what I could learn by analysing the data at a later date.

I discovered that the information I was interested in is available on the Dublin Bikes website and that the data could be easily obtained at URLs like this. I thus set a simple python script running to poll these URLs about once a minute for each station. This script has been running since 19 Sep 2009 so I have 40 days of data at the time of writing, 29 Oct 2009. Since I've had some spare time recently, I thought that it was about time to try analysing my data a little.

In addition to gathering the data from the web, I decided to estimate how long it takes for an event (i.e. a bike being docked/undocked) to reach the above URLs. I thus spent about 15 minutes beside a station on the way home from work one evening and watched as bikes were docked/undocked while simultaneously polling the data from the web. It looked like the data took about 90 seconds to update.

Raw data

I thought I'd start by displaying the data in a fairly raw form. I thus generated a graph showing bike usage across all stations as a function of time. To generate a data-point for the below graph I added up the total number of docked bikes across all of the 40 stations at a given moment in time. The red line in the graph thus displays the number of available bikes available throughout the city as a function of time. The blue line displays the maximum number of bikes that have ever been available (it has increased as J.C. Decaux have added bikes to the scheme).

Total number of available bikes as a function of time

In the middle of the night when it is not possible to undock a bike and when most people have returned their bike, the red line moves horizontally. If all bikes were always returned to a station at night and no bikes were ever decomissioned then the red line should always meet the blue line at night. Evidentally this is not the case. We can see from the blue line that the maximum number of bikes that has ever been available is 449. However this was only achieved on one date 11 Oct 2009 and since 22 Oct 2009, there have never been more than 425 bikes available. It looks like J.C. Decaux have either decomissioned some bikes or they keep a cache of about 25 bikes off the streets. (Incidentally there are 795 docking pillers distributed across the 40 stations around the city.)

Note also the three smallest local minima taken by the red line (the obvious downward spikes). These occur on the dates: 6, 18, 26 Oct 2009. These do not represent enormous numbers of people suddenly undocking bikes. They either mean that most of the bike stations were temporarily suspended on these dates or that J.C. Decaux's website was not displaying the correct number of available bikes. At least for 26 Oct 2009 the former explanation is probably the case. This was a bank holiday Monday and also the day of the Dublin city marathon. I walked by a couple of bike stations on this day and noted that they were suspended.

Other than the above three dates there have never been fewer than 300 bikes available, meaning that no more than about 150 bikes have ever been in use. We can thus see that at peak times bike usage reaches about 33%.

Let's zoom in and take a closer look at a region of the above chart.

Total number of available bikes as a function of time

If we look closely it look like there is a fairly consistent pattern of usage in the above. For example it looks like there usually tend to be 3 local minima on weekdays. I thought I'd spend a little time investigating this.

Times of day

Motivated by the above, let's investigate the busiests and quietest times of day. I can think of two different interpretations of what we might call a "busy" time of day from the point of view of Dublin Bikes usage. They seem likely to correlate fairly strongly with each other but they do measure different sorts of busyness. These are:
  • Times of day with the greatest number of bikes in use (i.e. the fewest bikes docked at the stations).
  • Times of day when the turnover of bikes was the highest (i.e. the rate at which bikes are docked and undocked is the greatest).
  • For this post, I decided to focus on the first of the above two types of busyness. I thus divided the time of day up into 5 minute chunks and looked at the average number of bikes available (i.e. docked at a station) across all stations for each little 5 minute chunk. For obvious reasons, I chose to carry this out separately for weekdays and weekends. I graphed the results and obtained the below results:

    Average number of available bikes on weekdays as function of time of day

    Average number of available bikes on weekends as function of time of day

    The weekday graph is the most interesting, showing clearly (I think) that people are using these bikes to go to and return from work and to a lesser extent, to get around at lunchtime. The heaviest used times are: 08:50-08:55, 13:20-13:25 and 17:50-17:55.

    One thing I like about the above simple results is that they give us information about the average length of the working day in Dublin. With this in mind, the following questions occurred to me:

  • How do the above results compare to those for, say, Paris (particularly as the French are often alleged to work shorter hours than other countries). Perhaps I will start recording that Parisian data after all!
  • Another way to continue these investigations would be to look at the busiest times of day on a per station basis. The stations grouped around the IFSC come to mind. This is partly because I work there and partly because I suspect the IFSC is a fairly homogeneous area with a concentration of people working slightly longer hours than the average for the city. I think it would be interesting if it was possible to demonstrate/quantify/refute this claim using this bike data. (It is worth pointing out that I would probably need more data to get statistically significant results here: the fewer the stations, the "noisier" the data. For example, all else being equal, if there are 4 IFSC stations then to compensate for using 4 stations instead of 40, I would need data for 60 weeks of data instead of 6.)
  • A final thought here is to try to estimate if, on average, people leave work earlier/later on Fridays or arrive earlier/later on Mondays etc.
  • Busy stations and quiet stations

    Another question which I thought it might be interesting to look at was to measure busyness on a per station basis. There is no analogy of the type of busyness I looked at above for times of day since there is no fixed set of bikes which "belong" to a given station. However the second type of busyness, turnover of bikes, makes perfect sense on a per station basis. I thus counted how many times the number of bikes at a station changed from one record to the next in my data-set. Each time the number of bikes at a station changes I know that somebody either removed or returned a bike at that station. We must bear in mind the 90 second delay here. It will dampen the effect we're trying to measure in our data but it seems unlikely that it should bias our results.

    Carrying out the above I found that the busiest station was Custom House Quay and the quietest (by far) was Hardwicke Street. In an attempt to give some idea of the distribution of this per-station-busyness I present a table of relative busyness below. The number opposite each station is its busyness relative to Custom House Quay.

    List of stations in increasing order of relative busyness for full day.
    Hardwicke street 0.20
    Blessington Street 0.31
    Parnell Square North 0.35
    Georges Quay 0.36
    Greek Street 0.40
    Fitzwilliam Square West 0.43
    Bolton Street 0.44
    Cathal Brugha Street 0.46
    Eccles Street 0.48
    Custom House 0.49
    Golden Lane 0.51
    Ormond Quay Upper 0.52
    Leinster Street South 0.53
    James Street East 0.54
    Mountjoy Square West 0.56
    Townsend Street 0.56
    Christchurch Place 0.57
    Earlsfort Terrace 0.57
    St. Stephen's Green East 0.57
    Parnell Street 0.59
    Merrion Square West 0.62
    High Street 0.62
    Merrion Square East 0.63
    Jervis Street 0.65
    Dame Street 0.65
    Fownes Street Upper 0.65
    Talbot Street 0.65
    Molesworth Street 0.67
    Portobello Harbour 0.75
    St. Stephen's Green South 0.76
    Charlemont Street 0.77
    Wilton Terrace 0.79
    Princes Street / O'Connell Street 0.82
    Grantham Street 0.87
    Chatham Street 0.89
    Exchequer Street 0.90
    Smithfield 0.91
    Herbert Place 0.94
    Pearse Street 0.95
    Custom House Quay 1.00

    Ranking stations by busyness, emptiness and "fullness".

    The above table provides the ranking of stations as regards busyness. It occurred to me to wonder how much this ranking would change if I restricted my attention to busyness for some fixed period of the day. I thus decided to calculate a busyness ranking for 8-10am on weekdays. In addition I decided it might be interesting to investigate which stations are most often empty (i.e. have no bikes docked) and which stations are most often full (i.e. have the maximum number of possible bikes docked). The below table contains these rankings in alphabetical order of station.
    Station Busyness rank full day Busyness rank 8-10am weekdays Emptiness rank Fullness rank
    (higher rank means busier) (higher rank means busier) (higher rank means more often empty) (higher rank means more often full)
    Blessington Street 2 3 28 15
    Bolton Street 7 12 6 22
    Cathal Brugha Street 8 8 20 13
    Charlemont Street 31 36 11 39
    Chatham Street 35 31 33 21
    Christchurch Place 17 18 7 8
    Custom House 10 15 3 35
    Custom House Quay 40 39 27 30
    Dame Street 25 17 22 27
    Earlsfort Terrace 18 27 32 28
    Eccles Street 9 7 19 31
    Exchequer Street 36 25 34 16
    Fitzwilliam Square West 6 11 31 17
    Fownes Street Upper 26 14 5 24
    Georges Quay 4 9 4 29
    Golden Lane 11 16 8 3
    Grantham Street 34 37 13 36
    Greek Street 5 6 24 38
    Hardwicke street 1 1 36 1
    Herbert Place 38 38 14 25
    High Street 22 22 23 26
    James Street East 14 19 38 18
    Jervis Street 24 5 9 11
    Leinster Street South 13 13 29 23
    Merrion Square East 23 24 35 19
    Merrion Square West 21 30 40 14
    Molesworth Street 28 26 37 6
    Mountjoy Square West 15 23 18 34
    Ormond Quay Upper 12 10 25 4
    Parnell Square North 3 2 12 2
    Parnell Street 20 4 2 12
    Pearse Street 39 40 16 5
    Portobello Harbour 29 33 15 40
    Princes Street / O'Connell Street 33 20 26 9
    Smithfield 37 34 17 37
    St. Stephen's Green East 19 21 39 10
    St. Stephen's Green South 30 35 30 7
    Talbot Street 27 29 10 33
    Townsend Street 16 28 1 32
    Wilton Terrace 32 32 21 20

    Hardwicke Street stands out a mile. It is the least busiest, the least often full and almost the most often empty. Other notable stations are the joint busiest: Custom House Quay and Pearse Street. Also, Merrion Square West is most often empty, Townsend Street is least often empty and Portobello Harbour is most often full.

    Further thoughts

    I found this quite an interesting little project and so far I have only looked into a few very simple questions. When I have more energy, time and data I'd quite like to do some more substantial analysis. In addition to questions mentioned above, the below also seem potentially interesting to me:
  • Do particularly quiet days correlate with mm rainfall statistics?
  • How much is total daily usage increasing as a function of time? (I've actually looked at this briefly but it's quite noisy and would be better studied with several months of data.)
  • Can we identify particular stations as bike-sources (where bikes tend to be taken from rather than returned to) or vice-versa? Perhaps this question might be most interesting at critical times of day, like 8-9am on weekdays.
  • J.C. Decaux has vans redistributing bikes around the city to try to deal with bike clustering. Can we see the results of this van-action in our data (e.g. occasions when the number of bikes at a station with particularly few bikes suddenly jumps) and estimate how much of this is taking place? (and why are they neglecting poor Hardwicke Street?)
  • Is a particular day of the week busiest?
  • Could I produce a nice heat-map type animation of the bikes moving around the city? (Ben suggested this interesting idea.)
  • Postscript on data sources

    As mentioned above, the bike data can be easily obtained at URLs like this but there are a few other sources. In particular, J.C. Decaux's own mobile app 'allbikesnow', previously called 'abikenow', uses a different source of data which has the interesting property that results come with a timestamp.

    I am indebted to Richard Bean for pointing out that some of the URLs have changed. For example the old app's URLs like this or this no longer work. Furthermore J.C.Decaux have gone to some trouble to put some rather flimsy security in place on their new app, presumably in an attempt to frustrate attempts at using their API outside of their own app. I thought I would record the details of how to get around this pseudo-security here for the benefit of anyone interested in using the 'allbikesnow' data.

    The base URL is here but you need to jump through a couple of hoops to get it to work for you. First you need a token which you can obtain by visiting this URL. Tokens seem to be valid for 10 minutes. For example, I just queried and received the following JSON response:

    With this token in hand, you can now get the bike data by substituting it into URLs like this. For example I just queried and got the following JSON response:
    This gives bike availability data for all stations in the city. To get the city code used in the above URL ("dublin" in the above example) you can visit a URL like this which will respond with a blob of JSON containing information about the various cities that J.C.Decaux's API supports.

    Follow-up post

    See here for my follow-up post with more data and comparisons of Dublin with several other cities.