Rush Hour Dynamics: Using Python to Study theRush Hour Dynamics: Using Python to Study theLondon UndergroundLondon UndergroundCamilla MontonenCamilla Montonen
PyData Paris 2015PyData Paris 2015
Full slides available at (http://cs-with-python.github.io/)http://cs-with-python.github.io/
1 of 58 10/04/15 00:30
IntroductionIntroduction
2 of 58 10/04/15 00:30
BackgroundBackgroundBryn Mawr College 2013University of Edinburgh 2014Currently working in QA at Caplin Systems Ltd.Member of Pyladies London and Women in Data. If you're ever in London, pleasedrop in to one of our meetups!
3 of 58 10/04/15 00:30
There are interesting data problems everywhere...There are interesting data problems everywhere...Python gives you the tools, but you have to ask the questions!
4 of 58 10/04/15 00:30
Back in August 2014...Back in August 2014...
5 of 58 10/04/15 00:30
Which Tube line should I take to work?Which Tube line should I take to work?
6 of 58 10/04/15 00:30
Some days it was all good...Some days it was all good...
7 of 58 10/04/15 00:30
Other days ...not so goodOther days ...not so good
8 of 58 10/04/15 00:30
A pattern starts to emergeA pattern starts to emerge
Source: (http://news.bbc.co.uk/1/hi/in_pictures/8092917.stm)BBC News
9 of 58 10/04/15 00:30
Observation: delays or suspensions on one station can affect remoteObservation: delays or suspensions on one station can affect remotestationsstations
10 of 58 10/04/15 00:30
QuestionsQuestions
What are the most "important" stations in the London UndergroundWhat are the most "important" stations in the London Undergroundnetwork?network?
How does suspending these "important" stations affect the rest of theHow does suspending these "important" stations affect the rest of thenetworknetwork
11 of 58 10/04/15 00:30
Let's bring the Python to the DataLet's bring the Python to the Data
12 of 58 10/04/15 00:30
In the beginning, there was the 'Data'In the beginning, there was the 'Data'How do I translate a physical map of the London Underground into aHow do I translate a physical map of the London Underground into aGraph I can process with Python?Graph I can process with Python?
13 of 58 10/04/15 00:30
StartStart
14 of 58 10/04/15 00:30
GoalGoal
15 of 58 10/04/15 00:30
16 of 58 10/04/15 00:30
GoalGoal
17 of 58 10/04/15 00:30
18 of 58 10/04/15 00:30
Data collection:Data collection:It would be cool to program some kind of OCR to automatically read the data from the mapand produce a data file! But alas, I had to resort to manually creating a data file:
#Station #Neighbour(line)Acton Town Chiswick Park (District), South Ealing (Picadilly), Turnham Green (Picadilly)Aldgate Tower Hill (Circle; District), Liverpool Street (Metropolitan; Circle; District)Aldgate East Tower Hill (District), Liverpool Street (HammersmithCity; Metropolitan)Alperton Sudbury Town (Picadilly), Park Royal (Picadilly)
19 of 58 10/04/15 00:30
Now it's a piece of cake...Now it's a piece of cake...
20 of 58 10/04/15 00:30
... to perform some analysis... to perform some analysis
21 of 58 10/04/15 00:30
Let's go back to our question 1Let's go back to our question 1What is the most "important" station in the London UndergroundWhat is the most "important" station in the London Undergroundnetwork?network?
22 of 58 10/04/15 00:30
Defining "importance"Defining "importance"
23 of 58 10/04/15 00:30
Let's talk about betweenness centralityLet's talk about betweenness centrality
24 of 58 10/04/15 00:30
Betweenness seems like a good metric to measure theBetweenness seems like a good metric to measure the"importance" of a station"importance" of a station
The higher the betweenness of a station, the moreThe higher the betweenness of a station, the morecommuters will pass through itcommuters will pass through it
How can we compute betweenness on our LondonHow can we compute betweenness on our LondonUnderground graph?Underground graph?
25 of 58 10/04/15 00:30
Graphs and Python: Graphs and Python: graph-tool
graph-tool is a Python library written by Tiago Peixoto that provides a number oftools for analyzing and plotting graphs.
26 of 58 10/04/15 00:30
What can you do with What can you do with graph-tool ? ?
27 of 58 10/04/15 00:30
Create a graph objectCreate a graph objectIn [14]: from graph_tool.all import Graph
#create a new Graph objectgraph_object=Graph()
28 of 58 10/04/15 00:30
Add edges and vertices to the graphAdd edges and vertices to the graphIn [15]:
In [16]:
# add a vertex vertex1 = graph_object.add_vertex()vertex2 = graph_object.add_vertex()
# add an edgeedge1 = graph_object.add_edge(vertex1, vertex2)
29 of 58 10/04/15 00:30
Create property mapsCreate property maps
helpful for storing information about your nodes and edgeshelpful for storing information about your nodes and edges
In [17]: # create a property mapvertex_names = graph_object.new_vertex_property("string")
## iterate through the vertices in the graphfor vertex in graph_object.vertices(): vertex_names[vertex]="some_name"
30 of 58 10/04/15 00:30
Create visualizationsCreate visualizationsIn [10]: from graph_tool.draw import graph_draw
from graph_tool.all import price_network
# draw a small graphgraph_draw(graph_object, output="somefile.png")
#create a price network price_graph=price_network(5000)graph_draw(price_graph, output="price.png")
Out[10]: <PropertyMap object with key type 'Vertex' and value type 'vector<double>', for Graph 0x7f27b0121190, at 0x7f277c05cf10>
31 of 58 10/04/15 00:30
A Simple GraphA Simple Graph
32 of 58 10/04/15 00:30
A Price NetworkA Price Network
33 of 58 10/04/15 00:30
34 of 58 10/04/15 00:30
Filter vertices and edgesFilter vertices and edges
35 of 58 10/04/15 00:30
A sample visualization of the London UndergroundA sample visualization of the London Underground
36 of 58 10/04/15 00:30
Let's go back to betweennessLet's go back to betweenness
Easily calculate betweenness by calling theEasily calculate betweenness by calling thebetweenness function in function ingraph_tool.centrality
37 of 58 10/04/15 00:30
BetweennessBetweenness
38 of 58 10/04/15 00:30
39 of 58 10/04/15 00:30
We have our answer for question 1...We have our answer for question 1...
40 of 58 10/04/15 00:30
Let's take our analysis of betweenness one step further...Let's take our analysis of betweenness one step further...and answer question 2and answer question 2
41 of 58 10/04/15 00:30
How do problems on one of these important stations affectHow do problems on one of these important stations affectthe Underground network?the Underground network?
42 of 58 10/04/15 00:30
Bokeh: creating interactive data visualizationBokeh: creating interactive data visualization
43 of 58 10/04/15 00:30
A basic visualization of the London UndergroundA basic visualization of the London Underground
44 of 58 10/04/15 00:30
45 of 58 10/04/15 00:30
A Basic Visualization of BetweennessA Basic Visualization of Betweenness
46 of 58 10/04/15 00:30
47 of 58 10/04/15 00:30
How does the betwenness of each station changeHow does the betwenness of each station changewhen Baker Street is suspended?when Baker Street is suspended?
48 of 58 10/04/15 00:30
49 of 58 10/04/15 00:30
Bokeh allows us to visualize this interactively in the browserBokeh allows us to visualize this interactively in the browser
50 of 58 10/04/15 00:30
In [13]: from IPython.display import YouTubeVideo
YouTubeVideo('VouLqY-Uegs')
Out[13]:
51 of 58 10/04/15 00:30
Bokeh can do alot more than thisBokeh can do alot more than this
In fact, we can build "real time" simulations by using the built-in bokeh-In fact, we can build "real time" simulations by using the built-in bokeh-server app to stream data to a graphserver app to stream data to a graph
52 of 58 10/04/15 00:30
A Simple Bokeh Simulation of the UndergroundA Simple Bokeh Simulation of the UndergroundEach station is assigned a random number of commuters1. Each commuter is assigned a random destination2. At each step in the simulation, commuters travel over one edge3. Bokeh allows us to observe how the number of commuters at each station changesover time
4.
53 of 58 10/04/15 00:30
In [11]: YouTubeVideo('ZKHMtu1eKtc')
Out[11]:
54 of 58 10/04/15 00:30
SummarySummaryAt the beginning, we set off to answer two questions:At the beginning, we set off to answer two questions:
55 of 58 10/04/15 00:30
1. What are the most important stations on the1. What are the most important stations on theUnderground?Underground?
We used We used graph-tool to calculate betweenness to calculate betweenness
We determined that Baker Street, King's Cross St. Pancras and LiverpoolWe determined that Baker Street, King's Cross St. Pancras and LiverpoolStreet are the most importantStreet are the most important
56 of 58 10/04/15 00:30
2. How does suspending one of the important stations2. How does suspending one of the important stationsaffect the rest of the network?affect the rest of the network?
We used bokeh to create interactive graphicsWe used bokeh to create interactive graphics
We saw that removing Baker Street can put more pressure on almost anWe saw that removing Baker Street can put more pressure on almost anentire Tube line worth of stationsentire Tube line worth of stations
57 of 58 10/04/15 00:30
Thank you very much!Thank you very much!
Questions, comments and critique are very welcome!Questions, comments and critique are very welcome!
Please get in touch at camillamon[at]gmail.com orPlease get in touch at camillamon[at]gmail.com orinfo[at]winterflower.netinfo[at]winterflower.net
58 of 58 10/04/15 00:30