PAKISTAN RAILWAY NETWORK ANALYSIS

Post on 22-Jan-2018

287 views 3 download

transcript

NETWORK

SCIENCES REPORT ON PAKISTAN

RAILWAY NETWORK

UNIVERSITY OF KARACHI DEPARTMENT OF COMPUTER SCIENCE

GROUP MEMBERS: Moiz Ahmed Ansari (B11101040)

Hassan Aftab (B11101021)

ABSTRACT The Railway Network of Pakistan is a country

wide facilitation that the government has

provided which aids the movement of citizens

and freight through Pakistan. In this report,

you can see the analysis of railway network of

Pakistan

Submitted to: Dr. Nadeem Mahmood

NETWORK SCIENCE

REPORT ON PAKISTAN RAILWAY NETWORKS

ABSTRACT:

The Railway Network Pakistan is a country wide facilitation that the government has

provided which aids the movement of citizens and freight through Pakistan. In this

report, you can see the analysis of railway network of Pakistan

In the above image we have extracted the major network of Pakistan Railways which is

scattered throughout the nation through Google Earth.

TOOLS AND TECHNOLOGIES:

Tools used for this project:

Google Earth

R project

R studio

Gephi

DATA COLLECTION:

We collected data of the major railway cities of Pakistan from the Pakistan Railway site.

We then marked the data on Google earth map of Pakistan to understand the routes

between different cities.

This piece of data shows the distances between the railway stations connected across

with different cities.

Column A refers to the Source city.

Column B refers to the Destination city.

Column C refers to the Distance between them in (kms).

ANALYSIS FACTORS:

Betweenness Centrality:

A measure of degree to which a given node lies on shortest path (geodesics) between

other nodes in the graph. A node has high betweenness if the shortest

paths (geodesics) between many pairs of other nodes in the graph pass through it.

Closeness Centrality:

A measure of closeness (distance) of a given node to all the other nodes in the network.

Closeness Centrality decreases if either the number of nodes reachable from the node in

question decreases, or the distances between the nodes increases.

DATA VISUALIZATION ON R:

We have imported the useful data of Pakistan Railway Network into R-project. The

console commands in R can be seen below:

Importing Data from csv file

- PakRailDist<-read.table ("C:/Users/moiz/Desktop/RailwayData.csv", header=T,

sep=",")

Plotting:

> names(PakRailDist)

[1] "Source" "Destination" "Distances.km."

> plot(Source ~ Destination , data = PakRailDist)

The image above is the plot of the data which shows the relationship amongst the

source city, destination city with respect to the railway distance between them.

PAIR PLOTTING

This is the pair plotting model of the same dataset which shows the distances of the

railway cities in pairs.

This representation works in pair so it takes the source station city from Column A of our

dataset and pairs it up with the city in Column B of our dataset. Results represents the

information of distances between the pairing railway cities.

HISTOGRAM:

The histogram above shows the distance in kilometer of various cities that we have

mentioned earlier and plotted in our graph. This histogram actually telling that the

major cities have source to destination railway distance in between 500km whereas

there are very few having source to destination distance above 2500km.

ADVANCE GRAPH PLOTS:

For advance graphing purpose we have used other graphing techniques in R project

rather than basic graphic programming in R. for this we have used ggplot to plot our

data using this command:

> install.packages("ggplot2")

After the installation we have performed several command line function to plot our

graph.

QPlot:

The picture above shows the quick visualization of a Qplot. We have used it to plot our

distances in km on x axis using this command:

> qplot(data=PakRailDist, x=Distances.km.)

Now if we take a look at graph, we analyzed that the density is much higher from 0-

2000km showing distances of source to destination of several different cities in between

2000km. Afterwards we see only a single bar having much higher degree and that is of

Abottabad. If we take a look in our data we can see that its distance to various cities is

2279km that is showing here.

Now if we plot the above graph w.r.t y-axis, this will look like:

Here u can see scatter plot instead of bar chart and the detail of this graph is the same

as we have mentioned above.

TreeMap:

The picture above shows the visualization of Source to Distance. To visualize this we

have installed the treemap library and plotted this graph using the command:

> treemap(PakRailDist, index=c('Source','Destination'),vSize='Distances.km.')

If we take a look at the graph, it is clear that the squares representing the cities. The

bigger square shows that it is connected to most of the cities, inside this square

represent the cities that particular city is connected to. Now as we see from the graph

that Abottabad has maximum number of connecting nodes whereas Turbat has

minimum nodes connected. Also the area of the square inside the big square showing

the source to destination distance. The bigger the square inside, the greater the distance

would be.

DATA VISUALIZATION ON GEPHI:

We used gephi to visualize the betweeness centrality and closeness centrality between

the source and destination.

Results showed that Bahawalpur has the highest between ness centrality and Abbotabad

has the lowest between ness centrality.

Hyderabad has highest closeness centrality whereas Muree has lowest closeness

centrality.

Conclusion:

From the above findings and analysis we have concluded that this is a small world

network having most nodes connected to the others by hops. This is a weighted

network in which the distances between nodes are not always the same, i.e, there are

different distances between source and destination. This graph is highly connected.

Furthermore, the betweeness centrality shows that Bahawalpur is connected to most

number of cities whereas Abottabad is connected to least number of cities due to its

geographical location. Also, the closeness centrality shows that Hyderabad is close to a

number of cities whereas Muree has far distance from most of the cities.