Date post: | 08-Aug-2018 |
Category: |
Documents |
Upload: | joao-ramos |
View: | 236 times |
Download: | 0 times |
of 43
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
1/43
ICWSM11 TutorialExploratory Network Analysis with:
Instructors: Sbastien Heymann, Julian [email protected],[email protected]
July 17, 2011 | 1 PM - 4 PM
mailto:seb%40gephi.org?subject=mailto:julian.bilcke%40gephi.org?subject=mailto:julian.bilcke%40gephi.org?subject=mailto:seb%40gephi.org?subject=8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
2/43
Exploratory Network Analysis with Gephi
This tutorial is an introduction to Gephi, the open source graph networkvisualization and manipulation software.
Gephi aims to fulll the complete chain from data importing to aesthetics
renements and interaction.
Users interact with the visualization and manipulate structures, shapesand colors to reveal hidden properties.
The goal is to help data analysts to make hypotheses, intuitively discoverpatterns or errors in large data collections.
At the end, the participants will walk away with the practical knowledgeenabling them to use Gephi for their own projects.
OFFLI
NE
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
3/43
Exploratory Network Analysis with Gephi
It starts with a brief introduction on the network exploration process anda hands-on demonstration of the essential functionalities of Gephi.
Participants are guided step by step through the complete chain of rep-resentation, manipulation, layout, analysis and aesthetics renements.
Next, teams work on real datasets.
They nally present their preliminary results. The tutorial concludes with
a general question and answer session.
OFFLI
NE
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
4/43
Requirements
Bring your own laptop with Java and Gephi installed.Gephi should be updated (menu Help > Check for Updates).
Bring a mouse with a wheel.
Bring a dataset of your own if you want, verify if it loads well in Gephi.[1]
[1] http://gephi.org/users/supported-graph-formats/
http://gephi.org/users/supported-graph-formats/http://gephi.org/users/supported-graph-formats/8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
5/43
Workshop Schedule - Part I
Exploratory Network Analysis
Exploratory Data Analysis Exploratory Network Analysis Looking for Orderness in Data Examples Guideline
Introduction to Gephi
Approach and Community
Networked Data Quick Start Demo
* 30 min break *
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
6/43
Workshop Schedule - Part II
Hands-On!
Team Work on a Dataset Presentation of Preliminary Results
Q&A
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
7/43
Exploratory Data Analysis
The greatest value of a picture is when it forces usto notice what we never expected to see
started withJohn Tukey (1962)
Conrmatory
Exploratory
Serendipity
resultsintuition
surprise
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
8/43
Exploratory Data Analysis
Non-linear processing chain of Ben Fryin Computational Information Design (2004)
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
9/43
Dummy Example
P2P le size distribution (Latapy et al., 2008)
Observation:visual saliences on specic
le sizes
External knowledge:these sizes correspond tolms
New hypothesis on data:lms are highly exchanged,
so the study might dig in
this direction
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
10/43
Exploratory Network Analysis
see the network11st graph viz tool: Pajek (1996)
Vladimir Batagelj, Andrej Mrvar
interact in real time2
3
Gephi prototype (2008)
group, lter, compute metrics...
size by rank, color by partition,label, curved edges, thickness...
build a visual language
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
11/43
Looking for a Simple Small Truth?
Drew Conway, What Data Visualization Should Do: 1. Make complex things simple2. Extract small information from large data3. Present truth, do not deceive
http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/
http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
12/43
Looking for Orderness in Data
Make varying 3 cursorssimultaneously to extractmeaningful patterns
MICRO level MACRO level
1 dimension N dimensions
T+0 T+N
at different levels
on multiple dimensions
at time scale
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
13/43
Zoom cursor on Quantitative Data
Global
- connectivity- density- centralization
Local
- communities- bridges between communities- local centers vs periphery
Individual- centrality- distances- neighborhood- location- local authority vs hub
MICRO level MACRO level
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
14/43
Crossing cursor on Qualitative Data
Social
- who with whom- communities
- brokerage- inuence and power
- homophily
Semantic
- topics
- thematic clusters
Geographic
- spatial phenomena
1 dimension N dimensions
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
15/43
Timeline cursor on Temporal Data
Evolution of social ties
Evolution of communities
Evolution of topics
T+0 T+N
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
16/43
Mapping an Innovation CenterCollaborations on projects at Images et Rseaux
Themes and content
Actors
Territory
Franck Ghitalla & Ecole de Design de Nantes
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
17/43
Mapping Scientic Cooperations
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
18/43
Network Map: a Series of Choices
corpus
data
algorithms
thresholds
graphical
operations
communication
goals
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
19/43
Guideline
lists + edges in bonus, focus on qualitative data
How attributes explain the structure? easy to read, obvious patterns
focus on entities (in context) metrics are tools to describe the graph (centrality, bridging...) links help to build and interpret categories of entitieschallenge: mix attribute crossing and connectivity
How the structure explains attributes? hard to read, problem of hidden signals:
track patterns with various layouts and ltering focus on structures metrics are tools to build the graph (cosine similarity...) categories help to understand the structurechallenge: pattern recognition
require high computational power
1 - 100
100 - 1,000
1,000 - 50,000
> 50,000
# nodes
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
20/43
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
21/43
Gephi in a Nutshell
Like Photoshop for graphs.
Helps data analysts to reveal patterns and trends,highlight outliers and tells story with their data.
Network visualization platform
Open source, supported by a community
Built for performance and usability
Extensible by plug-ins Windows, MacOS X, Linux
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
22/43
Gephi Community
ContributorsCommunities
Mathieu Bastian, Mathieu Jacomy,Eduardo Ramos Ibaez, SbastienHeymann, Guillaume Ceccarelli,Andr Panisson, Antonio Patriarca,Cezary Bartosiak, Martin kurla,Patrick McSweeney, Yi Du, HlderSuzuki, Daniel Bernardes, ErnestoAneiro, Keheliya Gallaba, LuizRibeiro, Urban kudnik, VojtechBardiovsky, Yudi Xue
Nonprot organization
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
23/43
Community Mission
Provide a sustainable software
Maintain the technical ecosystem
Build a business ecosystem
Face cutting-edge technological challenges witha long-term vision
Distribute the software in Open Source
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
24/43
Community Values
Open innovation: ideas and features come fromthe entire community.
Decisions are taken with transparency.
We consider this technology as a public good,and will keep it in open source.
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
25/43
Diversity of Usages
business leisure :-)
communication academic art
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
26/43
Diversity of Network Encoding
V = { a, b, c, d, e }E = { (a,b), (a,d), (b,c), (e,a), (c,e) }
Textual
a b c d ea - 1 - 1 -b - - 1 - -c - - - - 1d - - - - -e 1 - - - -
Tabular
XMLGraphical
and many others...
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
27/43
Software I/O
} >graph streaming
databases
le
le
CSVPajek NETGuess GDFGEXFGraphMLGraphviz DOTUCInet DLNetdrawVNA
Tulip TLPExcel Spreadsheet
MySQLPostgreSL
SQL ServerNeo4j
CSV
Pajek NETGuess GDFGEXFGraphMLExcel SpreadsheetSVGPDFPNG
user input
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
28/43
Choosing a File Format
Table of features supportedby Gephi
* spreadsheets can be loaded
in the Data Laboratory
Ed
geList/M
atrix
Stru
cture
XMLS
truture
Edge
Weig
ht
Attribu
tes
Visualizatio
nAttribu
tes
Attribu
teDefaultVa
lue
Hierarchica
lGraph
s
Dyna
mics
CSV
DL Ucinet
DOT Graphviz
GDF
GEXF
GML
GraphML
NET Pajek
TLP Tulip
VNA Netdraw
Spreadsheet*
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
29/43
Do you need...
GEXFSpreadsheetGraphMLGuess GDF
GMLUCINet DLNetdraw VNAGraphviz DOTPajek NETCSV
Tulip TLP
Many features
Few features
XML
TabularText
File Type
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
30/43
Using Gephi
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
31/43
Team work
Create a team of 2~3 people.1
Two teams present their preliminary ndings.
Explore it during 1H.
Choose a dataset.2
3
4
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
32/43
Dataset #1: GitHub Software Repository
GitHub is an application used by nearly a million people to storeover two million code repositories, making GitHub the largest code
host in the world.
Started in 2008, it provides the features of an online social network
and a software repository to lower the barriers of collaboration andmake the code easier to contribute.
https://github.com
https://github.com/https://github.com/8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
33/43
Dataset #1: GitHub Software Repository
Data extracted by Franck Cuny* at Linkuence SAS
1st release in March 2010 -> this poster2nd release in June 2011 -> your data
_____________Network of user proles__________Nodes: peoples with at least one repository whoare followed by at least two other peopleEdges: A follows B
_____________Network of repositories__________
Nodes: repositoriesEdges: A shares a developer with B
Very few research publications on this OSN!
mailto:franck.cuny%40linkfluence.net?subject=Github%20datasetmailto:franck.cuny%40linkfluence.net?subject=Github%20dataset8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
34/43
Dataset #1: GitHub Software Repository
Data extracted by a crawl using the GitHub APISeed: 10 well-known contributors in the Perl community
Networks by country: Japan, France, United StatesNetworks by language: Perl, PHP, Python, Ruby
Node attributes: user country number of followers main programming language
Edges:
directed weight = number of projects A has forked from B
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
35/43
Dataset #1: GitHub Software Repository
Your mission (should you decide to accept it):nd research hypotheses based on your exploration
Example question: are the Perl communities based on geography?
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
36/43
Dataset #2: The Irish Blogosphere
_______________Blogroll Network______________
Nodes: blogs with more than two blogroll linksEdges: blogroll link (in-link)
_______________Post-link Network_____________
Nodes: blogs with more than two blogroll linksEdges: hyperlink inside post from a blog to another(post-link)
Identifying Representative Textual Sources in Blog Networks. K. Wade, D.Greene, C. Lee, D. Archambault, P. Cunningham (2011) http://mlg.ucd.ie/blogs
http://mlg.ucd.ie/blogshttp://mlg.ucd.ie/blogs8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
37/43
Dataset #2: The Irish Blogosphere
Data extracted by a crawl at distance 2 from the seed for the in-linksand Google Blog Search for the post-links.Seed: 21 popular blogs, winners of the 2010 Irish Blog Awards
Node attributes:
post count = total number of posts by blog category = from the irish blog index at www.irishblogdirectory.com,where available
infomap_comm = community to which a node belongs (infomap algo) gce_comms = overlapping communities (GCE algo) moses_comms = overlapping communities (MOSES algo)
Edges: directed weight = number of hyperlinks in the Post-link network
crawl at distance 2 from the seed
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
38/43
Dataset #2: The Irish Blogosphere
Your mission:explore and try to conrm the ofcial results
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
39/43
Hands-On!
Start:
Load a graph Apply a layout Color the nodes by a qualitative variable in Partition Panel
Size the nodes by a quantitative variable in Ranking Panel Start to explore...compute metrics, lter the network
End:
Export maps to PDF in Preview Tab
Save
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
40/43
Presentations
GitHub Repository Irish Blogosphere
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
41/43
Gephi Documentation
Web Site:
Support:Wiki:Source code:
Online Tutorialshttp://gephi.org/users/quick-start/http://gephi.org/users/tutorial-visualization/http://gephi.org/users/tutorial-layouts/http://wiki.gephi.org/index.php/Import_CSV_Datahttp://wiki.gephi.org/index.php/Import_Dynamic_Data
Tutorial in Spanishhttps://code.google.com/p/camon/wiki/Taller_Gephi
Supported Graph Formatshttp://gephi.org/users/supported-graph-formats/
http://gephi.org
http://forum.gephi.org
http://wiki.gephi.org
https://launchpad.net/gephi
http://gephi.org/users/quick-start/http://gephi.org/users/tutorial-visualization/http://gephi.org/users/tutorial-layouts/http://wiki.gephi.org/index.php/Import_CSV_Datahttp://wiki.gephi.org/index.php/Import_Dynamic_Datahttps://code.google.com/p/camon/wiki/Taller_Gephihttp://gephi.org/users/supported-graph-formats/http://gephi.org/http://forum.gephi.org/http://wiki.gephi.org/https://launchpad.net/gephihttps://launchpad.net/gephihttp://wiki.gephi.org/http://forum.gephi.org/http://gephi.org/http://gephi.org/users/supported-graph-formats/https://code.google.com/p/camon/wiki/Taller_Gephihttp://wiki.gephi.org/index.php/Import_Dynamic_Datahttp://wiki.gephi.org/index.php/Import_CSV_Datahttp://gephi.org/users/tutorial-layouts/http://gephi.org/users/tutorial-visualization/http://gephi.org/users/quick-start/8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
42/43
Thank You!
Caspar David Friedrich -
Wanderer Above the Sea of Fog
8/22/2019 Gephi Icwsm Tutorial 110717064641 Phpapp02
43/43
Credits
[slide 11] images from Drew Conway
http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/
[slide 22 top left] Benot Vidal at MFG Labs
[slide 22 bottom center] Franck Ghitalla at UTC
[slide 22 right] Studies in MA Digital Fashion at LCF by Peter Jeun Ho Tsanghttp://jeunhotsang.com/blog/2010/12/07/prototype/
[slide 27] sketches from Ben Fry, Computational Information Design
Special Thanks to Franck Ghitalla and Mathieu Jacomyfor their insightful discussions.
http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/http://jeunhotsang.com/blog/2010/12/07/prototype/http://jeunhotsang.com/blog/2010/12/07/prototype/http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/