+ All Categories
Transcript
Page 1: DDAY2014 - Edgesense: Social network analysis per tutti

EdgesenseSocial network analysis per tutti

Luca Mearelli - @lmea

Page 2: DDAY2014 - Edgesense: Social network analysis per tutti

Hi, I’m Luca

Page 3: DDAY2014 - Edgesense: Social network analysis per tutti

Collective Intelligence

Page 4: DDAY2014 - Edgesense: Social network analysis per tutti

Emergence

larger entities, patterns, and regularities arise through interactions among smaller or simpler entities that themselves do not exhibit such properties

Page 5: DDAY2014 - Edgesense: Social network analysis per tutti

Online collaboration

it works!

Page 6: DDAY2014 - Edgesense: Social network analysis per tutti

Online communities

• Exhibit emergence

• Strong design properties

•Hackable

Page 7: DDAY2014 - Edgesense: Social network analysis per tutti

The Blueprint

•Map the community social network

•Measure the structural properties

• Visualize the structure & the metrics

• Tweak the interaction

Page 8: DDAY2014 - Edgesense: Social network analysis per tutti

Edgesense

Page 9: DDAY2014 - Edgesense: Social network analysis per tutti

Edgesense Architecture HTML5 Javascript

JSON files

Python

JSON source

Page 10: DDAY2014 - Edgesense: Social network analysis per tutti

Edgesense Source Data

• users.json

• nodes.json

• comments.json

Page 11: DDAY2014 - Edgesense: Social network analysis per tutti

users.json

Page 12: DDAY2014 - Edgesense: Social network analysis per tutti

nodes.json

Page 13: DDAY2014 - Edgesense: Social network analysis per tutti

comments.json

Page 14: DDAY2014 - Edgesense: Social network analysis per tutti

Edgesense Backend

• Python

•NetworkX

Page 15: DDAY2014 - Edgesense: Social network analysis per tutti

Edgesense Parsing Pipeline

• Parse source JSON files

• Build network from interactions

• Extract metrics

• Export network + metrics to JSON files

Page 16: DDAY2014 - Edgesense: Social network analysis per tutti

Network construction

• Persons are nodes

Page 17: DDAY2014 - Edgesense: Social network analysis per tutti

Network construction

•Comments make links

Page 18: DDAY2014 - Edgesense: Social network analysis per tutti

Network construction

• Edges are aggregated

•Metadata is added

Page 19: DDAY2014 - Edgesense: Social network analysis per tutti

Network construction

def extract_edges(nodes_map, comments_map): # build the list of edges edges_list = [] # a comment is 'valid' if it has a recipient and an author valid_comments = [e for e in comments_map.values() if e.get('recipient_id', None) and e.get('author_id', None)] logging.info("%(v)i valid comments on %(t)i total" % {'v':len(valid_comments), 't':len(comments_map.values())}) # build the whole network to use for metrics for comment in valid_comments: link = { 'id': "{0}_{1}_{2}".format(comment['author_id'],comment['recipient_id'],comment['created_ts']), 'source': comment['author_id'], 'target': comment['recipient_id'], 'ts': comment['created_ts'], 'effort': comment['length'], 'team': comment['team'] } if nodes_map.has_key(comment['author_id']): nodes_map[comment['author_id']]['active'] = True else: logging.info("error: node %(n)s was linked but not found in the nodes_map" % {'n':comment['author_id']}) if nodes_map.has_key(comment['recipient_id']): nodes_map[comment['recipient_id']]['active'] = True else: logging.info("error: node %(n)s was linked but not found in the nodes_map" % {'n':comment['recipient_id']}) edges_list.append(link)

return sorted(edges_list, key=eu.sort_by('ts'))

Page 20: DDAY2014 - Edgesense: Social network analysis per tutti

Network construction

def build_network(network): MDG=nx.MultiDiGraph()

for node in network['nodes']: MDG.add_node(node['id'], node)

for edge in network['edges']: MDG.add_edge(edge['source'], edge['target'], attr_dict=edge) set_isolated(network['nodes'], MDG) return MDG

Page 21: DDAY2014 - Edgesense: Social network analysis per tutti

Network construction

def extract_dpsg(mdg, ts, team=True): dg=nx.DiGraph() # add all the nodes present at the time ts for node in mdg.nodes_iter(): if mdg.node[node]['created_ts'] <= ts and (team or not mdg.node[node]['team']): dg.add_node(node, mdg.node[node]) for node in mdg.nodes_iter(): for neighbour in mdg[node].keys(): count = sum(1 for e in mdg[node][neighbour].values() if e['ts'] <= ts and (team or not e['team'])) effort = sum(e['effort'] for e in mdg[node][neighbour].values() if e['ts'] <= ts and (team or not e['team'])) team_edge = sum(1 for e in mdg[node][neighbour].values() if e['ts'] <= ts and e['team'])>0 if count > 0 and (team or not team_edge): dg.add_edge(node, neighbour, {'source': node, 'target': neighbour, 'effort': effort, 'count': count, 'team': team_edge}) return dg

Page 22: DDAY2014 - Edgesense: Social network analysis per tutti

•Content metrics

•Network metrics

Page 23: DDAY2014 - Edgesense: Social network analysis per tutti

•Number of users (active/inactive)

•Number of connections

•Number of community contributions

Page 24: DDAY2014 - Edgesense: Social network analysis per tutti

•Degree

•Distance

•Centrality

•Modularity

Page 25: DDAY2014 - Edgesense: Social network analysis per tutti

Network Metrics: Degree

•Number of inbound / outbound edges insisting on a node

Page 26: DDAY2014 - Edgesense: Social network analysis per tutti

Network Metrics: Distance

• The average number of hops needed to go from a randomly chosen node to another.

• A lower distance implies that information spreads more easily across the network.

Page 27: DDAY2014 - Edgesense: Social network analysis per tutti

Network Metrics: Centrality

• Refers to indicators which identify the most important vertices within a graph

• Betweenness Centrality: it is equal to the number of shortest paths from all vertices to all others that pass through that node.

Page 28: DDAY2014 - Edgesense: Social network analysis per tutti

Network Metrics: Modularity

• The difference between the observed network and a random one with the same degree distribution, on a 0-1 scale.

• Subcommunities are defined such that its members are more connected to each other than to

Page 29: DDAY2014 - Edgesense: Social network analysis per tutti

Network Metricsdef extract_network_metrics(mdg, ts, team=True): met = {} dsg = extract_dpsg(mdg, ts, team) if team : pre = 'full:' else: pre = 'user:' # avoid trying to compute metrics for # the case of empty networks if dsg.number_of_nodes()==0: return met met[pre+'nodes_count'] = dsg.number_of_nodes() met[pre+'edges_count'] = dsg.number_of_edges() met[pre+'density'] = nx.density(dsg) met[pre+'betweenness'] = nx.betweenness_centrality(dsg) met[pre+'avg_betweenness'] = float(sum(met[pre+'betweenness'].values()))/float(len(met[pre+'betweenness'].values())) met[pre+'betweenness_count'] = nx.betweenness_centrality(dsg, weight='count') met[pre+'avg_betweenness_count'] = float(sum(met[pre+'betweenness_count'].values()))/float(len(met[pre+'betweenness_count'].values())) met[pre+'betweenness_effort'] = nx.betweenness_centrality(dsg, weight='effort') met[pre+'avg_betweenness_effort'] = float(sum(met[pre+'betweenness_effort'].values()))/float(len(met[pre+'betweenness_effort'].values())) met[pre+'in_degree'] = dsg.in_degree() met[pre+'avg_in_degree'] = float(sum(met[pre+'in_degree'].values()))/float(len(met[pre+'in_degree'].values())) met[pre+'out_degree'] = dsg.out_degree() met[pre+'avg_out_degree'] = float(sum(met[pre+'out_degree'].values()))/float(len(met[pre+'out_degree'].values())) met[pre+'degree'] = dsg.degree() met[pre+'avg_degree'] = float(sum(met[pre+'degree'].values()))/float(len(met[pre+'degree'].values())) met[pre+'degree_count'] = dsg.degree(weight='count') met[pre+'avg_degree_count'] = float(sum(met[pre+'degree_count'].values()))/float(len(met[pre+'degree_count'].values())) met[pre+'degree_effort'] = dsg.degree(weight='effort') met[pre+'avg_degree_effort'] = float(sum(met[pre+'degree_effort'].values()))/float(len(met[pre+'degree_effort'].values()))

Page 30: DDAY2014 - Edgesense: Social network analysis per tutti

Exported Format{ "edges": [ { "effort": 4, "id": "2_1_1315491000", "source": "2", "target": "1", "team": false, "ts": 1315491000 }, ... ], "meta": { "generated": 1415788633 }, "metrics": [ { "ts": 1315491000, ... } ], "nodes": [ { "active": true, "created_on": "2011-09-08", "created_ts": 1315483000, "id": "1", "isolated": false, "name": "Alice", "team": true, "team_on": "2011-09-08", "team_ts": 1315483000 }, {...} ]}

Page 31: DDAY2014 - Edgesense: Social network analysis per tutti

Edgesense Frontend

• Single page application

•D3.js

• Sigma.js

Page 32: DDAY2014 - Edgesense: Social network analysis per tutti

Demo!

Page 33: DDAY2014 - Edgesense: Social network analysis per tutti

Dashboard: Network

•Uses sigma.js

• ForceAtlas layout *

•Contextual information

Page 34: DDAY2014 - Edgesense: Social network analysis per tutti

Dashboard: Metrics

• Sidebar, Bottom widgets

•Declaratively select metrics to display

<div class="small-box bg-maroon big-metric metric helped" data-metric-name="louvain_modularity" data-metric-round="3" data-help="modularity" > <div class="inner"> <h3 class="value"> </h3> <p> Modularity </p> </div> <div class="minichart"> </div></div>

Page 35: DDAY2014 - Edgesense: Social network analysis per tutti

Dashboard: Filters

Page 36: DDAY2014 - Edgesense: Social network analysis per tutti

Extras

• Twitter parser

•Gexf exporting

Page 37: DDAY2014 - Edgesense: Social network analysis per tutti

Drupal!

• Module to embed Edgesense

• Configurator for the backend processing

• Configurator for the dashboard

Page 38: DDAY2014 - Edgesense: Social network analysis per tutti

Thank you!P.S. Edgesense is opensource:

github.com/Wikitalia/edgesense


Top Related