+ All Categories
Home > Documents > TeCFlow – A Temporal Communication Flow Visualizer for ....pdf · TeCFlow – A Temporal...

TeCFlow – A Temporal Communication Flow Visualizer for ....pdf · TeCFlow – A Temporal...

Date post: 01-Nov-2019
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
4
TeCFlow – A Temporal Communication Flow Visualizer for Social Network Analysis Peter A. Gloor MIT Center for Coordination Science [email protected] Yan Zhao Dartmouth College [email protected] ABSTRACT This paper introduces an approach for organizational redesign and optimization of communication flows based on temporal analysis of communication patterns in groups of people. Our Temporal Communication Flow Visualizer automatically generates interactive movies of communication flows among individuals by mining e- mail log files and other communication archives. Combining those movies with measures of social network analysis such as the change over time in group betweeness centrality (GBC) and group density leads to deep insights into organizational dynamics. In addition we have defined a contribution index, which measures the activity of an individual as a sender and receiver of messages relative to a team. Based on these findings we can make predictions on the productivity of teams and suggest interventions for improved performance. DESIGN AND ARCHITECTURE OF TECFLOW Our Temporal Communication Flow Visualizer for the temporal analysis of social networks (TeCFlow) addresses three areas of related research: (1) visualization of social networks, (2) temporal analysis of social networks in animated visualizations, and (3) analysis of e-mail networks. TeCFlow takes as input e-mail archives and automatically generates static and dynamic visualizations of the calculated communication networks. The static visualizations allow users to step through a chosen time period by looking at communication networks at subsequent time intervals. The dynamic visualization consists of an interactive movie showing the evolution over time of the communication network within the group. Active relationships are displayed in a sliding time window, with inactive relationships decaying over time. TeCFlow also calculates and plots the evolution of group betweeness centrality and density over time to discover interesting events in the lifetime of a virtual team. The interactive movie can be stopped anytime to drill-down into the messages that are currently exchanged between actors. Multiple e-mail addresses can be combined into an online personality, reflecting the fact that people frequently use different e-mail addresses. We have implemented an open architecture. E-mail messages are processed locally in three steps (figure 1). In the first step, the e-mail messages and mailing lists are parsed and stored in decomposed format in a mysql database on the local machine. In the second step the database can be queried to select messages sent or received by a group in a given time period. In the third step the selected communication flows can be represented in our visual browser using our own netgraph [6] and static and dynamic views [5]. This architecture provides a tested of high scalability and flexibility. The number of actors, ties, and messages to be analyzed is only limited by the size of the database and the amount of RAM available, and temporal queries can be run in an ad hoc way. We are also able to experiment with different visualizations of the retrieved structure and to easily add other social web applications. Figure 1. TeCFlow Architecture We base our algorithm on the Fruchterman-Reingold graph drawing algorithm [4] for force-directed placement, which is commonly used to visualize social networks. This method compares a graph to a mechanical collection of electrically charged rings (the vertices) and connecting springs (the edges). Every two vertices reject each other with a repulsive force, and adjacent vertices (connected by an edge) are pulled together by an attractive force. Over a number of iterations, the forces modeled by the springs are calculated and the nodes are moved in a bid to minimize the forces felt. From: To: Title: Timestamp: Contents... e-Mail archives: ¥ mailing lists ¥ flat files ¥ .pst ¥ .mbx parsing database (SQL) Structural queries
Transcript
Page 1: TeCFlow – A Temporal Communication Flow Visualizer for ....pdf · TeCFlow – A Temporal Communication Flow Visualizer for Social Network Analysis Peter A. Gloor MIT Center for

TeCFlow – A Temporal Communication Flow Visualizerfor Social Network Analysis

Peter A. GloorMIT Center for Coordination Science

[email protected]

Yan ZhaoDartmouth College

[email protected]

This paper introduces an approach for organizationalredesign and optimization of communication flows basedon temporal analysis of communication patterns in groupsof people. Our Temporal Communication Flow Visualizerautomatically generates interactive movies ofcommunication flows among individuals by mining e-mail log files and other communication archives.Combining those movies with measures of social networkanalysis such as the change over time in group betweenesscentrality (GBC) and group density leads to deep insightsinto organizational dynamics. In addition we have defineda contribution index, which measures the activity of anindividual as a sender and receiver of messages relative toa team. Based on these findings we can make predictionson the productivity of teams and suggest interventions forimproved performance.

DESIGN AND ARCHITECTURE OFTECFLOW

Our Temporal CommunicationFlow Visualizer for the temporal analysisof social networks (TeCFlow) addressesthree areas of related research: (1)visualization of social networks, (2)temporal analysis of social networks inanimated visualizations, and (3) analysisof e-mail networks.

TeCFlow takes as input e-mailarchives and automatically generatesstatic and dynamic visualizations of thecalculated communication networks. Thestatic visualizations allow users to stepthrough a chosen time period by lookingat communication networks atsubsequent time intervals. The dynamicvisualization consists of an interactivemovie showing the evolution over timeof the communication network within the group. Activerelationships are displayed in a sliding time window,with inactive relationships decaying over time. TeCFlowalso calculates and plots the evolution of groupbetweeness centrality and density over time to discoverinteresting events in the lifetime of a virtual team. Theinteractive movie can be stopped anytime to drill-downinto the messages that are currently exchanged betweenactors. Multiple e-mail addresses can be combined into anonline personality, reflecting the fact that peoplefrequently use different e-mail addresses.

We have implemented an open architecture. E-mailmessages are processed locally in three steps (figure 1). In

the first step, the e-mail messages and mailing lists areparsed and stored in decomposed format in a mysqldatabase on the local machine. In the second step thedatabase can be queried to select messages sent or receivedby a group in a given time period. In the third step theselected communication flows can be represented in ourvisual browser using our own netgraph [6] and static anddynamic views [5].

This architecture provides a tested of high scalabilityand flexibility. The number of actors, ties, and messagesto be analyzed is only limited by the size of the databaseand the amount of RAM available, and temporal queriescan be run in an ad hoc way. We are also able toexperiment with different visualizations of the retrievedstructure and to easily add other social web applications.

Figure 1. TeCFlow Architecture

We base our algorithm on the Fruchterman-Reingoldgraph drawing algorithm [4] for force-directed placement,which is commonly used to visualize social networks.This method compares a graph to a mechanical collectionof electrically charged rings (the vertices) and connectingsprings (the edges). Every two vertices reject each otherwith a repulsive force, and adjacent vertices (connected byan edge) are pulled together by an attractive force. Over anumber of iterations, the forces modeled by the springsare calculated and the nodes are moved in a bid tominimize the forces felt.

From:To:Title:Timestamp:Contents...e-Mail archives:

¥ mailing lists¥ flat files¥ .pst¥ .mbx

parsing

database (SQL)

Structural queries

Page 2: TeCFlow – A Temporal Communication Flow Visualizer for ....pdf · TeCFlow – A Temporal Communication Flow Visualizer for Social Network Analysis Peter A. Gloor MIT Center for

In our algorithm, we treat the exchanges of e-mailbetween actors as an approximation of social ties. In ourvisualization a communication initiated by actor A toactor B is represented as a directed edge from A to B, i.e.a message sent from A to B is depicted as an arc. Themore interactions between actors A and B occur, thecloser the two representing vertices will be placed. Themost connected actors are positioned in the center of thegraph. This means that the actors who send and receivethe largest number of e-mail messages in a given timeframe are placed in the center of the graph. Similarly, themore messages A and B exchange, the shorter theirconnecting arc becomes.

To display the evolution of communication patternsover time, we developed a dynamic visualizationalgorithm where the layout of the graph is automaticallyrecalculated every day, resulting in an interactive movie.The simplistic approach would have been, for any givenday, to base the graph structure on the communicationsthat occurred during this day. However, this approachdoes not take into account interactions that happenedbefore this particular day, and would result in a jerkyanimation of low quality. For our dynamic visualization,we therefore propose a new algorithm: the sliding timeframe algorithm, where we are always looking at a timeinterval consisting of a flexibly chosen number of days.

The basic idea of the sliding time frame algorithm isto display active ties between actors in a sliding timeframe covering a flexibly selected interval of n daysstarting from the current day d the visualization isshowing. The window frame moves forward day by day,and new ties (i.e. e-mail messages exchanged) aresubsequently added to the graph each day until the desiredwidth of n days of the sliding time frameis reached. This time frame windowallows users to see all activitieshappening inside the time frame after thecurrent day. By default, oldcommunication activities before thecurrent time frame window are includedin the layout of the graph. This reflectspersistent ties that stay active once theyhave been established for the remainderof the lifetime of the team. Once an e-mail message has been sent, it will influence positioningof the actors in the graph for the rest of the animation,meaning that a link does not decay. Rather, after it movesout of the n-days wide time frame, it is displayed in thevisualization as a dimmed out arc.

TREATMENT OF INDIVIDUAL ACTORSIn addition, TeCFlow also allows the user to definepersonalities consisting of multiple e-mail addresses, andidentify groups consisting of multiple personalities.

[email protected]

[email protected]

[email protected],

[email protected],

[email protected]

[email protected],

[email protected],

jamie,[email protected]

Figure 2. Merging multiple e-mail addresses ofpersons and groups

In figure 2, actor [email protected] consists ofthree e-mail addresses. The [email protected] is composed of three actors.Maintaining the organization and domain parts of the e-mail addresses permits automatic analysis on the domainand organizational level.

We also looked at the frequency with which individualssend and receive messages. We have defined a measure,which we call the “contribution index” [5].

The contribution index is +1, if somebody only sendsmessages and does not receive any message. Thecontribution index is –1, if somebody only receivesmessages, and never sends any message. The contributionindex is 0, if somebody has a totally balancedcommunication behavior, sending and receiving the samenumber of messages:

CI = receivedmessagessentmessagesreceivedmessagessentmessages____

+−

We then plotted the contribution index against the totalnumber of messages sent and received of each participant.

Figure 3. Contribution Index

In figure 3, actor A only sent n messages, never receivingany, actor B only received p messages, never sending any,while actor C sent and received m messages (C is locatedon the x-axis, because C sent and received the samenumber of messages).

APPLYING TECFLOW STEP BY STEPThe TeCFlow tool is used in three steps:

(1) Watch social interaction pattern movies to finddense clusters indicating potential emergence ofcollaborating teams.

msg sent + received

ContributionIndex

-1

0

1 A only sends n messages

n

B only receives p messages

2 m

C sends and receives m messages

p

Page 3: TeCFlow – A Temporal Communication Flow Visualizer for ....pdf · TeCFlow – A Temporal Communication Flow Visualizer for Social Network Analysis Peter A. Gloor MIT Center for

(2) Look for peaks and troughs in the temporalevolution of group betweeness centrality anddensity to find the most “interesting” phases ofcollaboration in the lifetime of the team.

(3) Look at the contribution index to betterunderstand the roles of the individuals in teams.

Connecting the dots by combining steps (1) to (3) leadsto new insights not easily obtainable through other meansby giving a thorough understanding of the team dynamicsduring the chosen time interval. In this section weillustrate the use of TeCFlow by analyzing a globallyactive research and development community of a globalmanagement consulting firm.

Our sample data set consists of an e-mail archive of avirtual consulting practice with 200 members of a globalconsulting firm covering the time period from mid-2000to early 2002. It is composed of the ego-networks of thepractice leader and the practice coordinator (i.e. their e-mailboxes). Those e-mailboxes are taken as anapproximation of the organizational memory of theconsulting practice, as the practice leader and thecoordinator were informed of all major events in thepractice. The mailboxes were partitioned manually intomail folders by subject areas. Mail folders included one ofeight service offerings, a folder for each project currentlyactive, sales efforts, marketing activities, and theorganization of two practice-wide seminars conducted overthe Web (“Webinars”). The major advantage of this dataset is that one of us was intimately involved in theconsulting practice. The disadvantage is that themailboxes of the practice coordinator and the practiceleader do not include the direct one-to-one communicationamong the practice members bypassing the practicecoordinator or the practice leader.

The three step-process of using the TeCFlow tool asoutlined above is now applied to this data set. To gain onoverview of communication activities in the practice, westart with an analysis of all messages of the practice.

Step 1 – Watch movies to find communitiesBy watching a movie containing the entire e-mail trafficof the consulting practice we are able to discovercore/periphery structures [1,2,3]. This is a strong indicatorfor the emergence of an collaborative innovation team (topand bottom windows of figure 4).

Step 2 – Find interesting time periods.In the next step we look at the progression of groupbetweeness centrality (GBC) and group density over time.A rapid change in slope in the graph, i.e. a spike or atrough, indicates an interesting event. In those cases, we

go back to the movie, and drill down into the graph byclicking on interesting actors, and looking at the contentsof the e-mail messages exchanged.

The two troughs in the GBC graph in the center leftwindow of figure 4 correspond with periods of highactivity of the community.

Step 3 – identify most active actorsIn the third step, plotting the contribution index identifiesthe most active actors. In figure 4 the contribution indexpattern of the team (center right window of figure 4) isalmost consistent with our earlier results. In [5] we foundthat the coordinator sends the most messages, sendingmore than he receives, while the leader sends fewermessages, and receives significantly more than he sends.

In figure 4, the practice leader receives vastly moremessages than he sends, and the practice coordinator isthe most active sender of messages. Usually the practicecoordinator would also be the most active contributor.Surprisingly though, a practice member is the most activeparticipant, making herself the leader at the core of a newinnovation community.

Combining the steps – analyzing the birth of a newCommunityThe analysis with TeCFlow allows us an intimate lookinto the emergence of new teams and online communitiesnot possible by other means.

The creation of the new community as well as theemergent role of the leader of the new community wouldhave remained completely hidden, had we not combinedthe contribution index plot with the dynamic movie. Thegroup betweeness centrality view allows us to quicklyzoom into time periods of particular interest, where wethen can use the drill-down features of the dynamic viewto look at the contents of the e-mail exchange.

Surprising results of this analysis are:

• The emergence of a new innovation team, coming upwith a new and creative consulting service offering.

• The central role of a non-executive member of theconsulting practice in creating this new serviceoffering.

• The easy identification of the time period when thenew innovation team was most active.

• Easy identification of the core team members of thenew innovation team.

Page 4: TeCFlow – A Temporal Communication Flow Visualizer for ....pdf · TeCFlow – A Temporal Communication Flow Visualizer for Social Network Analysis Peter A. Gloor MIT Center for

Figure 4. 3-Step Analysis with TecFlow, combiningmovie, GBC, and Contribution Index

This analysis was done two years after the data had beencollected. Had this tool been available to monitor virtualinteraction in real time, senior management would havebeen much better able to adequately support theseactivities, thereby reducing time to market while alsoincreasing awareness of the team’s output within theconsulting firm.

FUTURE WORK AND CONCLUSIONSWe are currently working to develop a multiuser

version of our system, where users can uploadanonymized communication data sets to a “Global SocialWeb” under strict privacy and anonymity. We hope thatthis will encourage users to share their communicationdata such that we can get a much broader view on socialinteractions than is possible until now. We also plan touse our tools for other types of communication activities.Because TeCFlow runs on top of a database, it isstraightforward to import, for example, phone logs,instant messaging logs, and blog transcripts into thedatabase instead of e-mail archives.

Our continuing goals are to gain deeper insights intothe evolution of online group dynamics and developing atheory of member roles in virtual communities usingmore detailed communication pattern analysis.

REFERENCES1. Borgatti, S.P. & Everett, M.G.

(1999). Models of Core/PeripheryStructures. Social Networks 21:375-395.

2. Bulkley, N. & Van Alstyne, M.(2004), Why Information ShouldInfluence Productivity.(forthcoming) In The NetworkSociety: A Cross-CulturalPerspective. Manuel Castells, ed.Edward Elgar Publishing.

3. Cross, R. & Cummings, J.(2003). Relational and StructuralNetwork Correlates ofPerformance in KnowledgeIntensive Work. Academy ofManagement, Seattle, WA. Paperpublished in Proceedings.

4. Fruchterman, T.M.J & Reingold,E.M. (1991), Graph drawing byforce directed placement.Software: Practice andExperience, 21(11), 1991.

5. Gloor, P. Laubacher, R. Dynes,S. & Zhao, Y. (2003)Visualization of CommunicationPatterns in Collaborative

Innovation Networks: Analysis of some W3Cworking groups. Proc. ACM CIKM 2003, NewOrleans, Nov. 5-6.

6. Varghese, G. & Allen, T. (1993), Relational Data inOrganizational Settings: An Introductory Note forUsing AGNI and Netgraphs to Analyze Nodes,Relationships, Partitions and Boundaries.Connections, Volume XVI, Number 1 & 2, Spring.

practice leaderteam leader

practice coordinator

practice leaderteam leader

practice coordinator


Recommended