@margaretstorey#vissoft14 #icsme14
Visualization for Software Analytics
Margaret-Anne (Peggy) Storey@margaretstorey
#vis4se
Why visualization?
Provide insights Answer questions Support wayfinding Tell stories Communicate knowledge, awareness !
!
!
!
For software… Visualize what?
Design
UML
Algorithmsbost.ocks.org/mike/algorithms/
Code, dependencies http://thechiselgroup.org/2005/07/06/zest/
http://swerl.tudelft.nl/bin/view/Main/ExTraVis
Dynamic behaviour
Cleary, B., Storey, M., Chan, L., Salois, M., Painchaud, F., "ATLANTIS - Assembly Trace Analysis Environment," Working Conference on Reverse Engineering (WCRE), 2012.
http://hapao.dcc.uchile.cl
Architecture
Wettel, R. & Lanza, M. "CodeCity: 3D visualization of large-scale software,” (ICSE Companion '08), 2008
Creole: http://thechiselgroup.org/2003/07/06/creole/
Human activities
Gourse: visualizing commits
Software ecosystems
ADOPTION
Lessons learned?
THEORIES
METHODS
FLOWUSERS
TASKS
Diver:
Myers, D. & Storey, M. "Focusing on Execution Traces Using Diver," 18th Working Conference on Reverse Engineering (WCRE), 2011, pp.439-440
A theory of cognitive support for Diver…
Dimensions Characteristics Elements Intent Role Team, Developer, Manager, Researcher, Maintainer, Reengineer
Time Present, Recent Past, Historical Authorship Authorship, Rationale, Time, Artifacts
Information Change management Local History, Releases, Releases, Branching, Revisions Defect tracking Defects, Changes Program code Modules/components, Syntactic units (e.g. files), Semantic analysis Documentation Requirements, Design, Test cases, Architecture Informal communication Email, Instant messages Derived/Aggregated Single source, multiple source
Presentation Form Text, Hypertext, Graphical Kinds of views Annotated views, Statistical views, Graph views, Special views Techniques Visual variables (colour, position etc), Animation, 2D/3D
Interaction Batch/Live Offline, Online, Customizable Customization Level of customization, sharing and saving customizations Queries Query language, Visual queries, Filter widgets View navigation Multiple views, Overview+detail, Zoomable views, Coupled
Effectiveness System Implemented, Availability, Scalability, Interoperability Cost Economic cost, Installation, Learning, Usage Evaluation Adopted, Case study, User study
Storey, M.-A. & Cubranic, D. & German, D. M. "On the use of visualization to support awareness of human activities in software development: a survey and a framework," ACM symposium on Software Visualization (SoftVis), 2005.
Framework…
What’s next?
Three trends to consider…
Developers: solo coder -> social coder
Software development: code centric -> data centric
Visualization: standalone -> ubiquitous
Three trends to consider…
Developers: solo coder -> social coder
Software development: code centric -> data centric
Visualization: standalone -> ubiquitous
“I know how this was done because I did it” “I need complete understanding”
Peter Norvig, Coders at Work
“How is this likely done?” “Can I quickly get an understanding of what I need?” Peter Norvig, Coders at Work
“Google team?”
PlaceSpace
P. Dourish and V. Bellotti. Awareness and Coordination in Shared Workspaces. Proceedings of the ACM Conference on Computer-Supported Cooperative Work (CSCW'92).
1968 1980 1990 2000 20101970
Developer tools…
1968 1980 1990 2000 20101970
Telephone
Face2Face
ProjectWorkbook
Documents
Email Lists
VisualAgeVisual Studio
NetBeans EclipseIRC
ICQ Skype
SourceForge
WikisTrello
BasecampJazz
Campfire
GoogleHangouts
Punchcards TFS
Books UsenetStack
Overflow
Twitt
er
Google Groups
PodcastsBlogs
GitH
ub
Conferences
Societies LinkedInFacebook
Slashdot HackerNews
Nondigital Digital Digital & Socially Enabled
MasterbranchCoderwall
Meetups
Yam
mer
Storey, M.-A., L. Singer, F. Figueira Filho, B. Cleary and A. Zagalsky,The (R)evolutionary Role of Social Media in Software Engineering, ICSE 2014 Future of Software Engineering Track, 36th International Conference on Software Engineering (ICSE 2014) Hyderabad.
Social Media and Participatory Cultures [Jenkins]
Low barriers to artistic expression and engagement
Strong support for sharing one’s creations
Informal mentorship for novices
Members believe their contributions matter
Members care about social connections and what others think about their creations
3
The Participatory Culture in Software Engineering is not new
Internet and free/open source projects
Linux and the bazaar model of programming
Global software development (GSD)
Historical importance of tools and the social shaping of communities
4
Three trends to consider…
Developers: solo coder -> social coder
Software development: code centric -> data centric
Visualization: standalone -> ubiquitous
!
Code centric -> (Big) Data centric User feedback -> usage logs, social media In lab testing -> large scale testing in the wild Centralized -> distributed development Long product cycle -> continuous releases
!
!
Era of software analytics
Quiz!!! Which code should I test!
1. Which day of the week is likely to produce the buggiest code? Mon-Sun?!2. Who produces more buggy code? Junior or Senior Developers? !3. Which metrics are most useful in predicting defects? a. Lines of code, b. complexity of the code, c. number of developers that worked on the code, d. previous bugs in the code, or e. code churn
Software Analytics: A definition
Software Analytics is to enable software practitioners to perform data exploration and analysis to obtain insightful and actionable information for data-driven tasks around software and services.
Dongmei Zhang & Tao Xie, http://research.microsoft.com/en-us/groups/sa/softwareanalyticsinpractice_minitutorial_icse2012.pdf
Goals of software analytics?
To improve: - quality of the software - experience of the users - productivity of the developers !
Dongmei Zhang & Tao Xie, http://research.microsoft.com/en-us/groups/sa/softwareanalyticsinpractice_minitutorial_icse2012.pdf
Prolific data sources and analysis techniques
Program data: runtime traces, program logs, system events, failure logs, performance… !
User data: usage logs, user surveys, user forums, twitter and blogs, … !
Development data: versions, bug data, commits, testing, communication
Need for actionable insights
To support decision making“use data rather than fortune tellers” [A. Hassan] !
!
!
But need more than data! !
http://www.slideshare.net/taoxiease/software-analytics-towards-software-mining-that-matters
The need for visual analytics!
Focus has been on: - acquiring/cleaning/managing the data - analytics - understanding which questions to ask… One of the key pillars to support software analytics is visualization [Zhang & Xie]
Dongmei Zhang & Tao Xie, http://research.microsoft.com/en-us/groups/sa/softwareanalyticsinpractice_minitutorial_icse2012.pdf
Three trends to consider…
Developers: solo coder -> social coder
Software development: code centric -> data centric
Visualization: standalone -> ubiquitous
Recap: Why software visualization?
Provide insights Answer questions Support wayfinding Tell stories Communicate knowledge, awareness !
!
!
!
Visualization ubiquity
Visual analytics (gain insights) Deep integration (cognitive support in context) Infographics (tell a story) Dashboards (awareness) !
Visualization ubiquity
Visual analytics (gain insights) Deep integration (cognitive support in context) Infographics (tell a story) Dashboards (awareness) !
Information visualization process: overview, filter and zoom, details on demand !
!
!
Visual analytics process: analyze first, show the important, zoom, filter and analyze further, details on demand
Visual analytics
vs
Visualization ubiquity
Visual analytics (gain insights) Deep integration (cognitive support in context) Infographics (tell a story) Dashboards (awareness) !
Visual debugging: Debugger Canvas
http://www.youtube.com/watch?v=3p9XUwIlhJg
Visualization ubiquity
Visual analytics (gain insights) Deep integration (cognitive support in context) Infographics (tell a story) Dashboards (awareness) !
InfographicsTells a story, quickly Shared socially Interactive !
Popular, accessible: visual.ly, Tableau Public !
Examples: New York Times, Tagging, Stackoverflow, Twitter… !
http://www.nytimes.com/newsgraphics/2013/07/21/silk-road/
Tagging work items in
C. Treude and M.-A. Storey. Work Item Tagging: Communicating Concerns in Collaborative Software Development. In IEEE Transactions on Software Engineering 38, 1 (January/February 2012). pp. 19-34.
ConcernLines
http://githut.info
Coverage of API documentation: 77% of the Java API classes & 87% of Android API classes Speed of coverage:
C. Parnin, C. Treude, L. Grammel and M.-A. Storey. Crowd Documentation: Exploring the Coverage and the Dynamics of API Discussions on Stack Overflow”. at http://blog.ninlabs.com/2012/05/crowd-documentation/ May 2012.
Crowd authored API documentation!
http://latest-print.crowd-documentation.appspot.com/?api=android
http://graphoverflow.com
How developers use Twitter
!
AwarenessLearningRelationshipsWhy non-adoptionStrategies
“It was evolving way faster than I was able to keep up with it. And the only way to keep up was to follow some Node.js people on Twitter.”
Leif Singer, Fernando Figueira Filho, Margaret-Anne Storey. Software Engineering at the Speed of Light: How Developers Stay Current Using Twitter ICSE 2014.
Sentiments on Twitter for: shellshock
http://www.csc.ncsu.edu/faculty/healey/tweet_viz/tweet_app/
Visualization ubiquity
Visual analytics (gain insights) Deep integration (cognitive support in context) Infographics (tell a story) Dashboards (awareness) !
DashboardsAwareness Making informed decisions Live data Business intelligence
Dashboards for developer awareness
Treude, C., and M.-A. Storey, “Awareness 2.0: staying aware of projects, developers and tasks using dashboards and feeds,” in ICSE’10: Proc. of the 32nd ACM/IEEE Int. Conference on Software Engineering, ACM.
Assessing and watching developers
!L. Singer, F. F. Filho, B. Cleary, C. Treude, M.-A. Storey, K. Schneider. Mutual Assessment in the Social Programmer Ecosystem: An Empirical Investigation of Developer Profile Aggregators Blog: http://to.leif.me/devprofiles
Recap…
Developers: solo coder -> social coder
Software development: code centric -> data centric
Visualization: standalone -> ubiquitous
Visualization for software analytics
Opportunities and challengesTL;DR !
Mobile !
Scale !
Visualizations as social media !
Visual software analytics should be actionable!
http://think.withgoogle.com/databoard/
Visualize and share your research results!
TakeawaysSoftware developers are the prototype knowledge workers of tomorrow !
Software visualization has come of age: social coder software analytics ubiquitous visualization
AcknowledgementsCHISEL group, UVic, Canada: – Christoph Treude – Brendan Cleary – Alexey Zagalsky – Peter Rigby – Lars Grammel – …… Chris Parnin, NCSU Leif Singer, I Done This Daniel German, UVic Arie van Deursen, TU DelftFernando Figueira Filho, Brazil
Software visualization: Stasko, J. T., Brown, M. H. & Price, B. A. (Eds.) Software Visualization MIT Press, 1997Petre, M. "UML in practice," Proceedings of the 2013 International Conference on
Software Engineering (ICSE), 2013, pp.722-731Blackwell, A., Britton, C., Cox, A., Green, T., Gurr, C., Kadoda, G., Kutar, M., Loomes,
M., Nehaniv, C., Petre, M., Roast, C., Roe, C., Wong, A. & Young, "Cognitive Dimensions of Notations: Design Tools for Cognitive Technology Cognitive Technology: Instruments of Mind," Springer Berlin Heidelberg, 2001, vol.2117, pp.325-341
DeLine, R., Bragdon, A., Rowan, K., Jacobsen, J., & Reiss, S. "Debugger canvas: industrial experience with the code bubbles paradigm," Proceedings of the 34th International Conference on Software Engineering (ICSE), 2012, pp.1064-1073.
Bull, R. I. & Storey, M.-A. "Towards visualization support for the eclipse modeling framework," A Research-Industry Technology Exchange, 2005
Cleary, B., Gorman, P., Verbeek, E., Storey, M.-A, Salois, M., Painchaud, F., "Reconstructing program memory state from multi-gigabyte instruction traces to support interactive analysis," 20th Working Conference on Reverse Engineering (WCRE), Oct. 2013, pp.42-51!
!
Selected additional References
Social coding:Communities of practice: http://www.ewenger.com/theory/ !C. Treude and M.-A. Storey. Effective Communication of Software Development Knowledge Through Community Portals. ESEC/FSE ’11. M.-A. Storey, C. Treude, A. van Deursen and L.-T. Cheng. The Impact of Social Media on Software Engineering Practices and Tools. In FoSER ’10: Proceedings of the FSE/SDP workshop on Future of software engineering research.!Storey, M.-A., L. Singer, F. Figueira Filho, B. Cleary and A. Zagalsky,The (R)evolutionary Role of Social Media in Software Engineering, ICSE 2014 Future of Software Engineering Track, 36th International Conference on Software Engineering (ICSE 2014) Hyderabad, 2014!Begel, A., J. Bosch, and M.-A. Storey., Social Networking Meets Software Development: Perspectives from GitHub, MSDN, Stack Exchange, and TopCoder. Software, IEEE 30.1 (2013): 52-66. !!!!!
Software analytics:IEEE Software — two special issues on Software Analytics, July/August 2013Tao Xie’s tutorial on software analytics: http://www.slideshare.net/taoxiease/software-
analytics-towards-software-mining-that-matters!
Research methods: Easterbrook, S., Singer, J., Storey, M.-A. & Damian, D. "Selecting Empirical Methods for
Software Engineering Research," Guide to Advanced Empirical Software Engineering, Springer London, 2008, pp.285-311
Walenstein, A., "Observing and measuring cognitive support: steps toward systematic tool evaluation and engineering," 11th IEEE International Workshop on Program Comprehension (IWPC), 2003, pp.185-194
Begel, A. & Zimmermann, T. "Analyze this! 145 questions for data scientists in software engineering," Proceedings of the 36th International Conference on Software Engineering (ICSE), 2014, pp.12-23.
!
Visual analytics: Illuminating the path: http://vis.pnnl.gov/pdf/RD_Agenda_VisualAnalytics.pdfMark Smiciklas (2012). The Power of Infographics: Using Pictures to Communicate and
Connect with Your Audience.!
TakeawaysSoftware developers are the prototype knowledge workers of tomorrow !
Software visualization has come of age: social coder software analytics ubiquitous visualization