Software Visualization
CS 4460 - Information VisualizationMany slides courtesy John Stasko, with edits by J. Foley
CS 4460/7450 2
“The use of the crafts of typography,graphic design, animation, andcinematography with modern human-computer interaction and computergraphics technology to facilitate boththe human understanding and effectiveuse of computer software.”
Price, Baecker and Small, ‘98
Software Visualization• Definition
CS 4460/7450 3
Challenge• Unlike much information
visualization, however, software is often dynamic, thus requiring our visualizations reflect the time dimension- History views- Animation- ...
CS 4460/7450 4
Subdomains• Two main subareas of software
visualization- Program visualization
Use of visualization to help programmers, coders, developers
Software engineering focus- Algorithm visualization
Use of visualization to help teach algorithms and data structures
Pedagogy focus
CS 4460/7450 5
Caveat• This is a HUGH area• Presentation goal: provide flavor of
kinds of techniques and systems that have been created- Lots of screen shots- Some videos
CS 4460/7450 6
Program Visualization• Can be as simple as enhanced views
of program source• Can be as complex as views of the
execution of a highly parallel program, its data structures, run-time heap, etc.
PV is a big research Area
CS 4460/7450 7
PV is a Big Product Category• Do Google searches on
- Program visualization- Code visualization- Software visualization
• Lots of companies/products
CS 4460/7450 8
What Can We Visualize?• ??????
What Can We Visualize?• Structure
- Static code structures• Behavior
- Dynamic (execution) code behavior- Test suite results- Bug issues/relations
• Evolution- Code repository structure/evolution- Program team communications patterns
CS 4460/7450 10
Static Code Structures• Call graphs• Object class hierarchy• Scope of variables• Code module sizes
CS 4460/7450 11
Execution DataSummaries, such as
- Total running time- Number of times a
method was called- Amount of time CPU
was idle- Bytes read/written- Memory high-water
mark- Etc etc etc
Details, such as- Memory allocations- System calls- Cache misses- Page faults- Pipeline flushes- Process scheduling- Completion of disk
reads or writes- Message receipt- Application phases- Etc etc etc
CS 4460/7450 12
13
Test Results• Hundreds, maybe thousands of tests• For each test:
- Purpose- Result (pass or fail)
Could be per-configuration or per-version- Relevant parts of the code
CS 4460/7450
14
Really Detailed Execution Data• Logging virtual machines can capture
everything- Enough data to replay program execution and
recreate the entire machine state at any point in time
- Allows “time-traveling”- For long running systems, data could span
months• Uses:
- Debugging- Understanding attacks
CS 4460/7450
SoftVis 2010 Program• An Interactive Ambient Visualization for Code Smells • CodePad: Interactive Spaces for Maintaining Concentration in Programming Environments• User Evaluation of Polymetric Views Using a Large Visualization Wall • Software Evolution Storylines
AllocRay: Memory Allocation Visualization for Unmanaged Languages • Heapviz: Interactive Heap Visualization for Program Understanding and Debugging • A Map of the Heap: Revealing Design Abstractions in Runtime Structures • Trevis: A Context Tree Visualization & Analysis Framework and Its Use for Classifying Performance Failure Reports • Exploring the Inventor's Paradox: Applying Jigsaw to Software Visualization • Dependence Cluster Visualization • Towards Anomaly Comprehension: Using Structural Compression to Navigate Profiling Call-Trees • Embedding Spatial Software Visualization in the IDE: An Exploratory Study • 3D Kiviat Diagrams for the Interactive Analysis of Software Metric Trends • Graph Works - Pilot Graph Theory Visualization Tool • Visualizing Software Entities Using a Matrix Layout • ImpactViz: Visualizing Class Dependencies and the Impact of Changes in Software Revisions • VIPERS: Visual Prototyping Environment for Real-Time Imaging Systems • Towards Automated Analysis and Visualization of Distributed Software Systems • TIE: An Interactive Visualization of Thread Interleavings • GEM: Graphical Explorer of MPI Programs • Fault Forest Visualization • Visualizing Windows System Traces • Understanding Complex Multithreaded Software Systems by Using Trace Visualization • Zinsight: A Visual and Analytic Environment for Exploring Large Event Traces • Jype - A Program Visualization and Programming Exercise Tool for Python Off-Screen Visualization Techniques for Class Diagrams • An Automatic Layout Algorithm for BPEL Processes • Visual Comparison of Software Architectures • Representing Development History in Software Cities • Frank xDIVA: Automatic Animation Between Debugging Break Points • Understanding Relaxed Memory Consistency Through Interactive Visualization
the Execution of Object Orientated Concurrent Programs
CS 4460/7450 15
Commercial System Screen Shots
CS 4460/7450 16
Dependency Graph
CS 4460/7450 17
Ndepend commercial product; http://www.ndepend.com/SampleReports/OnDb4o/NDependReport.html#/?screen=Main
Graph and Matrix Rep’n
CS 4460/7450 18
NDepend
Code Hierarchy Treemap; Lines of Code
CS 4460/7450 19
NDepend
Open Source System Screen Shots • From The Source-NavigatorTM IDE
- http://sourcenav.sourceforge.net/index.html
CS 4460/7450 20
Hierarchy Browser Window
CS 4460/7450 21
Cross-Reference Browser Window
CS 4460/7450 22
Cross-Reference Browser Window
CS 4460/7450 23
Open Source System Screen Shots • From jBixbe
- http://www.jbixbe.com/index.html
CS 4460/7450 24
Structure Diagram
CS 4460/7450 25
Message Exchanges
CS 4460/7450 26
Multi-threading
CS 4460/7450 27
Research Examples• The following are discussions of a
variety of research prototypes• Lots and lots of papers in Software
Visualization Conference Proceedings
CS 4460/7450 28
Pretty Printing – Common Example
CS 4460/7450 29
More Sophisticated• But not so common to apply more
sophisticated graphic design• Baecker & Marcus, Design Principles
for the Enhanced Presentation of Computer Program Source Text, Proceedings CHI ’86
• No user testing • Looks nice
CS 4460/7450 30
Baecker&Marcus Pretty Printing
CS 4460/7450 31
CS 4460/7450 32
SeeSoft System• Pulled-back, far away view of source
code• Map one line of source to one line of
pixels- Maintain indentation & length- Color code lines in meaningful way
• Like taping your source code to the wall, walking far away, then looking back at it
Stephen G. Eick, Joseph L. Steffen and Eric E. Sumner, Jr. “SeeSoft – A Tool for Visualizing Line-Oriented Software Statistics.” IEEE Transactions on Software Engineering, 18(11):957-968, November 1992.
CS 4460/7450 33
SeeSoft View
15,000 linesof code in 52 files
Selected code
Details of selected
code
Heat map for color-coded
characteristic
CS 4460/7450 34
SeeSoft• Code tracking (typically means
mapping a data attribute to line color) - Code modification (when, by whom)- Location of bug fixes for specific bug report- Location of code to implement a specific
feature- Code coverage or hotspots
• Interactive- Change color mappings- Link from heat map to code overview- Brush views – back to source code
CS 4460/7450 35
Tarantula• Developed at GT• Utilizes SeeSoft code view
methodology
• Takes results of test suite run and helps developer find program faults
• Key is the clever color mappingEagan, Harrold, Jones & Stasko InfoVis ’01Jones, Harrold & StaskoICSE ‘02
Tarantula ViewRed – code
executed by failed tests
Green - executed by passed tests
Yellow - executed by both passed and failed
CS 4460/7450 36
Tarantula – Selective Display
CS 4460/7450 37
Failed tests only
38
Tarantula: Continuous Colour Mapping
• Extend discrete colour mapping by- Interpolating between red and green- Adjusting brightness according to
number of tests• Possibilities:
- Number of passed or failed tests- Ratio of passed to failed tests- Ratio of % passed to % failed
CS 4460/7450
39
Tarantula: Continuous Colour Mapping
• For each line L- Let p and f be the percentages of
passed and failed tests that executed L- If p = f = 0, colour L grey- Else, colour L according to
Hue: p / ( p + f ), where 0 is red and 1 is green
Brightness: max( p, f )
CS 4460/7450
40CS 4460/7450
41
Tarantula: Future Work• Does it help find bugs?
- Seems like it should• Link red lines to the tests
- Static- Dynamic
CS 4460/7450
Gammatella - Visualization of Program-Execution Data for Deployed Software
Orso, Jones and Harrold, Proc. Of ACM Symp. on Software Visualization, June 2003, pp. 67-76.
43
Gammatella: Tri-Level Representation
• System level- Treemap of package/class hierarchy- Size – code lines- Smaller areas colored differently
according as code lines colored differently
• File level:- SeeSoft-like view of code
• Statement level:- Source code (colored text)
CS 4460/7450
Gammatella screen shot
CS 4460/7450 44
Multiple
linked views
45
Gammatella• Displays program execution data
- From data base of executions• Visualization of data from many
executions- Code coverage and profiling data- Execution properties
OSJava versionErrorsEtc etc etc
CS 4460/7450
Color Coding• Two variables
- Hue: Red to yellow to green- Intensity: Dark to light
• Various attributes can be mapped to color- Execution profile: red frequent, green
infrequent; intensity not used- Other mappings not discussed
What could be done???
CS 4460/7450 46
System-Level View
CS 4460/7450 47
Execution Bar
Gray: Never executed
code
Colored subdivisions within
a module block indicate amounts of code with various
usages
Execution Bar• Each small vertical bar represents one
program execution (test run)- Color code – results of test run
• May be hundreds of test runs• Treemap shows information for selected test
run(s)- Select one or range of runs by pointing- Select multiple non-contiguous runs via data
base query on test run properties- For multiple executions, color codings are
averages
CS 4460/7450 48
49
Gammatella: Critique• Complete system – not just a
visualization• Nicely links code to structure• Trial usage discovered useful but
high-level information- Mainly relied on system view
• Needs more usage to determine utility
• Complex color coding – color and hue – hard to decodeCS 4460/7450
CVSscan: Visualization of Code Evolution
“CVSscan: Visualization of Code Evolution”, Voinea, Telea, and van Wijk, SoftVis 2005
CS 4460/7450 50
Original presentation by Summer Adams
CS 4460/7450 51
Overview• Objective - understand code evolution over
multiple versions - support maintenance• Want to answer questions such as
- What code lines were added, removed, or altered?When? By whom?
- Which parts of code are unstable?- How are changes correlated?- How are development tasks distributed?- Context in which code segment appeared?
Screen Shot
CS 4460/7450 52
View of code, line by
line and version by
version
Versions
Portion of code from selected version
Approach• Takes information from CVS
(Concurrent Versioning System) source code repository
• Uses UNIX’s diff command to compare versions
• Visualizes each version as column- Lines of code represented as in SeeSoft
As thin horizontal color-coded strip
CS 4460/7450 53
CS 4460/7450 54
Two Code Visualization Formats• File-based• Line-based
Local: like SeeSoft, just current lines
Global: includes future inserts
Code Visualization - Examples• 65 software
versions• Color code
- Green – constant- Yellow – modified- Red - modified by
deletion- Light blue - modified
by insertion- Light gray in global
view - inserted and deleted lines
CS 4460/7450 55
Code clean-up
stage
Global: includes future inserts
Local: like SeeSoft, just current lines
Metrics• Metrics for
each version- Code size
• Metrics for each line of code- Author- Lifetime at
current line• Other ideas
for metrics??CS 4460/7450 56
Per version metric
Per code line
metric
CS 4460/7450 57
Overall View
Preset zoom levels
Use Scenario
CS 4460/7450 58
Status Construct
Author
The same code, viewed with three different attributes being color-coded.
The modified piece of code (yellow status) is a comment (green status) and was modified by programmer whose color is purple
(would be nice to have color key )
CVScan Evaluation• Modest – two users, each on different
code repository- Perl Script, 65 versions, 457 lines- C program, 60 versions, 2900 lines
• 15 minutes training, 15 minutes use• Assessment
- General usability- think out loud- Effectiveness –code understandings
after useCS 4460/7450 59
CVScan Evaluation Results• Perl (65 versions, 457 lines)
- Familiar with overall organization of the file- Focus of each developer- Areas of significant modifications- What modifications referred to
• C (60 versions, 2900 lines)- Did not have clear image of code’s evolution- Did conclude that was legacy code adapted by
mainly two users to support IPv6 network protocol
- Pointed out major modification to memory manager
CS 4460/7450 60
CVScan Critique• Very modest evaluation –
inconclusive• Use of metrics not very well
developed• Nicely leverages SeeSoft to
visualizing multiple versions of same code
CS 4460/7450 61
CS 4460/7450 62
Stepping Up• Next step in program visualization is
to help debugging and performance optimization by visualizing program executions- Data, data structures, run-time heap,
memory, control flow, data flow, hot spots, ...
CS 4460/7450 63
Graph Visualization• Area that we’ve already studied
which is very important to SV• Graphs pop up everywhere in
software• Common use: Visualize a call graph,
visualize a flow chart, ...
CS 4460/7450 64
Sample CallGraph View
Software Projects• Come up a level, to group of
developers working on a code base• What might you want to understand?
- For instance, in an open source project?
CS 4460/7450 65
Stargate: An Author-Centric Approach to Software Project Visualization
Ogawas and Ma, Stargate: An Author-Centric Approach to Software Project Visualization Proceedings of IEEE PacificVis 2008 March, 2008
CS 4460/7450 66
CS 4460/7450 67
Stargate – Start with Modified Sunburst
Stargate – Add Developers• Dots are
developers- Placed close
to files they worked on
- Size => amount of activity
• File colors =>File typeCS 4460/7450 68
Stargate – Add Communications
• Email communications among developers
CS 4460/7450 69
Stargate – Add File History
CS 4460/7450 70
Stargate Video• http://vidi.cs.ucdavis.edu/research/vi
deos/stargate
• (Show from SV folder)
CS 4460/7450 71
SV Takeaways• Multiple dimensions to SV
- Static code/program relationships- Dynamic code/program relationships- Testing/coverge- Project evolution
• Many methods we have studied are applicable, such as ways to depict- Time-varying behaviors- Object relationships
CS 4460/7450 72
The End
CS 4460/7450 73
CS 4460/7450 74
FIELD• Program development and analysis
environment with a wide assortment of different program views- Integrated a variety of UNIX tools- Utilized central message server
architecture in which tools communicated through message passing
Reiss Software Pract & Exp ‘90
CS 4460/7450 75
FIELD Interface
CS 4460/7450 76
Dynamic Call Graph View
Currentlyactive
On callstack
CS 4460/7450 77
Class Browser
CS 4460/7450 78
Heap View
Color can beWhen allocatedBlock sizeWhere allocated
CS 4460/7450 79
3D Call Graph
Selected file
Collapsed file
Groups of calls
Visualizing Application Behavior on Superscalar Processors
Chris Stolte, Robert Bosch, Pat Hanrahan, and Mendel RosenblumProc. InfoVis 1999
81
Superscalar Processors: Quick Overview• Pipeline• Multiple Functional Units
- Instruction-Level Parallelism (ILP)• Instruction Reordering• Branch Prediction and Speculation• Reorder Buffer
- Instructions wait to graduate (exit pipeline)
CS 4460/7450
82CS 4460/7450
83CS 4460/7450
84CS 4460/7450
85CS 4460/7450
86CS 4460/7450
87CS 4460/7450
88CS 4460/7450
89CS 4460/7450
90
Critique• Most code doesn’t need this level of
optimization, but- The visualization is effective, and would
be useful for code that does- May reduce the expertise needed to
perform low level optimzation• Might be effective as a teaching tool• Bad color scheme:
black/purple/brown• Does it scale with processor
complexity?CS 4460/7450
CS 4460/7450 91
Concurrent Programs• Understanding parallel programs is
even more difficult than serial• Visualization and animation seem
naturals for illustrating concurrency• Temporal mapping of program
execution to animation becomes critical
Kraemer & Stasko JPDC ‘93
CS 4460/7450 92
POLKA• We developed a system, POLKA, that
was designed to help people build visualizations of concurrent programs- Used for both program and algorithm
visualization- Used for different programming models:
message passing, shared memory threads, ...
Stasko & Kraemer JPDC ‘93
CS 4460/7450 93
Message Passing Systems
PVM/Conch
Topol, Stasko, Sunderam IJPDSN ‘98
CS 4460/7450 94
Shared Memory Threads
Zhao & Stasko TR ‘95Pthreads
Strata Various: Multi Layer Visualization of Dynamics in Software System Behavior
Doug Kimelman, Bryan Rosenburg, Tova RothProc. Fifth IEEE Conf. Visualization ’94, IEEE Computer Society Press, Los Alamitos, Calif., 1994, pp. 172–178.
96
Strata Various• Trace-driven program visualization• Trace: sequence of <time, event>
pairs• Events captured from all layers
(strata, hence the name):- Hardware- Operating System- Application
• Replay execution history• Coordinate navigation of event viewsCS 4460/7450
97
Strata Various: Why Multi-Strata?• Debugging and tuning requires
simultaneously analyzing behaviour at multiple layers of the system
CS 4460/7450
CS 4460/7450 98
Call tree
Accumulated process times
Processrun history
Kernelperformancestats
99CS 4460/7450
CS 4460/7450 100
Goals• Facilitate debugging by providing
execution time information• Faster assimilation of execution data• Facilitate visual correlations, trends
and anomalies• Several layers including user-level
libraries, OS and hardware
CS 4460/7450 101
PV• Trace-driven• Trace: Time ordered sequence of
events of interest• Synchronous and asynchronous
configurations• Multiple concurrent views
representing the different layers
CS 4460/7450 102
Views• Process scheduling and System
activity- Scheduling view- Activity view
CS 4460/7450 103
Views• Memory Activity and Application
Progress- Memory size view- Source of memory allocation- State of the page of the data segment
CS 4460/7450 104
Codeviews
Memory view
CS 4460/7450 105
Views• Hardware Activity and Source
Progress- Active loop view- Hardware performance statistics
CS 4460/7450 106
Conclusion• A more effective and comprehensive
visualization of software• Worthwhile for developers facing
performance issues• Room for improvement in
visualization• More direct traversal• Presentation of derived data
107
Strata Various: Critique• Examples demonstrate usefulness• Fundamentally, a good idea
- Increasing importance as multi-core machines become standard
• Many windows- Titles not meaningful
• Dubious claim that tracing does not alter behavior
CS 4460/7450
What is Useful? 1991 Survey• Very useful
14,1,18,32,2,19,5
• Useless- 22,33,34
• All rest useful but not essential
CS 4460/7450 108
Software Visualization Tools: Survey and AnalysisSarita Bassil and Rudolf K. Keller, 1991
What is Useful?• Very useful
14,1,18,32,2,19,5
• Useless- 22,33,34
• All rest useful but not essential
CS 4460/7450 109