1
Designing a ProstheticMemory for Software
Developers
PhD Thesis ProposalIn Software Engineering
Uri Dekel
22 January 2008
2
Preview
• Settings: Software implementation andmaintenance work in the IDE
• Problem: Managing, using, and sharingknowledge about artifacts AND activities Reduce errors, disorientation, omissions
• Proposed approach: Memory aid in IDE Presents “journal” of development episode Interleaves “Objective” and “Subjective”
Knowledge Strength in objective-subjective synergy Offers unified management of knowledge on
artifacts and activities
3
Preview
• Thesis statement:“A prosthetic memory for softwaredevelopment……inspired by human episodic memory……can successfully augment the organicmemories of developers…
…and help them preserve, manage, and shareknowledge”
4
Outline
• Knowledge of activities Problems, techniques, proposed support
• Knowledge of artifacts• Traceability of decisions• Thesis and contributions• Completed work• Evaluation plans• Risks (Backup in response to questions)
5
Activity knowledge
Artifacts Traceability Thesis Plan RisksActivities Completed
6
Running Example
• Wally works on multiplayer cellphone chess• Wally plans game loop
Each side maintains copy of board Client gets move from UI, sends to server and
waits for opponent response Server always relays moves to other side Client receiving opponent move updates board,
waits for UI• Subtasks for implementing game loop
Communication infrastructure Encode moves Handle end of game
7
Running Example:Disorientation
• Created Model classes and UI• Create ChessClient class
Implement onUIMove Add board update call
• Switch to Board class Implement board update
Add Communicator calls• Switch to Move class
Implement move encoding Implement move decoding
Create Communicator class• Implement all four methods
Investigate networking library docs and samples
NEED: Don’t wait if end-of-game from own move• NEED: Implement end-of-game check on Board class
NEED: Implement onOpponentMove NEED: Implement server NEED: Handle end-of-game
Artifacts Traceability Thesis Plan RisksActivities Completed
8
Running Example:Disorientation
• Need to backtrack to add check forend-of-game, then move on to the server
• Potential for navigation disorientation[DeAlwis ‘06] Too many open windows Location stack in Eclipse limited DOI models not specific enough
• Wally closes all windows to “start fresh”• Potential for task disorientation
What was I doing and what did I accomplish? What remains to be done?
Artifacts Traceability Thesis Plan RisksActivities Completed
9
Running Example:Omissions
• Wally remembers he needs to implementthe server and the client-receiving Begins work on server
• Omission of check for end-of-game beforewaiting for opponent response System will get stuck if user’s own move ends
game Problem will likely pop up while testing
something else and make for difficult debug
Artifacts Traceability Thesis Plan RisksActivities Completed
10
Keeping Track Of Reminders
• Wally could have used bug DB torecord tasks for reminders Appropriate for high-level tasks
Self-contained, well defined, preplanned Less appropriate for personal subtasks
Transient, not-well defined, ad-hoc Expensive to use during development Unwanted public archival [Storey ‘08] Does not ensure timely awareness
Artifacts Traceability Thesis Plan RisksActivities Completed
11
Keeping Track of Reminders
• Wally could have used to-docomments for reminders [Storey ‘08] Cheap and easy to produce Proactive investment in phrasing Potential for placement difficulties Clutter the code and not private if
committed Code must be inspected to reveal to-do No clear order of completion
Artifacts Traceability Thesis Plan RisksActivities Completed
12
Proposed approach
• We propose building a prostheticmemory for software development “Device or software that supplements
human memory, to store maintain andaccess copies of relevant information”
Many existing prostheses focus onphysical world
Others record knowledge without context
Artifacts Traceability Thesis Plan RisksActivities Completed
13
Proposed Approach:Journal of Objective Activities
• A “journal”/“memory” of development work• Formed around “objective observations”
Activities with external manifestations Obvious to layperson without interpretation
Yes: Added text X to method Y No: Made method Y robust against null parameters
Aggregation/abstraction of low level events e.g., Edited method Y from S1 to S2 between T1 and T2
Automated capture via instrumentation/monitors Video recording not structured/searchable
Artifacts Traceability Thesis Plan RisksActivities Completed
14
Proposed Approach:Journal of Objective Activities
• Chronological view Peripherally shows
most recent Maintain Orientation Use to find artifacts
• Difficult tounderstand andvisually search
Artifacts Traceability Thesis Plan RisksActivities Completed
15
Proposed approach:Add Subjective “reminders”
• Developer can provide“subjective” observations
• Interleaved with objective• Objective provides context
to subjective Can infer details from
adjacent objective Makes subjective cheaper to
produce• Proposition: reduced costs
lead to more knowledgepreservation
16
Proposed approach:Add Activity Announcements• Visual partition if developer declares when
starting or switching tasks or subtasks
Artifacts Traceability Thesis Plan RisksActivities Completed
17
Proposed approach:Backtracking
• Simplified backtracking by removing top• Can further filter for leftover reminders
Artifacts Traceability Thesis Plan RisksActivities Completed
18
Proposed approach:Context View
• Memory view shows limited portion Hides old reminders relevant to current context
• Separate “context view” shows all relevantsubjective observations For current resource and method For other visible method definitions Potentially related methods (e.g., invoked)
• Allows caller to know of remaining to-dosin callee
Artifacts Traceability Thesis Plan RisksActivities Completed
19
Proposed approach:Sharing activity
Sharing activities canimprove awareness
• Status/Awayannouncements
• Change/checkoutindicators (manytools: Tukan,Palantir, Lotus Jazz)
• Activity indicator(using providedknowledge)
20
Proposition 1
Providing discrete categorizedsubjective observations about theiractions… (What we saw so far)…will help developers avoiddisorientation and omissions……and improve their peers’ awarenessof their actions
Artifacts Traceability Thesis Plan RisksActivities Completed
21
Artifact knowledge
Activities Traceability Thesis Plan RisksArtifacts Completed
22
Knowledge About Artifacts
• Developers generate important knowledgewhile creating artifacts Some manifested in the artifact
e.g., external interfaces, obvious implementations Some only in developer’s mind
Usage protocols Assumptions and limitations Differences from expected implementation Underlying rationale
• Potential for costly errors e.g., Not complying with usage protocol
Activities Traceability Thesis Plan RisksArtifacts Completed
23
Running Example:Need for artifact knowledge
• Asuk has to write the server• Examines the client sending code• Wants to use Wally’s Communicator• What is the usage protocol?
e.g., Has to be initialized somewhere? e.g., Can the same communicator handle
multiple concurrent connections?• Without additional information, will
have to study source codeActivities Traceability Thesis Plan RisksArtifacts Completed
24
Running Example:Need for artifact knowledgeExcerpt fromICommunicator
25
Running Example:Need for artifact knowledgeExcerpt fromCommunicatorImpl
Activities Traceability Thesis Plan RisksArtifacts Completed
26
Documentation Types
Single construct/blockYes (including virtual)Multipleartifacts
Only when reading sourceOnly when browsing docsAwarenessPublic once checked-inCan be controlledPrivacyIn artifact, may clutterSeparate from artifactsPresentation
Visual to following constructExplicit (with name or contentreplication)
Artifactconnection
Short textElaborateStructure
LimitedElaborateAmount
InternalExternal
Activities Traceability Thesis Plan RisksArtifacts Completed
27
Running example:Adequate documentation
28
Documentation Feasibility
• Optimally: comprehensive up-to-datereadable documentation
• But: Costs are high Ensuring grammatical/semantic clarity Making elements self-contained Delays from concurrent documentation
• And: No guaranteed returns per element• Result: Limited documentation
Rationale and intent often lost No exhaustive listing of assumptions/limitations
Activities Traceability Thesis Plan RisksArtifacts Completed
29
Proposed Approach:Observations on Artifacts
• A compromise between internal andexternal documentation
• User provides discrete subjectiveobservations instead of comments
• Reduced costs Context and selection automatically assigned Can be provided while coding (using voice) Short with no proactive editing Allows use of deictic references
• Reduction in clutter and privacy issues• Presented in related contexts (e.g., callers)
30
Proposed Approach:Observations on Artifacts
CreatingObservations
Activities Traceability Thesis Plan RisksArtifacts Completed
31
Proposed Approach:Observations on Artifacts
MemoryView
32
Proposed Approach:Observations on Artifacts
ContextView
Activities Traceability Thesis Plan RisksArtifacts Completed
33
Proposition 2
Providing discrete categorizedsubjective observations aboutartifacts… (What we just saw)…will help developers track and shareknowledge and thus avoid errors
Activities Traceability Thesis Plan RisksArtifacts Completed
34
Decision Traceability
Activities Artifacts Thesis Plan RisksTraceability Completed
35
Decision Traceability
• Occasional need to understand“how and why” of current artifact state Infrequent and after-the-fact but critical
• Requires knowledge about: Development process Knowledge from developer’s mind Passively absorbed information
Activities Artifacts Thesis Plan RisksTraceability Completed
36
Running Example
• Asuk is trying to understand Wally’simplementation of Communicator What is the purpose of Authorization? Why is the request queue initialized here?
Activities Artifacts Thesis Plan RisksTraceability Completed
37
Proposed Approach:Traceability
• Asuk selects authorization line and queries for allrelevant events One hit: Created and not moved or edited
• Asuk asks for all observations from that time Reveals sequence of web browsing
Android APIs, various examples Eclipse activity just before implementing method
Opened sample Tweeter client in Eclipse Copied section from client, pasted in method Reedited then copied additional constructs, including queue
• Asuk realizes that Wally blindly copied example Uses referenced materials to learn
Activities Artifacts Thesis Plan RisksTraceability Completed
38
Running Example
• Asuk still doesn’t understand thegame loop
• Wally did not document his intentions Well known problem No relevant knowledge or discrete
subjective observations when work began Needed knowledge more elaborate than
possible with discrete observations
Activities Artifacts Thesis Plan RisksTraceability Completed
39
Episodic Memory
• Developers avoid cost/distraction• How to get developers to provide
knowledge?• Humans capable of memorizing details and
experiences while working Turned to literature in psychology Encountered Tulving’s Episodic Memory model
Activities Artifacts Thesis Plan RisksTraceability Completed
40
Episodic vs. Semantic Memory
Comprehensive butnot accessible
Limited succinctknowledge
Scale
Contextual cuesAssociationAccess
Immediate withexposure
Multiple exposuresLearning process
Vivid recollection ofroad trip vacation
Road routes,vacation dates
Example
Rich chronologicalepisode recollections
Connected facts w/olearning context
Content andstructure
EpisodicSemantic
Activities Artifacts Thesis Plan RisksTraceability Completed
41
Episodic Memory inSW Development
• Recollection of development episode in EM Visual context Activities Thoughts during activities, including rationale
• Details and distinguish factors degrade• Need to be externalized and persisted
Cannot tap organic memory• We mimic episodic memory
Collect objective knowledge with context Combine with record of “thoughts”
Activities Artifacts Thesis Plan RisksTraceability Completed
42
Obtaining SubjectiveKnowledge
• Knowledge workers reason and reinforceworking memory with “inner voice”
• Researchers glimpse into this voice byasking for continuous verbalization “Think-aloud”, protocol analysis, etc. No interference with end product if avoiding
cognitive investment Deictic references to shared context Dependencies between utterances, form a
continuous narrative
Activities Artifacts Thesis Plan RisksTraceability Completed
43
Example of Think-Aloud
44
Obtaining SubjectiveKnowledge
• Developer “narrates” work for benefit of selfand of future users Most likely using speech while coding/thinking
• Does not need to be exhaustive Can apply discretion But must avoid distracting proactive investment
• Broken into uncategorized subjectiveobservations Visually distinct from discrete categories Does not appear in context view Interleaved with objective and context
45
Propositions 3+4
3. Developers can provide, with littleeffort, a continuous narrative thatcaptures important traceabilityknowledge
4. Other stakeholders can cost-effectively elicit the narratedtraceability knowledge from theprosthetic memory
Activities Artifacts Thesis Plan RisksTraceability Completed
46
Thesis and Contributions
Activities Artifacts Traceability Plan RisksThesis Completed
47
Expected Practical Benefits
• Objective record only: Tracking some recent navigation and changes Tracing passive exposure to information
• Objective + Discrete subjective: Recording reminders to reduce omissions Recording and share artifact knowledge Activity observations partition objective record
and improve orientation and awareness• Objective + All subjective narrative
Traceability Search
Activities Artifacts Traceability Plan RisksThesis Completed
48
Expected TechnicalContributions
• Objective knowledge collection Infrastructure for IDE monitoring Technique for web application monitoring Framework for representing traces with
context from multiple sources
Activities Artifacts Traceability Plan RisksThesis Completed
49
Thesis Statement
• A prosthetic memory for softwaredevelopment inspired by humanepisodic memory can successfullyaugment the organic memories ofdevelopers and help them preserve,manage, and share knowledge
Activities Artifacts Traceability Plan RisksThesis Completed
50
Expected ResearchContributions
• Demonstrating that cost reductions aloneyield more knowledge preservation Positive result - lead to more strategies Negative results - highlight another factor?
• Demonstrating the power of an episodicchronological representation for knowledgeabout software system Existing tools rely on semantic presentation
Activities Artifacts Traceability Plan RisksThesis Completed
51
Potential impact
• Step towards addressing theknowledge problem in SWdevelopment
• Proactive knowledge preservationreduces need for retroactive means
• Support for distributed organizations• May apply to to other development
phases• May apply to other knowledge workActivities Artifacts Traceability Plan RisksThesis Completed
52
Major Technical Challenges
• Collecting adequate telemetry• Eliciting objective observations• Collecting subjective knowledge with
minimal interruption• Processing spoken narrative• Effective peripheral presentation• Effective interactive investigation
Activities Artifacts Traceability Plan RisksThesis Completed
53
Completed work
Activities Artifacts Traceability Plan RisksCompletedThesis
54
Research Evolution
• Early work on interruptions Means for automatically deferring
interruptions Gathering and interpreting low-level
telemetry to determine activity Similar strategy followed in prototype
Activities Artifacts Traceability Plan RisksCompletedThesis
55
Research Evolution
• Studies of design Seeking to support distributed design Studied collocated design
Existing tools focus on supporting drawing Identified difficulties in managing artifacts
Memory and spatial cues frequently used Identified importance of knowledge preservation
Passive exposure to artifacts• Trace changes and impacts
Needed for interpretation Effort to preserve knowledge limits creativity
Activities Artifacts Traceability Plan RisksCompletedThesis
56
Research Evolution
• Knowledge preservation in eMoose Focused on computer-based work
More straightforward to monitor Tracking passive exposure
Developed technique for web access Support for heterogeneous records with
contextual details Abstraction of telemetry Addition of subjective support
Activities Artifacts Traceability Plan RisksCompletedThesis
57
eMoose Architecture
Activities Artifacts Traceability Plan RisksCompletedThesis
58
What it looks like
59
Personal experience with tool
• Used for over 6 months• Typed discrete observations about activities
Effective capturing of reminders and bugs• Task/subtask switching observations help
avoid disorientation and omissions• Effective for finding files and members
Recent locations Searching by task indicator Search by free text
Activities Artifacts Traceability Plan RisksCompletedThesis
60
Validation plans
Activities Artifacts Traceability Completed RisksPlanThesis
61
Prototype Deliverables
• Initial prototype ready for use• Still much work
Robustness Correctness of objective observations New features Context view
• Thesis scope: Support for Java work within Eclipse Support for web applications
Activities Artifacts Traceability Completed RisksPlanThesis
62
Formative Evaluation
• Goal: General initial evaluation andpreparation for prototype release
• Recruit testers within department• Train in all facets of eMoose• Freedom in whether and how to use• Frequently observe and interview• Short cycles on requests and bug fixes• Prepare for outside distributionActivities Artifacts Traceability Completed RisksPlanThesis
63
Phase 1
• Goal: Evaluate ability to manage tasks andreminders and avoid disorientation (prop. 1)
• Recruit developers from OSS/Industry Leverage existing contacts (IBM/Accenture, MSE) Preferably recruit entire project team
• Custom version focused activity knowledge• Periodic surveys on satisfaction and impact• Interviews with some participants• Collect data from memory view client• Check bug database for impactActivities Artifacts Traceability Completed RisksPlanThesis
64
Phase 2
• Goal: Knowledge preservation and sharingwith discrete observations (prop. 2)
• Recruit from phase 2 participants Projects where everyone used eMoose Add knowledge preservation/sharing features
• Periodic surveys on satisfaction and impact• Collect exposure data from context view• Compare amount of documentation
before/after interventionActivities Artifacts Traceability Completed RisksPlanThesis
65
Phase 3
• Goal: Study knowledge preservation andconsumption with continuous narrative
• Recruit users for lab study• Starting task from scratch• Asked to narrate for benefit of others• Qualitative analysis of narrative contents
and impact on productivity• Select work of certain developers• Have others perform maintenance and
explore decision. Analyze process and useActivities Artifacts Traceability Completed RisksPlanThesis
66
Phase 4 (optional)
• Goal: Study everything together• Recruit users for lab study• Sequence of hand-offs from one user
to another• Compensation incorporates group
effort and preserved knowledge• Qualitatively analyze results
Activities Artifacts Traceability Completed RisksPlanThesis
67
Timeline
Activities Artifacts Traceability Completed RisksPlanThesis
68
Risks (BACKUP)
Activities Artifacts Traceability Completed Plan RisksThesis
69
Capture difficulties
• Risk: Not all objective knowledgecan be captured
• Existing knowledge sufficient for mostpurposes
• May affect rare trace investigations
Activities Artifacts Traceability Completed Plan RisksThesis
70
Voice Capture
• Risk: Environment not suited for verbalnarration
• Risk: Not finding appropriate engine• Continuous narrative likely to be used less
than typed discrete observations• For our lab studies, we will type in transcript
between users.• Technologies and availability increasing• Organizations adjusting environment to
support collaboration (e.g., pair prog)Activities Artifacts Traceability Completed Plan RisksThesis
71
Voice Capture
• Risk: Speech to text inaccurate• We do not need absolute accuracy• Knowledge readable even if disrupted• May apply context to resolve names
Activities Artifacts Traceability Completed Plan RisksThesis
72
Insufficient contributions
• Risk: Users do not provide enoughsubjective knowledge?
• Likely to see variations between users• Returns correlated with investment• Some organizations mandate
knowledge preservation
Activities Artifacts Traceability Completed Plan RisksThesis
73
Users prefer traditionalmeans?
• Risk: Users prefer traditional means ofpreserving knowledge
• If our approach is just incorrect, not muchwe can do…
• Developers may not trust tool with storingknowledge separately We should offer automated means of adding the
knowledge to the code or to documents• Have to improve robustness
Activities Artifacts Traceability Completed Plan RisksThesis
74
Discrete knowledgeconsumption
• Risk: Users do not have space formemory and context views? Tool meant primarily for dual-monitor Eclipse supports quick-view Markers or overlay in lieu of context view
• Risk: Users find them distracting? Unlikely Memory view changes by one line at a time Context view changes only when switching
members
Activities Artifacts Traceability Completed Plan RisksThesis
75
Discrete observations
• Risk: Developers do not associate typeswith observations Types can be associated after-the-fact Some can be inferred automatically
• Risk: What if an observation is associatedwith wrong context? Associations automatically associated with
current context User may spot mistake in memory view We may support a degree-of-relevance context
modelActivities Artifacts Traceability Completed Plan RisksThesis
76
Continuous Narrative
• What if it is difficult to elicitknowledge from the continuousnarrative? Could explore various techniques Creating visual playback may offload
some mental load Collaborative filtering and editing over
time
Activities Artifacts Traceability Completed Plan RisksThesis
77
BACKUP SLIDES
78
Navigation disorientation
• Questions: How did we arrive at current position? How do we get to next position or back?
• Contributing factors: Complex and unfamiliar code base [de Alwis ‘06] Branching and delocalization
• Manifestations: Display thrashing Excessive navigation
79
Change disorientation
• Questions: What changes took place? What to include in a commit? What to revert when backtracking from a failed
attempt?• Contributing factors:
Branching and delocalization Interleaving of concerns within same members
• Manifestations: Dead and irrelevant code in production Accidental deletion
80
• Bob left several “surprises” He didn’t realize there are special moves
in chess with different encodings Race conditions (assumed that opponent
response takes time, no need for closeconnection fast)
Waiting checks a single mailbox in thecommunicator
81
Other preserved knowledge
• Knowledge preserved in: Version control comments Bug tracking comments Electronic communications
• Limitations: Sparse Often written after-the-fact Difficult to piece into readable material
82
Others using tools
• No field deployment yet• Brought one user to lab
Subject showed preference for speech Difficulty starting and stopping recording
Able to effectively “think-aloud” Narrative contained rationale Narrative too comprehensive
Must train for balance Subject did not explicitly associate types