Post on 25-May-2015
description
transcript
IBM Watson Life ©2014 IBM Corporation
1University of California - Irvine, 2IBM Watson Group - Watson Life, 3IBM Research June 4, 2014
Yi Wang1, Patrick Wagstrom2,3, Evelyn Duesterwald3, David Redmiles1
New Opportunities for Extracting Insight from Cloud Based IDEs
@pridkett http://wagstrom.net/ patrick@wagstrom.net
IBM Watson Life :: ©2014 IBM Corporation
Spoiler Alert
• Novelty: • Identify new opportunities provided by cloud based IDEs
• Demonstrate how to realize these opportunities with a case study.
• New data analysis technique applied to case study.
• Emerging: • Insights extracted from data collected in our case study
• Promising future directions for cloud based IDEs
• Impact: • Researchers - shift in focus to distributed environments
• Practitioners - the future is awesome
• IDE Developers - rich possibilities for tool enhancement based on findings
2
IBM Watson Life :: ©2014 IBM Corporation 3
IBM Watson Life :: ©2014 IBM Corporation 4
IBM Watson Life :: ©2014 IBM Corporation
Experimental Setup - Flow
• Lab Experiment with 8 IBM Employees
• Intake and Exit Surveys to Understand Development Skill and Experience
• Eight Programming Tasks of Increasing Difficulty • Similar to Interview Questions
• Written in JazzHub Orion Code Editor
• Instrumented to Record (Audio, Screen, and Network) of All Sessions
• Effort and Difficulty Self-Evaluated After Each Task
• Solutions Were Graded and Subjects Ranked after Experiment
5
IBM Watson Life :: ©2014 IBM Corporation
Experimental Setup - Man In the Middle Attack
6
Participant Computer
Execution Container
MITMProxy
JazzHub
StackOverflowData Logger
The Internet
IBM Watson Life :: ©2014 IBM Corporation
Code Growth Extraction
• Captured TCP Session Traces Allow Replay of Code Growth
• Collected 41 Total <subject, task> Traces
7
IBM Watson Life :: ©2014 IBM Corporation
Developer Traces
8
0 100 300 500
050
100
200
Subject 4 (Worst)
time (seconds)
code
size
(byt
es)
0 20 40 60 80
010
020
030
040
0
Subject 6 (Best)
time (seconds)
code
size
(byt
es)
Insight 1: Differing levels of expertise yield dramatically different code growth patterns
IBM Watson Life :: ©2014 IBM Corporation
Co-Integration Networks
• Cointegration is an Indicator of Shape Similarity of Two Time Series
• We Calculated Pairwise Cointegration for all 41 Traces
• Cointegration Network was Created UsingCases when p-value < 0.05
9
S1.1
S1.2
S1.3
S1.4
S2.1
S2.2S2.3
S2.4S2.5S3.1
S3.2
S3.3
S3.4S3.5
S4.1S4.2
S4.3S5.1
S5.2
S5.3
S5.4S6.1
S6.2
S6.3
S6.4
S6.5S6.6
S6.7
S6.8
S7.1
S7.2S7.3
S7.4
S7.5S8.1
S8.2
S8.3
S8.4
S8.5
S8.6
S8.7
IBM Watson Life :: ©2014 IBM Corporation
Co-Integration Networks
• The “best” developers (subject 6 and 8) share few intra-subject edges (5.71%), while the other 6 subjects share 112 edges (34.36%).
• The “best” developers traces are mostly located in the peripheral part of the network of Fruchterman-Reingold layout.
10
S1.1
S1.2
S1.3
S1.4
S2.1
S2.2S2.3
S2.4S2.5S3.1
S3.2
S3.3
S3.4S3.5
S4.1S4.2
S4.3S5.1
S5.2
S5.3
S5.4S6.1
S6.2
S6.3
S6.4
S6.5S6.6
S6.7
S6.8
S7.1
S7.2S7.3
S7.4
S7.5S8.1
S8.2
S8.3
S8.4
S8.5
S8.6
S8.7
Insight 2: The “best” developers may have more strategies in
programming, making the traces are more diverse. This may be used as
an potential indicator to infer developers’ skill levels.
IBM Watson Life :: ©2014 IBM Corporation
Individual Programming Styles
• Question: Are code growth traces are more similar for a collection of users doing same task, or for an individual user doing a collection of task?
• Compute edge density in three categories: • I: (subject-subject)—0.3085
• II: (task-task)—0.2479
• III: (task-subject)—0.2611.
• Test the differences using Kruskal-Wallis test: P(I, II) < 0.01, P(I, III) = 0.03.
11
Insight 3: Individual differences rather than Task differences contribute more to the differences of code
growth traces. Code growth traces may be some kind of “finger-print” of developers.
IBM Watson Life :: ©2014 IBM Corporation
Major Impacts
• Shift from desktop to cloud opens up incredible opportunities for studying large numbers of developers in depth
• Centralized instrumentation allows data from one developer to benefit other developers in near real-time
• In depth knowledge of IDE user behavior from cloud IDEs will enable next leap in productivity for software development professionals
12
@pridkett http://wagstrom.net/ patrick@wagstrom.net