Post on 30-Dec-2015
description
transcript
TOWARDS A CONTEXT-AWARE META SEARCH ENGINE FOR IDE-BASED RECOMMENDATION ABOUT PROGRAMMING ERRORS & EXCEPTIONS
Mohammad Masudur Rahman, Shamima Yeasmin, and Chanchal K. Roy
Department of Computer Science
University of Saskatchewan
CSMR-18/WCRE-21 Software Evolution Week (SEW 2014), Antwerp, Belgium
4
Softw
are
Rese
arch
Lab, U
of S
EXCEPTION SEARCH QUERY
Class can not access a member of class java.util.HashMap$HashIterator with modifiers "public final”
5
Softw
are
Rese
arch
Lab, U
of S
EXCEPTION HANDLING: WEB SEARCH
Traditional web search•No ties between IDE and web browsers•Does not consider problem-context•Environment-switching is distracting & Time consuming•Often not much productive (trial & error approach)
6
Softw
are
Rese
arch
Lab, U
of S
IDE-BASED WEB SEARCH
About 80% effort on Software Maintenance,(Ponzanelli et al, ICSE 2013)
Bug fixation– error and exception handling Developers spend about 19% of time in web
search, (Brandt et al, SIGCHI, 2009)
o IDE-Based context-aware web search is the right choice
7
Softw
are
Rese
arch
Lab, U
of S
EXISTING RELATED WORKS
Rahman et al. (WCRE 2013) ERA version of this paper Outlines basic idea, limited experiments
Cordeiro et al. (RSSE 2012) Based on StackOverflow data dump Subject to the availability of the dump, not easily
updatable Uses limited context, only stack trace Very limited experiments
Ponzanelli et al. (ICSE 2013) Based on StackOverflow data dump Uses limited context, only context-code Not specialized for exception handling
8
Softw
are
Rese
arch
Lab, U
of S
EXISTING RELATED WORKS
Poshyvanyk et al. (IWICSS 2007) Integrates Google Desktop in the IDE Not context-aware
Brandt et al. (SIGCHI 2010) Integrates Google web search into IDE Not context-aware Focused on usability analysis
9
Softw
are
Rese
arch
Lab, U
of S
MOTIVATION EXPERIMENTS
Search Query
Common for All
Google Unique
Yahoo Unique
Bing Unique
Content Only
32 09 16 18
Content and Context
47 09 11 10
75 Exceptions (details later) Individual engine can provide solutions for 58
exceptions at most. Maximizing total solutions
11
Softw
are
Rese
arch
Lab, U
of S
PROPOSED IDE-BASED META SEARCH MODEL
Start search
Results
Web page
12
Softw
are
Rese
arch
Lab, U
of S
PROPOSED IDE-BASED META SEARCH MODEL
Distinguished Features (5) IDE-Based solution
Web search, search result and web browsing all from IDE No context-switching needed
Meta search engine Captures data from multiple search engines Also applies custom ranking techniques
Context-Aware search Uses stack traces information Uses context-code (surroundings of exception locations)
Software As A Service (SAAS) Search is provided as a web service, and can be
leveraged by an IDE. http://srlabg53-2.usask.ca/wssurfclipse/
PROPOSED IDE-BASED META SEARCH MODEL
Two Working Modes Proactive Mode
Auto-detects the occurrence of an exception Initiates search for exception by client itself Aligned with Cordeiro et al. (RSSE’ 2012) & Ponzanelli et
al. (ICSE 2013)
Interactive Mode Developer starts search using context menu Also facilitates keyword-based search Aligned with traditional web search within the IDE
14
Softw
are
Rese
arch
Lab, U
of S
SEARCH QUERY GENERATION
Search Query required to collect results from the Search Engine APIs and to develop the corpus.
Query generation Uses stack trace and Context code Collects 5 tokens of top-most degree of
interests from stack trace. Collects 5 most frequently invoked methods
in the context-code. Combined both token list to form the
recommended keywords for the context.
15
Softw
are
Rese
arch
Lab, U
of S
RESULT RANKING ASPECTS (4)
Content-Relevance Considers page title, body content against search query
Context-Relevance Considers stack traces from webpage against target stack
trace Considers code snippets against context-code extracted
from IDE
Link Popularity Considers the Alexa & Compete site rank Estimates a normalized score from those ranks
Search Engine Confidence Heuristic measure of confidence for the result Considers the frequency of occurrence Considers the weight of each search engine
16
Softw
are
Rese
arch
Lab, U
of S
PROPOSED METRICS & SCORES
Content Matching Score (Scms) Cosine similarity based measurement
Stack trace Matching Score (Sstm) Structural and lexical similarity measurement of
stack traces
Code context Matching Score (Sccx) Code snippet similarity (code clones)
StackOverflow Vote Score (Sso) Total votes for all posts in the SO result link
17
Softw
are
Rese
arch
Lab, U
of S
PROPOSED METRICS & SCORES
Site Traffic Rank Score (Sstr)-- Alexa and Compete Rank of each link
Search Engine weight (Ssew)---Relative reliability or importance of each search engine. Experiments with 75 programming queries against the search engines.
Heuristic weights of the metrics are determined through controlled experiments.
18
Softw
are
Rese
arch
Lab, U
of S
EXPERIMENT OVERVIEW
75 Exceptions collected from Eclipse IDE workspaces of grad-students of SR Lab, U of S, and different online sources (StackOverflow, pastebin)
Related to Eclipse plug-in framework and Java Application Development
Solutions chosen from exhaustive web search with cross validations by peers
Recommended results manually validated. Results compared against existing
approaches and search engines.
19
Softw
are
Rese
arch
Lab, U
of S
PERFORMANCE METRICS
Mean Precision (MP) Recall (R) Mean First False Positive Position (MFFP) Mean Reciprocal Rank (MRR)
20
Softw
are
Rese
arch
Lab, U
of S
RESULTS FOR SCORE COMPONENTSScore Components
Metrics Proactive Mode (Top 30)
Interactive Mode (Top 30)
Content MPTEFR
0.037156 (75)74.66%
0.048165 (75)86.66%
Content +Context
MPTEFR
0.037655 (75)73.33%
0.051466 (75)88.00%
Content + Context + Popularity
MPTEFR
0.038156 (75)74.66%
0.051966 (75)88.00%
Content +Context + Popularity +Confidence
MPTEFR
0.038056 (75)74.66%
0.053868 (75)90.66%
[ MP = Mean Precision, R = Recall, TEF= Total Exceptions Fixed]
21
Softw
are
Rese
arch
Lab, U
of S
RESULTS OF EXISTING APPROACHES
Recommender Metrics Top 10 Top 20 Top 30
Cordeiro et al. (only stack traces)
MPTEFR
0.020215 (75)20.00%
0.012818 (75)24.00%
0.008518 (75)24.00%
Proposed Method (Proactive Mode)
MPTEFR
0.088651 (75)68.00%
0.052955 (75)73.33%
0.038056 (75)74.66%
Ponzanelli et al. (only context-code)
MPTEFR
0.02437 (37)18.92%
0.01357 (37)18.92%
0.00997 (37)18.92%
Proposed Method (Proactive Mode)
MPTEFR
0.100030 (37)81.08%
0.062132 (37)86.48%
0.045032 (37)86.48%
[ MP = Mean Precision, R = Recall, TEF= Total Exceptions Fixed]
22
Softw
are
Rese
arch
Lab, U
of S
RESULTS OF SEARCH ENGINESSearch Engine Metrics Top 10 Top 20 Top 30
Google MPTEFR
0.157157 (75)76.00%
0.086457 (75)76.00%
0.058057 (75)76.00%
Bing MPTEFR
0.101355 (75)73.33%
0.053358 (75)77.33%
0.036458 (75)77.33%
Yahoo MPTEFR
0.098654 (75)72.00%
0.053957 (75)76.00%
0.036957 (75)76.00%
StackOverflow Search
MPTEFR
0.022614 (75)18.66%
0.014017 (75)22.66%
0.009717 (75)22.66%
Proposed Method (Interactive mode)
MPTEFR
0.122959 (75)78.66%
0.073664 (75)85.33%
0.053868 (75)90.66%
23
Softw
are
Rese
arch
Lab, U
of S
THREATS TO VALIDITY
Search not real time yet, generally takes about 20-25 seconds per search. Multithreading used, extensive parallel processing needed.
Search engines constantly evolving, same results may not be produced at later time.
Experimented with common exceptions, which are widely discussed and available in the web.
24
Softw
are
Rese
arch
Lab, U
of S
LATEST UPDATES
More extensive experiments with 150 exceptions. Achieved 92% accuracy.
Eclipse plugin release (https://marketplace.eclipse.org/content/surfclipse)
Context-aware Keyword search with automatic query completion feature.
Visual Studio 2012 Plugin under development.
Extensive User Study ongoing.
25
Softw
are
Rese
arch
Lab, U
of S
SURFCLIPSE TOOL DEMONSTRATION
Tool Demo video: https://www.youtube.com/watch?v=hGbyF4YveaI
27
Softw
are
Rese
arch
Lab, U
of S
REFERENCES[1] M.M. Rahman, S.Y. Mukta, and C.K. Roy. An IDE-Based Context-
Aware Meta Search Engine. In Proc. WCRE, pages 467–471, 2013.
[2] J. Cordeiro, B. Antunes, and P. Gomes. Context-based Recommendation to Support Problem Solving in Software Development. In Proc. RSSE, pages 85 –89, June 2012.
[3] L. Ponzanelli, A. Bacchelli, and M. Lanza. Seahawk: StackOverflow in the IDE. In Proc. ICSE, pages 1295–1298, 2013.
[4] D. Poshyvanyk, M. Petrenko, and A. Marcus. Integrating COTS Search Engines into Eclipse: Google Desktop Case Study. In Proc. IWICSS, pages 6–, 2007.
[5] J. Brandt, P. J. Guo, J. Lewenstein, M. Dontcheva, and S. R. Klemmer. Two Studies of Opportunistic Programming: Interleaving Web Foraging, Learning, and Writing Code. In Proc. SIGCHI, pages 1589–1598, 2009.
28
Softw
are
Rese
arch
Lab, U
of S
SAMPLE STACK TRACEjava.net.ConnectException: Connection refused: connectat java.net.DualStackPlainSocketImpl.connect0(Native Method)at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)at java.net.AbstractPlainSocketImpl.connect(Unknown Source)at java.net.PlainSocketImpl.connect(Unknown Source)at java.net.SocksSocketImpl.connect(Unknown Source)at java.net.Socket.connect(Unknown Source)at java.net.Socket.connect(Unknown Source)at java.net.Socket.<init>(Unknown Source)at java.net.Socket.<init>(Unknown Source)at test.SockTest.main(SockTest.java:13)
29
Softw
are
Rese
arch
Lab, U
of S
SAMPLE CONTEXT CODE
try {Socket client = new Socket("localhost", 4321);ObjectOutputStream out = new ObjectOutputStream(client.getOutputStream());out.flush();ObjectInputStream in = new ObjectInputStream(client.getInputStream());System.out.println("Buffer size: " + client.getSendBufferSize());for (int i = 0; i < 10; i++) {
if (i == 3) {Thread.currentThread().interrupt();System.out.println("Interrupted.");}out.writeObject("From Client: Hellow." + i);out.flush();System.out.println(in.readObject());}} catch (Exception e) {e.printStackTrace();}
30
Softw
are
Rese
arch
Lab, U
of S
SEARCH QUERY FOR CORPUS DEVELOPMENT
java.net.ConnectException Connection refused connect currentThread
31
Softw
are
Rese
arch
Lab, U
of S
ITEMS USED FOR RELEVANCE CHECKING
java.net.ConnectException Connection refused connect currentThread+Sample Stack Trace+Sample Context Code
32
Softw
are
Rese
arch
Lab, U
of S
SAMPLE STACK TRACE (2)
java.lang.ClassNotFoundException: org.sqlite.JDBCat java.net.URLClassLoader$1.run(Unknown Source)at java.net.URLClassLoader$1.run(Unknown Source)at java.security.AccessController.doPrivileged(Native Method)at java.net.URLClassLoader.findClass(Unknown Source)at java.lang.ClassLoader.loadClass(Unknown Source)at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)at java.lang.ClassLoader.loadClass(Unknown Source)at java.lang.Class.forName0(Native Method)at java.lang.Class.forName(Unknown Source)at core.ANotherTest.main(ANotherTest.java:18)
33
Softw
are
Rese
arch
Lab, U
of S
CONTEXT CODE (2)
try{//code for making connection with a sqlite databaseClass.forName("org.sqlite.JDBC");Connection connection=null;connection=DriverManager.getConnection("jdbc:sqlite:"+"/"+"test.db");Statement statement=connection.createStatement();String create_query="create table History ( LinkID INTEGER primary key, Title TEXT not null, LinkURL TEXT not null);";boolean created=statement.execute(create_query);System.out.println("Succeeded");}catch(Exception exc){exc.printStackTrace();}
34
Softw
are
Rese
arch
Lab, U
of S
SEARCH QUERY FOR CORPUS DEVELOPMENT
java.lang.ClassNotFoundException org.sqlite.JDBC db ClassLoader execute