+ All Categories
Home > Documents > Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows...

Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows...

Date post: 09-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
34
Department of Informatics Master Project Market HS 2017 · Nathan Labhart Master Project Market · HS 2017 Nathan Labhart Academic Coordinator 2017-11-01 1
Transcript
Page 1: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

Master Project Market HS 2017 · Nathan Labhart

Master Project Market · HS 2017

Nathan LabhartAcademic Coordinator

2017-11-01 1

Page 2: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

Master Project Market HS 2017 · Nathan Labhart2017-11-01

Master Project: Rules

• The Master Project is a group project with two or more members.→ Chance of denial for individual projects: 99%

• The Master Project can only be started after the Master Basismodul has been completed successfully (only for Major).→ Best time: During semester break. Max. 1 year to complete.

• The Master Project must be done with an IfI professor.

• You will get 18 credit points.→ Submit a final report that concludes your work.

2

Page 3: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

Master Project Market: Procedure

• Groups at IfI prepared projects for you and published them online:http://www.ifi.uzh.ch/en/teaching/studiengaenge/msc/msc-proj.html

• Projects are presented at the Market → ask representatives

• To form groups, go to OLAT http://tiny.uzh.ch/yi → use discussion boards

• Once a group is complete,hand in the application form.

3Master Project Market HS 2017 · Nathan Labhart2017-11-01

Page 4: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

Master Projects

Page 5: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

5

Generating Smart Diffs Through Deep Learning

Goal of the Master Project

The outcome of this project should be a taxonomy of change types for different programming languages based on empirical evidence contained in open source repositories as well as the implementation of a change classifier generator.

The main tasks of the project are:

● Find a suitable unsupervised machine learning approach to identify common change types for different programming languages.● Extract 1000s of projects from GitHub and apply the learning approach to cluster the change types.● Implement a generator that will recognize these change types for arbitrary programming languages

Contacts

Carol Alexandru <[email protected]>Jürgen Cito <[email protected]

Number of Students: 2-3

IntroductionIntroduction

While plain-text diffs are a straight-forward way of keeping track of changes in a software project, they are poorly suited for understanding those changes. Different semantic changes might be mixed together in a single diff and it is difficult to further process diffs using automated tools.

Approaches like ChangeDistiller extract changes between two revisions based on abstract syntax trees (ASTs) instead of plain text source code. This allows them to recognize semantic changes, like whether specific elements (if conditions, classes, methods, etc.) have been added, removed, modified or even moved to other locations in the source code.

However, there are two existing problems with this idea:

● The change types identified by tools such as ChangeDistiller have been manually crafted by researchers and can seem arbitrary.● Tool like this need to be implemented separately for each specific programming language.

Since platforms such as GitHub contain the entire histories of millions of software projects, it should be possible to:

● Cluster different types of changes using unsupervised machine learning for any programming language.● Generate a classifier that can recognize these change types instead of implementing it manually.

Generating Smart Diffs Through Deep Learning Carol Alexandru <[email protected]> | Jürgen Cito <[email protected]>

.......[26/06/2015:21205.0], responseTime, "CustomerService", 204[26/06/2015:21215.0], responseTime, "CustomerService", 169[26/06/2015:21216.0], cpuUtilization, "CustomerServiceVM2", 0.73[26/06/2015:21216.0], cpuUtilization, "CustomerServiceVM1", 0.69[26/06/2015:21216.1], vmBilled, "CustomerServiceVM1", 0.35[26/06/2015:21219.4], ids, "ids", [1,16,32,189,216]........

.......[26/06/2015:21205.0], responseTime, "CustomerService", 204[26/06/2015:21215.0], responseTime, "CustomerService", 169[26/06/2015:21216.0], cpuUtilization, "CustomerServiceVM2", 0.73[26/06/2015:21216.0], cpuUtilization, "CustomerServiceVM1", 0.69[26/06/2015:21216.1], vmBilled, "CustomerServiceVM1", 0.35[26/06/2015:21219.4], ids, "ids", [1,16,32,189,216]........

.......[26/06/2015:21205.0], responseTime, "CustomerService", 204[26/06/2015:21215.0], responseTime, "CustomerService", 169[26/06/2015:21216.0], cpuUtilization, "CustomerServiceVM2", 0.73[26/06/2015:21216.0], cpuUtilization, "CustomerServiceVM1", 0.69[26/06/2015:21216.1], vmBilled, "CustomerServiceVM1", 0.35[26/06/2015:21219.4], ids, "ids", [1,16,32,189,216]........

Millions of lines of code

Extract Raw

Changes

Conventional Diffs use Line-Level Differencing

Research tools for Structural Differencing: ChangeDistiller, GumTree for automated reasoning

Can we use deep learning to synthesize change types from large-scale examples?

Cluster change types (Unsupervised Learning)

Learn/ Synthesize Classifier

Learn program fProgLang to classify changes on new code

Page 6: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

6

MSc Project: Financial Crises and Network StructureMotivation

• Banks create a complex network of interactions.• Shocks spread through the network and cause financial crises.• Portfolio Compression = Eliminate cycles in the network

• Idea: Reduce interactions → reduce spread• Required by law

Research Question• Does portfolio compression actuallly make financial networks safer?

Approach• Large-scale Monte Carlo Simulations• Optional: Theoretical guarantees

Prerequisites• Required knowledge:

• Good programming skills (Python or Java or …)• Basic graph theory (e.g., flows on networks)

• Helpful knowledge:• Solid background in mathematics (linear algebra), reading and writing proofs• Scientific computing / numerics

• Participants: 2-3Should we eliminate this cycle?

Supervisors: Prof. Sven Seuken, Steffen Schuldenzucker.Contact: [email protected] and Economics Research Group. www.ifi.uzh.ch/ce

Page 7: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

© 2017 UZH, CSG@IfI 1

Design and Implementation of a Crypto-Currency

q Blockchains (BC) q Crypto-Currencies (CC)q Tasks:

– Design & Implementation of a PoSp-based BC and CC

– Design & Implementation of an Android-based wallet

– Test and Evaluationq Goals:– Secure, Reliable, Fast and

Energy Efficient CC

Sina Rafati, Bruno Rodrigues, Burkard StillerCommunication Systems Group CSG,

Department of Informatics IfIUniversity of Zürich UZH

rafati|rodrigues|[email protected]

Page 8: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Implementation of FFT Operator in Apache Flink

Muhammad Saad

Database Technology Group(DBTG)

Page 9: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Stream Processing� Processing of data in motion� Computing on data directly as it is produced or received.� The application logic, analytics, and queries exist continuously, and data flows through them continuously.� High throughput: high-velocity, high-volume data can be

processed with minimal latency.

Apache Flink� Apache Flink is a powerful, mature, open source stream processing framework

What you will do?� Fast Fourier Transform(FFT) Query Operator in Apache Flink� With Aggregate and Group Clause

Sample FFT Query SQL Query Engine

Page 10: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Development:� Language: JAVA� # of Students: 2

Contact:¾ Email: [email protected]¾ Office: BIN 2.E.10

Page 11: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

11

UZH Open Research Data with DDIS

Using Wikidata’s Infrastructure at UZH Contributing to Wikidata (CCO)

- Machine-readable data - Wiki-style collaboration- Editors and Roles- Qualifiers, Sources

- Querying Knowledge Base (SPARQL)

- Citations- Images- Research Facts

Community-Oriented Semantic Data Management

- Timelines of Discoveries

- Author Networks- Topic Evolution

Query Service Histropeda http://www.wikidata,org wikicite Scholia PLAZI

Page 12: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

12

Supporting the feedback communication between end-users and developers for Android apps

● Fact: Feedback is a valuable information for software evolution.

● But: Current feedback solutions often do not allow ○ Feedback senders to comment on feedback sent by

others, ○ Feedback receivers to ask questions for clarification, ○ Feedback senders to receive information about the

feedback status (e.g., issue solved).

● Goal: You develop solutions that ○ Support the discussion and negotiation among

feedback senders and between feedback senders and receivers,

○ Inform the feedback senders about the current status of their requests,

○ Encourage end-users to take part in the feedback communication activities (e.g., by using gamification elements).

Norbert Seyff ([email protected])

Feedback senders(end-users)

Feedback receivers(developers)

Discussion • Negotiation • Feedback status • Motivation

Page 13: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

FlexiSketch2.0

Findvideosandmoreonwww.flexisketch.orgContact:DustinWüest,[email protected]

Createdwiththe2DgameframeworkCoronaSDKandthescriptinglanguageLua

CrossplatformFlexiSketchAflexibleeditorfordiagramsketching,runson

Android+iOS+Mac+Windows

Includesmulti-usersupport

✓Done

Where we want to goWhere we are

➢ Makeitpretty➢ Integrationoftext

documents

Page 14: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

14

Test Case Coverage Visualizationsoftware evolution & architecture lab

• Different coverage metrics (line, statement, branch, path) à how well-tested is your app?

• Part of CI environment• Detached from source code and given in percent

(e.g., 80% of classes)• No information about how well-tested

coherent/coupled parts are• No support for exploring the parts that need more

testing

Hard Facts:• # of Students: 2• Contact: Giovanni Grano ([email protected])

and Christoph Laaber ([email protected])

2. Design Map-likeVisualization

1. State of the Art of Software Visualization

3. Cluster Softwarethrough Static Analysis

5. CI/IDE Integration & Source Code Navigation (jump to file, zoom, etc.)

Page 15: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

15

Runtime Code Injection

Code changes

BackgroundExperimentation is about testing new functionality on a small fraction of the user base in production environments (e.g., 1% of users)

Extract code changes directly in the IDE and inject them into the running application without recompiling and redeploying it

Test these code changes immediately in production on a small fraction of the user base and provide feedback within the IDE (e.g., visualize metrics)

Goal1

2 Extract and classify changes

3Inject changes into running application as an experiment

4 Provide feedback within IDEContact Gerald Schermann: [email protected]

2 - 4 studentshttp://www.ifi.uzh.ch/en/seal/teaching/master/projects/RuntimeCodeInjection.htmlFurther Information:

Page 16: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

16

BenGen: Automatic Performance Test Suite Generationsoftware evolution & architecture lab

• Hard to write -> requires knowledge of (dynamic) compiler internals/optimizations

• No clear understanding, no best practices• No (to few) standard libraries -> Java has JMH since

v1.7• Hardly anyone writes benchmarks -> in 2015 ~ 30

Github Projects w/ at least 1 JMH commit

Hard Facts:• # of Students: 2• For Java and/or Go• Also available as Master Thesis• Contact: Christoph Laaber ([email protected])

2. Static Analysis1. Unit Test Suite

5. JMH Performance Test Suite

3. Select/Filter 4. Generation

Page 17: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata
Page 18: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

18

Perf-CoRe: Performance Code Review

Contact:[email protected]@ifi.uzh.ch

Are Performance Bugs detected during Code Review?

- Code Reviews Mining (GerritHub)

+

- Performance Reviews Validation (Profiler)

HPROF

Does Code Review help developers in fixing Perfomance Bugs?

- Extraction of Versions affected by Performance Bugs (discovered during Code Review)

- Code Performance metrics BEFORE and AFTER the Code Review

How can we support Developers during code review to fix Performance Bugs?

- Code Review Augmentation with dynamic Performance metrics/information

Master Project Description available at: http://www.ifi.uzh.ch/en/seal/teaching/master/projects/Perf-CoRe.html

Page 19: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Annostand Understanding how people create and use annotations in reading research articles

Info: zpac.ch/projectsSuitable for 2–3 students (as a master project)Supervisor: Prof. Dr. Chat Wacharamanotham (Interaction Design)

In this project, you will: • Analyze types and purpose of annotations from a

corpus of annotated paper and videos of annotations • Design and conduct an eye tracking study to

investigate creation and usage of annotations • Implement and evaluate a software to classify and

summarize annotations

People highlight and scribble while they read. These annotations are found to aid in interpretation, understanding, and remembering. But too many highlights can be distracting when the reader re-read or find important information from the material. Can the computer help readers separate important annotations from spurious ones?

Page 20: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Through social networks, individuals participate in the diffusion of today’s news. Yet many share news without fully reading them.

In this project, you will 1) develop a social RSS reader app that:

✓allows users to like, share and save their news within the app,

✓links the app with the user’s facebook account and shows others that like, share and save these news,

✓tracks the user’s interactions with the app, 2) conduct a small scale study to test users’ news consumption behavior within this app.

Suitable for 2-3 students (as a master project)Supervisors: Prof. Dr. Chat Wacharamanotham (Interaction Design) Prof. Dr. Anne Scherer and Andrea Bublitz (Quantitative Marketing)

NewsRoom

Info: zpac.ch/projects

Page 21: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Indivi A system for providing personalized feedback for a large number of study participants

Ambulatory assessment is a research method that allows for assessing psychological processes like emotions in the context of daily life. Study participants often want to get personalized feedback about their results, which is not possible in most cases due to a lack of automatized options.

In this project you will • get an insight in ambulatory assessment studies of elderly

at the UZH University Research Priority Program “Dynamics of Healthy Aging”

• develop a html-based flexible tool for researchers who want to give personalized feedback to study participants

Suitable for 2 students (as a master project)

Indivi

Info: zpac.ch/projects

Supervisors: Prof. Dr. Chat Wacharamanotham (Interaction Design) Dr. Andrea B. Horn (Psychology)

Page 22: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

WISE Web-based Interdisciplinary Symptom Evaluation

Info: zpac.ch/projects

Supervisors: Prof. Dr. Chat Wacharamanotham (IFI), PD Dr. Dr. Dominik Ettlin, and lic. phil. Beat Steiger (Dental Medicine)

• Evaluating of the existing WISE questionnaire system

• Designing & implementing the summary report page

• Designing & implementing an Android mobile app or Android wear smartwatch app for capturing pain over time

Suitable for 3–4 students (as a master project)

WISE

Page 23: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Description• Toolforsustainabilityassessment

• Identifychallengesandvalues• Institutionalfocus:UZHlevel

• 10categorieswitheachatleast3indicators

• Ratingsystemforindicators

Requirements• IntuitiveUI• UploadofdatatoCMS• Differentuserroles• Differentmodes(expert/light)• Dynamic/extendablecontent• Currentlycommonbrowsers• Visualization(e.g.Spiderwebdiagram)• Exportdataasfile/image

Advisors:Prof.Dr.LorenzHiltyDr.ClemensMaderDepartmentofInformatics–InformaticsandSustainabilityResearchGroup

Page 24: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

• StartDate:ASAP• FinishDate:6months(orless)• Technologiesnotdeterminedyet

• Frameworks:E.g.Django(Python),JavaScript(Angular,Vue),HTML• SQL-Database:MySQL,• Host:External,e.g.hostpoint.ch

Questions?Interested?Sendmeaemail:[email protected]

Page 25: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Visualization Toolkit Evaluation

• Evaluate several different visualization toolkits for their usability and deployability on a computer graphics cluster and display wall‣ VisIt and ParaView/VTK

• Evaluate and demonstrate the feasibility of integrating custom graphics rendering or data visualization into the visualization toolkit’s output‣ Combine toolkit display with separately generated image data

[email protected]

Page 26: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

ReviewNG: The Next Generation code review tool

Project visionDevise the next code review tool, to really support developers in this hard and time consuming task

Current code review tools are primitive!

They do not support developers with the most important tasks of a review:- understanding the code under review- detecting the most problematic parts,

quickly- providing high quality comments

What are you going to learn and do in this project?- learn about the problems of code review and elaborate new ideas to solve them- learn a new exciting programming language and IDE — Smallktalk Pharo- develop a full fledged code review tool, with the Pharo community, implementing your ideas- design and conduct an experiment to validate your work with real-world developers

Suitable for 2-4 students (as a master project)Supervisor: Prof. Dr. Alberto Bacchelli (Empirical Software Engineering)

Page 27: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

CodeCity 2.0

Project visionExpand CodeCity (the most famous software visualization tool) with information about connection between artifacts

CodeCity is a great software visualization tool, but it is not complete yet!

CodeCity cannot show information about:- connections between classes- clusters of artifacts that are related- dynamic behavior

What are you going to learn and do in this project?- learn about Code City and its implementation- learn about information visualization techniques to overlay clustering information on maps- develop an extension to an existing version of CodeCity, or create your own version - design and conduct an experiment to validate your work with real-world developers

Suitable for 2-4 students (as a master project)Supervisor: Prof. Dr. Alberto Bacchelli (Empirical Software Engineering)

Page 28: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

What was your source code comment or email about?

Project visionDevise an online service to automatically classify code comments and development emails, to help software developers

Non-relevant

Natural Language

Stack trace

Source code

Patch

(1) Alice wrote: (2) On Mon 23, Bob wrote: (3) Dear list, (4) When starting up ArgoUML on my MacOS X system (Java 2) (5) it throws a NullPointerException very soon. You'll find the (6) trace below. I hope someone knows a solution. Thanks a lot!

(7) Exception in thread "main" java.lang.NullPointerException (8) at (9) javax.swing.event.SwingSupport.fireChange(SwingChange.java) (10) at javax.swing.AbstractAction.setEnabled(AbstractAction.java) [...] (11) at uci.uml.Main.main(Main.java:148)

(12) I'm sorry I can't help you Bob but thanks for sharing the stack... (13) Alice. (14) -- (15) "Beware of programmers who carry screwdrivers." --L. Brandwein

(16) Alice, I believe the flawed Explorer.java class generates Bob's issue: (17) public void setEnclosingFig(Fig each) { (18) super.setEnclosingFig(each); (19) if (each != null || (each.getOwner() instanceof MPackage)) { (20) m = (MPackage) each.getOwner(); }

(21) The problem is in the condition, I attach the diff with this version: (22) --- src/org/argouml/ui/explorer/Explorer.java (revision 14338) (23) +++ src/org/argouml/ui/explorer/Explorer.java (working copy) (24) @@ -147,1 +147,1 @@ [...] (25) super.setEnclosingFig(each); (26) - if (each != null || (each.getOwner() instanceof MPackage)) { (27) + if (each != null && (each.getOwner() instanceof MPackage)) { (28) m = (MPackage) each.getOwner(); }

(29) Probably ModelTree is also affected, if so, please change it =) (30) Cheers, Carl. (31) -- I used to have a sig, but it took up much space so I got rid of it! (32) --------------------------------------------------------------------- (33) To unsubscribe, e-mail: [email protected] (34) For additional commands, e-mail: [email protected]

Development Emails and Code Comments are hard to parse!

They are made of many different languages and have several types of meanings:- natural language is interleaved with

code, patches, stack traces- there is a lot of noise to be removed

What are you going to learn and do in this project?- learn about information retrieval and advanced parsing techniques- learn about machine learning algorithms for text classification and how to mix them with parsing - develop an online service that developers and researchers can use to automatically classify

thousands of emails or code comments in seconds- design and conduct an experiment to validate your work with real-world developers

Suitable for 2-4 students (as a master project)Supervisor: Prof. Dr. Alberto Bacchelli (Empirical Software Engineering)

Page 29: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Gaztimate

Suitable for 2–4 students (as a master project)Figure 3: E↵ect of KG presence on eye gaze heatmaps. Red hotspots indicate where the user spent mosttime looking. The left panel shows a heatmap of user eye gaze when KG is absent (the shape resembles aGolden Triangle focused on top-left of the page). The right panel shows the corresponding heatmap whenKG is present. The search pages have the page areas of interest (AoIs) outlined, and regions outside AoIsdimmed. For the actual result pages presented to users, the AoIs were not visually marked in this way. Notethe increased activity near KG, suggesting a potential second Golden Triangle focused on KG.

tention, there are significant di↵erences in other measures:irrelevant KGs slow down the user by 3.5-4s on average (timeon page and time on task increase), while relevant KGs speedup the user by around 4s on average (users spend 2-2.5s lesson each of the top and mid as the answer is found in KG).Thus, relevant KGs get a higher fraction of page dwell (18%)than irrelevant KGs (8%), and search terminates faster onaverage after the user visits a relevant KG compared to anirrelevant KG (0.9 vs 2.8s). Clearly, task relevance is animportant factor a↵ecting user attention and task perfor-mance.

We tested whether mouse activity is sensitive to changesin relevance. We observe similar trends as the eye, but to asmaller extent. Like the eye, the mouse shows that relevantKGs get a higher fraction of page dwell (17%) compared toirrelevant KGs (12%), and search terminates faster on aver-age after the user visits a relevant KG compared to an irrele-vant KG (2.9 vs 6.4s). Figure 7a shows sample mouse trackswhen KG is relevant – in this example, the task is “when wasthe sequel to Toy Story released?”, we find that the user findsthe answer in KG, hence search terminates soon after uservisits KG. Figure 7b shows sample mouse tracks when KGis irrelevant – in this example, the task is “find more aboutthe Let’s Move program by Michelle Obama”, we find thatthe user visits KG, and continues searching on the rest ofthe page 6. Thus, mouse activity, like eye gaze, is sensitiveto relevance.

6The figure shows two di↵erent queries to illustrate moreexamples of pages with KGs. However, for analysis, we usedthe same set of queries to compare KG-relevant and KG-irrelevant conditions.

5. PREDICTING EYE FROM MOUSEGiven the observed correlations between eye and mouse ac-tivity in some measures, we are motivated to ask the follow-ing questions:

• How well can we predict eye gaze from mouse activity?• Can we achieve higher accuracy by predicting elements

of interest on the screen rather than estimating theexact eye gaze coordinates?

• To what extent is the relationship between eye gazeand mouse position user-specific and how far can wegeneralize to unseen users?

To answer these questions we developed a set of regressionand classification models to predict the exact coordinatesof the eye gaze and the element of interest on the page,respectively. Before describing these models in detail weneed a formal definition of our learning problem: We dividedthe set of eye-mouse readings into a set of points, where eachpoint d

i

= (yi

, e

i

,vi) represents the eye gaze coordinates yi

,a corresponding element of interest on the page e

i

, and acovariate vector vi comprising of a set of mouse features. Welet D

u = {du1 , · · · , dunu} be the set of nu

points pertainingto user u, and we let D = {D1

, · · · , DU} be the set of alldata points across all U users.

5.1 Regression to predict the eye positionAs a first step consider the problem of estimating the y-

coordinate of eye-gaze directly from mouse activity7. This

7Predicting the y coordinate of eye gaze is more interestingthan the x coordinate, as it can reveal which result element

957

Figure 4: Examples of eye (green) and mouse (blue) tracks when KG is present (left) and absent (right)

Figure 6: E↵ect of KG relevance on eye. Considerthe left panel. The x axis shows the AoIs, and they axis shows, for each AoI, the di↵erence in atten-tion (in seconds) when KG is present and relevantvs. when KG is absent (mean ± standard error).Right panel shows the corresponding plot for irrel-evant KG.

is a regression problem where we seek to find a functionf : v ! y such that the discrepancy between f(v) and theobserved eye position y is minimized.

In the following we use a (generalized) linear model to rep-resent the mapping from attributes to eye positions. Thatis, we seek to learn a regression function

f(v) = hw,�(vi

)i

Here f is parametrized by a weight vector w that we seekto estimate. When �(v

i

) = vi

we end up with a linearregression function in the input covariate space v

i

. When

the user is looking at. Thus we focus on y coordinate in thispaper.

�(vi

) comprises a nonlinear mapping, we obtain a nonlinearfunction in the input covariate space.To assess the impact of a personalized model we compare

the following three models: a global model that estimatesthe parameter w common for all users. Secondly we infera user-specific model that provides an upper bound of howaccurately the model can estimate eye positions from mouseactivity. Finally, we infer a hybrid model that combinesglobal and user-specific components. This allows us to dis-sociate both parts, thus allowing us to generalize to userswhere only mouse movements are available while obtaininga more specific model whenever eye tracking is possible. Wedescribe these three approaches below:

Global model: In this setup we learn a global regressionfunction f

g

parametrized by a global weight vector wg

.The learning goal is to find w

g

that minimize the av-erage prediction error on the whole dataset. More for-mally, our learning problem for the y-coordinate is:

minimizewg

X

di2D

kyi

� hwg

,�(vi

)ik22 + � kwg

k22

where � is a regularization parameter to prevent over-fitting. This model tests the hypothesis that eye-mousecorrelation is a global phenomenon and does not de-pend on the specific user behaviour.

User-specific models: In this setup we learn regressionfunctions f

u

independently for each user u. The learn-ing problem for the y-coordinate is:

minimizewu

X

di2D

kyu

i

� hwu

,�(vu

i

)ik22 + � kwu

k22

This model tests the hypothesis that eye-mouse corre-lation is NOT a global phenomenon and depends onthe specific user behaviour.

Hierarchical model: In this setup we still learn a per-user regression model, however we decompose eachuser-specific regression weight additively into a user-dependent part w

u

and a global part wg

. More for-

958

Eye and mouse have similar movement patterns when reading results from web search

Figure: Navalpakkam et al. (2013)

How much does this similarity exist in reading research articles and code review?

Supervisors: Prof. Dr. Alberto Bacchelli (Empirical Software Engineering) Prof. Dr. Chat Wacharamanotham (Interaction Design)

In this project, you will: • learn about eye and computer interactions

tracking • extend applications (in Java and Javascript) to

track computer interactions and include eye movement data stream

• design and conduct a lab study and analyze data

Using computer interaction tracking to estimate where people look while reading research articles and software code

Info: zpac.ch/projects

Page 30: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

30

Existing Task Planning Tools

Tasks are characterised by urgency, importance, tags, sub tasks, …

WEEK PLAN Remember The MilkAny Do

Happy Planner

Adds emotions to tasks Contains features to tackle annoying tasks

Contact: Manuela Züger ([email protected])Sebastian Proksch ([email protected]) Giovanni Grano ([email protected])

Page 31: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Software Ecosystem:“Groups of software projects that are

developed and co-evolve in the same environment…”

Popular examples: Eclipse plug-ins, the R ecosystem, Linux distributions, Apache projects

Main Characteristics: “share code, depend on one another, reuse the same code, are built on

similar technologies…” Client Library

Recommender System for Libraries Update of Client Projects

Sebastiano Panichella http://www.ifi.uzh.ch/en/seal/people/panichella.html

Page 32: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Problem when Managing Software Ecosystems…

Gabriele Bavota, Gerardo Canfora, Massimiliano Di Penta, Rocco Oliveto, Sebastiano Panichella: How the Apache Community Upgrades Dependencies: An Evolutionary Study. Empirical Software Engineering (EMSE 2014)

Manage dependenciesin large ecosystems isan intricate task…

?

Page 33: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Goal of the Master Project Software Ecosystems…

Project description: www.ifi.uzh.ch/en/seal/teaching/master/projects/Libraries-upgrade-tool.html Number of required students for the project: 2-3

Page 34: Master Project Market 2017-11-02 - UZH9475cbcb-e116-4626... · • Basic graph theory (e.g., flows on networks) ... Using Wikidata’s Infrastructure at UZH Contributing to Wikidata

Department of Informatics

Interested in a project? Talk to representatives and form groups!

http://tiny.uzh.ch/yi

Good luck with your Master Project 😉

34


Recommended