Date post: | 12-Apr-2017 |
Category: |
Data & Analytics |
Upload: | frank-hopfgartner |
View: | 551 times |
Download: | 0 times |
Se#ng Up a Living Lab for Informa3on Access Research
Frank Hopfgartner, DAI-‐Labor, Technische Universität Berlin
In CLEF NEWSREEL, par3cipants can develop news recommendaBon
algorithms and have them tested by millions of users over the period of a few
months in a living lab.
Why am I here?
Because I co-‐organise CLEF NEWSREEL
Overview
Part 2 (Hands-‐on Experience)
Part 1 (Academic Overview) Living Labs
(Introduc3on) Living Labs for IR
Research CLEF NEWSREEL
So what are living labs?
Rely on feedback from real users to develop convincing demonstrators that showcase potentials of an idea or a product.
Real-life test and experimentation environment to fill the pre-commercial gap between fundamental research and innovation.
§ Na3onal research ini3a3ve on energy efficiency in the housing and traffic domains
§ Efficiency House Plus is a small power plant that can export energy surpluses into the local power grid
§ Equipped with 1000 data sources such as movement sensors, weather data, etc.
Example: Efficiency House Plus
[BMVBS, 2011]
Source: W
erne
r Sob
ek
What can be studied?
§ 205 Smart meters § 39 Heat pumps § 74 Illumina3on sensors § 38 Photovoltaic sensors § …
1000 data points
Efficiency House Plus with electro mobility, Berlin Research inita3ve of BMVBS
§ Detec3on of resident presence in home environment § Energy consump3on is an
indicator for presence, but some devices con3nually consume energy
§ Recogni3on of resident acBviBes § Draw conclusions about user
ac3vity based on usage of home appliances
§ Recommenda3on of op3mized heaBng schedules § Gradually learn characteris3c
behavior to create personalized schedules for hea3ng control
InnovaBon
Data Analysis
QuesBons to be addressed
How will people really use the technology?
Who is interested in my product?
What is the willingness to
pay?
Is there a need for my product?
What parameters do I
need?
Overview
Part 2 (Hands-‐on Experience)
Part 1 (Academic Overview) Living Labs
(Introduc3on) Living Labs for IR
Research CLEF NEWSREEL
Why?
Cranfield (1962-1966)
Medlars (1966-‐1967)
SMART (1961-‐1995)
TREC (1992-‐today)
NTCIR / CLEF (1999/2000-‐today)
Let’s have a look at the history of IR evalua3on
Develop system/algorithm
Prepare appropriate dataset
Perform user study
Measure performance
Cranfield EvaluaBon Paradigm
§ Use standard test collec3on (e.g., from TREC) with documents, relevance assessments and search tasks
§ Create your own test collec3on (domain specific)
§ Ask users to perform search tasks in controlled environment § Simulated work task situa3on
§ Standard IR evalua3on metrics § Qualita3ve Methods
§ Baseline § Fancy improvement that will change the world
Laboratory SeVng
Find as many documents as possible for a given
search task
Act naturally while I watch everything you are doing
I tell you what is relevant!
NOT SUITABLE
FOR RESEARCH ON
USER-CENTRED IR
EvaluaBon of User-‐Centred IR (Personalised Search)
Context
§ Country § Social Connec3on § Locality § Personal History § Mobile Search
Evalua3on Issues
§ Observer-‐expectancy effect § Atypical search task § Missing context/background § Missing incen3ve to sa3sfy
own informa3on need
Personalised Search
An alternaBve seVng
Use our system to find the informa3on you are
looking for
Use the system whenever you want for
whatever reason
You decide what you
consider to be relevant
How to evaluate?
User SimulaBon
[ECIR’08, ACM TOIS, 2011]
Allows fine tuning (White et al., 2005) But does not replace user study
Evalua3on campaigns
Crowdsourcing works
Micro-‐tasks
§ Image CLEF (Nowak and Rüger, 2010)
§ INEX (Kazai et al., 2011) § TREC Blog (McCreadie et al., 2011) § MediaEval (Loni et al., 2013)
§ Data annota3on § Document annota3on § Document categorisa3on § Itera3ve system evalua3on
Ac3va3ng the crowd
§ Users may have interest in annota3ng items that they know well
§ Users may be apracted by incen3ves to annotate items
But…
§ Personalised search needs users who follow their own informaBon needs.
§ Users need to be driven by their own intrinsic moBvaBon.
EXTRINSIC Mo3va3on
Comes from the outside
INTRINSIC Mo3va3on
Exists
within the individual
Therefore…
“A living laboratory on the Web that brings researchers and searchers together is needed to facilitate ISSS (Informa3on-‐Seeking Support System) evalua3on.”
Kelly et al., 2009
Living Labs for IR evaluaBon
Local domain search Newsreel Product search
Real users interac3ng with a system following their own informa3on need
RealisBc se#ng where users are not restricted by closed laboratory condic3ons
Ideally: Many users to perform A/B tesBng
Source (Guinea pig): hpp://living-‐labs.net/wp-‐content/uploads/2014/05/livinglab.logo_.textunder.square200.png
Privacy and security
Challenges
Legal and ethical issues
§ Hos3ng data on secure server § Gaining subjects’ trust § Coping with need for privacy § Alterna3ves when individuals will not
share their data
§ User consent § Ethics approval § Trust between par3es § Copyright issues § Commercial sensi3vity of data
Prac3cal challenges
§ Forming living labs for IR partners within the research community
§ Obtaining commercial partners § Defining tasks and scenarios for
evalua3on purposes
Technical challenges
§ Designing and implemen3ng living labs architecture
§ Cost of implementa3on § Maintenance and adop3on § Managing living labs infrastructure
Source: hpp://living-‐labs.net/ll14/call-‐for-‐papers/
Overview
Part 2 (Hands-‐on Experience)
Part 1 (Academic Overview) Living Labs
(Introduc3on) Living Labs for IR
Research CLEF NEWSREEL
In CLEF NEWSREEL, par3cipants can develop news recommendaBon
algorithms and have them tested by millions of users over the period of a few
months in a living lab.
Again…
Recommender Systems help users to find items that they were not
searching for.
What are recommender systems?
§ First living lab for the evalua3on of news recommenda3on algorithms in real-‐3me
§ Organised as plista Contest, as a challenge at ACM RecSys’13 and as campaign-‐style evalua3on lab of CLEF’14
Example: News ArBcles
Source (Image): T. Brodt of plista.com
OrganisaBon (CLEF NEWSREEL)
Leading provider of a recommenda3on and adver3sement network in Central Europe
Thousands of content providers rely on plista to generate recommenda3ons for their customers (i.e., web users)
Applica3on-‐oriented research on smart informa3on systems
Steering Commipee of experts from the fields of IR and RecSys
Central Innova3on Programme SME
• Given a dataset, predict news articles a user will click on
Offline Evaluation
• Recommend articles in real-time over several months
Online Evaluation
CLEF NEWSREEL Tasks
Started in November 2013
TASK
1
TASK
2
@clefnewsreel hpp://www.clef-‐newsreel.org/
Predict interac3ons based on an OFFLINE dataset
Task 1: Offline EvaluaBon DA
TASET
EVAL
UAT
ION
§ Traffic and content updates of 9 German-‐language news content provider websites
§ Traffic: Reading ar3cle, clicking on recommenda3ons
§ Updates: adding and upda3ng news ar3cles
§ Recorded in June 2013 § 65 GB, 84 Million records § [Kille et al., 2013]
§ Dataset split into different Bme segments
§ Par3cipants have to predict interacBons of these segments
§ Quality measured by the ra3o of successful predic3ons by the total number of predic3ons
Recommend news ar3cles in REAL-‐TIME
Task 2: Online EvaluaBon LIVING LAB
EVAL
UAT
ION
§ Provide recommenda3ons for visitors of the news portals of plista’s customers
§ Ten portals (local news, sports, business, technology)
§ Communica3on via Open Recommender Plaworm (ORP)
§ Provide recommenda3ons within <100ms (VM provided if necessary)
§ Three pre-‐defined evalua3on periods § 5-‐23 February 2014 § 1-‐14 April 2014 § 5-‐19 May 2014
§ Evalua3on criteria § Number of clicks § Number of requests § Click-‐through rate
Living Lab Scenario
…
Publisher A
Publisher n
Researcher 1
Researcher n
… plista ORP
…
Millions of visitors Publishers Teams
Privacy and security
Challenges
Legal and ethical issues
§ Hos3ng data on secure server § Gaining subjects’ trust § Coping with need for privacy § Alterna3ves when individuals will not
share their data
§ User consent § Ethics approval § Trust between par3es § Copyright issues § Commercial sensi3vity of data
Prac3cal challenges
§ Forming living labs for partners within the research community
§ Obtaining commercial partners § Defining tasks and scenarios for
evalua3on purposes
Technical challenges
§ Designing and implemen3ng living labs architecture
§ Cost of implementa3on § Maintenance and adop3on § Managing living labs infrastructure
Source: hpp://living-‐labs.net/ll14/call-‐for-‐papers/
§ Hos3ng data on secure server
§ Gaining subjects’ trust § Coping with need for privacy § Alterna3ves when
individuals will not share their data
Privacy and security
§ No search queries are provided.
§ Data stream is pseudo-‐mized, i.e., users cannot be iden3fied based on their IP or search queries.
§ User consent § Ethics approval § Trust between par3es § Copyright issues § Commercial sensi3vity of
data
Legal and ethical issues
§ Researchers do not interact with users.
§ Business rela3on of plista and their customers.
§ Par3cipants have to agree to terms before par3cipa3ng.
§ Designing and implemen3ng living labs architecture
§ Cost of implementa3on § Maintenance and adop3on § Managing living labs
infrastructure
Technical challenges
§ Infrastructure developed in context of research project EPEN.
§ Constantly monitor the system.
§ Forming living labs for partners within the research community
§ Obtaining commercial partners
§ Defining tasks and scenarios for evalua3on purposes
PracBcal challenges
§ Always keep in contact with your par3cipants.
§ Adver3se. § Make sure no one can
cheat! § It’s a Win-‐Win-‐Win-‐Win
situa3on. (-‐> Torben)
Acknowledgement CO
-‐ORG
ANISER
S
STEERING COMMITTEE
§ Andreas Lommatzsch § Benjamin Kille § Till Plumbaum § Torben Brodt § Tobias Heintz
§ Pablo Castells § Paolo Cremonesi § Hideo Hoho § Udo Kruschwitz § Joemon M. Jose § Mounia Lalmas § Martha Larson § Jimmy Lin § Vivien Petras § Domonkos Tikk
www.dai-‐labor.de/~hopfgartner/
Fon Fax
+49 (0) 30 / 314 – 74 +49 (0) 30 / 314 – 74 003
DAI-‐Labor
Technische Universität Berlin Fakultät IV – Elektrotechnik & Informa3k
Ernst-‐Reuter-‐Platz 7 10587 Berlin, Deutschland
Distributed Ar3ficial Intelligence Laboratory
Frank Hopfgartner, PhD
@OkapiBM25
Director of Competence Center Informa3on Retrieval and Machine Learning
frank.hopfgartner@tu-‐berlin.de 202
Thank you