+ All Categories
Home > Documents > Haystack: Per-User Information Environments

Haystack: Per-User Information Environments

Date post: 06-Jan-2016
Category:
Upload: audra
View: 19 times
Download: 0 times
Share this document with a friend
Description:
Haystack: Per-User Information Environments. David Karger. Motivation. Indices search by keyword Taxonomies classify by subject Cool site of the day. A lot like libraries... Library catalogues Dewey digital New book shelf, suggested reading. Web Search Tools. - PowerPoint PPT Presentation
Popular Tags:
37
Haystack: Per-User Information Environments David Karger
Transcript
Page 1: Haystack: Per-User Information Environments

Haystack:Per-User Information Environments

David Karger

Page 2: Haystack: Per-User Information Environments

Motivation

Page 3: Haystack: Per-User Information Environments

Web Search Tools

Indices search by keyword

Taxonomies classify by subject

Cool site of the day

A lot like libraries...Library catalogues

Dewey digital

New book shelf, suggested reading

Is a universal library enough?

Page 4: Haystack: Per-User Information Environments

Library/web Limitations

Huge Too many answers, mostly irrelevant

Only published material Miss info known to few, leading-edge content

Rigid All get same search results Even if come back and try again

The library is the last place we look

Page 5: Haystack: Per-User Information Environments

Start with Bookshelf

I try solving problems using my data: Information gathered personally High quality, easy for me to understand Not limited to publicly available content

My organization: Personal annotations and metadata Choose own subject arrangement Optimize for my kind of searching

Adapts to my needs

Page 6: Haystack: Per-User Information Environments

Then Turn to a Friend

Leverage They organize information for their own use Let them find things for me too

Shared vocabulary They know me and what I want

Personal expertise They know things not in any library

Trust Their recommendations are good

Page 7: Haystack: Per-User Information Environments

Last to Library/web

Answer usually there But hard to find Wish: rearrange to suit my needs Wish: help from my friends in looking E.g. NY public library catalogue

Page 8: Haystack: Per-User Information Environments

Lessons

Individualized access: The best tools adapt to individual ways of organizing and seeking data.

Individualized knowledge: People know much more than they publish. That knowledge is useful to them and others.

End user: understands their data the best, so should control organization and presentation

Page 9: Haystack: Per-User Information Environments

Problems with Current Tools

Applications designed by few for use by many Developers decide what information is important Provide model to hold that information Provide interfaces to view/manipulate that info

Users discover uses/needs for other info Tool cannot store, cannot support interaction

Users discover connections between info If connected info is in different applications, neither

app can record connection People could do a lot more with information, if

environment let them record/use what they know

Page 10: Haystack: Per-User Information Environments

Haystack Approach Data Model

Define rich data model that lets user represent all interesting info

Rich search capabilities Machine readable so agents can augment/share/exchange info

User Interface Strengthen UI tools to show rich data model to user And let them navigate/manipulate/share it

Adaptation People are lazy, unwilling to “waste time” telling system what

to do, even if it could help them later System must introspect about user actions, deduce user

needs and preferences, and self-adjust to provide better behavior

Collaboration As system gathers information from one user, share with

others Rich data model maximizes useful knowledge transfer

Page 11: Haystack: Per-User Information Environments

Data Model

A semantic web of information

Page 12: Haystack: Per-User Information Environments

Motivation Tremendous amount of information is relational

Named relationships Written by, married to, traveling to, owned by…

Collections Directories, bookmarks, menus, albums Families, workgroups,

Web links People can take huge advantage of navigating

relationships Network of relationships much more

“structured” than a textual description, but much less regular than a spreadsheet/database

Page 13: Haystack: Per-User Information Environments

The Haystack Data Model

W3C RDF/DAML standard

Arbitrary objects, connected by named links

A semantic web Links can be linked

No fixed schema User extensible Add annotations Create brand new

attributes

Doc

D. Karger

Haystack

title

author

Outstanding

quali

ty

says

HTMLtype

Page 14: Haystack: Per-User Information Environments

RDF Lowers Barriers Location Independent

Universal Locators, even for local data (as may become non-local)

Application Independent Simple, common language suitable for variety of information

types Enables interlinking and exchange of information from all apps

Extensible Can add attributes as needed, leave them out if unimportant

Enables powerful search Based on broad variety of attributes

Support for data agents Extract information from raw data Make available for search and other forms of navigation

Page 15: Haystack: Per-User Information Environments

Where does data come from?

Pull from outside sources Web, databases, news feeds…

Active user input Interfaces let user add data, note relationships

Mining data from prior data Plug-in agents opportunistically extract data

Passive observation of user Plug-ins to other interfaces record user actions

Other Users

Page 16: Haystack: Per-User Information Environments

Data Extraction Services

Web Observer

RDF Store

Mail Observer

Machine Learning Services

Web Viewer

Haystack UI

Spider

Page 17: Haystack: Per-User Information Environments

User Interface

Uniform Access to All Information

Page 18: Haystack: Per-User Information Environments

Current Barriers to Information Flow

Partitions by Location Some data on this computer, some on that Remote access always noticeable, distracting

Partitions by Application Mail reader for this, web browser for that, text editor for

those To-do list, but without needed elements

Invisibility Where did I put that file? Tendency for objects to have single (inappropriate)

location (folder) Missing attributes

Too lazy to add keywords that would aid searching later

Page 19: Haystack: Per-User Information Environments

Goal: Task-Based Interface

When working on X, all information relevant to X (and no other) should be at my fingertips Planning the day: to-do list, news articles, urgent

email, seminars Editing a paper: relevant citations, email from

coauthors, prior versions Hacking: code modules, documentation, working

notes, email threads Location, source and format of data irrelevant

Page 20: Haystack: Per-User Information Environments

Sign of Need: Email Usage

Email as to-do list Anything not yet “done” kept there Reminder email to ourselves Single interface containing numerous document

types Overflowing Inboxes

Navigate only by brute-force scanning Unsafe file/categorize anything: out of sight, out

of mind

Page 21: Haystack: Per-User Information Environments

Interface Options

Folders Out of sight, out of mind Still need applications to see data Which is the right folder?

Desktops Allow arbitrary data types But coupling between applications & data types too light A smear of many tasks, so hard to focus

Hundreds of icons, tens of windows, huge menus No partitioning

Databases OK if you have a degree in database administration Interface is impoverished---long lists of tuples

Page 22: Haystack: Per-User Information Environments

The Big Picture

Page 23: Haystack: Per-User Information Environments

User Interface Architecture

Views: Data about how to display data Views are persistent, manipulable data

Data to be displayed

UI data

Mapping

View

Underlying information

UI data

Mapping 2

View 2

Page 24: Haystack: Per-User Information Environments

Semantic User Interface Present information by

assembling different views together

Information manipulation decoupled from presentation

New views can be added without mucking with data types

New data types can be added without designing new UIs

Uniform support for features like context menus

Actions apply to objects on screen in various “roles”

E.g. as word, as title of mail message, as member of collection

View for Favorites collection

View for cnn.com

View for yahoo.com

View for ~/documents/thesis.pdf

Page 25: Haystack: Per-User Information Environments

Persistence of Views

Views are data like all other data Stored persistently, manipulated by user User can customize a view

View for particular task can be cloned from another

Can evolve over time to need of task To an extent previously limited to sophisticated UI

designer Views can be shared

Once someone determines “right” way to look at data, others can benefit

Page 26: Haystack: Per-User Information Environments

Role of Schemata

Benefits Help people look at information the right way Help creators avoid creation mistakes

Risks of Enforcement Deters lazy users from entering data Prevents creative users from stretching the

boundaries Is there a middle ground?

Can schemata be “advisory”? One or many?

If each user makes own schema, how translate?

Page 27: Haystack: Per-User Information Environments

Brief look

Page 28: Haystack: Per-User Information Environments

Adaptation

Learning from the User over Time

Page 29: Haystack: Per-User Information Environments

Approach

Haystack is ideally positioned to adapt to user RDF data model provides rich attribute set for

learning In particular, can record user actions with

information (the flexible UI can capture easily)

Extensive record can be built up over time Introspect on that information

Make Haystack adapt to needs, skills, and preferences of that user

Page 30: Haystack: Per-User Information Environments

Observe User

Instrument all interfaces, report user actions to haystack Mail sent, files edited, web pages browsed

Discover quality What does the user visit often?

Discover semantic relationships What gets used at the same time?

Discover search intent Which results were actually used?

Page 31: Haystack: Per-User Information Environments

Learning from Queries

Searching involves a dialogue First query doesn’t work So look at the results, change the query Iterate till home in on desired results

Haystack remembers the dialogue instead of first query attempt, use last one record items user picked as good matches on future, similar searches, have better query

plus examples to compare to candidate results Use data to modify queries to big search engines,

filter results coming back

Page 32: Haystack: Per-User Information Environments

Mediation

Haystack can be a lens for viewing data from the rest of the world Stored content shows what user knows/finds

useful Selectively spider “good” sites Filter results coming back

Compare to objects user has found useful in the past Can learn over time

Example - personalized news service

Page 33: Haystack: Per-User Information Environments

Collaboration

Haystack’s Ulterior Motive

Page 34: Haystack: Per-User Information Environments

Hidden Knowledge

People know a lot that they are Willing to share But too lazy to publish

Haystack passively collects that knowledge Without interfering with user

Once there, share it! RDF---uniform language for data exchange

Challenges As people individualize systems, semantics diverge Who is the “expert” on a topic? (collaborative

filtering)

Page 35: Haystack: Per-User Information Environments

Example

I want info on probabilistic models in data mining My haystack doesn’t know, but “probability” is in

lots of email I got from Tommi Jaakola Tommi told his haystack that “Bayesian” refers to

“probability models” Tommi has read several papers on Bayesian

methods in data mining Some are by Daphne Koller I read/liked other work by Koller My Haystack queries “Daphne Koller Bayes” on

Yahoo Tommi’s haystack can rank the results for me…

Page 36: Haystack: Per-User Information Environments

Summary Rich data Model

Lets user represent all interesting info Supports sophisticated searches Accessible to information agents

User Interface Extensibly shows rich data model to user Lets them navigate/manipulate it

Adaptability System may introspect about user actions, deduce user needs

and preferences, and self-adjust to provide better behavior Collaboration

As system gathers information from one user, share with others

Rich data model maximizes useful knowledge transfer

Page 37: Haystack: Per-User Information Environments

More Info

http://haystack.lcs.mit.edu/(initial release available for

download)[email protected]


Recommended