+ All Categories
Home > Technology > Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Date post: 05-Dec-2014
Category:
Upload: ed-chi
View: 1,098 times
Download: 0 times
Share this document with a friend
Description:
This is the slide set that described and summarized the Smart eBook research we published in VAST2006. (yes, 4 years ago).
37
2006-11-02 Ed H. Chi VAST 2006 1 Ed H. Chi, Lichan Hong, Julie Heiser, Stu Card Palo Alto Research Center The user study portion of this research has been funded in part by ARDA NIMD program.
Transcript
Page 1: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

2006-11-02 Ed H. Chi VAST 2006 1

Ed H. Chi, Lichan Hong, Julie Heiser, Stu Card

Palo Alto Research Center

The user study portion of this research has been funded in part by ARDA NIMD program.

Page 2: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

2006-11-02 Ed H. Chi VAST 2006 2

Ed H. Chi, Lichan Hong, Julie Heiser, Stu Card, Michelle Gumbrecht

Palo Alto Research Center

The user study portion of this research has been funded in part by ARDA NIMD program.

Page 3: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 3 Ed H. Chi VAST 2006

Reading is an essential human activity. •  Giant leaps forward is marked by new and better ways to

find, correlate, and comprehend information.

Page 4: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 4 Ed H. Chi VAST 2006

Many books digitized in the digital library effort. •  Amazon, Google, and

CMU’s Million Book Project.

Intelligent Analysts spend an enormous amount of time Reading! [Pirolli, Lee, Card] 0

75%

90%

TIME

RECALL

(current)

0%

0

75%

90%

TIME

RECALL

(current)

0%

(goal)

Page 5: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 5 Ed H. Chi VAST 2006

Subject Indexes as a new reading “device” •  Invented in the 15th Century [Dewar98] •  Method of design refined thru centuries

Peter Heylyn's 1652 Cosmographie in Four Bookes

Page 6: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 6 Ed H. Chi VAST 2006

Instead of generating new indexes using IR techniques, why not enhance them?

Take advantage of centuries of experience in building subject indexes.

Page 7: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 7 Ed H. Chi VAST 2006

Page 8: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 8 Ed H. Chi VAST 2006

Readers need help in directing their attention to the most relevant passages to their topic of interest.

Idea: conceptually highlight passages and keywords that are related to user search keywords.

Page 9: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 9 Ed H. Chi VAST 2006

User first type search keywords: “anthrax symptoms”

Conceptually highlight any relevant passages and keywords

Draw user attention

Page 10: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 10 Ed H. Chi VAST 2006

Biohazard •  by Ken Alibek •  non-fiction retelling of

his experiences working on biological weapons in the former Soviet Union.

•  13 index pages in two columns, consisting 829 entries

Page 11: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 11 Ed H. Chi VAST 2006

Exact Matches in red

Associated Entries underlined in red

Page 12: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 12 Ed H. Chi VAST 2006

Page 13: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 13 Ed H. Chi VAST 2006

Page 14: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 14 Ed H. Chi VAST 2006

Page 15: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 15 Ed H. Chi VAST 2006

Page 16: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 16 Ed H. Chi VAST 2006

Goal: Compare how users find, compare, and comprehend information using the ScentIndex and 3Book versus the physical book. It’s not clear to us that the ScentIndex would be better, because:

•  Unfamiliarity with 3Book Interface (page turning, clicking on page numbers, use of search box)

•  Inability to grasp the ScentIndex concept (reorganization might be confusing, harder to read the index page on screen)

•  Readability of the Screen (hard to read a large number of pieces of text)

•  Users might be very good at using the physical book index.

Page 17: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 17 Ed H. Chi VAST 2006

Study Design: •  Within-subject •  Interface condition (ScentIndex vs. Physical Book Index), and •  Task Type (find, compare, comprehend), •  with the order of the interface used and expertise level as the

between-subject variables.

Subjects: 16 subjects (8 experts on the content, 8 novices) Materials: subjects used PC with two LCD monitors, and the

physical Alibek book.

Page 18: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 18 Ed H. Chi VAST 2006

Overall, the ScentIndex eBook performed better against the physical Book. Faster Speed:

•  Subjects using the ScentIndex were faster in completing their tasks no matter whether they were experts or novices, F(1,12)=12.96, p<.01.

More Accurate: •  Answers that they provided while using ScentIndex

interface were more accurate, F(1,12)=3.991, p=.06.

Page 19: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 19 Ed H. Chi VAST 2006

Page 20: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 20 Ed H. Chi VAST 2006

“The difficulty seems to be, not so much that we publish unduly in view of the extent and variety of present-day interests, but rather that publication has been extended far beyond our present ability to make real use of the record.” --- V. Bush

Indeed, our goal is to enhance current Browsing Interfaces for more productive reading.

Page 21: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 21 Ed H. Chi VAST 2006

The user study portion of this research has been funded in part by contract #MDA904-03-C-0404 to Stuart K. Card and Peter Pirolli from the Advanced Research and Development Activity, Novel Intelligence from Massive Data program. We thank Jock Mackinlay for some fruitful conversation about the interaction of the eBook; Michelle Gumbrecht and Tan Lee for running some of our experiments; Pam Desmond for help in the data analysis, and Brian Tramontana for the video production.

Contact: Ed H. Chi ([email protected])

Page 22: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 22 Ed H. Chi VAST 2006

Page 23: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 23 Ed H. Chi VAST 2006

Page 24: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 24 Ed H. Chi VAST 2006

Page 25: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 25 Ed H. Chi VAST 2006

Page 26: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 26 Ed H. Chi VAST 2006

PageImages

Scent

HighlightsSentence Structure

Renderer

Text

scan

OCR

sample

extract

compute

parse

Words + Locations

Word Association Matrix

Page Textures

Page 27: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 27 Ed H. Chi VAST 2006

Page 28: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 28 Ed H. Chi VAST 2006

Early proposal of an indexing system: Memex [Bush45] Electronic Books: Rocket eBook, SoftBook Reader, DigiPaper [Huttenlocher00], DjVu [DjVu Zone03], PDF [Adobe03], MS Reader [MS03]. 3D Electronic Books: SGI Demo Book [SGI93], WebBook [Card96], British Library Turning the Pages [BL03], 3Book [Card03]. Computer help search systems such as Apple or Microsoft. Google or AltaVista provide highlighting and searching Automatic Indexing in IR: use noun-phrases and parsers to create indexes [Wacholder01, Nevill-Manning99]. Scatter/Gather [Cutting91]. Concept similar to: Word Co-occurrence [Schuetze99], InfoScent and Spreading Activation [Chi01, Chi00].

Page 29: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 29 Ed H. Chi VAST 2006

HyperText Book Systems: SuperBook [Remde87] provides a dynamic TOC with fisheye DOI.

Page 30: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 30 Ed H. Chi VAST 2006

Two issues: •  M is calculated using a 40 word window

•  Caveat: Exact word matches do not always show up. •  Solution: Insert large values onto the diagonal

Page 31: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 31 Ed H. Chi VAST 2006

An experimenter without prior knowledge of how the ScentIndex system works devised a total of 12 tasks. The tasks were divided into two groups of six tasks each.

•  Between the two sets of questions, half of the subjects received one set first; the other received the other set first.

•  Tasks from one group were designed to be one-to-one equivalents of the other group.

•  Of these six tasks, two were Simple Fact Retrieval questions, two were Dispersed Comparison questions, and two were Comprehension questions.

Page 32: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 32 Ed H. Chi VAST 2006

Initial Survey (on computing and search experiences.) 4 expert and 4 novice subjects used the Book interface first, and the other eight used the ScentIndex first. Subjects were trained to use the ScentIndex immediately before they needed to use it. Each task was given on a separate sheet of paper. Read, understand each question completely before they start the task.

•  time limit for each task (simple retrieval=2min, comparison=4min, comprehension=6min) with one minute warnings.

•  For each interface, subjects performed the simple fact retrievals first, the dispersed comparisons second, and the comprehension questions last.

Post Survey (comments, preferences).

Page 33: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 33 Ed H. Chi VAST 2006

Simple Fact Retrieval: •  The last natural occurring case of WHICH virus occurred in Somalia in 1977? •  Who received a state award for developing a Q fever weapon?

Dispersed Comparision: •  What is the death rate of smallpox and tularemia? Which virus has a higher

death rate? •  What year did Russia open negotiations with Iraq for large fermentation vessels?

What year did Vladimir Kryuchkov become chairman of the KGB? Which occurred first?

Comprehension: •  Pasechnik’s defection to the West had grave implications for the Soviet

biowarfare program. Match the person with the fact that describes how they’re involved:

•  Persons: Frolov, Chernyayev, Karpov, Vinogradov •  Facts: A. First told Alibek about Pasechnik’s defection. B. Deputy minister who

refused to sign formal diplomatic reply. C. Given demarche that said US have “new information”, presumably given by Pasechnik. D. Told American visitors that Pasechnik’s jetstream milling machine was for “salt”.

Page 34: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 34 Ed H. Chi VAST 2006

Time to Completion Participants using the ScentIndex interface performed tasks faster than those using the Book, F(1,12)=12.96, p<.01. Many tasks not completed in the time allotted using the Book interface.

•  6/7 for simple retrieval, •  7/8 for comparison, •  3/5 for comprehension.

3

3.5

4

4.5

5

5.5

6

Simple Dispersed Comprehension

ln(c

ompl

etio

n tim

e) in

sec

onds

Book

ScentIndex

Page 35: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 35 Ed H. Chi VAST 2006

Natural log transformation on the completion time As predicted, experts performed tasks faster than novices overall

•  (Expert Mean=4.85, S.D.=.212, Novice Mean=4.58, S.D.=.212, F(1,12)=17.7, p<.01.)

•  There were no interactions.

Simple Retrieval < Dispersed Comparison < Comprehension, F(2,24)=204, p<.01.

4.987

5.435

3.722

0

1

2

3

4

5

6

Simple Dispersed Comprehension

ln(c

ompl

etio

n tim

e) i

n se

cond

s

Page 36: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 36 Ed H. Chi VAST 2006

Converted the scores for each task to a percentage.

We found that users performed better using the ScentIndex, reaching marginal significance F(1,12)=3.991, p=.06. We found no difference between experts and novices.

(measured in points)

Simple Retrieval

Dispersed Comparison

Comprehension

ScentIndex eBook

M=1.88 S.D.=.342

M=1.88 S.D.=.269

M=1.77 S.D.=.284

Book M=1.75 SD=.447

M=1.58 S.D.=.516

M=1.84 S.D.=.259

Page 37: Smart eBooks: ScentIndex and ScentHighlight research published at VAST2006

Copyright 2004 PARC 2006-11-02 37 Ed H. Chi VAST 2006

Users overwhelmingly preferred the ScentIndex interface (15/16) •  “can search using keyword combinations” •  “clicking on page number to navigate” •  “highlighting enables faster scanning and skimming” •  “easier to compare index entries because it’s all on 1 page.”

Some users mentioned that they prefer paper for extensive reading Potential Future work:

•  Compare with search engines (organize results by relevancy). •  Understand difference between this technique and keyword finding. •  Limit the page number list of each relevant index entry to only the pages

that are relevant to the keywords specified.


Recommended