Date post: | 05-Dec-2014 |
Category: |
Technology |
Upload: | ed-chi |
View: | 1,098 times |
Download: | 0 times |
2006-11-02 Ed H. Chi VAST 2006 1
Ed H. Chi, Lichan Hong, Julie Heiser, Stu Card
Palo Alto Research Center
The user study portion of this research has been funded in part by ARDA NIMD program.
2006-11-02 Ed H. Chi VAST 2006 2
Ed H. Chi, Lichan Hong, Julie Heiser, Stu Card, Michelle Gumbrecht
Palo Alto Research Center
The user study portion of this research has been funded in part by ARDA NIMD program.
Copyright 2004 PARC 2006-11-02 3 Ed H. Chi VAST 2006
Reading is an essential human activity. • Giant leaps forward is marked by new and better ways to
find, correlate, and comprehend information.
Copyright 2004 PARC 2006-11-02 4 Ed H. Chi VAST 2006
Many books digitized in the digital library effort. • Amazon, Google, and
CMU’s Million Book Project.
Intelligent Analysts spend an enormous amount of time Reading! [Pirolli, Lee, Card] 0
75%
90%
TIME
RECALL
(current)
0%
0
75%
90%
TIME
RECALL
(current)
0%
(goal)
Copyright 2004 PARC 2006-11-02 5 Ed H. Chi VAST 2006
Subject Indexes as a new reading “device” • Invented in the 15th Century [Dewar98] • Method of design refined thru centuries
Peter Heylyn's 1652 Cosmographie in Four Bookes
Copyright 2004 PARC 2006-11-02 6 Ed H. Chi VAST 2006
Instead of generating new indexes using IR techniques, why not enhance them?
Take advantage of centuries of experience in building subject indexes.
Copyright 2004 PARC 2006-11-02 7 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 8 Ed H. Chi VAST 2006
Readers need help in directing their attention to the most relevant passages to their topic of interest.
Idea: conceptually highlight passages and keywords that are related to user search keywords.
Copyright 2004 PARC 2006-11-02 9 Ed H. Chi VAST 2006
User first type search keywords: “anthrax symptoms”
Conceptually highlight any relevant passages and keywords
Draw user attention
Copyright 2004 PARC 2006-11-02 10 Ed H. Chi VAST 2006
Biohazard • by Ken Alibek • non-fiction retelling of
his experiences working on biological weapons in the former Soviet Union.
• 13 index pages in two columns, consisting 829 entries
Copyright 2004 PARC 2006-11-02 11 Ed H. Chi VAST 2006
Exact Matches in red
Associated Entries underlined in red
Copyright 2004 PARC 2006-11-02 12 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 13 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 14 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 15 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 16 Ed H. Chi VAST 2006
Goal: Compare how users find, compare, and comprehend information using the ScentIndex and 3Book versus the physical book. It’s not clear to us that the ScentIndex would be better, because:
• Unfamiliarity with 3Book Interface (page turning, clicking on page numbers, use of search box)
• Inability to grasp the ScentIndex concept (reorganization might be confusing, harder to read the index page on screen)
• Readability of the Screen (hard to read a large number of pieces of text)
• Users might be very good at using the physical book index.
Copyright 2004 PARC 2006-11-02 17 Ed H. Chi VAST 2006
Study Design: • Within-subject • Interface condition (ScentIndex vs. Physical Book Index), and • Task Type (find, compare, comprehend), • with the order of the interface used and expertise level as the
between-subject variables.
Subjects: 16 subjects (8 experts on the content, 8 novices) Materials: subjects used PC with two LCD monitors, and the
physical Alibek book.
Copyright 2004 PARC 2006-11-02 18 Ed H. Chi VAST 2006
Overall, the ScentIndex eBook performed better against the physical Book. Faster Speed:
• Subjects using the ScentIndex were faster in completing their tasks no matter whether they were experts or novices, F(1,12)=12.96, p<.01.
More Accurate: • Answers that they provided while using ScentIndex
interface were more accurate, F(1,12)=3.991, p=.06.
Copyright 2004 PARC 2006-11-02 19 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 20 Ed H. Chi VAST 2006
“The difficulty seems to be, not so much that we publish unduly in view of the extent and variety of present-day interests, but rather that publication has been extended far beyond our present ability to make real use of the record.” --- V. Bush
Indeed, our goal is to enhance current Browsing Interfaces for more productive reading.
Copyright 2004 PARC 2006-11-02 21 Ed H. Chi VAST 2006
The user study portion of this research has been funded in part by contract #MDA904-03-C-0404 to Stuart K. Card and Peter Pirolli from the Advanced Research and Development Activity, Novel Intelligence from Massive Data program. We thank Jock Mackinlay for some fruitful conversation about the interaction of the eBook; Michelle Gumbrecht and Tan Lee for running some of our experiments; Pam Desmond for help in the data analysis, and Brian Tramontana for the video production.
Contact: Ed H. Chi ([email protected])
Copyright 2004 PARC 2006-11-02 22 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 23 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 24 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 25 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 26 Ed H. Chi VAST 2006
PageImages
Scent
HighlightsSentence Structure
Renderer
Text
scan
OCR
sample
extract
compute
parse
Words + Locations
Word Association Matrix
Page Textures
Copyright 2004 PARC 2006-11-02 27 Ed H. Chi VAST 2006
Copyright 2004 PARC 2006-11-02 28 Ed H. Chi VAST 2006
Early proposal of an indexing system: Memex [Bush45] Electronic Books: Rocket eBook, SoftBook Reader, DigiPaper [Huttenlocher00], DjVu [DjVu Zone03], PDF [Adobe03], MS Reader [MS03]. 3D Electronic Books: SGI Demo Book [SGI93], WebBook [Card96], British Library Turning the Pages [BL03], 3Book [Card03]. Computer help search systems such as Apple or Microsoft. Google or AltaVista provide highlighting and searching Automatic Indexing in IR: use noun-phrases and parsers to create indexes [Wacholder01, Nevill-Manning99]. Scatter/Gather [Cutting91]. Concept similar to: Word Co-occurrence [Schuetze99], InfoScent and Spreading Activation [Chi01, Chi00].
Copyright 2004 PARC 2006-11-02 29 Ed H. Chi VAST 2006
HyperText Book Systems: SuperBook [Remde87] provides a dynamic TOC with fisheye DOI.
Copyright 2004 PARC 2006-11-02 30 Ed H. Chi VAST 2006
Two issues: • M is calculated using a 40 word window
• Caveat: Exact word matches do not always show up. • Solution: Insert large values onto the diagonal
Copyright 2004 PARC 2006-11-02 31 Ed H. Chi VAST 2006
An experimenter without prior knowledge of how the ScentIndex system works devised a total of 12 tasks. The tasks were divided into two groups of six tasks each.
• Between the two sets of questions, half of the subjects received one set first; the other received the other set first.
• Tasks from one group were designed to be one-to-one equivalents of the other group.
• Of these six tasks, two were Simple Fact Retrieval questions, two were Dispersed Comparison questions, and two were Comprehension questions.
Copyright 2004 PARC 2006-11-02 32 Ed H. Chi VAST 2006
Initial Survey (on computing and search experiences.) 4 expert and 4 novice subjects used the Book interface first, and the other eight used the ScentIndex first. Subjects were trained to use the ScentIndex immediately before they needed to use it. Each task was given on a separate sheet of paper. Read, understand each question completely before they start the task.
• time limit for each task (simple retrieval=2min, comparison=4min, comprehension=6min) with one minute warnings.
• For each interface, subjects performed the simple fact retrievals first, the dispersed comparisons second, and the comprehension questions last.
Post Survey (comments, preferences).
Copyright 2004 PARC 2006-11-02 33 Ed H. Chi VAST 2006
Simple Fact Retrieval: • The last natural occurring case of WHICH virus occurred in Somalia in 1977? • Who received a state award for developing a Q fever weapon?
Dispersed Comparision: • What is the death rate of smallpox and tularemia? Which virus has a higher
death rate? • What year did Russia open negotiations with Iraq for large fermentation vessels?
What year did Vladimir Kryuchkov become chairman of the KGB? Which occurred first?
Comprehension: • Pasechnik’s defection to the West had grave implications for the Soviet
biowarfare program. Match the person with the fact that describes how they’re involved:
• Persons: Frolov, Chernyayev, Karpov, Vinogradov • Facts: A. First told Alibek about Pasechnik’s defection. B. Deputy minister who
refused to sign formal diplomatic reply. C. Given demarche that said US have “new information”, presumably given by Pasechnik. D. Told American visitors that Pasechnik’s jetstream milling machine was for “salt”.
Copyright 2004 PARC 2006-11-02 34 Ed H. Chi VAST 2006
Time to Completion Participants using the ScentIndex interface performed tasks faster than those using the Book, F(1,12)=12.96, p<.01. Many tasks not completed in the time allotted using the Book interface.
• 6/7 for simple retrieval, • 7/8 for comparison, • 3/5 for comprehension.
3
3.5
4
4.5
5
5.5
6
Simple Dispersed Comprehension
ln(c
ompl
etio
n tim
e) in
sec
onds
Book
ScentIndex
Copyright 2004 PARC 2006-11-02 35 Ed H. Chi VAST 2006
Natural log transformation on the completion time As predicted, experts performed tasks faster than novices overall
• (Expert Mean=4.85, S.D.=.212, Novice Mean=4.58, S.D.=.212, F(1,12)=17.7, p<.01.)
• There were no interactions.
Simple Retrieval < Dispersed Comparison < Comprehension, F(2,24)=204, p<.01.
4.987
5.435
3.722
0
1
2
3
4
5
6
Simple Dispersed Comprehension
ln(c
ompl
etio
n tim
e) i
n se
cond
s
Copyright 2004 PARC 2006-11-02 36 Ed H. Chi VAST 2006
Converted the scores for each task to a percentage.
We found that users performed better using the ScentIndex, reaching marginal significance F(1,12)=3.991, p=.06. We found no difference between experts and novices.
(measured in points)
Simple Retrieval
Dispersed Comparison
Comprehension
ScentIndex eBook
M=1.88 S.D.=.342
M=1.88 S.D.=.269
M=1.77 S.D.=.284
Book M=1.75 SD=.447
M=1.58 S.D.=.516
M=1.84 S.D.=.259
Copyright 2004 PARC 2006-11-02 37 Ed H. Chi VAST 2006
Users overwhelmingly preferred the ScentIndex interface (15/16) • “can search using keyword combinations” • “clicking on page number to navigate” • “highlighting enables faster scanning and skimming” • “easier to compare index entries because it’s all on 1 page.”
Some users mentioned that they prefer paper for extensive reading Potential Future work:
• Compare with search engines (organize results by relevancy). • Understand difference between this technique and keyword finding. • Limit the page number list of each relevant index entry to only the pages
that are relevant to the keywords specified.