+ All Categories
Home > Documents > INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of...

INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of...

Date post: 16-Jan-2016
Category:
Upload: austen-small
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
79
INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Dr. Xia Lin Assistant Professor Assistant Professor College of Information Science and College of Information Science and Technology Technology Drexel University Drexel University
Transcript
Page 1: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

INFO624 -- Week 9Effective Information Retrieval

Dr. Xia LinDr. Xia LinAssistant ProfessorAssistant Professor

College of Information Science and TechnologyCollege of Information Science and Technology

Drexel UniversityDrexel University

Page 2: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Effective Information Retrieval System’s perspectivesSystem’s perspectives

Fast indexing and retrieval algorithmsFast indexing and retrieval algorithmsInverted indexing. Tree structures, Hash Inverted indexing. Tree structures, Hash

tablestables Semantic indexing and mapping Semantic indexing and mapping

Subject indexingSubject indexingLatent semantic indexingLatent semantic indexing

Intelligent information retrievalIntelligent information retrievalKnowledge representationKnowledge representationLogical inferences Logical inferences

Page 3: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Effective Information Retrieval User’s perspectivesUser’s perspectives

Iteration Iteration Relevance FeedbackRelevance Feedback Use User's ProfilesUse User's Profiles Graphical Display of Search ResultsGraphical Display of Search Results Browsing/Interactive SearchingBrowsing/Interactive Searching

We can’t change the user. We should We can’t change the user. We should make the system to adapt to the user’s make the system to adapt to the user’s needsneeds

Page 4: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Iteration Most search needs to be done iterativelyMost search needs to be done iteratively

From the user’s point of viewFrom the user’s point of viewThe first query often does not retrieve The first query often does not retrieve

what the user wantswhat the user wantsThe user needs to see the output of The user needs to see the output of

previous queries to construct the next previous queries to construct the next queryquery

The user often needs to reconstruct his/her The user often needs to reconstruct his/her information needs after they read/browse information needs after they read/browse search results.search results.

Page 5: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Iteration – User’s strategies Modify queries repeatedly based on some goalsModify queries repeatedly based on some goals

Starting with high precisionStarting with high precisionUse a specific query firstUse a specific query firstBroaden queries to include more relevant Broaden queries to include more relevant

documentsdocuments• "pearl growing""pearl growing"

Starting with high recallStarting with high recallUse a very broad queryUse a very broad queryImprove precision graduallyImprove precision gradually"onion peeling""onion peeling"

Starting with known itemsStarting with known itemsFind documents similar to the known itemsFind documents similar to the known items

Browsing/interactive searchingBrowsing/interactive searching

Page 6: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Iteration – System’s strategies If the system can “learn” from the user’s If the system can “learn” from the user’s

activities, the system likely can retrieve activities, the system likely can retrieve better results to meet user’s needs.better results to meet user’s needs. Relevance feedbackRelevance feedback User’s profilesUser’s profiles

The system should provide better output The system should provide better output representations to help the user representations to help the user BrowseBrowse Conduct interactive searches.Conduct interactive searches.

Page 7: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Relevance Feedback Feedback: The user provides information that the system Feedback: The user provides information that the system

can use to modify its next search or next displaycan use to modify its next search or next display Relevant Feedback:Relevant Feedback:

Users let the system knowUsers let the system know

what documents are relevant to their what documents are relevant to their information needsinformation needs

What concepts or terms are related to their What concepts or terms are related to their information needsinformation needs

What weights they would like the system to What weights they would like the system to put on each relevant documents/termsput on each relevant documents/terms

Page 8: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Relevant Feedback – System’s Strategy

The system should invite the user to The system should invite the user to select relevant documents/terms from the select relevant documents/terms from the retrieved results before the second retrieved results before the second retrieval is conductedretrieval is conducted

The system should use information from The system should use information from user's feedback to conduct next search.user's feedback to conduct next search.

Page 9: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Design IR Systems with relevance feedback

Collect relevance feedback throughCollect relevance feedback through Binary vs. scalesBinary vs. scales Positive and negative feedbackPositive and negative feedback

Apply relevance feedback toApply relevance feedback to QueryQuery ProfileProfile DocumentDocument Retrieval algorithmRetrieval algorithm

Page 10: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

User Profiles

User profilesUser profiles information about the user’s information about the user’s

information needs that IR system can information needs that IR system can use to modify its search process.use to modify its search process.

Simple user profilesSimple user profiles A list of terms that the user selects to A list of terms that the user selects to

represent his/her information needsrepresent his/her information needs A list of terms with weightsA list of terms with weights

Page 11: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Extended user profilesExtended user profiles More complex term structuresMore complex term structures Information use patternsInformation use patterns levels of interestslevels of interests User’s background informationUser’s background information User’s browsing behaviorsUser’s browsing behaviors

What pages the user has visited last What pages the user has visited last week, last month, …week, last month, …

From which page to which page …From which page to which page …

Page 12: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Use of user Profiles

Selective Dissemination of Information (SDI)Selective Dissemination of Information (SDI) The system regularly runs the search to get The system regularly runs the search to get

any any newnew information that matches user’s information that matches user’s profiles.profiles.

The user can set up several profilesThe user can set up several profilesOnce they are set up, the queries are Once they are set up, the queries are

always the same.always the same. The user can set the frequency of the update The user can set the frequency of the update

searches.searches.

Page 13: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

SDI Advantages of SDIAdvantages of SDI

Automatic retrieval of new information for the Automatic retrieval of new information for the useruser

Set up a profile once, use the profile for retrieval Set up a profile once, use the profile for retrieval many times.many times.

The user can change the profiles or the search The user can change the profiles or the search frequency as needed.frequency as needed.

Disadvantages of SDIDisadvantages of SDI The query based on the profile is static The query based on the profile is static Timing problemsTiming problems

Information in need is information indeed.Information in need is information indeed.Something I am very interest, but it did not Something I am very interest, but it did not

come at the time I want to read it.come at the time I want to read it.

Page 14: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Use profiles during the search

Modify the queryModify the query When the user sends a query, the system When the user sends a query, the system

automatically adds some terms to the query automatically adds some terms to the query from the user’s profiles.from the user’s profiles.

When the user sends a query, the system When the user sends a query, the system checks if the query terms is in user’s profile. If checks if the query terms is in user’s profile. If it is, increase the weight for the terms. it is, increase the weight for the terms.

Organize the search resultsOrganize the search results When the user sends a query, the system uses When the user sends a query, the system uses

the profiles information to organize the search the profiles information to organize the search results (such as clustering, ranking, )results (such as clustering, ranking, )

Page 15: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Browsing

Browsing is an act of human information Browsing is an act of human information seekingseeking

a mental process of identifying and a mental process of identifying and choosing informationchoosing information

a dynamic process that varies in time a dynamic process that varies in time and depends on intermediate results.and depends on intermediate results.

a part of process of decision making, a part of process of decision making, problem solving, etc.problem solving, etc.

Page 16: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Browsing for Information Retrieval

A kind of searching process in which the A kind of searching process in which the initial search criteria or goals are only initial search criteria or goals are only partly definedpartly defined general-purpose web browsinggeneral-purpose web browsing

An art of not knowing what one wants An art of not knowing what one wants until one finds ituntil one finds it visual recognitionvisual recognition content recognitioncontent recognition

Page 17: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Browsing for Information Retrieval A learning activity that emphasizes A learning activity that emphasizes

structures and interactive processstructures and interactive process exploratoryexploratory movements based on feedbackmovements based on feedback

A process of finding and navigating in a A process of finding and navigating in a unknown or unfamiliar information spaceunknown or unfamiliar information space becoming aware of new contentsbecoming aware of new contents finding unexpected resultsfinding unexpected results

Page 18: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Search or Browse?

Would you like to search using a search engine or would you like to browse from pages to pages (or through a hierarchy)?Depend on what?

Page 19: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Factors of browsing

PurposesPurposes Fact retrievalFact retrieval Concept formation or interpretationConcept formation or interpretation Current awarenessCurrent awareness

TasksTasks Well-defined tasksWell-defined tasks Ill-defined tasksIll-defined tasks number of items to browsenumber of items to browse

Page 20: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Factors of browsing Individual characteristicsIndividual characteristics

MotivationMotivation Experience and knowledgeExperience and knowledge Cognitive stylesCognitive styles

ContextContext Subject disciplinesSubject disciplines Organizational schemesOrganizational schemes Nature of text/informationNature of text/information

MediumMedium Does the system support browsing?Does the system support browsing?

Page 21: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

IR Systems that support browsing

Good navigation toolsGood navigation tools Easy to move from one item to anotherEasy to move from one item to another Links Links

good structuresgood structuresfast accessfast access

Easy to back trackEasy to back trackCorrect any errorsCorrect any errorsmake new selectionsmake new selections

Page 22: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

IR Systems that support browsing

Good displaysGood displays easy to readeasy to read meaningful orders of retrieval resultsmeaningful orders of retrieval results graphical presentationgraphical presentation

Meaningful content organizationMeaningful content organization contextual hierarchical structurescontextual hierarchical structures Grouping of related items Grouping of related items Contextual landmarksContextual landmarks

Page 23: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

“why just browse when you can fly?” HotSauce is an innovative 3D fly-through HotSauce is an innovative 3D fly-through

interface for navigating information spaces. interface for navigating information spaces. It was developed, largely as a one-man It was developed, largely as a one-man effort, by Ramanathan V. Guha while at effort, by Ramanathan V. Guha while at Apple Research in the mid-1990s. Apple Research in the mid-1990s. HotSauce was a specific 3D spatialization HotSauce was a specific 3D spatialization of the Meta Content Framework (MCF) also of the Meta Content Framework (MCF) also developed by Guha. developed by Guha.

Page 24: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

HotSauce

Page 25: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Why Surf alone?

What if you had an assistant always What if you had an assistant always looking ahead for you [when browsing looking ahead for you [when browsing the web]….the web]…. The assistant could warn you if the The assistant could warn you if the

page was irrelevant, could alert you if page was irrelevant, could alert you if that link or some other link merited that link or some other link merited your attention.your attention.

The assistant could save you time and The assistant could save you time and frustration. frustration.

CACM,44(8), p.71, 2001

Page 26: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Information Agents a software that applies user profiles, a software that applies user profiles,

dynamically and intelligently, to search tasksdynamically and intelligently, to search tasks Search distributed, possibly heterogeneous Search distributed, possibly heterogeneous

information resources on the user’s behalf. information resources on the user’s behalf. Gather and integrate search results by some Gather and integrate search results by some

Artificial Intelligence techniquesArtificial Intelligence techniques Accept user’s feedback and use the feedback to Accept user’s feedback and use the feedback to

modify the user profiles and search strategiesmodify the user profiles and search strategies

Page 27: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Architecting Browsable Websites Design site structuresDesign site structures

Metaphor Exploration Metaphor Exploration Organizational metaphorsOrganizational metaphorsFunctional metaphorsFunctional metaphorsVisual metaphorsVisual metaphors

Define Navigation Define Navigation Global navigationGlobal navigationLocal navigationLocal navigation

Design Document Design Document

Page 28: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Interactive Systems

““When an interactive system is well-designed, When an interactive system is well-designed, the the interface almost disappearsinterface almost disappears, enabling users , enabling users to concentrate on their work, exploration, or to concentrate on their work, exploration, or pleasure.”pleasure.”

– Ben Ben Shneiderman Shneiderman

Page 29: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Design Principles Offer informative feedbacksOffer informative feedbacks

Relationships between query and documents Relationships between query and documents retrievedretrieved

Relationships among retrieved documentsRelationships among retrieved documents Relationships between metadata and documentsRelationships between metadata and documents

Reducing working memory loadReducing working memory load Keep tracks of choices made during the search Keep tracks of choices made during the search

processprocess Allow user to return temporarily abandoned Allow user to return temporarily abandoned

strategies or jump from one strategy to anotherstrategies or jump from one strategy to another Retain information and context across search session. Retain information and context across search session.

Page 30: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Provide alternative interfaces for novice and Provide alternative interfaces for novice and expert users. expert users. Simplicity vs. powerSimplicity vs. power

Page 31: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Output Presentation for Search engines Two major issuesTwo major issues

What information to present?What information to present? How to organize the output items?How to organize the output items?

Information in the output displayInformation in the output display Traditional databasesTraditional databases

Document reference numbers (unique Document reference numbers (unique number)number)

Citations (author, title, source)Citations (author, title, source)Document surrogate (citation plus abstract Document surrogate (citation plus abstract

and/or indexing terms)and/or indexing terms)fulltextfulltext

Page 32: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

On the webOn the web title, urltitle, urlFirst few sentences/related First few sentences/related

sentences/summariessentences/summariesDates / page sizesDates / page sizesDegree of relevanceDegree of relevancespecial linksspecial links

• ““find similar one”find similar one”Types of linksTypes of linksRelated categoriesRelated categories

Page 33: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

What other information you may wish to have What other information you may wish to have in the retrieval output?in the retrieval output? Citations (or links from this document)?Citations (or links from this document)? Critique or evaluation?Critique or evaluation? Access information (how many times it was Access information (how many times it was

accessed in last 6 months)?accessed in last 6 months)? Links to this document Links to this document Author contact information ?Author contact information ? Why documents were retrieved?Why documents were retrieved?

Page 34: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Output organization Linear Linear

a list of documentsa list of documents listed bylisted by

best match best match alphabetical ordersalphabetical ordersdatesdatesorder of selected fields (authors, order of selected fields (authors,

titles, web sites)titles, web sites)

Page 35: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Linear displayLinear display Practical and most popularPractical and most popular

easy to generateeasy to generateusers know how to use itusers know how to use it

Did not shown relationships among Did not shown relationships among documents!documents!

Document relationships are more complex Document relationships are more complex than a linear onethan a linear one

Page 36: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Hierarchical displayHierarchical display Separate data into different levels or Separate data into different levels or

branchesbranches Branches can be expanded/collapsed.Branches can be expanded/collapsed. Show more data in less spaceShow more data in less space Show the organization of the data Show the organization of the data

Page 37: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Graphical displaysGraphical displays Show more complex relationshipsShow more complex relationships Use location, colors, dimensions, etc to Use location, colors, dimensions, etc to

represent documents, terms or concepts.represent documents, terms or concepts. Provide more interactive functionsProvide more interactive functions

Page 38: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

What is IV?

The use of computer-supported, interactive, The use of computer-supported, interactive, visual representations of abstract data visual representations of abstract data to assist navigation in large information to assist navigation in large information

spaces spaces to reveal complex information structuresto reveal complex information structures to amplify cognitionto amplify cognition

System-centered View

User-centered

Page 39: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

IV and IR Both need to process a large amount of Both need to process a large amount of

informationinformation Both are tools to assist the cognitive Both are tools to assist the cognitive

process of finding, learning, and process of finding, learning, and understanding information.understanding information.

Both face the challenge of “uncertainty”Both face the challenge of “uncertainty” Not an “Exact science”Not an “Exact science”

Both subject to human’s interpretation. Both subject to human’s interpretation.

Page 40: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

VIRI -- Visual Information Retrieval Interfaces

2-dimensional graphical display2-dimensional graphical display use graphical objects (icons, dots etc.) to use graphical objects (icons, dots etc.) to

represent documentsrepresent documents Use geographical relationships to Use geographical relationships to

indicate document relationshipsindicate document relationships use colors to group/differentiate use colors to group/differentiate

documentsdocuments use animation to assist interactionuse animation to assist interaction

Page 41: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Concept Visualization

AltaVista LiveTopicAltaVista LiveTopic HiBrowse InterfaceHiBrowse Interface SemioMapSemioMap Hyperbolic TreesHyperbolic Trees Visual ThesaurusVisual Thesaurus Visual Concept Explorer Visual Concept Explorer

Page 42: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Alta Vista’s LiveTopic

Page 43: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

ConceptSpace

Page 44: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

HiBrowse Interface

Page 45: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

SemioMap

Page 46: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Inxight.com

Page 47: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Topic Maps Highwire: Highwire: http://www.http://www.highwirehighwire.org.org

Page 48: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Visual Thesaurus

Page 49: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Visual Concept Explorer

Page 50: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Concept Mapping

Page 51: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.
Page 52: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

MedLine Search

Page 53: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

IBM – Visualization Space

•This information system understands the user.

•It "hears" users' voice commands and "sees"their gestures and body positions. Interactions are natural, more like human-to-human interactions.

Page 54: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Visual Search Engines

TheBrainTheBrain MooterMooter KartooKartoo MapStanMapStan GrokkerGrokker ToughGraphToughGraph StarNightStarNight NewsLinkNewsLink

Page 55: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

WebBrainhttp://www.webbrain.com/

Page 56: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Mooter: http://www.mooter.com/

Page 57: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Kartoo: http://www.kartoo.com/

Page 58: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

MapStan: http://search.mapstan.com/

Page 59: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Grokkerhttp://www.http://www.groxisgroxis.com/service/.com/service/grokgrok/g_products.html/g_products.html

Page 60: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Touchgraph http://www.http://www.touchgraphtouchgraph.com/.com/

Page 61: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.
Page 62: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Starrynight from RHIZOME

Page 63: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Galaxy of NewsRennison 95

Page 64: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Galaxy of NewsRennison 95

Page 65: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Map of Information Scientists

Page 66: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Author Mapping

Page 67: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

AuthorLink

Page 68: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

NewsLink

IntegrateIntegrate And cross mappingAnd cross mapping

Mapping on topics; displaying by people;Mapping on topics; displaying by people; Mapping on people; display by organization;Mapping on people; display by organization; Etc.Etc.

NewsLink: NewsLink: http://project.cis.drexel.edu/lexislink/http://project.cis.drexel.edu/lexislink/

Page 69: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Discussion

Information VisualizationInformation Visualization What works and what does not? What works and what does not?

Page 70: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

VIRI

AdvantagesAdvantages More representational powerMore representational power

show more information in a limited screen show more information in a limited screen spacespace

many different ways to group documentsmany different ways to group documents can put both keywords and documents in the can put both keywords and documents in the

same 2-dimensional spacesame 2-dimensional space Provide good overview Provide good overview Provide more interactionProvide more interaction

Page 71: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

VIRI

DisadvantagesDisadvantages Difficult to generateDifficult to generate Not always easy to understandNot always easy to understand Many not be specific enoughMany not be specific enough Hard to useHard to use

Page 72: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Evaluation of IR Systems Using Recall & PrecisionUsing Recall & Precision

Conduct query searchesConduct query searchesTry many different queriesTry many different queriesResults may depend on sampling Results may depend on sampling

queries.queries. Compare results of Precision & RecallCompare results of Precision & Recall

Recall & Precision need to be Recall & Precision need to be considered together.considered together.

Page 73: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

How to calculate Recall Determine recall for the whole collectionDetermine recall for the whole collection

Take a random sample to estimate Take a random sample to estimate Use a broad query to select a sample collection Use a broad query to select a sample collection

for the estimationfor the estimation Use “seed” documentsUse “seed” documents

Use relative recallUse relative recall Use two more expert searches as the base. Use two more expert searches as the base. Use one system as the base to estimate recall Use one system as the base to estimate recall

on other systemson other systems Use a small test collectionUse a small test collection

Use experts to judge relevance of every Use experts to judge relevance of every document. document.

Prepare special collections.Prepare special collections.

Page 74: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Functionalities Precision and Recall are particularly useful for Precision and Recall are particularly useful for

evaluating searching/indexing algorithms, and evaluating searching/indexing algorithms, and system featuressystem features.. Compare P & R with and without fuzzy Compare P & R with and without fuzzy

search.search. Compare P & R with different type of Compare P & R with different type of

indexing optionsindexing options Compare P & R across systems with the Compare P & R across systems with the

sample featuressample features Precision and Recall are query-oriented, not Precision and Recall are query-oriented, not

system-oriented. system-oriented.

Page 75: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Evaluation without P & R

The emphasis should be on the user and the The emphasis should be on the user and the interaction.interaction. Be specific on data collection Be specific on data collection

How data are collected and indexed?How data are collected and indexed?Is there a quality control for the data Is there a quality control for the data

collection?collection? Be creative on the test questions and Be creative on the test questions and

methodsmethodsNot just questionnairesNot just questionnaires

Be selective on subject groupsBe selective on subject groups

Page 76: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Quality Evaluation

Data quality Data quality Coverage of databaseCoverage of database

It will not be found if it is not in the It will not be found if it is not in the database.database.

Completeness and accuracy of dataCompleteness and accuracy of data Indexing methods and indexing qualityIndexing methods and indexing quality

It will not be found if it is not indexed.It will not be found if it is not indexed. indexing types indexing types currency of indexing ( Is it updated often?)currency of indexing ( Is it updated often?) indexing sizesindexing sizes

Page 77: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

Interface Consideration User friendly interfaceUser friendly interface

How long does it take for a user to How long does it take for a user to learn advanced features?learn advanced features?

How well can the user explore or How well can the user explore or interact with the query output?interact with the query output?

How easy is it to customize output How easy is it to customize output displays?displays?

Page 78: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

User Satisfaction User satisfactionUser satisfaction

The final test is the user!The final test is the user!User satisfaction is more important User satisfaction is more important

then precision and recallthen precision and recall Measuring user satisfactionMeasuring user satisfaction

SurveySurvey Use statisticsUse statistics User experimentsUser experiments

Page 79: INFO624 -- Week 9 Effective Information Retrieval Dr. Xia Lin Assistant Professor College of Information Science and Technology Drexel University.

User Experiments Observe and collect data on Observe and collect data on

System behaviorsSystem behaviors User search behaviorsUser search behaviors User-system interactionUser-system interaction

Interpret experiment resultsInterpret experiment results for system comparisonsfor system comparisons for understanding user’s information seeking for understanding user’s information seeking

behaviorsbehaviors for developing new retrieval systems/interfacesfor developing new retrieval systems/interfaces


Recommended