+ All Categories
Home > Documents > Endeca @ NCSU Libraries Andrew Pace & Emily Lynema NCSU Libraries May 24, 2006.

Endeca @ NCSU Libraries Andrew Pace & Emily Lynema NCSU Libraries May 24, 2006.

Date post: 01-Jan-2016
Category:
Upload: timothy-brett-stokes
View: 216 times
Download: 2 times
Share this document with a friend
Popular Tags:
21
Endeca @ NCSU Endeca @ NCSU Libraries Libraries Andrew Pace & Emily Lynema Andrew Pace & Emily Lynema NCSU Libraries NCSU Libraries May 24, 2006 May 24, 2006
Transcript

Endeca @ NCSU Endeca @ NCSU LibrariesLibraries

Andrew Pace & Emily LynemaAndrew Pace & Emily Lynema

NCSU LibrariesNCSU Libraries

May 24, 2006May 24, 2006

Technical Overview Endeca Information Access Platform Endeca Information Access Platform

co-exists with SirsiDynix Unicorn ILS co-exists with SirsiDynix Unicorn ILS and Web2 online catalog.and Web2 online catalog.

Endeca indexes MARC records Endeca indexes MARC records exported from Unicorn.exported from Unicorn.

Index is refreshed nightly with Index is refreshed nightly with records added/updated during records added/updated during previous day.previous day.

Endeca IAP Overview

Raw MARC data

NCSU exports and reformats

Flat text files

Data Foundry

Parse text files Indices

MDEX Engine

NCSU Web Application

HTTP

Client browser

HTTP

Endeca Information Access Platform

Endeca IAP Overview

Raw MARC data

NCSU exports and reformats

Flat text files

Data Foundry

Parse text files Indices

MDEX Engine

NCSU Web Application

HTTP

Client browser

HTTP

Offline - Nightly

Endeca IAP Overview

Raw MARC data

NCSU exports and reformats

Flat text files

Data Foundry

Parse text files Indices

MDEX Engine

NCSU Web Application

HTTP

Client browser

HTTP

Always Online

Integrating Endeca Endeca doesn’t understand MARC data / MARC-8 Endeca doesn’t understand MARC data / MARC-8

character encoding – translate to UTF-8 text filescharacter encoding – translate to UTF-8 text files Each night a script updates the data indexed by Each night a script updates the data indexed by

Endeca:Endeca:– Exports updated or new MARC records from Unicorn.Exports updated or new MARC records from Unicorn.– Reformats and merges these records with those already Reformats and merges these records with those already

indexed.indexed.– Starts Endeca re-index – completely rebuilding index for Starts Endeca re-index – completely rebuilding index for

the catalog.the catalog. Process requires about 7 hours.Process requires about 7 hours. Retain Web2 OPAC for some functionalityRetain Web2 OPAC for some functionality

– Authority searching - known items and cross-referencesAuthority searching - known items and cross-references– Detailed record pages – how to make Endeca -> Web2 Detailed record pages – how to make Endeca -> Web2

link?link?

Integrating Endeca - Future MarcAdapter plugin for raw MARC MarcAdapter plugin for raw MARC

data.data.– Create local field mappings and special Create local field mappings and special

handlers in Java.handlers in Java.– Eliminate need for external MARC 21 Eliminate need for external MARC 21

translation and file merging.translation and file merging. Partial UpdatesPartial Updates

– Update circulation data multiple times Update circulation data multiple times throughout the day.throughout the day.

Quick Demo http://catalog.lib.ncsu.eduhttp://catalog.lib.ncsu.edu

Some Search Statistics (March

2006)

Requests by Search Type

Search 55%

Navigation 15%

Search -> Navigation

30%

Searches by Search Key

74971

32776

135639872

58381141

0

20000

40000

60000

80000

Keyword ISBN Title Author Subject Multi-Field

Search Key

Req

ues

ts

Some Navigation Statistics (March 2006)

Navigation by Dimensions

17939

8653

7451

13607

23291

20867

17720

44197

49931

6790

0 20000 40000 60000

Author

Language

Subject: Era

Subject: Region

Library

Format

Subject: Genre

Subject: Topic

LC Classification

Availability

Dim

en

sio

n

Requests

Navigation Statistics (II) (March 2006)

Dimension Requests Order (on page)

LC Classification 49931 2

Subject: Topic 44197 3

Library 23291 6

Format 20867 5

Author 17939 10

Subject: Genre 17720 4

Subject: Region 13607 7

Language 8653 9

Subject: Era 7451 8

Availability 6790 1

Other interesting tidbits… (March 2006)

Authority searching decreased 45%Authority searching decreased 45% Keyword searching increased 230%. Keyword searching increased 230%.

– Caveat: default catalog search changed Caveat: default catalog search changed from title authority to keyword.from title authority to keyword.

~ 6% of keyword searches offered ~ 6% of keyword searches offered spelling correction or suggestion spelling correction or suggestion – 3.6% - automatic spell correction3.6% - automatic spell correction– 2.6% - “Did you mean…” suggestion2.6% - “Did you mean…” suggestion

Usability Testing 10 undergraduate students10 undergraduate students

– 5 with Endeca catalog5 with Endeca catalog– 5 with old Web2 OPAC5 with old Web2 OPAC

Endeca performed as well as OPAC for Endeca performed as well as OPAC for known-item searching in usability testknown-item searching in usability test– 89% Endeca tasks completed ‘easily’ (8/9)89% Endeca tasks completed ‘easily’ (8/9)– 71% OPAC tasks completed ‘easily’ (15/21)71% OPAC tasks completed ‘easily’ (15/21)

Endeca performed better than OPAC for Endeca performed better than OPAC for topical searching in usability test.topical searching in usability test.

Topical Searching Tasks

Topical Task Success: Web2

Easy36%

Medium7%Hard

23%

Failed34%

Topical Task Success: Endeca

Easy58%

Medium17%

Hard3%

Failed22%

00:00.0 00:43.2 01:26.4 02:09.6 02:52.8 03:36.0

Task 5

Task 6

Task 7

Task 8

Task 9

Task 10

Web2

Endeca

Average Topical Task Duration

Usability Testing Trends Relevance *most* importantRelevance *most* important

– ““Once I scroll through a page, I get pretty discouraged Once I scroll through a page, I get pretty discouraged about the results...” about the results...”

Web2 OPAC participant looking for resources on cat healthWeb2 OPAC participant looking for resources on cat health

‘‘Keyword’ term less intuitive / trusted than Keyword’ term less intuitive / trusted than ‘Subject’ and ‘Title’‘Subject’ and ‘Title’– ““[I used] Keyword in Title because that’s what I want the [I used] Keyword in Title because that’s what I want the

book to be mainly referring to. But I also could’ve went book to be mainly referring to. But I also could’ve went Keyword in Subject. But if I’d have went Keyword Keyword in Subject. But if I’d have went Keyword Anywhere it would have had too big of a field to look Anywhere it would have had too big of a field to look through.” through.”

Web2 OPAC participant looking for resources on gene therapyWeb2 OPAC participant looking for resources on gene therapy

When found, dimensions seem intuitive and When found, dimensions seem intuitive and usefuluseful

‘‘Did you mean’ seems intuitiveDid you mean’ seems intuitive

A study in relevance Are search results in Endeca more Are search results in Endeca more

likely to be relevant to a user’s query likely to be relevant to a user’s query than search results in Web2 OPAC? than search results in Web2 OPAC?

100 topical user searches from 1 100 topical user searches from 1 month in fall 2005month in fall 2005

How many of top 5 results relevant?How many of top 5 results relevant?– 40% relevant in Web2 OPAC40% relevant in Web2 OPAC– 68% relevant in Endeca catalog68% relevant in Endeca catalog

Relevance defined Relevance ranking in Endeca – select Relevance ranking in Endeca – select

from a variety of modules and order from a variety of modules and order them based on importance.them based on importance.

Relevance most important in Keyword Relevance most important in Keyword Anywhere - searches all fields.Anywhere - searches all fields.

At NCSU…At NCSU…1.1. Original query term(s) (no thesaurus, Original query term(s) (no thesaurus,

stemming, spell correction)stemming, spell correction)2.2. Exact phrase matchExact phrase match3.3. Field ranking (Title higher than Author higher Field ranking (Title higher than Author higher

than Table of Contents)than Table of Contents)4.4. Number of fields that contain term(s) …Number of fields that contain term(s) …

Future Plans Ongoing tweaks:Ongoing tweaks:

– Continued usability testingContinued usability testing– Relevance ranking algorithms & spell correction Relevance ranking algorithms & spell correction

thresholdsthresholds– Additional browsing optionsAdditional browsing options

Endeca 2.0 ideasEndeca 2.0 ideas– FRBR-ized displayFRBR-ized display– Discussions with OCLC regarding FAST (Faceted Discussions with OCLC regarding FAST (Faceted

Access to Subject Terms) and FRBRAccess to Subject Terms) and FRBR– Patron-generated refinements (folksonomies?)Patron-generated refinements (folksonomies?)– Enrich records with supplemental Web Services Enrich records with supplemental Web Services

content – more usable TOCs, book reviews, etc.content – more usable TOCs, book reviews, etc.– The death of authority searching (?)The death of authority searching (?)– More integration with QuickSearch, other data More integration with QuickSearch, other data

repositories, and third-party discovery toolsrepositories, and third-party discovery tools

Thanks

http://www.lib.ncsu.edu/endecahttp://www.lib.ncsu.edu/endeca

Andrew Pace, Head, ITAndrew Pace, Head, IT

[email protected][email protected]

Emily Lynema, Systems Librarian for Digital Emily Lynema, Systems Librarian for Digital ProjectsProjects

[email protected][email protected]


Recommended