Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
Taking D2D Services to the Users with
OpenURL, RSS, and OAI-PMH
Chuck Koscher
Technology Director, CrossRef
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
• Everything is online – if it’s not online, it doesn’t exist
• Everything is interlinked – if it’s not linked it doesn’t exist
• Breaking of barriers between academic and consumer behavior – user expectations are set by Google, eBay, etc.
• Journal brand strong but moving to article economy
• Economic models changing – Open Access
• Technical Reports and other grey lit are now findable
• Books going online
Scholarly Publishing Trends
“Find-ability precedes usability, you can not use what you can not find"STM-TMR 2006 Amanda Spiteri, Marketing Director Elsevier
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
One window: Your Web page
Getting noticed requires a store window
Users must know the URL
Content may be indexed by a search engine
User must read their RSS feeds
User might have brand affinity
… but there are billions of web pages
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
There are lots of windows
…among others
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
Metadata distribution via standardized methods is the bridge to these windows for your content
Strength Complexity Targeted use
RSS Wide adoption, great support, browser integrations, mass-user appeal.
Simple to create and distribute. Just create an XML file and stick it on your web server
Distribution of ‘newsy’ data most often for human consumption
OpenURL All inclusive specification, well positioned for advanced or diverse applications.
Simple to complex syntax, only the more basic examples are human readable. Software implementation can be complex, lots of decision paths.
Distribution of metadata or content of individual items, most likely implemented as part of a linking system.
OAI-PMH Robust well thought out transaction model. Very extensible and adaptable. Wide spread adoption within the industry.
Implementation is moderate to complex. Good frameworks (OCLC) available. Requires substantial resources (compute and human) for any non-trivial repository.
Distribution of large volumes of metadata most likely to automated harvesters.
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
OpenURL is packaging
is a transport syntax (a box), a way to sendOpenURL
Complexity stems from the number of ways you can accomplish the same task: send metadata to a service (a resolver)
ContextObject
MetaData
OpenURLContextObject
MetaData
OpenURL
MetaData
OpenURL
referentreference
ContextObject
MetaData
OpenURL
referentreference
MetaData
OpenURL
contextreference
ContextObject
referentreference
OpenURL
contextreference
ContextObject
MetaData
MetaData
is an internal wrapper (box within a box)
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
http://www.crossref.org/openurl?url_ver=Z39.88-2004 &rft_id=info:doi/10.1361/15477020418786&noredirect=true
OpenURL basic example
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
http://www.crossref.org/openurl?url_ver=Z39.88-2004&url_tim=2004-01-09&url_ctx_fmt=info:Aofi/fmt:Akev:Amtx:Actx&ctx_ver=Z39.88-2004&ctx_enc=info:Aofi/enc:AUTF-8&ctx_id=345871&ctx_tim=2002-03-20T08:A55:A12Z&rft_val_fmt=info:Aofi/fmt:Akev:mtx:journal&rft.atitle=Isolation+of+a+common+receptor+for+coxsackie+B&rft.jtitle=Science&rft.aulast=Bergelson&rft.auinit=J&rft.date=1997&rft.volume=275&rft.spage=1320&rft.epage=1323&rfe_val_fmt=info:ofi/fmt:kev:mtx:journal&rfe.atitle=p27-p16+Chimera:+A+Superior+Antiproliferative&rfe.jtitle=Molecular+Therapy&rfe.aulast=McArthur&rfe.aufirst=James&rfe.date=2001&rfe.volume=3&rfe.issue=1&rfe.spage=8&rfe.epage=13&req_ref_fmt=http://lib.caltech.edu/fmt/ldap-mtx.html&req_ref=http://ldap.caltech.edu/janed/record.txt
OpenURL: In-Line context object example
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
http://www.crossref.org/openurl
NISO Z39.88-2004 OpenURL is a very comprehensive framework!
CrossRef implemented the San Antonio Profile #1 The basic inline by value model might address a high percentage of actual needs
By consolidating metadata in one place (CrossRef), publishers have created an ideal circumstance for a single resolver to reach a large amount of content.
An OpenURL ‘solution’ is not embodied in a single place. It is a community of contributors using a common language. OpenURL is the Esperanto of linking.
No CrossRef account needed, available free to the public
Number of resolutions in 2006 => 608,756
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
OAI-PMH is a set of commands used to pull metadata from a compliant repository
Verb Use Example
Identify Ask a repository to tell you about itself.
oai.crossref.org/OAIHandler?verb=Identify
ListMetadataFormats
Ask a repository which formats (XML schemas) data is available in. Compliant repositories support Dublin Core.
oai.crossref.org/OAIHandler?verb=ListMetadataFormats
ListSets Ask a repository to list the hierarchical structure it uses to organize itself
oai.crossref.org/OAIHandler?verb=Identify
ListIdentifiers Ask a repository to list the identifiers in the whole repository or a particular set
oai.crossref.org/OAIHandler?verb=ListIdentifiers
ListRecords Ask a repository to return the metadata for all records in the repository or those in a given set
oai.crossref.org/OAIHandler?verb=ListRecords&SetSpec=10.1002:300:1999
GetRecord Ask the repository for the metadata of a given identifier.
oai.crossref.org/OAIHandler?verb=GetRecord&metadataPrefix=cr_unixml&identifier=info:doi/10.1002/jnr.490010101
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
OAI-PMH sample responses - Identify
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
OAI-PMH sample responses
ListSets
ListSets&resumptionToken=1160597811347!698!205002
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
OAI-PMH sample responseverb=GetRecord&metadataPrefix=cr_unixml&identifier=info:doi/10.1002/jnr.490010101
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
OAI-PMH sample responseverb=ListIdentifiers&metadataPrefix=cr_unixml&set=10.1002:297:2004
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
CrossRef’s OAI-PMH Mission
December 2005 CrossRef announced a Web Services initiative
Provide a central point for the distribution of metadata from 100s of publishers, for millions of identifiers
Utilize common/existing distribution protocols and technology
Targeted at consumers of mass quantities of metadata.
Active: MS Academic Live and Scirus (search engines)
Looking: EBSCO, Euopean Biomatics Institute, others…
Is not ‘open’ (e.g. it is not free), uses IP authentication for access control
Recipient identified by 2 IP address ranges
Content can be selectively mapped to a recipient (opt-in/opt-out) at the publisher or title level
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
RSS
CrossRef is not currently operating any RSS feeds (we have Blogs which are kinda sorta the same thing)
Members view RSS feeds as a way to reach out and touch end users and bring them to the member’s site
For end uses:
OpenURL is like plumbing (“Intel inside”), they really don’t care
OAI-PMH is a what?
RSS they’ve probably heard of (blogs) and may even know how to use
CrossRef members have recognized the need to establish guidelines on content composition by feed type.
e.g. a TOC feed should be organized the same way from one publisher to the next in order to avoid end user confusion.
(a NISO initiative?)
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
…Google uses the <link> field in your feed to gather URLs from your site and uses the modified date field (the <pubDate> field for RSS feeds and the <modified> date for Atom feeds) to learn when each URL was last modified … Make sure that the feed is located in the highest-level directory you want search engines to crawl
RSS syndication
http://www.google.com/support/webmasters/bin/answer.py?answer=34656&ctx=sibling
Of course RSS is used for syndication as well
Example:
Syndication feed —Google accepts RSS (Real Simple Syndication) 2.0 and Atom 0.3 feeds. Generally, you would use this format only if your site already has a syndication feed. Note that this method may not let Google know about all the URLs in your site, since the feed may only provide information on recent URLs.
Discovery to Delivery: Solutions to Put Your Content Where the Users AreNovember 2-3, 2006
Conclusion
Bringing users to content requires metadata distribution
Be complete (article title, all authors, citations) Be accurate (author=given-name + surname, not the entire byline) Use a widely accepted (and expressive) format: NLM, DC, CrossRef
Position metadata for discovery
Aggregated distribution like CrossRef’s PMH service Register as a PMH data provider (http://www.openarchives.org/data/registerasprovider.html) Find syndication channels (syndication.iop.org, Feedzilla, MedicineNet)