NISO Webinar: Authority Control: Are You Who We Say You Are?

Post on 15-Jul-2015

1,332 views 0 download

Tags:

transcript

NISO Webinar Authority Control:

Are You Who We Say You Are?

Wednesday, February 11, 2015

Speakers:

Simeon Warner, Director of Repository Development, Cornell University Library

Laura Dawson, Product Manager, ProQuest

Thomas Hickey, Chief Scientist, OCLC

http://www.niso.org/news/events/2015/webinars/authority_control/

ORCID identifiers in research

workflows

Simeon Warner, Cornell University Library

with thanks to

Laure Haak, ORCID Executive Director and

Josh Brown, ORCID Regional Director, Europe

for slides and comments

NISO Webinar:

Authority Control: Are You Who We Say You Are?

February 11, 2015

“Use ORCID iDs in research

workflows to solve name

ambiguity and save everyone

a bunch of effort!”

ORCID background

• open - anyone can register, any organization with interest in

research and scholarly communications can join, iDs intended

for reuse, software open source

• non-profit - incorporated in USA, also ORCID EU

• community-driven - where community includes all sectors of

research process including publishers, funders, universities,

and the researchers themselves

two core functions:

1. a registry of unique identifiers and manage a record of

activities

2. APIs that support system-to-system communication and

authentication

see: http://orcid.org/content/initiative

ORCID status and adoption

A little over 2 years since launch, over 1.1M ids created,

over 190 members from all sectors and around the world.

-

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

Oct

No

vD

ec Jan

Feb

Mar

Ap

rM

ay Jun

Jul

Au

gSe

pO

ctN

ov

Dec Jan

Feb

Mar

Ap

rM

ayJu

ne

July

Au

g

Creator

Website

Trusted Party

2012 2013 2014

Publishing25%

Universities & Research

Orgs45%

Funders7%

Associations

12%

Repositories & Profile

Sys11%

EMEA35%

Americas

50%

AsiaPac15%

National integrations and membership

http://openaccess.blogg.kb.se/2013/01/30/slutrapport-fran-projekt-forfattarindentifikatorer/

http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/researchinformation/orcid.aspx

http://orcid.org/blog/2014/09/03/denmark-adopts-orcid-consortium-approach-orcid-implementation

http://orcidpilot.jiscinvolve.org/wp/

ORCID Scope

ORCID = Open RESEARCHER AND CONTRIBUTOR Identifier

o Research activities

o Living people

o There are fewer researchers than the scope of people and

personas covered by ISNI or VIAF

CONTRIBUTOR -- ORCID intended to be used for the spectrum of

actors in the research process, not just authors, and records roles.

o Already supports roles like translator, principal investigator

o 2012 Harvard Workshop http://projects.iq.harvard.edu/attribution_workshop/home

o 2014 Project CRediT Workshop http://www.eventbrite.ca/e/project-credit-workshop-tickets-10314211083

Researcher driven

Creation methods:

• integrations dominate

• website second

• institutional creation

Researcher must be involved to create or activate the ORCID iD,

and can control the privacy settings and/or add information.

Recommend institutions use the trusted party creation method

rather then direct record creation. Need to connect with and

educate users anyway. Can pre-populate registration fields.

-

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

Oct

No

vD

ec Jan

Feb

Mar

Ap

rM

ay Jun

Jul

Au

gSe

pO

ctN

ov

Dec Jan

Feb

Mar

Ap

rM

ayJu

ne

July

Au

g

Creator

Website

Trusted Party

2012 2013 2014

Leveraging ISNI Organization IDs

ORCID uses Ringgold (an ISNI registrar) organization list to support

connection between individuals and education and employment

affiliations.

Leveraging FundRef identifiers

Funding agency list coordinated with FundRef

Auto-complete based

on FundRef data

Integration of ORCID iDs in research

workflows

Publication round trip

ORCID iDs are intended to be integrated into research and

publication workflows, and become embedded in the

metadata. ORCID iDs will thus be associated with new

works at the time of publication.

ORCID

record

Manuscript

SubmissionORCID

record

ORCID

recordReview

Publication

w DOI &

ORCID(s)

CrossRef

DOI assignment

Verified ORCID, update permission

Readers

Round trip process and implications

Publisher captures ORCID iD during manuscript submission

o Authenticated process, no mistyping, accurate

o User may grant permission to add works later

Publisher includes ORCID iD in metadata when minting DOI

o Will be available to support discovery

o Available in CrossRef search

Publisher/CrossRef writes metadata back to ORCID record

o Holder notified, can control visibility

o Saves effort updating record

o Information flow to other systems such as local profile (e.g.

I've linked my ORCID record with my VIVO profile)

Similar process for datasets, mediated by DataCite

ref: http://orcid.org/blog/2014/11/21/new-functionality-friday-auto-update-your-orcid-record

Funder workflow

• Use for applicants and reviewers

• Profile data reduces applicant/grantee form filling burden

• Improve reporting accuracy

• Pull publications, datasets and other works based on ORCID iD

ref: http://support.orcid.org/knowledgebase/articles/426596-orcid-funder-workflow

An ounce of ambiguity avoidance is worth a

pound of disambiguation

-- with apologies to Benjamin Franklin

• Workflow integration avoids name ambiguity at source

• Resulting data good for disambiguation of older data

• Resulting data good for compilation of authority records

“How much information should my

ORCID record have?”

Minimal record

Registration is really quick and

easy, 30 seconds perhaps

1. name

2. email

3. password

4. agree to privacy policy and

conditions

A minimal ORCID record that is

enough to get an iD and use it in

research workflows

Helpful ORCID record

Reasons to add a little more information:

1. Provide enough information so that someone who follows a

link to your record, or searches for you, can understand which

"John Smith" you are

o alternate names

o education and employment information

o a few works. Everyone likes to show off their best work …

o opens the door for disambiguation of existing data

1. Provide other identifiers so that ORCID can act as a

switchboard to connect your identities in different systems.

o local profile id (e.g. my VIVO id at Cornell)

o Scopus Author ID, Researcher ID, ISNI

o (Using the search and link wizards that connect to these

other systems is also the easiest way to add works.)

Expansive ORCID record

There are many import wizards which not only allow

o connection of an ORCID record to other identifiers

o also import of works, grants, etc..

o source is recorded and provides way to assess trust

ORCID registry has facilities for users to enter works themselves,

specify their roles, etc..

ORCID UI groups information about the same work from multiple

sources

o user may select preferred one to display

You may make your ORCID record a complete picture research

contributions if you choose. But a complete record isn't necessary

for ORCID to work.

ORCID as a hub identifier

ORCID is a hub

Other Identifiers

Funders

Higher Education

and Employers

Professional Associations

Repositories

Publishers

The ORCID identifier

connects researchers

with their works

(papers, grants,

datasets, and more),

organizations, and

other identifiers.

ORCID APIs enable data

exchange between

research information

systems.

DOI

DOI

ISBN

Thesis ID

ISNI

Researcher ID

Scopus Author ID

Internal identifiers

Member ID

Abstract ID

Member ID

Abstract ID

FundRef

GrantID

Hub identifier linking to other

identifiers and to profiles in

other systems

… and data in machine form too

$ curl –H “Accept: application/orcid+xml”

“http://pub.orcid.org/0000-0002-7970-7855/orcid-bio”

| grep external-id-url

<external-id-url>

http://isni.org/isni/0000000351311901

</external-id-url>

<external-id-url>

http://vivo.cornell.edu/individual/individual24416

</external-id-url>

<external-id-url>

http://www.researcherid.com/rid/E-2423-2011

</external-id-url>

<external-id-url>

http://www.scopus.com/inward/authorDetails.url?authorID=7103063073&amp;p

artnerID=MN8TOARS

</external-id-url>

Thanks for listening!

Pointers

Register at https://orcid.org/register if you haven’t already!

http://orcid.org/

• Research organizations: http://orcid.org/organizations/institutions

• Publishers: http://orcid.org/organizations/publishers

• Associations: http://orcid.org/organizations/associations

• Funders: http://orcid.org/organizations/funders

• Researchers: http://orcid.org/content/initiative

Membership http://orcid.org/about/membership

• Questions: membership@orcid.org

Blog http://orcid.org/category/newsletter/blog

Slides: http://www.slideshare.net/simeonwarner/orcid-identifiers-in-research-workflows

ISNI

Disambiguating Public Identities

What Is ISNI

• ISO Standard, published in 2012

• International Standard Name Identifier

• Numerical representation of a name

– 16 digits

– Assigned to public figures, contributors of content –

researchers, authors, musicians, actors, publishers,

research institutions – and subjects of that content (if

they are people or institutions).

– Example: 0000 0004 1029 5439

Who is ISNI

• Founding members

– IFRRO (International Federation of Reproduction Rights Organizations)

– CISAC (International Confederation of Authors and Composers Societies)

– SCAPR (Societies’ Council for the Collective Management of Performers’ Rights)

– OCLC

– CENL (Conference of European National Librarians), represented by the British Library and the National Library of France

– ProQuest, represented by Bowker

Members

Quality Team

Board of Directors

ISNI Organizational Structure

Registration Agencies

Ongoing

assignments/

general public

How Does ISNI Registration Work

• Publisher submits names for assignment through a Registration Agency

• RA works with the publisher to ensure the data feed is well-formatted, and sends that feed to the Assignment Agency

• AA assigns as many ISNIs to the names in the feed as it can, using complex algorithms and business rules that evolve with each feed

• AA returns a file of names with ISNIs attached to them

– This may not be the full file of names

– Ambiguous names are held for review by Quality Team

– QT assignments and other exceptions (assignments as a result of improvements to the algorithm) are returned to RA quarterly

– Process is not instant. Assignment may be immediate if the name and other information is unique, but frequently assignments take a week or two.

Stage One

Customer submits data to Registration Agency

Registration Agency sends file to Assignment Agency

Assignment Agency assigns as many ISNIs to the names as it can

Stage Two

Assignment Agency sends assigned file to

Registration Agency

Registration Agency sends assigned file to

Customer

Customer reviews, QAs, ingests

Stage Three

Assignment Agency sends updates on a monthly basis

Registration Agency disperses files to appropriate

Customers

Customers ingest updates

Display

• Only minimal metadata is displayed

• Not meant as a comprehensive profile

• ISNI is a tool for linking data sets, collocation, and

disambiguation

• Enhancements to the record can be made but not

required

Sample Public ISNI Record

Bridge identifier linking disparate data sets

ISNI links

41

Who is using ISNIs?

• Wikipedia/Wikidata

• VIAF

• Access Copyright

• Scholar Universe

• British Library

• JISC

• Musicbrainz

• Macmillan (Digital Science)

• Booknet Canada (piloting)

• Authors Guild (piloting)

• Books in Print ONIX 2.1 extracts (sent to Google, B&N, Chegg and others)

Einstein’s Wikipedia Page

How many names in the ISNI database?

• Over 8,000,000 assigned

• 10,112,931 provisional (awaiting a match from another

data set for corroboration)

• Your author names may well already have ISNIs.

http://www.isni.org/search.

Use Case: Publisher

Use Case: Research Institution

Use Case: University

Use Case: Cross-Domain Linking

Use Case: Cross-Domain Linking

Data Quality

• Based on matching names to existing records in

database (over 17 million names)

• Strict criteria for assigning ISNIs to names

• Quality team oversight (manual edits)

– British Library

– National Library of France

– OCLC

50

Assignment Criteria

• If on the common surname list:

– Birth date

– Death date

– ISBN(s)

– Title(s)

– Co-authors or institutional affiliation

• If not on the common surname list

– Title(s)

– Birth date

– Death date

– Any other distinguishing factors (“is not”)

• If unique

– Immediate assignment

51

ISNI and ORCID

• ORCID numbers are a subset of the numbers in ISNI’s

database

• Working towards alignment, with ultimate goal of single

assignment

• There is ISNI representation on the ORCID Technical

Steering Group, and ORCID representation on the ISNI

Technical Committee

• A researcher may have both an ORCID and an ISNI

52

Do You Have An ISNI?

53

Laura.Dawson@proquest.com

Thomas Hickey

Chief Scientist, OCLC Research

2015 February

NISO Webinar on Authority Control

VIAF Relations

VIAF

Virtual International Authority File

• Grew out of collaboration with national libraries

• Implemented and run by OCLC

• VIAF Council helps oversee it

• ~36 files, mainly from national authority files

• Everything libraries control other than topical subject headings is in scope– Personals, corporates, families

– Jurisdictionals, geographics

– Works, expressions

– Imaginary characters, etc.

56

57

58

59

60

61

Why multiple files?

• Different

– Information collected

• Private vs. public

• Identification vs. comprehensive

– Technologies and systems

• APIs

– Time scales

• Batch vs. interactive creation

• Historical vs. contemporary

– Business models

62

VIAF’s characteristics

• Origins

• What is being identified

• Who creates it

• Range of entities

• Priorities and control

• What can be shared

Library authorities

Entities libraries control

Library staff

Very broad

Libraries

Open

63

Relationship with ISNI

• Both systems run by OCLC– VIAF helped get ISNI started

• Problems– Each absorbs the other’s data

– Feedback loops!

• Who’s in charge?– ISNI now indicates reviewed records

• Relationships treated as though from xA

• Can both merge and split VIAF clusters

Wikipedia & Wikidata

Wikipedia & Wikidata

Wikipedia & Wikidata

Wikipedia & Wikidata

Wikipedia & Wikidata

Relationship with Wikipedia

• VIAF Harvests Wikipedia dumps monthly

• Pages about people that are in VIAF are added

• VIAFbot back loaded links into Wikipedia

– http://en.wikipedia.org/wiki/User:VIAFbot

Relationship with WorldCat

• One of the main uses of VIAF internally at OCLC is controlling names

• Multilingual Bibliographic Structure project

• Generate ‘xR’ authority records

– Works

– Expressions

OCLC Production Services

External OCLC Research Systems

Internal OCLC Research Resources

enhancedWorldCat

Kindred Works

Classify

Identities

FictionFinder

Cookbook Finder

LCSH

FAST

VIAF

GMGPC

Linked Data Entities

WORKSGSAFD

GTT

DDC

LCTGMMeSH

enhancedWorldCat

WORKSxRSandbox

Multi-lingualBib Records

VIAF

FRBRClustering

Unexpected interactions

• Drive towards comprehensiveness– More information about entities

– More entities

• Importing other files

• Keeping up with updates

• Recognizing source of information

• What to trust

• How to leverage limited staff

Thank you

NISO Webinar • February 11, 2015

Questions?All questions will be posted with presenter answers on

the NISO website following the webinar:

http://www.niso.org/news/events/2015/webinars/authority_control/

NISO Webinar

Authority Control:

Are You Who We Say You Are?

Thank you for joining us today.

Please take a moment to fill out the brief online survey.

We look forward to hearing from you!

THANK YOU