How to do things with metadata: From rights statements to speech acts.

Post on 17-Jan-2017

313 views 1 download

transcript

HOW TO DOTHINGSWITHMETADATA

From Rights Statements to Speech Acts.

R. J. URBAN

Metadata SemanticsKnowledge Organization• Colloquial • Informal, document-

like representation structures.

• Metadata elements conform to a complex set of rules.

• Supports information retrieval.

• Creates descriptions that identify resources.

Knowledge Representation• Formal• Grounded in formal

theories of First Order Logic

• Metadata assertions are true/false within an interpretation.

• Available to support formal reasoning.

• Relies on names to identify resource.

Metadata Pragmatics?• Semantics: what do words (signifiers, etc.)

mean? – Colloquial MARC/XML semantics– Formal semantics for knowledge representation.

• Resource Description Language (RDF)• Pragmatics: how does context contribute to

meaning? – As humans we regularly interpret the meaning of

metadata successfully, even though it may not be formally represented or machine understandable.

So what?Dublin Core Rights

Typically, rights information includes a statement about various property rights associated with the resource, including intellectual property rights.

So what?DC Terms class: rightsStatement

A statement about the intellectual property rights (IPR) held in or over a Resource, a legal document giving official permission to do something with a resource, or a statement about access rights.

DPLA Rights Statements

DPLA Rights Statements

87,000unique values

Research Questions• What are organizations trying to do

with rights statements? • How are rights statements a kind of

speech act?

Europeana Context“The [Task Force on Metadata Quality] noted that many data providers approach rights statements as an afterthought and lack sufficient know-how to apply the appropriate statement. They therefore choose a restrictive rights statement as a default.”

http://goo.gl/lHaeMX

Europeana Rights Statements– Public Domain– In Copyright• Various Creative Commons Licenses• Rights Reserved – Free Access• Rights Reserved – Paid Access

– Orphan Works– Unknown

– http://pro.europeana.eu/page/available-rights-statements

International Standardized Rights Statements

• Europeana + Digital Public Library of America (DPLA)

• http://rightsstatements.org/• SKOS vocabulary representing 10

rights statement classifications.

Problem

Metadata Quality Frameworks

Schlosser’s Memes• Specific Ownership statements:

“copyright [organization]”• Vague Ownership statements:

“copyright retained by the original owner”• What you can/can’t do:

“we encourage fair use of copyrighted material”

• Protecting Ourselves and You:“no information on the rights in the collection, researchers are responsible for determining copyright.” Schlosser, M. (2009). Unless otherwise indicated: A survey of

copyright statements on digital library collections. College & Research Libraries, 70(4), 371–385. http://doi.org/10.5860/crl.70.4.371

Speech Acts

J.L. Austin

• Maybe not all “statements” are true/false

• Performatives• Questions• Commands• Promises• Oaths• Declarations

• locutionary vs. illocutionary meaning

Speech Acts• Maybe not all “statements”

are true/false• Performatives• Questions• Commands• Promises• Oaths• Declarations

• locutionary vs. illocutionary meaning

Searle’s Speech Act Theory• Illocutionary

force = – Illocutionary

point (purpose of statement) +

– Direction of fit (relation of utterance to the world) +

– Speaker intention (psychological state)

Searle (1979) Speech Act TaxonomyCategory Description Direction of fitAssertive Utterances that commit the

speaker to the expressed truth proposition.“The cat is on the mat.”

Words-to-world

Commisive Utterances that commit the speaker to some future action.“I shall faithfully uphold the office of the president….”

World-to-words

Declarations

Utterances that bring about some change in the world. “I now pronounce you man and wife.”

World-to-words andword-to-world

Directives Utterances that consist of an attempt by the speaker to get the hearer to do something. “Please pass the salt.”

World-to-words

Expressives Expresses the speakers emotional or psychological attitude towards a statement. “I believe that…”

No direction of fit.

Method: Sample• 87,610 unique values found in

aggregated DPLA metadata as dc:rights.– Frequency counts of associated records.

• Drop statements associated with fewer than 100 records. (n=86,482)– Of these, 78,191 only associated with

one record.• Result: 1295 statements

Method: Cleanup• http://openrefine.org• Make all statements lowercase.• Remove extra whitespace.• Remove uniquing features:– DOIs– Copyright [date]– Gift of/donated by [name]– Cite as [citation string]

• Result: 488 statements

Method: statement analysis• 488 statements– Qualitative coding according to Searle’s

Taxonomy of Speech Acts.• What is the proper unit of analysis? – 603 coded excerpts. (these are not

necessarily sentences, especially for long complex directives).

Assertives (n=199)• Schlosser’s ownership statements.• “all rights reserved”• “copyright [copyright holder]”• “copyright [date]”• “This work is in the public domain”• “No known copyright restrictions” • “Purchased with Smithsonian Trust

funds”

Directives (n=272)• Schlossers What you can and can’t do and Protecting

ourselves and you.• “contact the host institution for more information”• “users may download the images for personal or

educational use - students may include images in reports, for instance, and teachers may use the images in the classroom - if the following credit line is included with the image: courtesy of the georgia archives.”

• “to purchase copies of images and/or for copyright information, contact university of [x]”

• “this item may be subject to copyright” • “this image available for use only with the expressed,

written consent of the [x] historical society “

Commissive (n=1)• i hereby certify that, if appropriate, i have obtained and

attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, and specifically allowing distribution as specified below. i certify that the version i submitted is the same as that approved by my advisory committee. i hereby grant to brigham young university and its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. i retain all other ownership rights to the copyright of the thesis, dissertation, or project report. i also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.;

Expressives (n=1)• the new york public library is

interested in learning more about items you've seen on our websites or elsewhere online. if you have any more information about an item or its copyright status, we want to hear from you.

Rights statement patterns

Assertive + directive.

Copyright [date], [organization]. For more information contact x@x.edu

Non-Speech Acts (n=130)• Unexpected discovery!• “university of utah”• person@ex.org• "watt hall 4d, usc, los angeles, ca

90089-0294”

• Are URLs speech acts?– http://ex.org/rights.html

Non-speech acts in context

<http://ex.org/resource001> <dc:rights> “University of Utah”

• Still not really a “statement”• Not a legal document.• Not a permission.

So what?• Europeana Rights statements really about Assertives.

– Creative Commons Licenses (maybe more directive, but still represented as an assertion that a license is available)

• Open Digital Rights Language (ODRL)https://www.w3.org/community/odrl/ – Policies

• Permissions• Constraints

– Intended to be actionable by a system, so very tightly defined.– Mapping statements to ODRL may be difficult.

Permissions/constraints often mixed in rights statements sentences.

– Not well supported in cultural heritage digital library software.

Next steps• Can text analysis help automatically

assign International Standard Rights Metadata.– Automatically recognize and separate

different kinds of speech acts.– Determine relationship to rights

statements.• Fork Cohen’s Ciranda (detects

speech acts in emails)?

Parallel Research• What do rights

statements refer to?• Same data set.• Tagged according to

indexical/referential statements. – This collection– This (digital image)– The work– Etc. etc.