Key to the management of intellectual property in digital media
BISG/NISOThe Changing Standards Landscape Washington DC, June 22 2007
Norman Paskin
IDENTIFY AND DESCRIBE
T E R T I U S L t d
• “Key to” not “keys to…”
• Naming = assigning an identifier to a referent
• Some general themes and practical consequences
Naming and meaning
The case of the headless corpse
• Identifier: unique persistent alphanumeric string (“number”, “name”, “lexical token”) specifying a referent
– Unique: one to many: an identifier specifies one and only one referent (but a referent may have more than one identifier)
– Persistent: once assigned, does not change referent – May be part of an identifier system (other components, technical or social)
• Resolution: process by which an identifier is input to a network service which returns its associated referent and/or descriptive information about it (metadata). “Actionability”
• Referent: the object which is identified by the identifier, whether or not resolution returns that object.
– may be abstract, physical or digital, since all these forms of entity are of relevance in content management (e.g. creations, resources, agreements, people, organisations)
Naming
• My hat is on the shelf: I can see the hat • My hat is in a box on the shelf: now I can’t see the hat, only boxes• I put a label on the box: now I can still find the hat.
– “The label identifies the hat” • The hat box analogy in digital form: “the data” and “the file server page
it’s on” (the web)• I click on the link, and I get “the thing that I want”…?
– “the URL identifies the content”
• …only in a very simple case.
• It may have moved• It may be in various forms; in multiple places; in different versions; etc. • You may not have rights to it• It may not be possible to “get” it
– e.g if the referent is a person…
What is being named: the (false) hat box analogy
• Granularity: the extent to which a collection of information has been subdivided for purposes of identification (e.g. a collection; a book; tables and figures)– Functional Granularity: it should be possible to identify an
entity whenever it needs to be distinguished
• Your functional granularity may not be my functional granularity: – A wants to distinguish “this book in any format”, but B
wants to distinguish “the pdf version, the html version, etc ….”
Granularity
• Precisely what is being named? • Suppose I have here a pdf version of Defoe’s “Robinson
Crusoe” issued by Norton. I find an identifier – is it of:
– The work “Robinson Crusoe”?– All works by Daniel Defoe? – The Norton edition of “Robinson Crusoe”? – The pdf version of the Norton edition of…. ?– The pdf version of…held on this server…?– Which hat is in that box?
• Most digital objects of interest have compound form, simultaneously embodying several referents
• Multiple identifiers may be necessary (cf music CDs)• Need to say what each identifier describes
Compound objects
• Persistence: “get me the right thing” (redirect to a valid URL)
• Contextual resolution: “get me the thing that is right for me”
– Appropriate copy resolution (e.g. OpenURL context-sensitive linking): same content in different contexts
– Full contextual resolution e.g. rights-based) : different content in different contexts
• A specific case: location-dependent resolution– e.g. Crossref / China
• A general mechanism: multiple resolution: returns multiple things in response to a request from one identifier (e.g. a choice, an automated service)
Resolution
• Assigning metadata to a referent, to enable semantic interoperability – “say what the referent is”
• Semantic: – Do two identifiers denote the same referent? – If A says “owner” and B says “owner”, are they referring to the
same thing? – If A says “released” and B says “disseminated”, do they mean
different things?
• Interoperability: the ability for identifiers to be used in services outside the direct control of the issuing assigner– Identifiers assigned in one context may be encountered, and may
be re-used, in another place or time - without consulting the assigner. You can’t assume that your assumptions made on assignment will be known to someone else.
• Persistence = interoperability with the future
Meaning
Tools to ensure meaning
• Basis: “Interoperability of Data in E-Commerce Systems” (indecs) : http://www.indecs.org 1998-2000
• Led to Contextual Ontology approach - used in:
• ISO MPEG-21 Rights Data Dictionary (http://iso21000-6.net/)
• DOI Data Dictionary (http://www.doi.org )
• DDEX digital data exchange - music industry (http://ddex.net/ )
• ONIX: Book industry (+) messaging schemas (www.editeur.org )
• Rightscom’s OntologyX - licensee of output, plus own work on tools (www.rightscom.com )
• Digital Library Federation - communication of licence terms (ERMI: ONIX for licensing terms)
• May inform development of ACAP - Content Access (http://www.the-acap.org/ )
• Physical property: – representations e.g. deeds, mortgages, are traded (not the physical bricks
etc.) • Intellectual property:
– representations e.g. licences, files, are traded (not the abstract Work etc.)• Representations have value• Not just an inventory but a structured entity, such as a deed
– "to facilitate the comparison and combination of assets (standard descriptions)“
• We are becoming more used to representations: Avatars, licences: in general: digital objects
[See: De Soto: "The Mystery Of Capital"; and Kahn: "Representing Value as Digital Objects" D-Lib magazine, May 01 (www.dlib.org/dlib/may01)]
Representations
• Services using an identifier may be offered by multiple providers– Some may be more definitive than others – “Resolution” shades into “query” – e.g. Worldcat ISBN service
• Each registration authority for an identifier scheme should retain autonomy and precedence in determining rules for usage within its own scheme or community.
• Many early applications will be silos; interoperability is not needed (and may not be desired) – e.g. Knovel: interactivity within its online book content through e book
components
• New applications will reach across silos (mash ups etc); new silos will appear. As such services grow and become many, a single source of data to power multiple services makes sense
Interoperability and multiple services
• An identifier specifies one and only one referent (but a referent may have more than one identifier)– Make systems work together: e.g. Bookland DOIs made from
ISBNs…?• Objects may be abstract, physical or digital, since all these forms of
entity are of relevance in content management (e.g. creations, resources, agreements, people, organisations) – Need for many identifiers: ISTC, ISNI, Licences, etc
• Your functional granularity may not be my functional granularity: A wants to distinguish “this book in any format”, but B wants “the pdf version, the html version….” – Need to enable different identifiers to work together: e.g. ISTC and
ISBN • Assumptions are not sufficient for interoperability
– An identifier is not enough. You need to say what you are identifying.
• Context is vital: “get me the thing that is right for me”– Simple resolution may not be enough.
Practical consequences
• Multiple services may exist for an identifier – Don’t assume only monopoly services– One service may be definitive; some may be better than others
• Digital objects may be representations of something– Need to distinguish what is a representation– Note that representations may be compound objects
• Interoperability becomes more important as an economic feature when there are multiple services or multiple uses – which there will be eventually– Don’t design only for today
• Common frameworks for naming and meaning (to do all this) become important when services cut across silos; across media; from different sources; etc – e.g. DOI
• Multiple resolution: returns multiple results in response to a request (e.g. a choice, an automated service) – need some way of grouping and ordering those results, e.g.
Handle value typing• Interoperability of Data in E-Commerce Systems
– Need semantic precision and common framework
Practical consequences (cont.)
Key to the management of intellectual property in digital media
BISG/NISOThe Changing Standards Landscape Washington DC, June 22 2007
Norman Paskin
IDENTIFY AND DESCRIBE
T E R T I U S L t d