Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Robert Sanderson Herbert Van de Sompel
Prototyping TeamResearch Library
Los Alamos National Laboratory
http://www.openannotation.org/
http://groups.google.com/group/oac-discuss
OAC is funded by the Andrew W. Mellon Foundation
Open Annotation: Technical Overview
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Outline
• Motivation
• OAC’s Perspective on Annotations
• Design Choices
• Alpha 3 Data Model
2
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Motivation (1)
• Annotations are a core ingredient of scholarship:
• Transcribe and annotate medieval manuscripts;• Annotate maps with maps;• Annotate video recording of endangered languages with non-
verbal communication events;• Annotate 3D museum artifacts;• Your use cases …
3
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Motivation (2)
• Existing solutions for scholarly annotation are repository-centric, tied to silos:
• Annotation in terms of specific repository and/or collection, no global scope;
• Only consumable in client/server combination that created the annotation;
• Annotations not shareable beyond original server – can not create cross system services based on (enriched & merged) annotations.
4
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Open Annotation Collaboration
• Focus on interoperability for annotations in order to allow sharing of annotations across:
• Annotation clients;• Content collections;• Services that leverage annotations.
• Focus on annotation for scholarly purposes. But desire to make the OAC framework more broadly usable.
• In order to gain adoption => tools, communities, integration of scholarly communication with other areas of discourse.
5
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
OAC’s Perspective on Annotations (1)
• The following characterize an annotation:
• There is an author (human, software agent) of an annotation;• The annotation occurs at some point in time;• There is a body of an annotation;• There is a target of an annotation;• The body annotates the target, i.e. the body is somehow
“about” the target.
6
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
OAC’s Perspective on Annotations (2)
• Body and target of an annotation can be of any media type.
• A video can annotate a Web page; a Web page can annotate a video.
• This is contrary to the prevailing view in which the body is textual.
• Annotation, body, and target can have different authors.
• This is contrary to the prevailing view in which annotation and body have same authorship.
7
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
OAC’s Perspective on Annotations (3)
• Most (scholarly) annotations involve parts of resources (image regions, slices of a video, dimensions of a dataset).
• The annotation framework should provide support for resource segment identification and description.
• A variety of more complex annotations involve multiple targets (and maybe multiple bodies?).
• The annotation framework should support this. • So far, no compelling use cases for multiple bodies have been
identified.
8
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Design Choices (1)
• Interoperability specified in terms of the Architecture of the Web(URI, link, resource, representation, …) , Semantic Technologies, Linked Data.
• Aligned with desire to more tightly embed scholarly communication in overall human discourse;
• Aligned with trend towards machine-actionable scholarly communication system;
• Aligned with approach we followed in other efforts (OAI-ORE for Aggregations; Memento for versioning).
9
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Design Choices (2)
• Client autonomy.
• Not an annotation protocol (cf. Annotea) but an annotation model combined with a publish/subscribe mechanism;
• No reliance on a server that helps with generation of annotations. Only a server that supports publish/discovery of the annotation created by the client is assumed.
10
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Protocol
publish, subscribe, consume
11
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Publish/Subscribe
publish subscribe consume
12
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Design Choices (3)
• An annotation is an information resource (aka document).
• Earlier versions of the model regarded annotations as conceptual, non-information resources. This approach was related to:• Ideas to model annotations as events;• Ideas to model annotations as OAI-ORE Aggregations.
• Community feedback to this approach was negative:• Added complexity without added value;• The use of OAI-ORE, and especially its Proxies, was deemed
artificial.
13
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Design Choices (4)
• Significant attention to problems related to the ephemeral nature of the Web.
• Representations of URI-addressable resources change over time, resulting in ambiguous or incorrect annotations, as well as annotations that lack (representations of) body and/or target.
• The approach provides hooks to allow:• Recognizing ambiguous/incorrect annotations, e.g. via timestamp
and fixity for body and target;• Reconstructing the annotation, e.g. via timestamps, scholarly
archives, and Memento.
14
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Design Choices (5)
• A model that gracefully builds from simple to complex• The simple core of the model supports elementary annotation use
cases.• The model becomes gradually more complex to accommodate
increasingly complex use cases.
15
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Basic Model
• The basic model has three resources:• Annotation (an RDF document)
• Default: RDF/XML but others via Content Negotiation• Body (the comment or text of the annotation)• Target (the resource the body is about)
16
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Basic Model Example
17
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Additional Relationships and Properties
• Any of the resources can have additional information attached, such as creator, date of creation, title, etc.
18
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Additional Properties Example
19
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Annotation Types
• There can be further types of Annotation, such as a Reply.• Example: Replies are Annotations on Annotations.
20
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Annotation Types Example
21
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Annotation Types Example
We'll come back to these …
22
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Inline Information
• It is important to be able to have content contained within the Annotation document for reasons of Client Autonomy:
• Clients may be unable to mint new URIs for every resource• Clients may wish to transmit only a single document
• Third parties can generate new URIs if the client does not
• The W3C has a Content in RDF specification that describes how to do this:
• http://www.w3.org/TR/Content-in-RDF10/
23
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Inline Information: Body
• We introduce a resource identified by a non resolvable URI, such as a UUID URN, as the Body.• We then embed the data within the Annotation document using the 'chars' property from the Content in RDF ontology.
24
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Inline Body Example
25
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Multiple Targets
• There are many use cases for multiple targets for an Annotation:• Comparison of two or more resources• Making a statement that applies to all of the resources• Showing the provenance of resources• Making a statement about multiple parts of a resource
• The OAC Data Model allows for multiple targets by simply having more than one hasTarget relationship.
26
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Multiple Targets Example
27
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Segments of Resources
• Most annotations are about part of a resource
• Different segments for different media types:
• Text: paragraph, arbitrary span of words• Image: rectangular or arbitrary shaped area• Audio: start and end time points, track name/number• Video: area and time points• Other: slice of a data set, volume in a 3d object, …
28
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Segments of Resources (2)
• Web Architecture Segmentation:
• A URI with a Fragment identifies part of the resource• Media-specific fragment identifiers; eg XPointer for XML• W3C Media Fragments URI specification for simplesegments of media: http://www.w3.org/TR/media-frags/
• We introduce a method of constraining resources:
• Introduce an approach for arbitrarily complex segments that cannot be expressed using Fragments• Can be applied to Body or Target resource
29
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Segments of Resources: Fragment URIs
• URI Fragments are a syntax for creating subsidiary URIs that identify part of the main resource
• The syntax is defined per media type• X/HTML: The named anchor or identified element
• http://www.example.net/foo.html#namedSection
• XML: An XPointer to the element(s)• http://www.example.net/foo.xml#xpointer(/a/b/c)
• PDF: Many options, most relevant two operations:• http://www.example.net/foo.pdf#page=2&viewrect=20,80,50,60
• Plain Text: Either by character position or line position:• http://www.example.net/foo.txt#char=0,10• http://www.example.net/foo.txt#line=1,5
•:30
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Segments of Resources: Media Fragments
• Media Fragments allow anyone to create URIs that identify part of an image, audio or video resource.
• The most common case is for rectangular areas of images:• http://www.example.org/image.jpg#xywh=50,100,640,480
• Link to the full resource as well, for all Fragment URIs
31
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Media Fragments Example
32
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Media Fragments Example
33
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Complex Constraints
• The ConstrainedTarget(CT-1) identifies the segment of interest
• The type of description is dependent on the media type and nature of the target resource.
• If a Fragment URI is not possible, we introduce a Constraint to describe the segment of interest
• Media Fragments embed the segment description in the URI• Constraints are entire resources, so can be more expressive• Constraints may also describe 'contextual' information
34
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Constraint Example
35
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Inline Information: Constraints
• We can also use inline information in the same way as for the Body resource to include the Constraint data.
36
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Inline Constraint Example
37
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
RDF Constraints
• Instead of having the information in an external document, it could be within the RDF of the Annotation document.
• We attach information to the Constraint node
• The Annotation Ontology models its "Selectors" in this way
http://code.google.com/p/annotation-ontology/
38
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
RDF Constraint Example
39
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
RDF Constraint plus Media Fragment
40
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Constrained Body
• The body may also be constrained in the same way as Targets
41
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Constrained Body
42
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Annotating a Non-Document
• The Target of an Annotation does not have to be a document: can be a Non Information Resource
• Non Information Resource as Target• Annotations about:
• Concepts• Physical things • Locations • or any other non-document
• Example: A video about a real life painting
• Non Information Resource as Body• We'll come back to this…
43
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Non Information Resource Example
44
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Annotations and Time
• Web Resources change over time, but keep the same URI
• This is important for linking, but makes Annotation hard. We need to know when the annotation applies to the resource.
• This is true for Body and Target(s).
• The created time is not sufficient, as the Annotation, Body and Target could be created at different times. The Body could be about a previous state of the Target.
45
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Web-Centric Annotation: No Persistence
Google Sidewiki Annotation on http://news.bbc.co.uk/ as of 2010-06-14
46
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Web-Centric Annotation: No Annotations
Archived page from: http://www.dracos.co.uk/work/bbc-news-archive/2010/03/08/07.05.html
47
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Web-Centric Annotation: Desired Cross-Linking
48
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Annotations and Time
• There are three different types of Annotation with respect to Time:
• Timeless Annotations• These are always relevant, regardless of the current state of the resource.
• Uniform Annotations• There is a single timestamp at which all of the resources should be considered.
• Varied Annotations• Each of the resources (Body, Targets) should be considered at a different time.
49
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Timeless Annotations
• The model for a Timeless Annotation is the base model
50
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Uniform Annotations
• If the same time is applicable to all resources, we attach it to the Annotation using the oac:when predicate.
51
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Uniform Annotation Example
52
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Varied Annotations
• If different timestamps are required for each resource, we use oac:when from an oac:TimeConstraint.
53
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
Varied Annotation Example
54
Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel
http://www.openannotation.org/
http://groups.google.com/group/oac-discuss
55