Robert Sanderson Herbert Van de Sompel - Open Annotation · • Aligned with trend towards...

Post on 19-Jun-2020

1 views 0 download

transcript

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Robert Sanderson Herbert Van de Sompel

Prototyping TeamResearch Library

Los Alamos National Laboratory

http://www.openannotation.org/

http://groups.google.com/group/oac-discuss

OAC is funded by the Andrew W. Mellon Foundation

Open Annotation: Technical Overview

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Outline

• Motivation

• OAC’s Perspective on Annotations

• Design Choices

• Alpha 3 Data Model

2

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Motivation (1)

• Annotations are a core ingredient of scholarship:

• Transcribe and annotate medieval manuscripts;• Annotate maps with maps;• Annotate video recording of endangered languages with non-

verbal communication events;• Annotate 3D museum artifacts;• Your use cases …

3

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Motivation (2)

• Existing solutions for scholarly annotation are repository-centric, tied to silos:

• Annotation in terms of specific repository and/or collection, no global scope;

• Only consumable in client/server combination that created the annotation;

• Annotations not shareable beyond original server – can not create cross system services based on (enriched & merged) annotations.

4

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Open Annotation Collaboration

• Focus on interoperability for annotations in order to allow sharing of annotations across:

• Annotation clients;• Content collections;• Services that leverage annotations.

• Focus on annotation for scholarly purposes. But desire to make the OAC framework more broadly usable.

• In order to gain adoption => tools, communities, integration of scholarly communication with other areas of discourse.

5

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

OAC’s Perspective on Annotations (1)

• The following characterize an annotation:

• There is an author (human, software agent) of an annotation;• The annotation occurs at some point in time;• There is a body of an annotation;• There is a target of an annotation;• The body annotates the target, i.e. the body is somehow

“about” the target.

6

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

OAC’s Perspective on Annotations (2)

• Body and target of an annotation can be of any media type.

• A video can annotate a Web page; a Web page can annotate a video.

• This is contrary to the prevailing view in which the body is textual.

• Annotation, body, and target can have different authors.

• This is contrary to the prevailing view in which annotation and body have same authorship.

7

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

OAC’s Perspective on Annotations (3)

• Most (scholarly) annotations involve parts of resources (image regions, slices of a video, dimensions of a dataset).

• The annotation framework should provide support for resource segment identification and description.

• A variety of more complex annotations involve multiple targets (and maybe multiple bodies?).

• The annotation framework should support this. • So far, no compelling use cases for multiple bodies have been

identified.

8

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Design Choices (1)

• Interoperability specified in terms of the Architecture of the Web(URI, link, resource, representation, …) , Semantic Technologies, Linked Data.

• Aligned with desire to more tightly embed scholarly communication in overall human discourse;

• Aligned with trend towards machine-actionable scholarly communication system;

• Aligned with approach we followed in other efforts (OAI-ORE for Aggregations; Memento for versioning).

9

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Design Choices (2)

• Client autonomy.

• Not an annotation protocol (cf. Annotea) but an annotation model combined with a publish/subscribe mechanism;

• No reliance on a server that helps with generation of annotations. Only a server that supports publish/discovery of the annotation created by the client is assumed.

10

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Protocol

publish, subscribe, consume

11

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Publish/Subscribe

publish subscribe consume

12

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Design Choices (3)

• An annotation is an information resource (aka document).

• Earlier versions of the model regarded annotations as conceptual, non-information resources. This approach was related to:• Ideas to model annotations as events;• Ideas to model annotations as OAI-ORE Aggregations.

• Community feedback to this approach was negative:• Added complexity without added value;• The use of OAI-ORE, and especially its Proxies, was deemed

artificial.

13

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Design Choices (4)

• Significant attention to problems related to the ephemeral nature of the Web.

• Representations of URI-addressable resources change over time, resulting in ambiguous or incorrect annotations, as well as annotations that lack (representations of) body and/or target.

• The approach provides hooks to allow:• Recognizing ambiguous/incorrect annotations, e.g. via timestamp

and fixity for body and target;• Reconstructing the annotation, e.g. via timestamps, scholarly

archives, and Memento.

14

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Design Choices (5)

• A model that gracefully builds from simple to complex• The simple core of the model supports elementary annotation use

cases.• The model becomes gradually more complex to accommodate

increasingly complex use cases.

15

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Basic Model

• The basic model has three resources:• Annotation (an RDF document)

• Default: RDF/XML but others via Content Negotiation• Body (the comment or text of the annotation)• Target (the resource the body is about)

16

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Basic Model Example

17

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Additional Relationships and Properties

• Any of the resources can have additional information attached, such as creator, date of creation, title, etc.

18

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Additional Properties Example

19

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Annotation Types

• There can be further types of Annotation, such as a Reply.• Example: Replies are Annotations on Annotations.

20

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Annotation Types Example

21

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Annotation Types Example

We'll come back to these …

22

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Inline Information

• It is important to be able to have content contained within the Annotation document for reasons of Client Autonomy:

• Clients may be unable to mint new URIs for every resource• Clients may wish to transmit only a single document

• Third parties can generate new URIs if the client does not

• The W3C has a Content in RDF specification that describes how to do this:

• http://www.w3.org/TR/Content-in-RDF10/

23

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Inline Information: Body

• We introduce a resource identified by a non resolvable URI, such as a UUID URN, as the Body.• We then embed the data within the Annotation document using the 'chars' property from the Content in RDF ontology.

24

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Inline Body Example

25

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Multiple Targets

• There are many use cases for multiple targets for an Annotation:• Comparison of two or more resources• Making a statement that applies to all of the resources• Showing the provenance of resources• Making a statement about multiple parts of a resource

• The OAC Data Model allows for multiple targets by simply having more than one hasTarget relationship.

26

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Multiple Targets Example

27

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Segments of Resources

• Most annotations are about part of a resource

• Different segments for different media types:

• Text: paragraph, arbitrary span of words• Image: rectangular or arbitrary shaped area• Audio: start and end time points, track name/number• Video: area and time points• Other: slice of a data set, volume in a 3d object, …

28

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Segments of Resources (2)

• Web Architecture Segmentation:

• A URI with a Fragment identifies part of the resource• Media-specific fragment identifiers; eg XPointer for XML• W3C Media Fragments URI specification for simplesegments of media: http://www.w3.org/TR/media-frags/

• We introduce a method of constraining resources:

• Introduce an approach for arbitrarily complex segments that cannot be expressed using Fragments• Can be applied to Body or Target resource

29

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Segments of Resources: Fragment URIs

• URI Fragments are a syntax for creating subsidiary URIs that identify part of the main resource

• The syntax is defined per media type• X/HTML: The named anchor or identified element

• http://www.example.net/foo.html#namedSection

• XML: An XPointer to the element(s)• http://www.example.net/foo.xml#xpointer(/a/b/c)

• PDF: Many options, most relevant two operations:• http://www.example.net/foo.pdf#page=2&viewrect=20,80,50,60

• Plain Text: Either by character position or line position:• http://www.example.net/foo.txt#char=0,10• http://www.example.net/foo.txt#line=1,5

•:30

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Segments of Resources: Media Fragments

• Media Fragments allow anyone to create URIs that identify part of an image, audio or video resource.

• The most common case is for rectangular areas of images:• http://www.example.org/image.jpg#xywh=50,100,640,480

• Link to the full resource as well, for all Fragment URIs

31

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Media Fragments Example

32

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Media Fragments Example

33

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Complex Constraints

• The ConstrainedTarget(CT-1) identifies the segment of interest

• The type of description is dependent on the media type and nature of the target resource.

• If a Fragment URI is not possible, we introduce a Constraint to describe the segment of interest

• Media Fragments embed the segment description in the URI• Constraints are entire resources, so can be more expressive• Constraints may also describe 'contextual' information

34

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Constraint Example

35

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Inline Information: Constraints

• We can also use inline information in the same way as for the Body resource to include the Constraint data.

36

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Inline Constraint Example

37

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

RDF Constraints

• Instead of having the information in an external document, it could be within the RDF of the Annotation document.

• We attach information to the Constraint node

• The Annotation Ontology models its "Selectors" in this way

http://code.google.com/p/annotation-ontology/

38

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

RDF Constraint Example

39

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

RDF Constraint plus Media Fragment

40

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Constrained Body

• The body may also be constrained in the same way as Targets

41

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Constrained Body

42

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Annotating a Non-Document

• The Target of an Annotation does not have to be a document: can be a Non Information Resource

• Non Information Resource as Target• Annotations about:

• Concepts• Physical things • Locations • or any other non-document

• Example: A video about a real life painting

• Non Information Resource as Body• We'll come back to this…

43

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Non Information Resource Example

44

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Annotations and Time

• Web Resources change over time, but keep the same URI

• This is important for linking, but makes Annotation hard. We need to know when the annotation applies to the resource.

• This is true for Body and Target(s).

• The created time is not sufficient, as the Annotation, Body and Target could be created at different times. The Body could be about a previous state of the Target.

45

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Web-Centric Annotation: No Persistence

Google Sidewiki Annotation on http://news.bbc.co.uk/ as of 2010-06-14

46

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Web-Centric Annotation: No Annotations

Archived page from: http://www.dracos.co.uk/work/bbc-news-archive/2010/03/08/07.05.html

47

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Web-Centric Annotation: Desired Cross-Linking

48

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Annotations and Time

• There are three different types of Annotation with respect to Time:

• Timeless Annotations• These are always relevant, regardless of the current state of the resource.

• Uniform Annotations• There is a single timestamp at which all of the resources should be considered.

• Varied Annotations• Each of the resources (Body, Targets) should be considered at a different time.

49

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Timeless Annotations

• The model for a Timeless Annotation is the base model

50

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Uniform Annotations

• If the same time is applicable to all resources, we attach it to the Annotation using the oac:when predicate.

51

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Uniform Annotation Example

52

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Varied Annotations

• If different timestamps are required for each resource, we use oac:when from an oac:TimeConstraint.

53

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

Varied Annotation Example

54

Open Annotation: Technical OverviewRobert Sanderson, Herbert Van de Sompel

http://www.openannotation.org/

http://groups.google.com/group/oac-discuss

55