open annotation: annotating high energy physics on the web

52
Open Annotation AAHEP5, 22 nd of September 2011, Ithaca, NY 1 Open Annotation: Annotating High Energy Physics on the Web Robert Sanderson [email protected] Los Alamos National Laboratory @azaroth42 Herbert Van de Sompel [email protected] Los Alamos National Laboratory @hvdsomp http://www.openannotation.org/ This research is funded, in part, by the Andrew W. Mellon Foundation

Upload: robert-sanderson

Post on 11-May-2015

922 views

Category:

Technology


2 download

DESCRIPTION

Presentation given at AAHEP5 (Cornell University) on Sept 22nd, 2011 about the use of Open Annotation.

TRANSCRIPT

Page 1: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

1

Open Annotation: Annotating High Energy Physics on the Web

Robert Sanderson [email protected] Los Alamos National Laboratory @azaroth42

Herbert Van de Sompel [email protected] Los Alamos National Laboratory @hvdsomp

http://www.openannotation.org/

This research is funded, in part, by the Andrew W. Mellon Foundation

Page 2: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

2

Overview

•  Motivation

•  Open Annotation Model •  Basics •  Segments •  Resources Changing over Time •  Machine Annotations

•  Network Model

•  Quick Demo

Page 3: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

3

Scholarly Communication

Scholarly Communication is increasingly: •  Online •  Open •  Distributed •  Collaborative •  Data-Oriented

Annotation is a scholarly primitive, spanning discipline and level.

Need to ensure that Digital Annotations fall under these headings!

•  Apply the standards and architecture of the World Wide Web to the Annotation use case. •  Even if scholar doesn’t share annotations with others, she will want to access them from different tools and environments.

Page 4: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

4

Open Annotation Collaboration

http://www.openannotation.org/

Focus on interoperable sharing of annotations •  Web-centric and open, not locked down silos •  Create, consume and interact in different environments •  Build from a simple model for simple cases, to more detailed for complex scholarly annotation requirements

The Collaboration: •  Los Alamos National Laboratory •  University of Illinois at Urbana-Champaign •  University of Queensland •  University of Maryland •  George Mason University •  plus many adopters and partners

Page 5: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

5

Open Annotation Data Model

Design Guidelines: •  Based on the Architecture of the World Wide Web •  … and on Linked Open Data •  Should be general enough for ease of adoption •  … and rich enough to cover scholarly use cases

Status: Beta, with 9 ongoing funded experiments to inform 1.0

Hardest part: Define what an Annotation is! •  "Aboutness" is key to distinguish from general metadata

A document that describes how one resource is about one or more other resources, or part(s) thereof.

Page 6: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

6

Basic Model

The basic model has three resources: •  Annotation (an RDF document) •  Body (the ‘comment’ of the annotation) •  Target (the resource the Body is ‘about’)

Page 7: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

7

Basic Model Example

Page 8: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

8

Additional Relationships and Properties

Any of the resources can have additional information attached, such as creator, date of creation, title, etc.

Page 9: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

9

Additional Properties Example

Page 10: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

10

Annotation Types

There can be further types of Annotation, such as a Reply. Example: Replies are Annotations on Annotations.

Page 11: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

11

Annotation Types Example

Page 12: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

12

Inline Information

It is important to be able to have content contained within the Annotation document for Client Autonomy:

•  Clients may be unable to mint new URIs for every resource •  Clients may wish to transmit only a single document •  Third parties can generate new URIs if the client does not

The W3C has a Content in RDF specification: •  http://www.w3.org/TR/Content-in-RDF10/

Page 13: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

13

Inline Information: Body

•  We introduce a resource identified by a non resolvable URI, such as a UUID URN, as the Body. •  We then embed the data within the Annotation document using 'chars’ from Content in RDF.

Page 14: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

14

Inline Body Example

Page 15: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

15

Multiple Targets

There are many use cases for multiple targets for an Annotation: •  Comparison of two or more resources •  Making a statement that applies to all of the resources •  Making a statement about multiple parts of a resource

The OAC Data Model allows for multiple targets by simply having more than one hasTarget relationship.

Page 16: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

16

Multiple Targets Example

Page 17: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

17

Segments of Resources

Most annotations are about part of a resource

Different segments for different media types:

•  Text: paragraph, arbitrary span of words •  Image: rectangular or arbitrary shaped area •  Audio: start and end time points, track name/number •  Video: area and time points •  Other: slice of a data set, volume in a 3d object, …

Page 18: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

18

Segments of Resources

Web Architecture Segmentation:

•  A URI with a Fragment identifies part of the resource: •  IETF Mime-type fragment identifiers; eg xpointer •  W3C Media Fragments URI specification for simple segments of media: http://www.w3.org/TR/media-frags/

We introduce a method of constraining resources:

•  Introduce an approach for arbitrarily complex segments that cannot be expressed using Fragments •  Can be applied to Body or Target resource

Page 19: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

19

Segments of Resources: IETF Fragment URIs

URI Fragments are a syntax for creating subsidiary URIs that identify part of the main resource

The syntax is defined per media type:

•  X/HTML: The named anchor or identified element

•  XML: An XPointer to the element(s)

•  PDF: Many options, especially page and viewrect

•  Plain Text: Either by character position or line position

Page 20: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

20

Segments of Resources: W3C Media Fragments

Media Fragments allow anyone to create URIs that identify part of an image, audio or video resource.

The most common case is for rectangular areas of images: •  http://www.example.org/image.jpg#xywh=50,100,640,480

Link to the full resource as well, for all Fragment URIs

Page 21: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

21

Media Fragments Example

Page 22: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

22

Complex Constraints

Fragment URIs are not always possible •  Introduce a Constraint that describes the segment of interest •  And a ConstrainedTarget that identifies the segment of interest •  Constraints are entire resources, so can be more expressive •  Constraints may also describe 'contextual' information

Page 23: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

23

Constraint Example

Page 24: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

24

RDF Constraints

Instead of having the information in an external document, it could be within the RDF of the Annotation document.

•  We can attach information to the Constraint node

•  Or use the Content in RDF specification to include what would have been in the external document

Page 25: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

25

RDF Constraint Example

Page 26: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

26

Constrained Body

The Body may also be constrained in the same way as Targets

Page 27: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

27

Web-Centric Annotation: No Persistence

Google Sidewiki Annotation on http://news.bbc.co.uk/ as of 2010-06-14

2

Page 28: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

28

Web-Centric Annotation: No Annotations

Archived page from: http://www.dracos.co.uk/work/bbc-news-archive/2010/03/08/07.05.html

2

Page 29: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

29

Web-Centric Annotation: Desired Cross-Linking

2

Page 30: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

30

Annotations and Time

There are three different types of Annotation with respect to Time:

•  Timeless Annotations •  These are always relevant, regardless of the current state of the resources.

•  Uniform Annotations •  There is a single timestamp at which all of the resources should be considered.

•  Varied Annotations •  Each of the resources (Body, Targets) should be considered at a different time.

3

Page 31: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

31

Timeless Annotations

The model for a Timeless Annotation is the base model

3

Page 32: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

32

Uniform Annotations

If the same time is applicable to all resources, we attach it to the Annotation using the oac:when predicate.

3

Page 33: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

33

Uniform Annotation Example

3

Page 34: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

34

Varied Annotations

If different timestamps are required for each resource, we use oac:when from an oac:WebTimeConstraint.

3

Page 35: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

35

Varied Annotation Example

3

Page 36: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

36

Annotations for/by Machines?

•  The Body consists of one or more Statements •  Human understandable: Text, Image, Video, Symbols, … •  Machine understandable: Data

•  Humans can infer relationships and context, Machines cannot •  Need to be as explicit as possible •  Need structured information

•  When would we do this? •  Nano-Publications: publication of data for further processing •  Simple Data associations •  Semantic annotations

3

Page 37: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

37

Human Annotation of Human Annotation

3

Page 38: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

38

Human Annotation of Human Annotation

3

But what about this?

Page 39: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

39

Simple Data Association

3

Equivalent to 12 people saying: "I Like This"

I = ex:User Like = fb:likes

This = ex:Anno

Page 40: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

40

Human Language Annotation

4

Page 41: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

41

Machine Language Annotation?

4

Page 42: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

42

Transcription for Humans: Image to Text

4

Page 43: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

43

Transcription for Machines: Single Named Entity

4

Page 44: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

44

Transcription for Machines: Entity Relationships

4

Page 45: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

45

Advantages of this Approach

4

•  Only uses existing OAC constructions •  No new ‘isTopicOf’ or similar shortcut relationships •  Creator of Body is not confused with Creator of Annotation!

•  Aligns very closely with human annotation practices

•  Consistent model that scales from resource to part of resource •  Can annotate data extracted at most appropriate level •  Could extract from sentence/paragraph/section/entire text

•  Consistent model that allows association of any amount of data: •  From Single Entity •  To scholarly discourse extraction from entire document

Page 46: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

46

Annotation Protocols

Protocol: publish, subscribe, consume tied together

4

Unlike previous systems, Open Annotation does not mandate a protocol.

No reliance on a client/server combination gives the client autonomy.

Instead we promote a publish/subscribe methodology, where annotations may be stored and consumed from anywhere.

Page 47: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

47

Publish/Subscribe Method

publish

4

We don’t specify how this transfer should occur

Page 48: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

48

publish subscribe

4

Publish/Subscribe Method

Nor this.

Page 49: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

49

publish subscribe consume

4

Publish/Subscribe Method

Nor this.

Page 50: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

50

5

Publish/Subscribe Advantages

•  Client can use most appropriate method for transferring annotation to storage service •  May already be specified in certain domains •  Can use existing services without requiring direct adoption

•  Annotations are web resources in their own right •  Can be protected for limited access using existing technology •  Have their own URIs by necessity, not good-will of service

•  Promotes a market-place of services •  Archiving Annotations and resources for preservation •  Enriching with additional metadata and information •  Aggregation and curation to provide trusted annotation feeds

Page 51: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

51

5

Demo!

http://www.shared-canvas.org/impl/demo3/

•  Took the PDF of Dirac’s thesis on Quantum Mechanics and split into individual page images

•  Allow transcription by annotation, and commentary by annotation •  Allows storage at different services, both public and private

Page 52: Open Annotation: Annotating High Energy Physics on the Web

Open Annotation AAHEP5, 22nd of September 2011, Ithaca, NY

52

Thank You

Robert Sanderson [email protected] [email protected] @azaroth42

Web: http://www.openannotation.org/ Paper: http://arxiv.org/abs/1003.2643

Slides: http://slidesha.re/……