ctm document - open-std.org · type value scope? reifier? • can be – either the ”kind” of...

46

Upload: others

Post on 28-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

WG3, Montreal, 2006-08-11

CTM document

• Consists of a header and a body

• Header

– required and optional declarations

• Body

– assertion blocks

– verbose associations

• White space

– (mostly) not significant, except inside strings

• Comments

– may occur anywhere white-space is allowed

WG3, Montreal, 2006-08-11

Comments

• One line comments

# This is a comment. It ends at the end of the line

• Multiline comments

/*

This is also a comment.

It continues across

multiple lines.

*/

• Issue: Do we need multiline comments?

WG3, Montreal, 2006-08-11

Topic identifiers

• Can be one of the following:

• Notes:

– An ID is like an XML Name (except that colon is explicitly not allowed)

– HTTP IRIs get special status, in order to encourage their use as identifiers

– A local ID becomes an item identifier; a QName becomes a subject identifier

http://en.wikipedia.org/wiki/PucciniHTTP-IRIsubject identifier

<urn:x-myns:music:puccini>"<" IRI ">"

=http://www.puccini.it/”=” HTTP-IRIsubject locator

=<ftp://example.com/opera/tosca/synopsis.txt>”=” "<" IRI ">"

wikipedia:PucciniID ":" IDQName

PucciniIDlocal ID

ExampleFormName

WG3, Montreal, 2006-08-11

Datatypes (1 of 3)

• All XML Schema data types supported, but some are built-in, i.e. automatically recognized by the CTM parser

• String

– Same as xsd:string

– Delimited by quotes or triple quotes

"A string containing \"quote\" marks"

"""Another string containing "quote" marks"""

"""Quote marks at the end of a string must always

be "escaped\""""

• Issue: Do we need triple quotes?

WG3, Montreal, 2006-08-11

Datatypes (2 of 3)

• IRI

– Same as xsd:anyURI. Delimited by ”<” and ”>” unless HTTP IRI

<http://en.wikipedia.org/wiki/>

http://en.wikipedia.org/wiki/

<urn:x-myns:music:puccini>

<ftp://example.com/opera/tosca/synopsis.txt>

• Date

– Same as xsd:date

1858-12-22

• Integer

– Same as xsd:integer

42

WG3, Montreal, 2006-08-11

Datatypes (3 of 3)

• Decimal

– Same as xsd:decimal

+42.0

• Other datatypes expressed as strings

– Qualified using ^^

"2A" ^^xsd:hexBinary

"P1Y2M3DT10H30M" ^^xsd:duration

"12-22" ^^xsd:gMonthDay

WG3, Montreal, 2006-08-11

Header

• Encoding directive (opt)

– Specifies the character encoding:

%encoding "Shift-JIS"

• Version directive (opt)

– Specifies the notation and version number:

%version ctm 1.0

• Prefix declarations (opt)

– Specify a URI prefix:

%prefix w <http://en.wikipedia.org/wiki/>

• Templates (opt)

• Issue: Should templates be allowed in the body as well?

WG3, Montreal, 2006-08-11

Templates (1 of 3)

• Currently three kinds:

– name, occurrence, association

• Name

– Tells parser to interprete typing topic as a name type and allows

specification of scope

%name foaf:name # foaf:name is a name type

%name country-code @iso639 # all names of this type

# are scoped by iso639

• Issue: Is scope on a name type bogus?

WG3, Montreal, 2006-08-11

Templates (2 of 3)

• Occurrence templates

– Tells parser to interprete typing topic as an occurrence type and

(optionally) the datatype and/or scope

%occur bio:birthYear ^^xsd:gYear

%occur geo:lat_long @deprecated

• Issue: Is scope on an occurrence type bogus?

WG3, Montreal, 2006-08-11

Templates (3 of 3)

• Association templates

– Allows role types to be omitted and scope to be specified globally

# binary (with scope)

%assoc person bio:born-in place @bio

# ternary

%assoc victim bio:killed-by killer how

# unary

%assoc work opera:unfinished

• Issue: Is scope on an association type bogus?

WG3, Montreal, 2006-08-11

Document body

• Consists of

– assertion blocks

– verbose associations

• Corresponds to (topic | association)* in XTM

– Except that associations can also be specified within an assertion

block (see below)

• Verbose associations

– So named because they do not exploit the compactness offered by

templates (see below)

WG3, Montreal, 2006-08-11

Assertion blocks

• Consist of assertions about a single topic (referred to as the subject)

– Much like <topic> elements, except for ability to include

associations

– Assertions are delimited by semicolons

– Assertion blocks are terminated by a period or an empty line

subject # the subject of all assertions in this block

assertion-1 ; # note the semicolon

assertion-2 ;

[...]

assertion-n . # note the period

• Issue: Should we call these topic blocks (or just topics)?

WG3, Montreal, 2006-08-11

Assertion blocks (cont.)

• White space not significant, except

– Empty line terminates a block

– White space required before a period (hmm...)

subject

assertion-1 ; # space before semicolon

assertion-2; # no space before semicolon

assertion-3 # period omitted; replaced by empty line

# multiple assertions on one line:

subject

assertion-1; assertion-2; assertion-3; assertion-4 .

# multiple blocks on one line:

subject assertion-1 . subject2 assertion-2 .

WG3, Montreal, 2006-08-11

Assertions

• Can be either identifier assignments or statements

– assignment of subject identifier

– assignment of subject locator

– name (or list of names of the same type)

– occurrence (or list of occurrences of the same type)

– association (or list of associations of the same type)

• General form of all assertions is

type value scope? reifier?

– However, in order to achieve compactness, some assertions are abbreviated

– Note how subject-type-value corresponds closely to subject-verb-object (SVO)

• Issue: Would an SOV variant syntax be of use in Korea and Japan?

WG3, Montreal, 2006-08-11

type value scope? reifier?

• Can be

– either the ”kind” of identifier (subject identifier or subject locator)

– or the TMDM [type] of the statement

• When omitted implies that the assertion is either

– a subject identifier assignment (if the value is a URI)

tosca w:Tosca .

– a tm:name-type name (if the value is a string)

tosca "Tosca" .

• Otherwise must be a topic identifier or "="

– The latter can be thought to signify a subject locator

tosca premiere-date 1900-01-14 .

csgp-homepage =<http://www.puccini.it/> .

WG3, Montreal, 2006-08-11

type value scope? reifier?

• A topic identifier or a datatyped value

– Can be a single value

tosca premiere-date 1900-01-14 ;

composed-by puccini .

– Or a comma-separated list

tosca m:libretto-by illica, giacosa .

which is equivalent to

tosca m:libretto-by illica ;

m:libretto-by giacosa .

WG3, Montreal, 2006-08-11

type value scope? reifier?

• A @ followed one or more topic identifiers

– Must be separated by white space (not commas)

opera "Opera" ; "Oopera" @fi ;

dc:description """A drama set to music; consists of...

overture and interludes."""

@wordnet ,

"""Opera refers to a dramatic art form...

through the lyrics.""" @wikipedia en ,

"""L'opera lirica è un genere...ed

il canto.""" @wikipedia it .

WG3, Montreal, 2006-08-11

type value scope? reifier?

• A ~ followed by a topic identifier

tosca "Tosca"

takes-place-in rome ~ tosca-in-rome .

tosca-in-rome "The Setting of Tosca in Rome" ;

bibref """Nicassio, Susan Vandiver:

"Tosca's Rome: The Play and the Opera

in Historical Perspective",

University of Chicago Press

(Chicago, 2002)""" .

• Issue: How to reify the topic map itself (see below)?

WG3, Montreal, 2006-08-11

Names

• Default name type inferred if no type component

puccini "Puccini, Giacomo" .

• Specified name type must be declared using a template:

– (otherwise the assertion will be interpreted as an occurrence) ...

%name foaf:name

puccini foaf:name "Giacomo Puccini" .

– ... unless a name flag is present:

puccini %name foaf:name "Giacomo Puccini" .

– Flags allow defaults and template information to be overridden, and

templates to be omitted (e.g. in a simple TMQL update)

• Issue: What would be a good syntax for this flag?

WG3, Montreal, 2006-08-11

Variants

• Specified in parens after the base name

boito

"Boïto, Arrigo" ( "boito, arrigo" @sort ) .

tchaikovsky

"Tchaikovsky"

( "Tsjaikovski" @search )

( "Tschaikovski" @search ) .

WG3, Montreal, 2006-08-11

Occurrences

• Type must be specified

– Value can be any datatyped value

puccini

article http://en.wikipedia.org/wiki/Giacomo_Puccini ; # IRI

description "The greatest of the verismo composers" ; # string

date-of-birth 1858-12-22 ; # date

bibref """Budden, Julian: "Puccini: His Life and Works",

Oxford University Press (Oxford, 2002)""" ; # string

xml-bibref # XML

"""<bibitem id="budden"><bib>Budden</bib>

<pub>Budden, Julian:

<highlight style="ital">Puccini: His Life and Works</highlight>,

Oxford University Press (Oxford, 2006)</pub>

</bibitem>""" ^^xsd:anyType .

• Issue: Can quotes be avoided for XML values?

WG3, Montreal, 2006-08-11

Associations (1 of 3)

• Verbose associations are specified outside assertion blocks

– Correspond to <association> elements in XTM

– Templates are not required (or can be overridden)

– Semicolon delimited role-type/role-player pairs

born-in( person puccini ; place lucca ) .

unfinished( work turandot ) .

killed-by( victim scarpia; killer tosca; how stabbing) .

• Reifying roles

– Roles can only be reified in a verbose association:

pupil-of( pupil puccini ~ puccini-pupil-role ;

teacher angeloni ) .

WG3, Montreal, 2006-08-11

Associations (2 of 3)

• Compact associations are specified inside assertion blocks

– Must have corresponding templates; allow role types to be omitted

%assoc person born-in place # binary

puccini born-in lucca .

= born-in( person puccini ; place lucca ) .

%assoc work unfinished # unary

turandot unfinished .

= unfinished( work turandot ) .

%assoc victim killed-by killer how # ternary

scarpia killed-by tosca stabbing .

= killed-by( victim scarpia; killer tosca; how stabbing) .

WG3, Montreal, 2006-08-11

Associations (3 of 3)

• Omitted roles

– It is possible to omit roles specified by the template in an instance

of an association type

– In the example, there is no role player for the role of "killer"

%assoc victim killed-by killer how

cavaradossi killed-by %role shooting .

• Built-in association types

– type-instance and supertype-subtype have shorthands

puccini ISA composer .

composer AKO person .

• Issues: Is AKO necessary? Are the keywords ISO and AKO acceptable?

Steve PepperConvenor SC34/SG3Coordinator RDFTM Task Force

[email protected]

WG3, Montreal, 2006-08-11

Some additional thoughts

WG3, Montreal, 2006-08-11

CTM as surface syntax?

• bio:killed-by( victim scarpia; killer tosca; how stabbing

%assoc victim is-killed-by killer how

is-killed-by bio:killed-by; "Killed by” .

scarpia is-killed-by tosca stabbing .

WG3, Montreal, 2006-08-11

CTM as surface syntax?

• bio:killed-by( victim scarpia; killer tosca; how stabbing)

%ignore by

%assoc victim is-killed-by killer how

%assoc killer kills victim (by) how

is-killed-by bio:killed-by; "Killed by” .

kills bio:killed-by .

scarpia is-killed-by tosca stabbing .

tosca kills scarpia by stabbing .

WG3, Montreal, 2006-08-11

CTM as surface syntax?

• bio:killed-by( victim scarpia; killer tosca; how stabbing)

%ignore by

%assoc victim is-killed-by killer how

%assoc killer kills victim by how

%assoc killer commits-suicide %self victim by how

is-killed-by bio:killed-by; "Killed by” .

kills bio:killed-by .

commits-suicide bio:killed-by.

scarpia is-killed-by tosca stabbing .

tosca kills scarpia by stabbing .

tosca commits-suicide by stabbing .

WG3, Montreal, 2006-08-11

CTM as surface syntax?

• bio:killed-by( victim scarpia; killer tosca; how stabbing)

%ignore by

%assoc victim is-killed-by killer how

%assoc killer kills victim by how

%assoc killer commits-suicide %self victim by how

is-killed-by bio:killed-by; "Killed by” .

kills bio:killed-by .

commits-suicide bio:killed-by.

scarpia is-killed-by tosca stabbing .

tosca kills scarpia by stabbing .

tosca commits-suicide by stabbing .

# using CSS-style multiple subjects

kills, is-killed-by, commits-suicide { bio:killed-by }

WG3, Montreal, 2006-08-11

SOV example (Korean)

• 우리 는 대한민국 입니다

– uri neun te-han-min-guk imnida

we [sub] Republic of Korea are

%version ctm 1.0 ov

%ignore 는

%assoc subject object 입니다

우리 는 대한민국 입니다 .

입니다 ( subject 우리 ; object 대한민국 ) .

are( subject we ; object RepublicOfKorea ) .

WG3, Montreal, 2006-08-11

Indentation for hierarchies?

[

animal

mammal

canine

dog

wolf

feline

tiger

lion

reptile

snake

neo-con

insect

]

%assoc topic subtopicOf subtopic

%hierarchy subtopicOf [

arts

movies

television

music

...

business

jobs

real-estate

investing

...

computers

internet

software

hardware

...

]

Steve PepperConvenor SC34/SG3Coordinator RDFTM Task Force

[email protected]

WG3, Montreal, 2006-08-11

Discussion

WG3, Montreal, 2006-08-11

type-instance & subtype-supertype

• How to represent type-instance and subtype-of shortcuts?

– ISA and AKO; isa and subtype-of; or delimiters?

• Resolution

– Replace QNames with new concept for abbreviated URIs called

“bangers” that use the bang instead of the colon

foo!bar

– Rule for concatenation of tuples is just concatenate without

inserting any other characters

– Then we use : and :: for instance-of and subtype-of respectively

– Those that prefer a keyword (isa, ako) can define local IDs through

templates, but the WG advises against doing this...

– ...but see examples later that use isa and subtypeOf instead of

: and ::

WG3, Montreal, 2006-08-11

Is the new “QName” syntax sufficient?

– What about query URIs?

%prefix foo <http://psi.example.org?id=>

foo/puccini, foo&puccini, foo+puccini

foo*puccini, foo’puccini, foo`puccini

foo!puccini

http://psi.example.org?id=puccini

– What about local IDs beginning with a number?

banger := URI "!" NMTOKEN

http://psi.oasis-open.org/iso/3166/#004

%prefix ctry http://psi.oasis-open.org/iso/3166/#

ctry!004

WG3, Montreal, 2006-08-11

Should WS+ be required before "."?

• Problem

– Sometimes necessary in order to avoid "." being regarded as part of

the local ID or QName

puccini born-in lucca. # "lucca." taken as local ID

– Inconsistency with ";" and "," unless WS also required here

puccini : composer ; born-in lucca ; died-in brussels .

– Cf Andreas Sewe’s proposal for a CSS inspired, period-free syntax

– Solution: delimit assertion blocks w/ curlies; allow local IDs outside*

puccini { "Giacomo Puccini"; isa: composer; sid: music!puccini; born-in: lucca; died-in: brussels;}

* Local IDs are purely syntactic and

do not become part of the model,

i.e. they explicitly do not result in item identifiers; these have their

own syntax and the keyword iid.

WG3, Montreal, 2006-08-11

Examples

foafName {sid: foaf!name;isa: tm/nametype;

}

puccini {"Giacomo Puccini";isa: composer;sid: music!puccini;born: 1858-12-22;died: 1924-01-14;foafName: "Fred Blogs" @pseudonym;iid: http://www.ontopia.net/tms/opera.ltm#puccini;composed: tosca @foo, butterfly @bar, turandot @baz;pupil-of: ponchielli @bar baz;died-in: brussels;}

sid: subject identifier

slo: subject locator

iid: item identifer

isa: type-instance

NOTE: Willingness to

consider use of colon after type, but no

decision taken. Also

willingness to consider

replacing : and :: with isa and subtypeOf.

WG3, Montreal, 2006-08-11

Examples

tosca { ”Tosca”;consists-of: [act1, act2, act3];} # possible syntax for ordering

csgp-homepage {”Centro Studie home page”;slo: http://www.csgp.it/;}

scarpia { ”Baron Scarpia”; isa: character; # use instead of ":"? appears-in: tosca; killed-by: floria stabbing;}

{ "Some topic"; sid: foo/bar; descr: "This topic has no item identifier. Without the SID (or a SLO or IID) it would be invalid.” ; }

composer { subtypeOf: person } # use instead of "::"?

WG3, Montreal, 2006-08-11

Naming of association concepts

• What (if anything) gets to be called “associations”?

– “associations” and “verbose associations” (as in the text)

– “template expansions” and “associations” (as in the EBNF)

• Not discussed.

WG3, Montreal, 2006-08-11

SOV variant of CTM?

• Do we want to allow an SOV variant of CTM

– http://en.wikipedia.org/wiki/Subject_Object_Verb

%version ctm 1.0 ovpuccini {lucca born-in:}tosca {1900-01-14 premiere-date:}

• Conclusion:

– Sam Oh will consider this from the perspective of the Korean

language and create some examples based on the current

proposal.

WG3, Montreal, 2006-08-11

Templates

• Restrict templates to header or allow anywhere?

– http://www.petesbox.net/pipermail/sc34wg3/2006-August/003258.html

– Consensus: Restrict to header

• What functionality for templates?

– Do we need topic templates? Not discussed.

• What syntax for templates?

– http://www.petesbox.net/pipermail/sc34wg3/2006-August/003260.html

– Some ideas produced, but these are not recommendations

• What form should name and occurrence flags have (related to template syntax)?

– Not discussed.

WG3, Montreal, 2006-08-11

Templating ideas (DB)

foaf-name {sid: foaf!name;isa: tm/nametype;}

composed {isa: ctm!object-property;type: music!composed-by;subject-role: composer;object-role: work;}

composed-by {isa: ctm!object-property;type: music!composed-by;subject-role: work;object-role: composer;}

puccini {foaf-name: ”Fred Blogs”;composed: tosca, butterfly, turandot;}

WG3, Montreal, 2006-08-11

Syntax for reifying the topic map?

– Current proposal

~myMap

– Directive

%topicmap myMap

– Predefined topic

ctm!thisMap

– Issue not discussed.

WG3, Montreal, 2006-08-11

Other issues

• Assertion block with multiple subjects

– The following should be possible:

puccini, verdi, ponchielli { : composer }

• Hierarchies

– Favourable reaction, but no recommendation.

• Proposal for modifiers in parentheses

– Discussed but not adopted in current form

• Strings

– Triple quotes not needed

WG3, Montreal, 2006-08-11

Other ideas

• The following ideas came up and need to be explained and justified by their various proponents

– Named regions of ctm doc (GDM)

– Blocks of “subdocuments” w/ own header/body (LJH)

– Named collections of topics (LJH)

– Named scopes (SP)

– Ordered lists (DB)

– Embedded assertions

May have value for reification. Should probably be avoided

otherwise.

– GDM’s proposal to be able to specify the base URI (for the

construction of item identifiers) is no longer considered relevant