tmf - a tutorial part 3: designing (schemas and) filters

Post on 01-Feb-2016

34 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

TMF - a tutorial Part 3: Designing (schemas and) filters. TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria. General principles. Terminological information interchange Three components: Source TDB 1 Target TDB 2 Terminological interchange format - PowerPoint PPT Presentation

TRANSCRIPT

TMF - a tutorialPart 3: Designing (schemas and)

filters

TMF - Terminological Markup Framework

Laurent Romary - Laboratoire Loria

General principles

Terminological information interchange– Three components:

• Source TDB1

• Target TDB2

• Terminological interchange format– A specific TML (DXLT, Geneter)

TDB1 TDB2

TML

Important notice

– GMT is not a TML• A too abstract format

– Uncontrolled recursivity (‘ struct ’ element)

– Uncontrolled content (‘ feat ’ and ‘ annot ’)

• Necessity to provide a schema to check interchanged data

– Precise list of datacategory

– Precise definition of format

– GMT is here to provide conceptual simplicity

Designing filters

TML to GMT

General principles

Just for your information– The creation of the filters can be automatized

Basic processes– Reduction of expansion trees– Mapping elements and attributes to the

corresponding data categories

Reducing expansion trees

Example• DXLT (Martif) sub-tree

<ntig><!-- some general information associated with the term --><termGrp>

<!-- term related information --></termGrp>

</ntig>

• GMT<struct type="TS"><!-- some features -->

</struct>

Element mapping

Example• DXLT (Martif)

<definition>Bla, bla, bla etc.</definition>

• GMT<feat type="definition">Bla, bla, bla etc.</feat>

Structural elements

Generating a GMT ‘ struct ’ element

<xsl:template match="termEntry"><xsl:element name="struct">

<xsl:attribute name="type">TE</xsl:attribute>

<xsl:apply-templates select="@*|node()"/></xsl:element>

</xsl:template>

Features

Generating a GMT‘ feat ’ element» (style=Attribute)

<xsl:template match="@id"><xsl:element name="feat">

<xsl:attribute name="type">iso12620-identifier</xsl:attribute>

<xsl:value-of select="."/></xsl:element>

</xsl:template>

Features

Generating a GMT‘ feat ’ element» (style=Element)

<xsl:template match="term"><xsl:element name="feat">

<xsl:attribute name="type">iso12620-term</xsl:attribute>

<xsl:apply-templates/></xsl:element>

</xsl:template>

Features

Generating a GMT‘ feat ’ element» (style=TypedElement)

<xsl:template match="descrip[@type='subjectField']"><xsl:element name="attr">

<xsl:attribute name="type">SubjectField</xsl:attribute>

<xsl:apply-templates/></xsl:element>

</xsl:template>

XML Schemas for TMLs

…work ahead…

Analysing existing TDBs

Towards a generic methodology

General Architecture

TDB Flat XML GMT TMLForm

at spe

cific

XSL

sty

lesh

eet

Sim

ple

DB dum

per

Autom

atic G

MT2

TML st

yles

heet

A two phase process

List the various Data Categories used in the TDB– Relate them to existing registries (e.g. iso 12620),

cf. http://salt.loria.fr/public/salt/DCQuery.html

Identify the underlying organization of the TDB– Relate it to the Meta-model– Anchor the DatCat where they actually occur

Analysis of an existing TDB

Going through an example

Eurodicautom sample<entry>

<BE>BTB</BE><TY>DAG77</TY><NI>398</NI><CF>3</CF><CM>AG1</CM><CM>JUA</CM><EN>

<VE>key money</VE><RF>CILF,Dict.Agriculture,ACCT,1977</RF>

</EN><FR>

<VE>pas-de-porte</VE><DF>prix payé au précédent occupant pour le droit d'entrer dans une

exploitation agricole</DF><RF target="DF">TNC(1997)</RF><RF>CILF,Dict.Agriculture,ACCT,1977</RF><NT type="NTE">droit rural;pratique prohibée par la loi</NT>

</FR></entry>

definition-12620A.5.1 (TS)

term-12620A.1 (TS)

Language 12620A.10.7(LS)

note-12620A.8 (TS)

classificationCode-12620A.4.2 (TE)

Result in GMT (1/2)<tmf>

<struct type="TE"><feat type="entryIdentifier-12620A.10.15">BTB-TY-398</feat><feat type="originatingInstitution-12620A.10.22.2">BTB</feat><feat type="projectSubset">DAG77</feat><feat type="NI">398</feat><feat type="reliabilityCode">3</feat><feat type="classificationCode-12620A.4.2">AG1</feat><feat type="classificationCode-12620A.4.2">JUA</feat><struct type="LS">

<feat type="language-12620A.10.7">EN</feat><struct type="TS">

<feat type="term-12620A.1">key money</feat></struct><feat type="sourceIdentifier-

12620A.10.20">CILF,Dict.Agriculture,ACCT,1977</feat></struct>

Result in GMT (2/2)<struct type="LS">

<feat type="language-12620A.10.7">fr</feat><struct type="TS">

<feat type="term-12620A.1">pas-de-porte</feat>

</struct><brack>

<feat type="definition-12620A.5.1">prix payé au précédent occupant pour le droit d'entrer dans une exploitation agricole</feat>

<feat type="sourceIdentifier-12620A.10.20">TNC(1997)</feat>

</brack><feat type="sourceIdentifier-

12620A.10.20">CILF,Dict.Agriculture,ACCT,1977</feat><feat type="note-12620A.8">droit rural;pratique

prohibée par la loi</feat></struct>

</struct></tmf>

Simple rules

Using XSL locality

<xsl:template match="CM"> <feat type="classificationCode-12620A.4.2"> <xsl:apply-templates/> </feat></xsl:template>

Introducing specific levels

Necessity to combine structure and content

<xsl:template match="VE"> <struct type="TS"> <feat type="term-12620A.1"> <xsl:apply-templates/> </feat> </struct></xsl:template>

Default rule

Useful for keeping track of unmapped data categories

<xsl:template match="*"> <feat> <xsl:attribute name="type">

<xsl:value-of select="name()"/></xsl:attribute>

<xsl:apply-templates/> </feat></xsl:template>

Useful pointers

TMF page:– http://www.loria.fr/projets/TMF

HLT/Salt project page– http://www.loria.fr/projets/SALT

Data category query tool:– http://salt.loria.fr/public/salt/DCQuery.html

top related