mesmuses methodology lessons learned and open issues… alain michard florence, june 2003

12
MESMUSES methodology Lessons learned and open issues… Alain Michard Florence, June 2003

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

MESMUSES methodology

Lessons learned and open issues…

Alain MichardFlorence, June 2003

MESMUSES broad vision

Just like several other projects SW is all about semantic interoperability

Sharing machine-readable terminologies and classification schemes

Science and culture are collective and international

Semantic Web methodology should be highly relevant for managing and sharing scientific and cultural information

Some key S&T issues in the Project

Model : is RDFS / OWL-Lite adequate ?

Schema authoring : method and tools needed !

Metadata : where does it come from ?

Automatic Indexing : experiments with a categorizer

The basic SW model

Dwelling Person Artefact

House Artist Artwork

Lives-in

Owner

Produces

Create

Type : texte imprimé, monographie

Auteur(s) : Zola, Émile (1840-1902)

Titre(s) : L'assommoir [Texte imprimé] / par Emile Zola

Edition : 50e éd.

Publication : Paris : G. Charpentier, 1878

Description matérielle : 111-569 p.

Notice n° : FRBNF35963044

CreatesLives-in Surrogates

Schema

Real-worldentities

Model and Schema Language

Typed attributes are needed XML-Schema types Derived types (e.g.: Celsius temperature,

Gregorian date, etc.) Enumerated types, thesauri

Time-stamping Cardinality constraints Explicit transitivity of properties (e.g.:

geographic inclusion)

Schema authoring issues (1)

Find the right level of abstraction Is « Glucid » a class or an instance ? Or is it sometime a class and sometime an

instance ?

Avoid the « KR » attitude and practices ! It’s all about indexing resources with shared

terminologies, not about representing human knowledge !

Schema authoring issues (2)

est-régulé-par

est-expliquée-par

Processus

Processusélémentaire

Processuscomplexe

est-réalisé-par

nécessite

déclenche

Structure

Cellule

Molécule

Organisme

Appareil

Organe

Tissus

Système

GTANSGrande Thématique

est-documentée-par

est-documentée-par

est-constitué-de

consomme

transforme

produit

implique

est-constitué-de

élimine

ISAISA

ISA

Schema authoring issues (3)

Schema authoring issues (4)

Authoring tools are badly needed Graphical representation of the schema Zooming on sub-graphs (hierarchies) Versioning

Consider using UML authoring environment ?

Established methodology and tutorials are needed

Creating Surrogates

Data extraction and fusion from structured sources

R-DB, XML-DB, LDAP Updating

When ? Should not create duplicates !

Detect cross-references Authority lists Thesauri Lexical distance ???

Automatic Categorization

Automatic indexing By extracting metadata from resources By automatic categorization

Define hierarchies of « concepts » inside the schema

Seeding with representative documents Machine learning to create categorizers

Pros : enriched search functionality Cons : hierarchies of categories are static

Adding a category may change the categorizers of the others

Bottom-line…

RDFS schema authoring may be more difficult than E-R modelling

Debates on syntactic features are irrelevant Should be grounded on real-world implementations

and testbeds

A new query language (e.g.: RQL) is not high priority

We have not addressed the « logical rules » layer

Semantic Web vs. Community Webs