seek semantic mediation
DESCRIPTION
SEEK Semantic Mediation. Shawn Bowers Bertram Ludäscher e-Science Centre, May 11-14, 2004,. Outline. The Sparrow Toolkit Semantic Registration Ontology-Driven Structural Transformation. Outline. The Sparrow Toolkit Semantic Registration Ontology-Driven Structural Transformation. - PowerPoint PPT PresentationTRANSCRIPT
SEEK Semantic Mediation
Shawn BowersBertram Ludäscher
e-Science Centre, May 11-14, 2004,
Outline
• The Sparrow ToolkitThe Sparrow Toolkit
• Semantic RegistrationSemantic Registration
• Ontology-Driven Structural TransformationOntology-Driven Structural Transformation
Outline
• The Sparrow ToolkitThe Sparrow Toolkit
• Semantic RegistrationSemantic Registration
• Ontology-Driven Structural TransformationOntology-Driven Structural Transformation
Semantic Mediation in SEEK: Our focus
Resource Discovery– Ontology-driven tools to help search for datasets and
services using semantic descriptions …
Data Transformation – Determine and execute mappings to compose services
and bind data to services
Data Integration– Provide reconciled, uniform access to multiple datasets
“Semantic” Workflow Analysis– Verify semantic correctness, accumulate semantic
information, and provide workflow planning/suggestion services … the future
The Sparrow Toolkit: Vision
Lightweight Languages and command-line-style services to support mediation
– Syntax and language conversion • DL, FOL, OWL, RDF, …
– Reasoning• subsumption, classification, consistency, satisifiability,
datatypes, instance classification, …
– Display utilities • hierarchies, OO/ER style models, OWL DLs?
– Query• Query answering, semantic query rewriting, semantic
registration, integration, …
Logic-based implementation (Prolog)
Some sparrow-dl (Taxon example)
Some more sparrow-dl (“textbook” example)
display_formulas(KB)
display_preclassified_hierarchy(K)
display_classified_hierarchy(K)
display_classified(K)
Outline
• The Sparrow ToolkitThe Sparrow Toolkit
• Semantic RegistrationSemantic Registration
• Ontology-Driven Structural TransformationOntology-Driven Structural Transformation
Adding semantics to EML: Observations
The finer grain the annotation, the more opportunity for discovery, integration, and transformation …
The coarser grain the annotation, the harder it is to do useful operations; unless your ontology is very deep
annotation granularity
ontologydepth
finecourse
shallow
deepmaximal ontology/annotation leverage
Semantic Registration (SSDBM’04)
By annotation granularity, we mean:
– Resource-Level “Metadata”– Attribute Level (the attribute itself)– Attribute Level (as a collection-value)– Attribute Level (as independent values)– Attribute Groups (as a collection-value or independent
values)– Filtered values (e.g., SQL where-clause)– Specific value annotations (as a mapping function or
stated by-hand)
Often, integration and transformation require very detailed annotations
Some Examples (arguments against concepts-as-labels)
r(…, lt, ln, …)
sem(lt) == latitudesem(ln) == longitude
Question: What do these annotations mean? 1. The name “lt” itself refers to latitude?2. The set of values in the column taken as a whole make
up a latitude (like coverage)3. Each individual value in the column denotes a separate
latitude (Is it a latitude though? Or just a coded rep.?)
We want to avoid these ambiguous anntotations … often
Some Examples (still not enough)
r(…, lt, ln, …)
sem(lt) == values represent latitudesem(ln) == values represent longitude
More problems: How do I know lt and ln go together to form a location, for example, …
Location
Latitude Longitude
lat lon
Some Examples (still not enough)
r(…, lt, ln, lt-end, ln-end, …)
sem(lt) == values represent latitudesem(ln) == values represent longitudesem(lt-end) == values represent latitudesem(ln-end) == values represent longitude
Which lat goes with which lon?
Location
Latitude Longitude
lat lon
Some Examples (still not enough)
r(…, lt, ln, lt-end, ln-end, …)
sem(lt, ln) == values represent location and lat leads to semval(lt) and lon leads to
semval(ln) **
sem(lt, ln) == values represent locationsem(lt) == values represent latitudesem(ln) == values represent longitudesem(lt, ln) == values represent location and …sem(lt-end) == values represent latitudesem(ln-end) == values represent longitude
What if we want to integratewith another dataset withtwo lat/lons? What do we do?
Location
Latitude Longitude
lat lon
* We could infer the lat and lon roles here; in general, I don’t think we can infer roles as such…
Some Examples (still not enough)
r(…, lt, ln, lt-end, ln-end, …)
sem(lt, ln, lt-end, ln-end) === values represent transect and start leads to semval(lt, ln) and end leads to semval(lt-end, ln-end)
sem(lt, ln) == values represent location and …sem(lt) == values represent latitudesem(ln) == values represent longitudesem(lt, ln) == values represent location and …sem(lt-end) == values represent latitudesem(ln-end) == values represent longitude
So, even in verysimple cases,annotationscan become complex…
Location
Latitude Longitude
lat lonTransect
start
end
Executable, Fine-Grain Semantic Registration
genus species count lat lon
'Acanthomyops' 'latipes' 1 41.6, -119.383'Acromyrmex' 'versicolor' 1 33.1839 -114.866'Anergates‘ 'atratulus' 1 37.9833 -84.5167'Anergates‘ 'atratulus' 4 38.8833 -77.1167
Each row represents a RatioMeasurement
RatioMeasurement
Executable, Fine-Grain Semantic Registration (cont.)
genus species count lat lon
'Acanthomyops' 'latipes' 1 41.6, -119.383'Acromyrmex' 'versicolor' 1 33.1839 -114.866'Anergates‘ 'atratulus' 1 37.9833 -84.5167'Anergates‘ 'atratulus' 4 38.8833 -77.1167
For a row, count is the value of the measurement
value11
dataValue
RatioMeasurement
LocalInteger
Executable, Fine-Grain Semantic Registration (cont.)
genus species count lat lon
'Acanthomyops' 'latipes' 11 41.6 41.6 -119.383-119.383'Acromyrmex' 'versicolor' 1 33.1839 -114.866'Anergates‘ 'atratulus' 1 37.9833 -84.5167'Anergates‘ 'atratulus' 4 38.8833 -77.1167
For a row, lat/lon are the locations values of the measurement
value11
dataValue
context
location
latitude
longitude41.641.6
-119.383-119.383
RatioMeasurement
LocalInteger
LocationContext
GeogCoordPoint
Executable, Fine-Grain Semantic Registration (cont.)
genus species count lat lon
'Acanthomyops''Acanthomyops' 'latipes''latipes' 11 41.6 41.6 -119.383-119.383'Acromyrmex' 'versicolor' 1 33.1839 -114.866'Anergates‘ 'atratulus' 1 37.9833 -84.5167'Anergates‘ 'atratulus' 4 38.8833 -77.1167
For a row, genus/species are mapped to standard values, associated
RatioMeasurement
itemMeasuredCount
propertyEntityTaxonomicGroup
…
taxonomicIDSimpleTaxonomicId
genusGenus
rankName
taxon:1883/5taxon:1883/5
Species
speciesrankName
taxon:1883/3taxon:1883/3
subCatsuperCat
Querying based on Semantic Registrations
value11
dataValue
context
location
latitude
longitude41.641.6
-119.383-119.383
RatioMeasurement
LocalInteger
LocationContext
GeogCoordPoint
itemMeasuredCount
propertyEntityTaxonomicGroup
taxonomicIDSimpleTaxonomicId
genusGenus
rankName
taxon:1883/5taxon:1883/5
Species
speciesrankName
taxon:1883/3taxon:1883/3
subCatsuperCat
Find all datasets that measure species of ‘Acanthomyops’ in South Africa … and return a set of all lat/lon “points”(demo …)
Architecture
SMSOperations
Dataset repository (heterogeneous)
Lat/Lon Species Queries
Semantic Annotations
Taxon Services
SynonymsConcept
IDs
…
Ontology repository
Results
discover_resources
query_resourcesintegrate_resource
s
Mappings
Finding user interfaces that are easy-to-use, but provide detailed annotations
genus specieslat lon count
TaxaConceptIDValue Value Value
41.6 -119.4 5 ‘Manica’ ‘bradleyi’
34.9 -120.7 2 ‘Formica’ ‘fusca’
resource id:
<<registration information/properties>>
<<ontology view>> <<sample instance view>>
<<annotation, schema, and data>> antweb:040412
A Sparrow Executable Semantic Annotation Registration
A partial object instantiation (of onto classes)
The resource can be queried directly using the object structure (i.e., using the ontology)
Outline
• The Sparrow ToolkitThe Sparrow Toolkit
• Semantic RegistrationSemantic Registration
• Ontology-Driven Structural TransformationOntology-Driven Structural Transformation
Example Structural Types (XML)
S1
(life stage property)
S1
(life stage property)
S2
(mortality rate for period)
S2
(mortality rate for period)
P1P2
P4
P3 P5
root population = (sample)*elem sample = (meas, lsp)elem meas = (cnt, acc)elem cnt = xsd:integerelem acc = xsd:doubleelem lsp = xsd:string
<population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> …<population>
root cohortTable = (measurement)*elem measuremnt = (phase, obs)elem phase = xsd:stringelem obs = xsd:integer
<cohortTable> <measurement> <phase>Eggs</cnt> <obs>44,000</acc> </measurement>…<cohortTable>
structType(P2) structType(P3)
Example Semantic Types
Portion of SEEK measurement ontology
MeasContext
Observation EntityMeasProperty
hasContext 0:*1:1
appliesTo
hasProperty
0:*
AccuracyQualifier
EcologicalProperty
AbundanceCount
LifeStageProperty
NumericValue
SpatialLocation
hasLocation
hasCount
1:1
1:1
hasValue1:1
itemMeasured
1:*
Example Semantic Types
Semantic types for P2 and P3
S1
(life stage property)
S1
(life stage property)
S2
(mortality rate for period)
S2
(mortality rate for period)
P1P2
P4
P3 P5
Observation
semType(P3)
MeasContext
hasContext
1:1
appliesTo LifeStageProperty1:1
AbundanceCount
itemMeasured NumberValue
hasCount
1:11:1
semType(P2)
⊑
AccuracyQualifier
hasProperty
1:1
hasValue1:1
The Ontology-Driven Framework
SourceServiceSourceService
TargetServiceTargetService
Ps Pt
SemanticType Ps
SemanticType Ps
SemanticType Pt
SemanticType Pt
StructuralType Pt
StructuralType Pt
StructuralType Ps
StructuralType Ps
Desired Connection
Compatible ( )⊑
RegistrationMapping (Output)
RegistrationMapping (Input)
Ontologies (OWL)Ontologies (OWL)
The Ontology-Driven Framework
SourceServiceSourceService
TargetServiceTargetService
Ps Pt
SemanticType Ps
SemanticType Ps
SemanticType Pt
SemanticType Pt
StructuralType Pt
StructuralType Pt
StructuralType Ps
StructuralType Ps
Desired Connection
Compatible ( )⊑
RegistrationMapping (Output)
RegistrationMapping (Input)
CorrespondenceCorrespondence
Ontologies (OWL)Ontologies (OWL)
The Ontology-Driven Framework
SourceServiceSourceService
TargetServiceTargetService
Ps Pt
SemanticType Ps
SemanticType Ps
SemanticType Pt
SemanticType Pt
StructuralType Pt
StructuralType Pt
StructuralType Ps
StructuralType Ps
Desired Connection
Compatible ( )⊑
RegistrationMapping (Output)
RegistrationMapping (Input)
CorrespondenceCorrespondence
Generate (Ps)(Ps)
Ontologies (OWL)Ontologies (OWL)
Transformation
Datasets used in the Prototype
genus species count lat lon
'Acromyrmex' 'versicolor‘ 1 33.1839 -114.866…
genus species cnt lt ln
Camponotus‘ ‘festinatus‘ 3 30.55 -103.833…
Antweb
SouthAfrica
Museum
mbcnt cfcnt lat lon
1 2 -25.35 -77.1167…
“faked”
genus1 species1 genus2 species2
Manica parasitica Manica bradleyi…
DulosisParasite/
Host