searching within large grid infrastructures marios d. dikaiakos university of cyprus & coregrid

55
Searching within Large Searching within Large Grid Infrastructures Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Upload: barry-ramsey

Post on 11-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Searching within Large Grid Searching within Large Grid InfrastructuresInfrastructures

Marios D. DikaiakosUniversity of Cyprus & CoreGRID

Page 2: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 2

AcknowledgementsAcknowledgements

• Wei Xing, University of Cyprus• Rizos Sakellariou, U. Manchester, UK• Yannis Ioannidis, U. Athens, GR• Salvatore Orlando, ISTI-CNR, IT• Domenico Laforenza, ISTI-CNR, IT

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 3: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 3

OutlineOutline

• Context and Motivation• Limitations of Grid Information Services• Semantic Grid and Ontologies• A Core Grid Ontology• Conclusions and Future Work

Page 4: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 4

The GridThe Grid

• A wide-scale, distributed computing infrastructure to support resource sharing and coordinated problem solving in dynamic, multi-institutional Virtual Organizations.– Computational Grid: Provides the raw computing power,

high speed bandwidth interconnection and associate data storage.

– Data & Information Grid: Allows easily accessible connections to major sources of information and tools for its analysis and visualisation.

– Knowledge & Semantic grid: Gives added value to the information; provides intelligent guidance for decision-makers; facilitates the generation, diffusion and support of knowledge.

Page 5: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 5

Near-future Scenarios for the GridNear-future Scenarios for the Grid

Page 6: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 6

Near-future Scenarios for the GridNear-future Scenarios for the Grid

• The Grid as a Wide-Scale Distributed System:– Millions of resources of different kinds.– Services and Policies in place.– Relationships (permanent and transient) between

organizations, software, data, services, applications…– Different middleware platforms.– Common (?) protocols, standards and API’s.

• The hope is that Grid will grow larger and will reach an acceptance as wide as the Web.

Page 7: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 7

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 8: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 8

Problem Statement: Problem Statement: Searching the Searching the GridGrid

• How are individuals and organizations going to harness the capabilities of a fully deployed Grid, with a massive and ever-expanding base of computing and storage nodes, network resources, and a huge corpus of available programs, services, and data?

• To this end, users need to identify “resources” that are:– Interesting (discovery)– Relevant (classification)– Accessible and available under known policies of use,

cost (inquiry)• Emphasis on “summary” information, in terms of

granularity and timing.

Page 9: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 9

Searching the GridSearching the Grid

• Computing, Storage, Network Resources •Software and Data-sets• Policies • Relationships• Best-practices

Page 10: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 10

Examples of search queries Examples of search queries

• Hardware resources on the Grid, their attributes, and applicable policies of their use:– Find a VO providing exclusive access to a shared-

memory multiprocessor system with at least 16 processors, 8 GB of main memory, and a usage charge of not more than 100 euros per CPU time?

• Application services, software, and data-sets:– Find services running Quantum Chromo-Dynamics

calculations (QCD) using F90 and MPI.

• Hardware-software combinations, Grid usage and best-practices:– Find the pricing and prior clientele of Grid services that

provide access to the XYZ workflow for high-performance oil refinery simulations.

Page 11: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 11

OutlineOutline

• Context and Motivation• Grid Information Services and

Limitations • Semantic Grid and Ontologies• A Core Grid Ontology• Conclusions and Future Work

Page 12: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 12

Grid Information ServicesGrid Information Services

• Established to help users answer questions on the status of individual resources and the Grid.

• Support the discovery and ongoing monitoring of the existence and characteristics of resources, services, computations and other entities of value to the Grid.

• Examples:– GLOBUS, EDG: Metacomputing Directory Service (MDS)– UNICORE Gateway and Network Job Supervisor (NJS)– EGEE: Relational Grid Monitoring Architecture (R-GMA),

GridICE– Condor Matchmaker

Page 13: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 13

MDS: Grid Info Services in GlobusMDS: Grid Info Services in Globus

Resources

GIIS GIIS

GRRP

Users

GRIP

GRIS

LDIF

GRIS

“Info. Provider”

LDIF

GRIS

“Info. Providers”

LDIF

GRIS

“Info. Providers”

LDIF

GRRPGRRPGRRP

GRIP

GRIP

GIIS

GIIS

Info. Retrieval

Discovery/Inquiry/Retrieval

GRRPGRRP

“Info. Providers”

Page 14: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 14

Relational Grid Monitoring Relational Grid Monitoring ArchitectureArchitecture

Application

ConsumerAPI

Sensor Code

ProducerAPI

ConsumerServlet

ProducerServlet

Reg

istry AP

I

RegistryService

Page 15: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 15

What information is out there?What information is out there?

Virtual Organizations:• Resources• Policies• People

Software:• Codes• Specs• Location

Data-sets:• Data• Metadata• Replicas

Services:• Interface• Metadata

Applications:• Descriptions.• I/O requirements.• Meta-Data• Worklfows

Summary & Statistics• Logs.• Associations.• Statistics of use.

Resource Specifications:• Descriptions & Types• Names• Capacity• Configuration

Resource status• Resource use.• Availability.• Monitoring data.

Page 16: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 16

Resource Specification info. Resource Specification info. (examples)(examples)Source Information provided Schema System

Info. Provider(Unix sys-call)

Mds-computer-platformMds-Cpu-modelMds-Host-hn

Hierarchical MDS-GlobusLDAP

Info. Provider (Unix sys-call)

Static info.

GlueCENameGlueHostNameGlueHostArchitectureGlueHostProcessorClockSpeedGlueSEAccessProtocolTypeGlueCESEBindGroupGlueHostFileLatency

Hierarchical MDS-EDGLDAP

Sensors(Unix sys call)

StorageElementProtocolNetworkTCPThroughputNetworkRTT

Relational RGMA-EDGHTTP

Page 17: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 17

Resource status information Resource status information (examples)(examples)

Source Information provided Schema System

Info. Provider(Unix sys-call)

Mds-Memory-Ram-freeMBMds-FS-Total-freeMBcpuload5

Hierarchical MDS-GlobusLDAP

Info. Provider(Unix sys-call)

GlueCEStateRunningJobsGlueCEJobLocalIDGlueHostProcessorLoadLast1Min

Hierarchical MDS-EDGLDAP

Sensors(Unix sys call)

StorageElementStatusNetworkUDPPacketLoss NetworkFileTransferThroughput

Relational RGMA-EDGHTTP

Condor’s Sensor modules

DiskSpace MemoryUsed SystemLoad

ClassAds HawkeyeCondor

NWS probesTraceroute

End-to-end bandwidthEnd-to-end latencyEnd-to-end path

XML GridLab’s TopoMonGMA arch.

Page 18: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 18

VO information (examples)VO information (examples)

Source Information provided Schema System

Static info.

Cert (info. About local certificate policy)MdsHostContact

Hierarchical MDS-GlobusLDAP

Static info.

GlueCEPolicyMaxWallClockTimeGlueCEPolicyMaxCPUTimeGlueSAPolicyMaxFileSize

Hierarchical MDS-EDGLDAP

Page 19: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 19

Software & Dataset information (examples)Software & Dataset information (examples)

Source Information provided Schema System

Info. Provider Mds-Application-Group-configMds-Application-nameMds-Application-locationMds-Application-info

Hierarchical MDS-GlobusLDAP

Info. Provider GlueSLFileNameGlueSLFileSizeGlueSLFilePath

Hierarchical MDS-EDGLDAP

GDMP producer

ExportCatalogue RGMA Replica Catalogue ServiceGDMP-EDG

Page 20: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 20

Application & Logging InformationApplication & Logging Information

Source Information provided Schema System

TRIANA Worklow information & Metadata

XML TRIANA - GridLab

Condor submission

DAGMan input file (DAG specification and metadata)

Condor-specific Condor meta-scheduler

Workload Management System

BrokerInfo file Hierarchical Resource Broker (EDG)LDAP

LDAP queries to JSS, RB.

Logging informationBookkeeping information (transient)UserID, JobID, Job State, JobDescription, etc

Attribute=value LB Server (EDG)Events, exported API for queries

Page 21: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 21

Limitations of Current ApproachesLimitations of Current Approaches

• Remarks extracted from the description of a Grid-application development effort:– “Jobs typically need to access hundreds of files, and

each site has a different subset of the files.”– “Our data system knows what portion of a user's

data may be at each site, but does not know how to submit grid jobs.”

– “Our job submission system required users to choose grid sites and gave them no assistance in choosing.”

– “…jobs requesting thousands of files and sites having hundreds of thousands of files are not uncommon in production.”

– “…it would not be scalable to explicitly publish all the properties of jobs and resources in ...”

Page 22: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 22

Limitations and ChallengesLimitations and Challenges

• Scalability in the context of Millions of Resources:– Infrastructure intrusiveness.– Resource Discovery, Retrieval and Classification.

• Expressiveness of Data Models in terms of:– Types of captured information.– Expressing semantic relationships between represented entities.– Amenability to Indexing, Query Optimization.

• Complexity:– Different protocols for discovery & inquiry, registration,

invocation.– Lack of interoperability between different platforms.– Information Standardization.

• Missing Functionalities:– Transient and Historical information.– Policies.– Complex Queries.

Page 23: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 23

Revisiting the problemRevisiting the problem

• Very large number of sources.• Independent.• No common schema. • Various, partly unknown semantics.• Subject to change, birth, or silence.

Page 24: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 24

Revisiting the problemRevisiting the problem

• A federated warehouse approach:– “Wrap” the various sources to extract their

information.– Store data in a warehouse.– Monitor sources and propagate updates to the

warehouse.– Ask queries to the warehouse.

Page 25: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 25

Requirements for Requirements for Searching the GridSearching the Grid

• Global/Common naming scheme for Grid entities.

• Resolution mechanism for discovery and retrieval of entity-related information/meta-data.

• Type and representation of retrieved entity-related information.

• Mining and representation of relationships and summary data.

• Complexity of queries and query interpretation.

Page 26: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 26

Research IssuesResearch Issues

• Metadata Consolidation: Definition & local creation of metadata about Grid entities.

• Information Source Discovery: Algorithms for Search and Discovery, Management of

Updates.

• Metadata Retrieval and Integration: Protocols for retrieval; Data structures and algorithms for

integration.

• Management of meta-data: Analysis to build proper indexes; Extrapolation of semantic

relationships.

• Query mechanisms and interface. Query language definition. Intelligent-agent interface to

help users formulate queries.

Page 27: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 27

OutlineOutline

• Context and Motivation• Limitations of Grid Information Services• Semantic Grid and Ontologies• A Core Grid Ontology• Conclusions and Future Work

Page 28: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 28

Looking for answers: Semantic GridLooking for answers: Semantic Grid

An extension of the current Grid in which information and services are given well-defined and explicitly represented meaning, so that it can be shared and used by humans and machines, better enabling them to work in cooperation.

Source: Goble, Bechhofer, DeRoure, Semantic Grid 101GGF16, Athens, 2/2005

Page 29: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 29

Ontologies and the Semantic GridOntologies and the Semantic Grid

• Ontologies are among the key building blocks of the Semantic Grid.– The concepts/terms of Grid entities, resources,

capabilities and the relationships between them.

• We develop Grid ontologies to:– Merge the information from different sources;– Build a knowledge base for Grid infrastructures;– Construct a Grid information system;– Support co-operation with semantic-able Grid

services, such as Resource Broker, Information Service, etc.

Page 30: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 30

• An ontology is an engineering artifact: – It is constituted by a specific vocabulary used to

describe a certain reality, plus – a set of explicit assumptions regarding the

intended meaning of the vocabulary. • Almost always including how concepts should be classified

• Thus, an ontology describes a formal specification of a certain domain:– Shared understanding of a domain of interest– Formal and machine manipulable model of a

domain of interest

Ontologies in Computer ScienceOntologies in Computer Science

Source: Goble, Bechhofer, DeRoure, Semantic Grid 101GGF16, Athens, 2/2005

Page 31: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 31

LanguagesLanguages

• Work on Semantic Web has concentrated on the definition of a collection or “stack” of languages. – These languages are then used to support the

representation and use of metadata.• The languages provide basic machinery that can be

used to represent the extra semantic information needed for the Semantic Web– XML– RDF– RDF(S)– OWL– …

OWL

Integration

RDF(S)

RDF

XML

Annotation

Integration

Inference

Source: Goble, Bechhofer, DeRoure, Semantic Grid 101, GGF16, Athens, 2/2005

Page 32: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 32

““W3C” StackW3C” Stack

• XML provides a surface syntax for structured documents

• XML Schema is a language for restricting the structure of XML documents.

• RDF is a data-model for objects ("resources") and relations between them, provides simple semantics for this data-model

• RDF Schema is a vocabulary for describing properties and classes of RDF resources, with semantics for generalization and hierarchies of such properties and classes.

• OWL adds more vocabulary for describing properties and classes.

Page 33: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 33

OutlineOutline

• Context and Motivation• Limitations of Grid Information Services• Semantic Grid and Ontologies• A Core Grid Ontology• Conclusions and Future Work

Page 34: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 34

Towards a general Ontology for Towards a general Ontology for GridsGrids• Currently, there are several Grid architectures and

Grid implementations.• Different views of Grid entities and their properties. • It is practically impossible that one ontology can

include all aspects of Grids or of many types of Grid entities.

• A Core Grid Ontology (CGO):– A core “framework” for representing a Grid.– Open and extensible for all kinds of Grid

architectures and Grid implementations.

Page 35: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 35

Building a Core OntologyBuilding a Core Ontology

• The most difficult task for developing an ontology:– Capture a “right” model for the Grid;

• Our view of a Grid:– Users&Applications+{Middleware/Services}+Resources

within VOs;

• A layer-structured model consisting of three layers:– Users/Applications– Middleware/services– Resources.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.GGF 16, 2/2006

GGF 16, 2/2006

Page 36: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 36

A Grid ModelA Grid Model

Page 37: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 37

CGO Classes OverviewCGO Classes OverviewQuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

Page 38: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 38

Defining propertiesDefining properties

Based on the Constraints of the CGO Classes.

Page 39: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 39

Representing a Grid EntityRepresenting a Grid Entity

Page 40: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 40

Representing a Grid Entity using Representing a Grid Entity using OWLOWL

<owl:Class rdf:ID="ComputingElement">

<rdfs:subClassOf>

<owl:Restriction>

<owl:someValuesFrom>

<owl:Class>

<owl:unionOf rdf:parseType="Collection">

<owl:Class rdf:about="#Jobmanager"/>

<owl:Class rdf:about="#JobScheduler"/>

</owl:unionOf>

</owl:Class>

</owl:someValuesFrom>

<owl:onProperty rdf:resource="#runningSevice"/>

</owl:Restriction>

</rdfs:subClassOf>

……

Page 41: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 41

Generating InstancesGenerating Instances

Page 42: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 42

ConclusionsConclusions

• The CGO can be used as a common, extensible language for:– Expressing the basic concepts of a Grid

infrastructure and the relationships thereof.– Encoding and storing Grid metadata.– Integrating grid-related information extracted

from different sources.– Expressing queries.

Page 43: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 43

Next stepsNext steps

• Automate the knowledge-base construction and maintenance process:– Information-source discovery– Metadata wrapping– Metadata integration– Consistency updates

• Investigate mechanisms for efficient knowledge-base query implementation.

Page 44: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 44

Thank you for your attention!• Questions? • Comments ?

Page 45: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 45

ReferencesReferences• "A Core Grid Ontology for the Semantic Grid." Wei Xing, M. D.

Dikaiakos, and R. Sakellariou. 6th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), Singapore, May 2006 (to appear).

• "Information Services for Large-scale Grids: A Case for a Grid Search Engine." M. D. Dikaiakos, R. Sakellariou, and Y. Ioannidis. In Engineering the Grid: status and perspectives, Jack Dongarra, Hans Zima, Adolfy Hoisie, Laurence Yang, Beniamino DiMartino (Editors), American Scientific Publishers, January 2006, ISBN: 1-58883-038-1.

• "Building a Distributed Digital Library for Natural Disasters Metadata with Grid Services and RDF." W. Xing, M. D. Dikaiakos, Hua Yang, A. Sphyris, G. Eftychidis. Library Management Journal (Special Issue on Digital Libraries in the Knowledge Era: Knowledge Management and Semantic Web Technology). Vol. 26, No. 4-5, May 2005

• "Search Engines for the Grid: A Research Agenda." M. D. Dikaiakos, Y. Ioannidis, R. Sakellariou. In Grid Computing. First European AcrossGrids Conference, Santiago de Compostela, Spain, February 2003, Revised Papers, Lecture Notes in Computer Science series, vol. 2970, pages 49-58, vol. 2970, Springer, 2004.

Page 46: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 46

The RDF Data ModelThe RDF Data Model• Statements are <subject, predicate, object> triples:

– <Sean,hasColleague,Ian>• Can be represented as a graph:

• Statements describe properties of resources• A resource is any object that can be pointed to by a URI:

– The generic set of all names/addresses that are short strings that refer to resources

– a document, a picture, a paragraph on the Web, http://www.cs.man.ac.uk/index.html, a book in the library, a real person (?), isbn://0141184280

• Properties themselves are also resources (URIs)

Sean IanhasColleague

Source: Goble, Bechhofer, DeRoure, Semantic Grid 101GGF16, Athens, 2/2005

Page 47: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 47

Linking StatementsLinking Statements• The subject of one statement can be the object of another• Such collections of statements form a directed, labeled graph

• The object of a triple can also be a “literal” (a string)

Sean IanhasColleague

Carole http://www.cs.man.ac.uk/~horrocks

hasColleaguehasHomePage

“Sean K. Bechhofer”hasName

Page 48: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 48

RDF SyntaxRDF Syntax

• RDF has an XML syntax that has a specific meaning:

• Every Description element describes a resource• Every attribute or nested element inside a

Description is a property of that Resource• We can refer to resources by URIs

<rdf:Description rdf:about="some.uri/person/sean_bechhofer">

<o:hasColleague resource="some.uri/person/ian_horrocks"/>

<o:hasName rdf:datatype="&xsd;string">Sean K. Bechhofer</o:hasName>

</rdf:Description>

<rdf:Description rdf:about="some.uri/person/ian_horrocks">

<o:hasHomePage>http://www.cs.mam.ac.uk/~horrocks</o:hasHomePage>

</rdf:Description>

<rdf:Description rdf:about="some.uri/person/carole_goble">

<o:hasColleague resource="some.uri/person/ian_horrocks"/>

</rdf:Description>

Page 49: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 49

What does RDF give us?What does RDF give us?

• A mechanism for annotating data and resources.• Single (simple) data model.• Syntactic consistency between names (URIs). • Low level integration of data.

Source: Goble, Bechhofer, DeRoure, Semantic Grid 101GGF16, Athens, 2/2005

Page 50: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 50

RDF(S): RDF SchemaRDF(S): RDF Schema

• RDF gives a formalism for meta data annotation, and a way to write it down in XML, but it does not give any special meaning to vocabulary such as subClassOf or type (supporting OO-style modelling)– Interpretation is an arbitrary binary relation

• RDF Schema extends RDF with a schema vocabulary that allows you to define basic vocabulary terms and the relations between those terms– Class, type, subClassOf, – Property, subPropertyOf, range, domain

– it gives “extra meaning” to particular RDF predicates and resources

– this “extra meaning”, or semantics, specifies how a term should be interpreted

Source: Goble, Bechhofer, DeRoure, Semantic Grid 101GGF16, Athens, 2/2005

Page 51: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 51

Problems with RDFSProblems with RDFS

• RDFS is too weak to describe resources in sufficient detail– No localised range and domain constraints

• Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants

– No existence/cardinality constraints• Can’t say that all instances of person have a mother that is

also a person, or that persons have exactly 2 parents– No transitive, inverse or symmetrical properties

• Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical

• It can be difficult to provide reasoning support– No “native” reasoners for non-standard semantics– May be possible to reason via FO axiomatisation

Source: Goble, Bechhofer, DeRoure, Semantic Grid 101GGF16, Athens, 2/2005

Page 52: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 52

Web Ontology Language Web Ontology Language RequirementsRequirements• Desirable features identified for Web Ontology

Language:

• Extends existing Web standards – Such as XML, RDF, RDFS

• Easy to understand and use– Should be based on familiar KR idioms (e.g. OO-style,

frames etc).

• Formally specified

• Of “adequate” expressive power

• Possible to provide automated reasoning support

Page 53: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 53

OWLOWL

• W3C Recommendation (February 2004) • Well defined RDF/XML serializations• A family of Languages

– OWL Full– OWL DL– OWL Lite

• Formal semantics– First Order (DL/Lite)– Relationship with RDF

• Comprehensive test cases for tools/implementations

• Growing industrial takeup.

Page 54: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 54

OWL BasicsOWL Basics

• Set of constructors for concept expressions– Booleans: and/or/not– Quantification: some/all

• Axioms for expressing constraints– Necessary and Sufficient conditions on classes– Disjointness– Property characteristics: transitivity, inverse

• Facts– Assertions about individuals

Page 55: Searching within Large Grid Infrastructures Marios D. Dikaiakos University of Cyprus & CoreGRID

Slide 55

Metacomputing Directory Service Metacomputing Directory Service (MDS)(MDS)

• Distributed Directory approach: collection of LDAP servers.

• Simple LDAP Information Schemas describe resource information.

• Servers:– Grid Resource Information Server (GRIS): Running on

each resource and supplying information about it. Supports multiple resources as well.

– Grid Index Information Server (GIIS): Collect information from multiple GRIS servers. Support particular queries for information spread across multiple GRIS servers.

• Protocols (LDAP based) for:– Discovery and Inquiry (GRIP).– “Soft-state” Registration (GRRP).