an introduction to track 4: soa and metadata (semantics)

Post on 20-Jan-2016

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

An Introduction to Track 4: SOA and Metadata (Semantics). 2 nd SOA for E-Government Conference 30-31 October 20006. Chuck Mosher Senior Enterprise Architect cmosher @ metamatrix.com. Agenda. The drivers for data (& metadata) integration Metadata in an SOA - PowerPoint PPT Presentation

TRANSCRIPT

An Introduction to Track 4: SOA and Metadata

(Semantics)

Chuck Mosher

Senior Enterprise Architect

cmosher @ metamatrix.com

2nd SOA for E-Government Conference30-31 October 20006

2

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

3

Acknowledgements

• Dave McComb*, Semantic Arts

• Atif Kureishy*, Booz | Allen | Hamilton

• John Salasin*, NIST

• Jeff Pollock, Oracle

• Brand Niemann, EPA

• Andy Evans, Revelytix

* Track 4 Speaker, 2:45-4:15 pm tomorrow

4

One of the three enablers which drives domain-wide visibility: “… is a standard enterprise data architecture — the foundation for effective and rapid data transfer and the fundamental building block to enable a common logistical picture.”

Army Lt. Gen. Claude Christianson

“If you look at all the trends in the IT arena over the past 30 to 40 years, we’ve moved into an environment where we’ve got faster networks, more powerful processors, but it really comes down to the data”

Michael Todd, DOD CIO office

Data Interoperability Lies At The Very Core of DoD Transformation

5

Dr. Linton Wells, as quoted in September’s NDIA Magazine, “…data compatibility may be an issue. Enabling digital interaction with nontraditional partners may require middleware or other programs that convert data from totally different formats …”

6

Problem Scope

• Incompatible data meanings are the largest, most expensive, and time-consuming portion of IT visibility and IT interoperability projects:– Gartner… Forrester… NIST…– IDC… CIO Magazine…

• The classic “n-squared” problem of interfaces is even more severe at the data layer:– Data-to-data interfaces outnumber “pipes”– Tightly-coupled is brittle, and requires code

• Information growth is accelerating – FAST!– 2002-2005 – more new data than all of history– 5 exabytes of new digital data created in 2002 – enough for .5 million

new Library’s of Congress

Jeff Pollock – 2004 White House Conference on Semantic Technology

7

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

8

Why Does SOA Need Metadata?

• An architectural style enabling loose-coupling• Cornerstone of E-Government reengineering• Web Services and their related standards

(SOAP, WSDL, UDDI) provide an implementation framework for several key features of SOA

• BUT: Web Service technologies do not provide all the requirements for Dynamic USE of Discoverable Services

• Discovery – Yes – UDDI/ebXML• Use – No – requires service consumers and

providers to agree on a pre-defined standard interface for the service

9

SOA is Easy, It’s Metadata That’s Hard

• SOA focuses on the interoperability between application interfaces & protocols

• Data (and service) meaning, integrity, and transformation have to be addressed elsewhere

• This information is found in the metadata

• SOA makes getting control over the metadata critical to success– Or you will end up with SOA silos!

10

Metadata Is Everywhere

• Integration– Syntactic– Semantic– Application– Process

• Accessibility• Visibility• Discoverability

• Management– Governance– Auditing– Lineage– Quality– Compliance– Change Mgmnt– Impact Analysis– Performance

Many of the problems & issues around SOA implementations & governance boil down to getting a solid handle on all of the types & forms of metadata involved

11

What Are Semantic Conflicts?Different primitive or abstract types for same information

Synonyms/antonyms have different text labels

Different conceptions about the relationships among concepts in similar data sets. Collections or constraints have been modeled differently for same information

Different abstractions are used to model same domain

Different choices are made about what concepts are made explicit

Fundamentally different data representations are used

Synonyms/antonyms exist in same/similar concept instance values

Different units of measures with incompatible scales

Similar concepts with different definitions

Fundamental incompatibilities in underlying domains

Disparity among the integrity constraints

Data Type

Labeling

AggregationStructureCardinality

Generalization

Value Representation

Impedance Mismatch

Naming

Scaling and Unit

Confounding

Domain

IntegrityJeff Pollock – 2004 White House Conference on Semantic Technology

12

Metadata Management Maturity• Level 1: Inventory of information assets

– Necessary 1st step – what data do we have– Typically stored in repositories, registries, spreadsheets,

implicit in data itself (relational DB’s)• Level 2: Impact analysis

– Develop domain vocabularies and data models– Discover or create relationships between system artifacts

• Level 3: Metadata-driven integration– Design-time metadata repository + run-time integration– Example of Model-Driven Architecture

• Level 4: Semantic Web– Dynamic, machine-based inferencing at the concept level

13

Data Evolution Timeline

Age of Programs

Age of Proprietary

Data

Age of OpenData

Age of Open

Metadata

Age of SemanticModels

Program-Data

GIGO/minis/micros www / Netscape Web services OWL

Text, Office DocsDatabases

(proprietary schema)

HTML,XML

(open schema)

Namespaces,Taxonomies,

RDF

Ontologies&

Inference

1945 -1970 2000 - 20031994 - 20001970 - 1994 2003 -

ProceduralProgramming

Object-OrientedProgramming

Model-DrivenProgramming

“Data is lesslessimportant

than code”

“Data is asasimportantas code”

“Data is moremoreimportant

than code”

Michael Daconta, Creating Relevance and Reuse with Targeted Semantics,XML 2004 Conference Keynote, November 16, 2004.

14

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

15

Program Challenges• Multiple sources

• Different interfaces/drivers• Different physical structures• Different semantics

• Single interface to data desired• Real-time access to data• Performance• Maintainability as data changes• Maintainability as apps change

Mission Challenges• Time-to-deploy• Agility - Responsiveness to change• Automation – Reduce cost of new development and operations• ROI of enterprise information

Agency Challenges• 100’s/1000’s of data sources• 100’s/1000’s of applications• Multiple access points/modes for apps• Understanding relationships/semantics• Data consistency• Data reuse – bridging data silos• Support for Web Services & SQL• Control & manageability, compliance• Security & auditing

Information Resources

Communities of Interest

Information Challenges

?

16

Information Virtualization

Information Resources

Communities of Interest

Information Virtualization Layer

17

Information Virtualization

Unified Semantic Layer

Information Virtualization Layer

Data Federation Layer

Data Access/Connectivity Layer

Enterprise Data Sources

Unification of different concepts across systemsSingle-query access to heterogeneous systemsUniform, standardized access to any system

18

Metadata-Based Data Service

MasterData

OperationalData Store

AgencyApplication

Data Service

SQL SQL APICall

XML/SOAP

• Decouple data sources from application– Data implementation shielded

from application• Semantic/Format Mediation

– Standard vocabulary • Single access point

– Web Service/XML– SQL

• Federation– Single source or multi-source

• Scalability– Security, performance

Bridge theGap

SQL

19

FEA DRM View on Data Services

DRM Version 2 Data Access Services• Context Awareness Services• Structural Awareness Services• Transactional Services• Data Query Services• Content Search and Discovery Services• Retrieval Services• Subscription Services• Notification Services

Service Types include:• Metadata / Data

• Structured / Unstructured• Read / Write• Push / Pull

20

Designing data services

Modeling Information Services for SOA

xml

databases

warehouses

spreadsheets

services

<sale/> <value/></ sale >

geo-spatial

rich media

…Enterprise Enterprise Information Information

Sources (EIS)Sources (EIS)

Information Information ConsumersConsumers

Reusable,Reusable,Integrated Data Integrated Data

ObjectsObjects

ExposedExposedDataData

ServicesServices

<WSDL><WSDL>(contract)

<WSDL><WSDL>(contract)

<WSDL><WSDL>(contract)

Custom Apps

Web Services,Business Processes

Packaged Apps

Reporting, Analytics

EAI, Data warehouses

OD

BC

JDB

CS

OA

P

Logistics

Intelligence

21

• Transformations from one or more sources

• Transformations defined with:– Joins/unions– Criteria– Functions

• Elements mapped to dictionary

• Business definitions captured

Data Service Abstraction Layers

22

Data Service Layer in SOAClient Process & Applications

Data Sources

Data Services Layer

Message Services (ESB)

Business Services

Business Process Services

App App App App App App

Data Service Data Service Data Service Data Service Data Service

23

Data,ContentSources

Logical Data Model

Data Services Approaches

T

Org, Person, Image,

Location

MaterializedLogical Model

<X>

</X>

<X>

</X>

<X>

<X>

<X>

</X>

<X>

</X>

<X>

<X>

Data Services for Multiple Purposes:

• Simplified access to value-added (tagged) data in real-time• Value-added (tagged) data materialized & staged

• Phased-in migration from legacy to new• Managed archiving via classification, retention tags

• Enhanced search via consistent content tags

Model-Driven Integration LayerModel-Driven Integration Layer

Data,ContentSources

Logical Data ModelT

Organization, Customer, Imagery, Location

MaterializedLogical Model

<X>

</X>

<X>

</X>

<X>

<X>

<X>

</X>

<X>

</X>

<X>

<X>

AgileInformation

Services

<X>

</X>

<X>

</X>

<X><X>

<X>

</X>

<X>

</X>

<X><X>

<X>

</X>

<X>

</X>

<X><X>

Enriched Data/Content Store

24

T

Authoritative Sources:• Mapped to logical

Multiple Internal/External Information Sources

Application views of information:

• Relational, XML

T T

XML Document<a>

</a>

<b>

</b>…

T

TT

ODBC/JDBC JDBC SOAP

WebServices

WebServices

Search Applications

Search Applications

BusinessIntelligence

Applications

BusinessIntelligence

Applications

Logical Data Model:• Agency or COI-specific• Rationalize, harmonize,

mediate

C2, Logistics, Intelligence, …

Leveraging COI Data Dictionaries

bldg_id SITENUM Facility_ID

Location_ID

bldg_type Depot_Number

Location_Type

25

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

26

Beyond Mere Metadata

• Vocabularies/lexicons, Domain Models, Taxonomies, Ontologies

• All are means of beginning to define the context and scope of the domain of interest

• All specify artifacts in some way

• The “Semantics” word often means the relationships between artifacts is also specified

27

Semantics = Meaning = Relationships

• Humans (and therefore our machines) only ever understand anything in so far as it is related to other things

ID

28

Semantics = Meaning = Relationships

• Humans (and therefore our machines) only ever understand anything in so far as it is related to other things

ID

VANY

MD

29

Semantics = Meaning = Relationships

• Humans (and therefore our machines) only ever understand anything in so far as it is related to other things

ID

SUPEREGO

EGO

ANALYSIS

30

Semantics = Meaning = Relationships

• Humans (and therefore our machines) only ever understand anything in so far as it is related to other things

ID

LICENSE

CARD

BADGE

31

Data Dictionary -> Vocabulary

• The data alone does not have sufficient context• Using metadata is not enough - you must be able to

leverage domain concepts and terminologies• Example problem – potentially similar data elements,

but dissimilar constructs/datatypes/descriptions– How do we relate common constructs with uncommon datatypes? – Solution requires that vocabulary relate those constructs across

models with transformation relationships, logic

• Define business use/semantics of similar information– Datatypes describe a set of values– Defines the technical constraints on values– Enables integrating information, as datatypes can be

referenced by any models (relational, XML, object, …)

32

Benefits of Building a Vocabulary• Develop reusable information models and schemas

• Capture business and technology requirements in a single vocabulary

• Capture institutional knowledge

• Enables semantic mining techniques for deeper data discovery and information sharing

• Accelerate interoperability, web services and SOA development and deployment

• Establish and maintain a common relationship across data sources

• Establish and maintain compliance with industry exchange models

• Reduce IT expenses by leveraging data in its native source

• Reduce IT expenses associated with building and maintaining partner integration

• Improved information sharing directly enhances decision making

33

Develop UML Use-CaseAuto Generate XSD - XML

Vocabulary Handbook

UNCLASSIFIED

Example Vocabulary Development Process

Determine Pilot Demonstration

Class Relationship Diagram

MDA DS COI Pilot - John Shea PEO C4I, PMW180 ISR/IO NMCI

34

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

35

“Ideal” Semantics

• Formal definition of meaning– Unambiguous– Machine process-able– Decidable

• Automated classification– Membership based on properties

• Inference– Can increase what you know based on

classification

36

Ontologies

• Ontology is an explicit formal specification of the terms in a domain and the relationships between them– Others are special cases– Formal conceptual model– W3C standard (OWL/RDF) implementation

• Concepts, definitions, properties, relationships

• Machines can draw inferences from the properties and relationships captured in the model

37

Ontologies

• Ontologies bring rigorous definitions of meaning to (meta)data

• More abstraction from lower levels of detail

• Key to loose-coupling

• With OWL/RDF, part of the W3C Semantic Web vision

38

W3C Semantic Web Stack

39

RDF

• Resource Description Format

• A mechanism to make assertions about things

• In the form of a triple:

subject -> predicate ->object

Resource (URI) -> Property (URI) -> Resource (URI or literal)

• URI’s establish unique namespace; do not have to be addressable

40

RDF Examples

Airport123Business345

“ORD”

“Chicago, IL”

closestTo

name

locatedIn

Airport123

Airport123

41

OWL

• OWL extends RDF by allowing us to create and make assertions about classes of things

Feline

Mammal Hair

Retractable

Claws

is a

has

has

42

T

Authoritative Sources:• Mapped to logical

Multiple Internal/External Information Sources

Application views of information:

• Relational, XML

T T

XML Document<a>

</a>

<b>

</b>…

T

TT

ODBC/JDBC JDBC SOAP

WebServices

WebServices

Search Applications

Search Applications

BusinessIntelligence

Applications

BusinessIntelligence

Applications

Logical Data Model:• Agency or COI-specific• Rationalize, harmonize,

mediate

C2, Logistics, Intelligence, …

Semantic Mapping Challenge

bldg_id SITENUM Facility_ID

Location_ID

bldg_type Depot_Number

Location_Type

43

Contextualize (Interpret)

Automated term tokenization

Automated semantic linking using the default knowledge-base contained within MatchIT

ArticleAmount

Amount Article

Sum

Assets

Creation

Synonym

Type-of

44

Semantic Matching (Mediate)

• With relationships pre-established within the knowledge-base…

• Identify the Target and the Source(s) and run the match.

ArticleAmount

ProductShares

Automatically linked by a specific % distance

45

Facilitate Decision Making (Mediate)

Helps facilitate rapid decision making

Target element for matching

Automatically calculated semantic distance between terms

Source candidate for matching

46

Enterprise Model (UML)

Data Models(Relational, XML)XML

XMLXML

Physical Sources

Model & Relate information within any domain

Ontology Models(e.g. OWL, RDF)

Relate information in different domains/models

Search within and across domains for related information

Integration Driven By Semantics

47

Ontology-Driven Integration Example

Land

4 Wheel

2 Wheel

TruckBus Car

Fuel Truck

CargoTruck

Transportation T

T

T

T

equivalence

equivalence

equivalence

equivalence

Logical Views Physical SourcesOntology

48

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

49

Track 4 Talks Tomorrow: 2:45-4:15pm

• Predictive Metrics To Guide SOA-Based System Development– John Salasin, NIST

• Integrating SOA and Ontologies for Information Sharing– Atif Kureishy, BAH

• SOA & Semantics– Dave McComb, Semantic Arts

50

Predictive Metrics To Guide SOA Development

John Salasin, NIST• Will propose a set of metrics (vocabulary) to

characterize SOA-based systems• These metrics can be assessed at different points

in the development lifecycle– Early stage (concept development)– Architecture/Construction (system charac.)– Operations (robustness, perf, usage, govern.)– Evolution (extensibility, change mgmnt)

• Analysis can lead to ongoing refinement at every stage

• Quantitative, incremental Verification &Validation

51

Integrating SOA and Ontologies for Information Sharing

Atif Kureishy, BAH

• Will discuss approaches for dynamic use of discoverable services

• Leverage semantic understanding/ definition of application domain

• Ontology-driven application case study

52

SOA & Semantics – Dave McComb

Dave McComb, Semantic Arts

• How firms are using semantic web standards & technology to assist their SOA efforts

• Semantics for service discovery

• Enterprise message modeling

• Dynamic classification of messages

53

Agenda

• The drivers for data (& metadata) integration• Metadata in an SOA• Data services: using active metadata to drive

data integration• Beyond metadata: dictionaries, vocabularies,

domain models, ontologies (semantics)• Why ontologies?• Overview of Track 4 Presentations• Q & A

An Introduction to Track 4: SOA and Metadata

(Semantics)

Chuck Mosher

Senior Enterprise Architect

cmosher @ metamatrix.com

2nd SOA for E-Government Conference30-31 October 20006

top related