data services: addressing the challenges of transformation to a knowledge-driven enterprise

29
Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise Sri Gopalan Booz Allen Hamilton

Upload: justine-quentin

Post on 03-Jan-2016

13 views

Category:

Documents


1 download

DESCRIPTION

Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise. Sri Gopalan Booz Allen Hamilton. Agenda. Challenges of transitioning to a knowledge-driven enterprise Facets of an effective Data Services solution An approach to realizing Data Services - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Data Services:Addressing the challenges of transformation to a knowledge-driven enterprise

Sri GopalanBooz Allen Hamilton

Page 2: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Agenda

Challenges of transitioning to a knowledge-driven enterprise

Facets of an effective Data Services solution An approach to realizing Data Services The Way Ahead Questions and Comments

Page 3: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Challenges of transitioning to a knowledge-driven enterprise

Page 4: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

The current production rate of digital information exceeds the ability to process it

Technology research firm IDC determined that the world generated 161 billion gigabytes of digital information last year

Data is contained in a multitude of unstructured (images, video, free text) and structured ( RDBMS, XML, etc…) formats

Greater policy requirements both from regulatory concerns (i.e. Sarbanes-Oxley, HIPAA, etc…) and enterprise interests (i.e. security constraints, etc…)

Organizations are struggling to get a handle on what information they have, how to search for it, and how to protect it

Vol

ume

of D

ata

Cre

atio

n

Time

Page 5: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Within many enterprises, there is no consistent way to discover, access, or share data

Dept. A

Dept D.

Dept. B

Dept. C

DB

XML

Proprietary

JDBC

HTTPEmailERP

Portals Web Application

Stand-alone Apps

Without a priori knowledge of where systems are, how to access them, and how to query them, users find it difficult to get all the information that they need

Page 6: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Providing Business Context to Search

The key element to search is to provide search results relevant to the given business context

While a consumer might make a request in his/her business context, the data providers may interpret that request in their own divergent business context

Org. A

Apps

Data Format

Data

Service

Interface

Org. B

Apps

Data Format

Data

Service

Interface

“I need a tank…”

“I have scuba tanks”

“I have gas tanks”

Page 7: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Facets of an effective Data Services solution

Page 8: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

“Web 2.0” technologies provide enhanced collaboration and spark community-building activities

HousingMaps = Google Maps + Craigslist.com

JobMaps = Google Maps + Indeed Job Search

Mashups are a great example of re-purposing data, but they are still point-to-point and require a lot of redundant developer effort to create each one

Page 9: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Lessons Learned from Social Software techniques

Leverage industry strengths Use technologies and standards that are well supported by

commercial and open-source tools in order to facilitate greater adoption

Greatest common factor approach Develop solutions that meets the requirements of the widest

based of users, including those that may be technologically limited or resource constrained

Evolve with the community Develop solutions that are flexible and adaptable enough to

change over time and incorporate community feedback and contributions

Keep it Simple While Data Services solutions may perform very complicated

process in the back end, try to keep the front-end interfaces to it as simple and easy to work with as possible

Page 10: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

The importance of Metadata The main purpose of metadata, or data about data, is to

speed up and enrich searching for resources “What data services have information on recent financial filings?” “Which data services are associated with a HR data within an

enterprise taxonomy?”

Metadata Type Description Examples

Syntactic Describes the physical, syntactic markup of individual data elements (formatting, field markers)

Datatype, Field Length, Field Name, Tag Names, Flat File Makers

Structural Describes the logical grouping of individual of data elements (i.e. entity-attribute groupings)

Logical schema definitions (PersonRecord: PersonName, PersonSSN, PersonDOB)

Semantic Describes the codified meaning of data elements, and their relationships, including any rules or constraints on those relationships

Person was-born on PersonDOB, and was-born once and only once

Ex

pre

ss

ive

ne

ss

Types of MetadataTypes of Metadata

Page 11: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

The Need for Data Discovery Data Discovery provides service consumer agents with a common

facility to distribute a search for relevant information across data assets within the enterprise including those that are known a priori and those that are unexpected

Data Discovery exposes the essential metadata of a data resource (e.g. id, title, summary), not the data resource itself

Potential usage scenarios: An consumer can “subscribe” to a Data Discovery service to

automatically receive streams of information about topics he/she is interested in from a variety of data providers he/she may or may not know about

Data providers, both small and large, can more directly advertise their information to interested service consumer agents that it may or may not know about

An analyst may request more metadata about a data resource before accessing it

Page 12: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Example Data Discovery Scenario

1

2

3

3

3

4

5

SearchService #1 DB

SearchService #2 XML

SearchService #3

Video

SearchService #4

Images

SearchAggregator

ServiceDiscovery

UDDI

1. Consumer makes discovery request

2. Search Aggregator queries Service Discovery for

relevant Search Services

3. Search Aggregator distributes request to relevant

Search Services

4. Search Aggregator aggregates search results

5. Search Aggregator returns all search results

Page 13: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

The need for Data Access and Delivery

Once a data resource of interest has been identified via Data Discovery, a service consumer might want to “access” or “deliver” that data resource for further processing

Data Access and Delivery capabilities provide service consumer agents with a common facility to synchronously fetch a data resource or asynchronously route it to a pre-determined endpoint

Potential usage scenarios: An user at his/her workstation can directly “access” a data resource for

detailed inspection An field technician on the job site can use his/her mobile device to

“deliver” a data resource to his/her computer at work to analyze later Data providers can lower the cost of integration by supporting a common

data retrieval interface that is well-understood throughout the local enterprise and industry

Page 14: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Example Data Access & Delivery Scenario

Messaging

Infrastructure

RetrieveService #1

DB

CallbackInterface

1

2a

2b3a

3a1. Consumer makes data access request

2a. Retrieve Service returns requested information

2b. Retrieve Service forwards requested information to

the Messaging Infrastructure

3a. Messaging Infrastructure routes requested

information to service consumer

3b. Messaging Infrastructure routes requested

information to service consumer receiver agent

implementing a Callback Interface

Page 15: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Major issues facing distributed information sharing

Must support for a number of interaction models Request-response, subscribe-push, probe and match,

authenticated and/or single use of data, etc… Must support a variety of metadata and content formats

Atom, Dublin Core, Images, Video, PDF, Open Document, etc… Different types of data lend themselves to be queried by

different mechanisms XML can be natively searched XQuery Images cannot be natively searched with XQuery

Must be designed for controlled evolution Do not want the addition of new features to alienate current users

through constant upgrades or revisions Discourage specification “lip service” by avoiding unbounded

fields

Page 16: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

An approach to realizing Data Services

Page 17: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Data Service Objectives Address the need to enable enterprise-wide

data discovery and aggregation across any number of service implementations while offering the end users with relevant information

Enable horizontal discovery, access, and consumption of data of relevance, regardless of physical location, data type, and/or technical implementation

Support a variety of messaging patterns, security and policy requirements, and data needs

Page 18: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Profile-Based Approach to achieving Data Services

Data Services specifications should focus on capturing the high-level process and use-cases requirements (i.e. the need to search against metadata and content), rather than the low-level realizations of those features (i.e. XQuery vs. Keyword search)

Abstract Data Services interface focused on defining a high-level construct to capture intended behaviors that will be implemented by pluggable profiles

Inspired by token profiles within WS-Security Loosely coupled specification that enables service providers to

add new capabilities without having to change the WSDL Enables service providers to only implement those profiles that

satisfy their specific requirements

Page 19: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

What are the profiles we need to consider?

Context – What is the business context of the data service operation (search, retrieve)

Ex. A set of taxonomy key-value pairs to search against a UDDI registry

Metadata – What are the metadata formats that I would like to interact against?

Ex. Dublin Core Metadata Element Set, Atom 1.0, RSS Content – What are the content types that I would like to

interact with? Ex. PDF, Open Document, Open XML, JPEG, MPEG2

Query – Given the type of metadata and/or content, how would I like to query for information?

Ex. Keyword search, XQuery request, SPARQL requests

Page 20: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Data Services Request

The combination of different “profiles” can have measurable impact

While “CriminalMetadata”, “MugShotContent”, “CriminalQL” and “ImageMatch” do not exist today, if they are introduced in the future it should not significantly alter the way we process requests for information

Metadata Profile: CriminalMetadata

Query Profile: CriminalQL

Find Where sex = “male” and race = “white” and height >= “5-09” and height <= “5-10”

Content Profile: MugShotContent

Query Profile: ImageMatch

Page 21: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Encouraging collaboration with REST and/or SOAP

SOAP is a protocol specification that defines a uniform way of passing XML-encoded data that abstracts the physical transport layer.

Representational State Transfer (REST) are a set of architectural principles that loosely describes any simple interface that uses the use XML over HTTP without an additional messaging layer such as SOAP

SOAP and REST are two different approaches that serve different needs

In many areas the provided functionality overlaps and causes a bit of contention

The two approaches, if used properly, can be complementary and will help to meet the overall data services needs

Page 22: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

RESTful feeds may be appropriate for disparate content subscriptions

Source: RSS--Promising Technology for Building Customer Relationships (http://www.mediathink.com/rss/rss_marketers2.asp)

Page 23: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

SOAP-based messages are better suited for complex requests and messaging patterns

RetrieveService #1 DB

RetrieveService #2 XML

RetrieveService #3

Video

RetrieveService #4

Images

SubscribeService

CallbackInterface

Subscribe

Notify

Scheduled Pull

Scheduled Pull

Scheduled Pull

Page 24: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Supporting standards that may help to advance Data Services initiatives

There is a no existing set of standards that fully supports the functionality of a complete Data Services solution

Need Standard(s)

Service Registry UDDI v3, ebXML Registry

Security/Policy Concerns WS-Security, SAML 2.0, XACML, WS-Policy

Notifications and Eventing WS-Notification, WS-Eventing, WS-EventNotification

Asynchronous Behavior WS-Addressing

Reliable Messaging WS-ReliableMessaging

Query Languages XQuery 1.0, XPath, SPARQL

Metadata Formats Dublin Core, Atom 1.0

Search Functionality Z39.50

Page 25: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

The Way Ahead

Page 26: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

OASIS Data Services Framework Technical Committee (OASIS DSF TC)

Goals and objectives for the TC include: Collect, analyze and document the requirements for

data management and sharing in a networked environment where data services lie under different domains of ownership and stewardship

Aid architects in understanding the conceptual patterns of interaction pertaining to data oriented operations

Create an abstract specification normatively describing a framework of operations to manage and retrieve data in a services environment, across ownership and stewardship boundaries.

Describe service patterns and interactions between a provider, consumer, and other resources and entities

Page 27: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

OASIS Data Services Framework Technical Committee (OASIS DSF TC)

Out of Scope Items: Define a mapping of the functions and elements

described in the specifications to any programming language, to any particular messaging middleware, or to specific network transports.

Define new key query algorithms, metadata specifications, or content specifications.

Define concepts or renderings for functions that are of wider applicability including but not limited to:

Addressing Query frameworks Routing Reliable message exchange

Page 28: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Summary

The need for a distributed discovery, aggregation, and access mechanism becoming more an more important

Any Data Services solution must account for a growing number of metadata specifications, content formats, and query mechanism

WS-Security demonstrates that a a profile-based solution can meet the diverse needs of a community

OASIS Data Service Framework TC will identify and fill the gaps to achieve a complete Data Services solution

Page 29: Data Services: Addressing the challenges of transformation to a knowledge-driven enterprise

Questions and Comments