data management information management knowledge management for network centric operations

46
Data Management Information Management Knowledge Management for Network Centric Operations Dr. Bhavani Thuraisingham The University of Texas at Dallas October 2005

Upload: giulia

Post on 31-Jan-2016

41 views

Category:

Documents


1 download

DESCRIPTION

Data Management Information Management Knowledge Management for Network Centric Operations. Dr. Bhavani Thuraisingham The University of Texas at Dallas. October 2005. Data, Information and Knowledge Management: Definitions. Knowledge Management:. Acquiring knowledge. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Management Information Management Knowledge Management for Network Centric Operations

Data ManagementInformation Management

Knowledge Management for Network Centric Operations

Dr. Bhavani ThuraisinghamThe University of Texas at Dallas

October 2005

Page 2: Data Management Information Management Knowledge Management for Network Centric Operations

204/22/23 02:00

Data, Information and Knowledge Management: Definitions

Information Management: Extracting information from the dataVisualizing the data

Data Management:Data administrationDatabase management

Knowledge Management:Acquiring knowledgeCollaboration and sharingManaging the processesDisseminating the knowledgeTaking action

Page 3: Data Management Information Management Knowledge Management for Network Centric Operations

304/22/23 02:00

What is data management?

0 One proposal: Data Management = Database System Management + Data Administration

0 Includes data analysis, data administration, database administration, auditing, data modeling, database system development, database application development

Page 4: Data Management Information Management Knowledge Management for Network Centric Operations

404/22/23 02:00

Data Administration

0 Identifying the data

- Data may be in files, paper, databases, etc.

0 Analyzing the data

- Is the data of good quality?

- Is the data complete?

0 Data standardization

- Should one standardize all the data elements and metadata?

- Repositories for handling semantic heterogeneity?

0 Data Security

- How should data be secured?

0 Data modeling

- Structure the data, model the data and the processes

Page 5: Data Management Information Management Knowledge Management for Network Centric Operations

504/22/23 02:00

Data Administration (Continued)

0 Data quality provides some measure for determining the accuracy of the data- Is the data current? Can we trust the source?- Data quality parameters can be passed from source to

source=E.g., Trust A 50% and Trust B 30%

0 Data may have different semantics- E.g, Bank A may send out statement on the 20th day of

each month and Bank B may send out statements on the 5th day of each month

- Fighter jet and Passenger plane may be considered to be one and the same

Page 6: Data Management Information Management Knowledge Management for Network Centric Operations

604/22/23 02:00

Data Administration (Concluded)

0 Data Standards- Standards for data semantics and administration- E.g., XML (eXtensible Markup Language) for

document interchange0 Data security includes data confidentiality and integrity

- Confidentiality is about preventing unauthorized access to the data

- Integrity is about preventing malicious corruption to the data

Page 7: Data Management Information Management Knowledge Management for Network Centric Operations

704/22/23 02:00

An Example Database System

Database

Database Management SystemApplicationPrograms

Users

Page 8: Data Management Information Management Knowledge Management for Network Centric Operations

804/22/23 02:00

Metadata

0 Metadata describes the data in the database

- Example: Database D consists of a relation EMP with attributes SS#, Name, and Salary

0 Metadatabase stores the metadata

- Could be physically stored with the database

0 Metadatabase may also store constraints and administrative information

0 Metadata is also referred to as the schema or data dictionary

Page 9: Data Management Information Management Knowledge Management for Network Centric Operations

904/22/23 02:00

Three-level Schema Architecture: Details

ExternalSchema A

ExternalSchema B

ConceptualSchema

InternalSchema

User A1 User A2 User A3 User B1 User B2

ExternalModel A

ExternalModel B

ConceptualModel

StoredDatabaseInternal Model

External/ConceptualMapping B

External/ConceptualMapping A

Conceptual/Internal Mapping

Page 10: Data Management Information Management Knowledge Management for Network Centric Operations

1004/22/23 02:00

Functional Architecture

User Interface Manager

QueryManager

Transaction Manager

Schema(Data Dictionary)Manager (metadata)

Security/IntegrityManager

FileManager

DiskManager

Data Management

Storage Management

Page 11: Data Management Information Management Knowledge Management for Network Centric Operations

1104/22/23 02:00

Types of Database Systems

0 Relational Database Systems

0 Distributed and Federated Database Systems

0 Object Database Systems

0 Deductive Database Systems

0 Other

- Real-time, Secure, Parallel, Scientific, Temporal, Wireless, Functional, Entity-Relationship, Sensor/Stream Database Systems, etc.

Page 12: Data Management Information Management Knowledge Management for Network Centric Operations

1204/22/23 02:00

Relational Database: Example

Relation S:

S# SNAME STATUS CITYS1 Smith 20 LondonS2 Jones 10 ParisS3 Blake 30 ParisS4 Clark 20 LondonS5 Adams 30 Athens

Relation P:

P# PNAME COLOR WEIGHT CITYP1 Nut Red 12 LondonP2 Bolt Green 17 ParisP3 Screw Blue 17 RomeP4 Screw Red 14 LondonP5 Cam Blue 12 ParisP6 Cog Red 19 London

Relation SP:

S# P# QTYS1 P1 300S1 P2 200S1 P3 400S1 P4 200S1 P5 100S1 P6 100S2 P1 300S2 P2 400S3 P2 200S4 P2 200S4 P4 300S4 P5 400

Page 13: Data Management Information Management Knowledge Management for Network Centric Operations

1304/22/23 02:00

Example Object

CompositeDocument

Object

Section 1Object

Section 2Object

Paragraph 1Object

Paragraph 2Object

Page 14: Data Management Information Management Knowledge Management for Network Centric Operations

1404/22/23 02:00

Distributed Database System

Communication NetworkDistributed Processor 1

DBMS 1

Data-base 1 Data-

base 3

Data-base 2 DBMS 2

DBMS 3

Distributed Processor 2

Distributed Processor 3

Site 1

Site 2

Site 3

Page 15: Data Management Information Management Knowledge Management for Network Centric Operations

1504/22/23 02:00

DBMS 1

DQP DQP

DBMS 2

DQP

DBMS 3

EMP1 (20) EMP2 (30)DEPT2 (20)

EMP1 (20)EMP3 (50)DEPT3 (30)

Network

Query at site 1: Join EMP and DEPT on D#

Move EMP2 to site 3; Merge EMP1, EMP2, EMP3 to form EMPMove DEPT2 to site 3; Merge DEPT2 and DEPT3 to form DEPTJoin EMP and DEPT; Move result to site 1

Query Processing ExampleDQP(DistributedQueryProcessor)

Page 16: Data Management Information Management Knowledge Management for Network Centric Operations

1604/22/23 02:00

Transaction Processing Example

Site 1Coordinator

Transaction Tj

Site 2Participant

Site 3Participant

Site 4Participant

Subtransaction Tj2 Subtransaction Tj3

Subtransaction Tj4

Issues:Concurrency controlRecoveryData Replication

Two-phase commit:Coordinator queries participants whether they are ready to commitIf all participants agree, then coordinator sends request forthe participants to commit

DTM (Distributed Transaction Manager) responsible for executing the distributedtransaction

Page 17: Data Management Information Management Knowledge Management for Network Centric Operations

1704/22/23 02:00

Interoperability of Heterogeneous Database Systems

Database System A Database System B

Network

Database System C(Legacy)

Transparent accessto heterogeneousdatabases - both usersand application programs;Query, Transactionprocessing

(Relational) (Object-Oriented)

Page 18: Data Management Information Management Knowledge Management for Network Centric Operations

1804/22/23 02:00

Technical Issues on the Interoperability of Heterogeneous Database Systems

0 Heterogeneity with respect to data models, schema, query processing, query languages, transaction management, semantics, integrity, and security policies

0 Interoperability based on client-server architectures

0 Federated database management

- Collection of cooperating, autonomous, and possibly heterogeneous component database systems, each belonging to one or more federations

Page 19: Data Management Information Management Knowledge Management for Network Centric Operations

1904/22/23 02:00

Different Data Models

Node A Node B

Database Database

RelationalModel

NetworkModel

Node C

Database

Object-Oriented Model

Network

Node D

Database

HierarchicalModel

Developments: Tools for interoperability; commercial productsChallenges: Global data model

Page 20: Data Management Information Management Knowledge Management for Network Centric Operations

2004/22/23 02:00

Schema Integration and Transformation: An approach

Schemadescribing

the networkdatabase

Schemadescribing

the hierarchicaldatabase

Schemadescribing

the object-orienteddatabase

Global Schema: Integratethe generic schemas

ExternalSchema I

External Schema II

External Schema III

Schemadescribing

the relationaldatabase

Generic schemadescribing

the relationaldatabase

Generic schemadescribing

the networkdatabase

Generic schemadescribing

the hierarchicaldatabase

Generic schemadescribing

the object-orienteddatabase

Challenges: Selecting appropriate generic representation; maintaining consistency during transformations;

Page 21: Data Management Information Management Knowledge Management for Network Centric Operations

2104/22/23 02:00

Semantic Heterogeneity0 Semantic heterogeneity occurs when there is a disagreement about

the meaning or interpretation of the same data; or same data interpreted differently

Object O

Node A Node B

Database Database

Object Ointerpreted as

a passenger ship

Object Ointerpreted asa submarine

Challenges:Standard definitions;Repositories

Page 22: Data Management Information Management Knowledge Management for Network Centric Operations

2204/22/23 02:00

Federated Database Management

Database System A Database System B

Database System C

Cooperating databasesystems yet maintainingsome degree ofautonomy

Federation F1

Federation F2

Page 23: Data Management Information Management Knowledge Management for Network Centric Operations

2304/22/23 02:00

Autonomy

Component A Component B

Component C

local request

request from component

communicationthrough

federation

component Adoes not

communicatewith

component C

component A honorsthe local request first

Challenges:Adapt techniques to handle autonomy -e.g., transactionprocessing, schema integration; transitionresearch to products

Page 24: Data Management Information Management Knowledge Management for Network Centric Operations

2404/22/23 02:00

Federated Data and Policy Management

ExportData/Policy

ComponentData/Policy for

Agency A

Data/Policy for Federation

ExportData/Policy

ComponentData/Policy for

Agency C

ComponentData/Policy for

Agency B

ExportData/Policy

Page 25: Data Management Information Management Knowledge Management for Network Centric Operations

2504/22/23 02:00

What is Information Management?

0 Information management essentially analyzes the data and makes sense out of the data

0 Several technologies have to work together for effective information management

- Data Warehousing: Extracting relevant data and putting this data into a repository for analysis

- Data Mining: Extracting information from the data previously unknown

- Multimedia: managing different media including text, images, video and audio

- Web: managing the databases and libraries on the web

Page 26: Data Management Information Management Knowledge Management for Network Centric Operations

2604/22/23 02:00

Data Warehouse

OracleDBMS forEmployees

SybaseDBMS forProjects

InformixDBMS forMedical

Data Warehouse:Data correlatingEmployees WithMedical Benefitsand Projects

Could beany DBMS; Usually based on the relational data model

UsersQuerythe Warehouse

Page 27: Data Management Information Management Knowledge Management for Network Centric Operations

2704/22/23 02:00

What is Data Mining?

Data MiningKnowledge Mining

Knowledge Discoveryin Databases

Data Archaeology

Data Dredging

Database MiningKnowledge Extraction

Data Pattern Processing

Information Harvesting

Siftware

The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data, often previously unknown, using pattern recognition technologies and statistical and mathematical techniques(Thuraisingham 1998)

Page 28: Data Management Information Management Knowledge Management for Network Centric Operations

2804/22/23 02:00

Steps to Data Mining

Data Sources

Integratedata sources

Clean/modifydata sources

Minethe data

ExamineResults/Pruneresults

Reportfinalresults/Take actions

Page 29: Data Management Information Management Knowledge Management for Network Centric Operations

2904/22/23 02:00

Data Mining Needs for Counterterrorism: Non-real-time Data Mining

0 Gather data from multiple sources

- Information on terrorist attacks: who, what, where, when, how

- Personal and business data: place of birth, ethnic origin, religion, education, work history, finances, criminal record, relatives, friends and associates, travel history, . . .

- Unstructured data: newspaper articles, video clips, speeches, emails, phone records, . . .

0 Integrate the data, build warehouses and federations

0 Develop profiles of terrorists, activities/threats

0 Mine the data to extract patterns of potential terrorists and predict future activities and targets

0 Find the “needle in the haystack” - suspicious needles?

0 Data integrity is important

0 Techniques have to SCALE

Page 30: Data Management Information Management Knowledge Management for Network Centric Operations

3004/22/23 02:00

Data Mining Needs for Counterterrorism: Real-time Data Mining

0 Nature of data

- Data arriving from sensors and other devices

=Continuous data streams

- Breaking news, video releases, satellite images

- Some critical data may also reside in caches

0 Rapidly sift through the data and discard unwanted data for later use and analysis (non-real-time data mining)

0 Data mining techniques need to meet timing constraints

0 Quality of service (QoS) tradeoffs among timeliness, precision and accuracy

0 Presentation of results, visualization, real-time alerts and triggers

Page 31: Data Management Information Management Knowledge Management for Network Centric Operations

3104/22/23 02:00

Data Mining as a Threat to Privacy

0 Data mining gives us “facts” that are not obvious to human analysts of the data

0 Can general trends across individuals be determined without revealing information about individuals?

0 Possible threats:- Combine collections of data and infer information that is private

=Disease information from prescription data=Military Action from Pizza delivery to pentagon

0 Need to protect the associations and correlations between the data that are sensitive or private

Page 32: Data Management Information Management Knowledge Management for Network Centric Operations

3204/22/23 02:00

Privacy Preserving Data Mining

User Interface Manager

ConstraintManager

Privacy Constraints

Query Processor:

Constraints during query and release operations

Data Miner:

Makes correlations

Ensures privacy

Database Design Tool

Structures the database

DatabaseDBMS

Page 33: Data Management Information Management Knowledge Management for Network Centric Operations

3304/22/23 02:00

Current Status, Challenges and Directions0 Status

- Data Mining is now a technology- Several prototypes and tools exist; Many or almost all of

them work on relational databases0 Challenges

- Mining large quantities of data; Dealing with noise and uncertainty, reasoning with incomplete data, Eliminating False positives and False negatives

0 Directions- Mining multimedia and text databases, Web mining

(structure, usage and content), Mining metadata, Real-time data mining, Privacy

Page 34: Data Management Information Management Knowledge Management for Network Centric Operations

3404/22/23 02:00

Semantic Web: Overview

0 According to Tim Berners Lee, The Semantic Web supports- Machine readable and understandable web pages- Enterprise application integration- Nodes and links that essentially form a very large

database

Premise:

Semantic Web Applications: Web Database Management +

Web Services + Information Integration + - - - - -

Semantic Web Technologies: XML, RDF, Ontologies, Rules-ML

Page 35: Data Management Information Management Knowledge Management for Network Centric Operations

3504/22/23 02:00

Layered Architecture for Dependable Semantic Web

0 Some Challenges: Interoperability between Layers; Security and Privacy cut across all layers; Integration of Services; Composability

XML, XML Schemas

Rules/Query

Logic, Proof and TrustTRUST

OtherServicesRDF, Ontologies

URI, UNICODE

PRIVACY

0Adapted from Tim Berners Lee’s description of the Semantic Web

Page 36: Data Management Information Management Knowledge Management for Network Centric Operations

3604/22/23 02:00

What is XML all about?

0 XML is needed due to the limitations of HTML and complexities of SGML

0 It is an extensible markup language specified by the W3C (World Wide Web Consortium)

0 Designed to make the interchange of structured documents over the web easier

0 Key to XML are Document Type Definitions (DTDs) and XML Schemas

0 Allows users to bring multiple files together to form compound documents

Page 37: Data Management Information Management Knowledge Management for Network Centric Operations

3704/22/23 02:00

What is Knowledge Management?

0 Knowledge management, or KM, is the process through which organizations generate value from their intellectual property and knowledge-based assets

0 Gartner group: KM is a discipline that promotes an integrated approach to identifying and sharing all of an enterprise's information assets, including databases, documents, policies and procedures as well as unarticulated expertise and experience resident in individual workers

0 Peter Senge: Knowledge is the capacity for effective action, this distinguishes knowledge from data and information; KM is just another term in the ongoing continuum of business management evolution

Page 38: Data Management Information Management Knowledge Management for Network Centric Operations

3804/22/23 02:00

Knowledge Management Components

Components:StrategiesProcessesMetrics

Cycle:Knowledge, CreationSharing, Measurement And Improvement

Technologies:Expert systemsCollaborationTrainingWeb

Components ofKnowledge Management: Components,Cycle and Technologies

Page 39: Data Management Information Management Knowledge Management for Network Centric Operations

3904/22/23 02:00

KM: Strategy, Process and Metrics

0 Strategy- Motivation for KM and how to structure a KM program

0 Process- Use of KM to make existing practice more effective

0 Metrics- Measure the impact of KM on an organization

Page 40: Data Management Information Management Knowledge Management for Network Centric Operations

4004/22/23 02:00

Strategy: Building Learning Organizations

0 Adaptive learning and Generative learning- Need to adapt to the changing environment- Total quality movement (TQM) in Japan has migrated to a

generative learning model=Look at the world in a new way

0 Changing roles of the leader- Migrating from decision makers to designers, teachers

and stewards 0 Building a shared vision

- Encouraging ideas, Requesting support, Moving beyond blame, Effective communication

0 Learning tools- Learning laboratory

Page 41: Data Management Information Management Knowledge Management for Network Centric Operations

4104/22/23 02:00

Knowledge Management in Process Management

0 Types of Processes- Simple processes: Low level operation- Complex and nonadapative processes: Systems that use

the same rules- Complex and adaptive: Agents carrying out the processes

are intelligent and adaptive0 Linking knowledge management with processes

- Knowledge management is needed for all processes; critical for complex and adaptive processes

- Learn from experience and use the experience in unknown situations

Page 42: Data Management Information Management Knowledge Management for Network Centric Operations

4204/22/23 02:00

Metrics: The Balanced Scorecard

0 Employee Capabilities: Measuring the following- Employee satisfaction- Employee retention- Employee productivity

0 Information system capabilities: Measuring the following- Whether each employee segment has information to carry

out its operations. 0 Motivation and Empowerment: Measuring the following

- Suggestions made and implemented- Improvement- Team performance

Page 43: Data Management Information Management Knowledge Management for Network Centric Operations

4304/22/23 02:00

Knowledge Management Architecture

Knowledge Creation and Acquisition Manager

Knowledge RepresentationManager

Knowledge ManipulationManager

Knowledge Dissemination and SharingManager

Page 44: Data Management Information Management Knowledge Management for Network Centric Operations

4404/22/23 02:00

Secure Knowledge Management

0 Protecting the intellectual property of an organization

0 Access control including role-based access control

0 Security for process/activity management and workflow

- Users must have certain credentials to carry out an activity

0 Composing multiple security policies across organizations

0 Security for knowledge management strategies and processes

0 Risk management and economic tradeoffs

0 Digital rights management and trust negotiation

Page 45: Data Management Information Management Knowledge Management for Network Centric Operations

4504/22/23 02:00

Status and Directions

0 Knowledge management has exploded due to the web

0 Knowledge Management has different dimensions

- Technology, Business

- Goal is to take advantage of knowledge in a corporation for reuse

0 Tools are emerging

0 Need effective partnerships between business leaders, technologists and policy makers

0 Knowledge management may subsume information management and data management

- Vague boundaries

Page 46: Data Management Information Management Knowledge Management for Network Centric Operations

4604/22/23 02:00

Other Ideas and Directions?Prof. Bhavani Thuraisingham

- Director Cyber Security Center

- Department of Computer Science

- Erik Jonsson School of Engineering and Computer Science

- The University of Texas at Dallas

- Richardson, Texas

- [email protected]

http://www.utdallas.edu/~bxt043000/

President

Dr-Bhavani Security Consulting

Dallas, TX

www.dr-bhavani.org