chapter 4
DESCRIPTION
CHAPTER 4. Data and Knowledge Management. Chapter Outline. 4.1 Managing Data 4.2 The Database Approach 4.3 Database Management Systems 4.4 Data Warehousing 4.5 Data Governance 4.6 Knowledge Management. Learning Objectives. - PowerPoint PPT PresentationTRANSCRIPT
Information Technology Foundations-BIT 112
CHAPTER 4
Data and Knowledge Management
Information Technology Foundations-BIT 112
2
Chapter Outline
• 4.1 Managing Data• 4.2 The Database Approach• 4.3 Database Management Systems• 4.4 Data Warehousing• 4.5 Data Governance• 4.6 Knowledge Management
Information Technology Foundations-BIT 112
3
Learning Objectives
• Recognize the importance of data, issues involved in managing data and their lifecycle.
• Describe the sources of data and explain how data are collected.
• Explain the advantages of the database approach.• Explain the operation of data warehousing and its role
in decision support.• Explain data governance and how it helps to produce
high-quality data.• Define knowledge, and describe different types of
knowledge.
Information Technology Foundations-BIT 112
5
Examples of Data Sources
E-mails
Credit card swipes RFID tags Digital video surveillance
Radiology scans
Blogs
Information Technology Foundations-BIT 112
6
Chapter Opening Case
Push Model
Products
Information Technology Foundations-BIT 112
7
Chapter Opening Case
Pull Model
Orders
Information Technology Foundations-BIT 112
8
4.1 Managing Data
• Difficulties in Managing Data– Amount of data increases
exponentially.– Data are scattered and
collected by many individuals using various methods and devices.
– Data come from many sources.
– Data security, quality and integrity are critical.
Information Technology Foundations-BIT 112
9
Difficulties in Managing Data
• An ever-increasing amount of data needs to be considered in making organizational decisions.
The Data Deluge
http://www.applimation.com/
Information Technology Foundations-BIT 112
10
Data Life Cycle (Figure 4.1)
• Businesses run on data that have been processed or transformed into information and knowledge.
• Figure 4.1 illustrates the processing of data into information and ultimately knowledge.
Time
Information Technology Foundations-BIT 112
11
Data, Information, Knowledge, Wisdom
• Putting data, information, knowledge, and wisdom into perspective.
Information Technology Foundations-BIT 112
12
What is meaning Data, Information, Knowledge, and Wisdom ?• At your tables, take a few minutes and try to define
these terms.
Information Technology Foundations-BIT 112
13
What is meaning of Data, Information, Knowledge, and Wisdom ?• Data Item
– Elementary description of things, events, activities and transactions that are recorded, classified and stored but are not organized to convey any specific meaning.
• Information– Data organized so that they have meaning and value to the
recipient.
• Knowledge– Data and/or information organized and processed to convey
understanding, experience, accumulated learning and expertise as they apply to a current problem or activity.
• Wisdom– The quality or state of being wise; knowledge of what is true or
right coupled with just judgment as to action
Information Technology Foundations-BIT 112
14
4.2 The Database Approach
• A database management system (DBMS) provides all users with access to all the data.
• DBMSs minimize the following data management problems:– Data redundancy:
• The same data are stored in many places.– Data isolation:
• Applications cannot access data associated with other applications.
– Data inconsistency: • Various copies of the data do not agree.
Information Technology Foundations-BIT 112
15
Database Approach (continued)
• DBMSs maximize the following issues:– Data security:
• Keeping the organization’s data safe from theft, modification, and/or destruction.
– Data integrity: • Data must meet constraints (e.g., student grade point
averages cannot be negative).– Data independence:
• Applications and data are independent of one another. This means that applications and data are not linked to each other, so application logic can be changed and the database does not have to be modified. The inverse is also true.
Information Technology Foundations-BIT 112
16
Database Management Systems
Information Technology Foundations-BIT 112
17
Data Hierarchy (some DBMS Terminology) • A bit
– a binary digit, or a “0” or a “1”.
• A byte – eight bits and represents a single character (e.g., a letter, number or
symbol).
• A field – a group of logically related characters (e.g., a word, small group of
words, or identification number).
• A record – a group of logically related fields (e.g., student in a university database).
• A file – a group of logically related records.
• A database – a group of logically related files.
Information Technology Foundations-BIT 112
18
Hierarchy of Data for a Computer-Based File
Information Technology Foundations-BIT 112
19
Data Hierarchy (continued)
Bit (binary digit)
Byte (eight bits)
Information Technology Foundations-BIT 112
20
See Digital Data Representation Handout
• Review Digital Data Representation Handout
Information Technology Foundations-BIT 112
21
Data Hierarchy (continued)
• Example of Field and Record
Information Technology Foundations-BIT 112
22
Data Hierarchy (continued)
Example of a Database Form.
Information Technology Foundations-BIT 112
23
Designing the Database
• Data Model – A diagram that represents the entities in the database and their
relationships.
• Data Model Components– Entity
• An entity is a person, place, thing, or event about which information is maintained.
• A record is a database instance of an entity.– Attribute
• A particular characteristic or quality of a particular entity.– Primary Key
• A field that uniquely identifies a record.– Non-key Attributes
• A property or characteristic of an entity that is not part of the key
Information Technology Foundations-BIT 112
24
Entity Example
Entity Attributes
Instances
MOVIEMovie Number Name Rating Rental Rate
12345345 Die Hard PG13 $3
23456781 Wings PG $2
65656565 Black Beauty G $2
CUSTOMER
Cust Number Name AddressStatus Code
123-345 Tom Jones 12 Oak St OK
789-789 Mary Sullivan 456 Hill Ave Pend
567-342 Bob Waters 7676 Scutter Rd OK
Information Technology Foundations-BIT 112
25
Entity Attribute Try it …
• Copy #-The sequence number of the item available for rent. Used to differentiate multiple copies of a Movie.
• Customer # (Fk2)-Unique identifier of an individual authorized to rent a Movie.• Late Status-A status code identifying if the rental item has not been returned by the
Return Date.• Length-The running time in minutes of the item available for rent.• Movie #-Unique identifier of the item available for rent.• Movie Rental-An instance of a Movie being rented by a customer.• Movie Type-The genre or classification associated with the items available for rent. • Movie-An item that is available to rent, a motion picture or television production.• MPAA Rating-Motion Picture Association of America evaluation. Valid values are:
G, PG, PG-13 R, and NC-17.• Rent Date-The date a Movie is rented by a Customer.• Return Date-The date a rented Movie is to be returned to the store for restocking.• Title-The name of the item available for rent.
Information Technology Foundations-BIT 112
26
Entity-Relationship Modeling
• Database designers plan the database design in a process called entity-relationship (ER) modeling.
• ER diagrams consists of entities, attributes and relationships.
• Other concepts – Entity classes
• Groups of entities of a certain type.– Instance
• The representation of a particular entity.– Identifiers
• Attributes that are unique to that entity instance.
Information Technology Foundations-BIT 112
27
Sample Information Model (Relational - IDEF 1X)
(SET TYPE)
Information Technology Foundations-BIT 112
28
Entity-Relationship Diagram Model
Information Technology Foundations-BIT 112
29
4.3 Database Management Systems Key Definitions• Database management system (DBMS)
– A set of programs that provide users with tools to add, delete, access, and analyze data stored in one location.
• Relational database model– A popular type of DBMS that is based on the concept of two-
dimensional tables.
• Structured Query Language (SQL)– SQL is a standard interactive and programming language for querying
and modifying data and managing databases. – The core of SQL is formed by a command language that allows the
retrieval, insertion, updating, and deletion of data, and performing management and administrative functions.
• Query by Example (QBE)– allows users to fill out a grid or template to construct a filter or
description of the data one wants.
Information Technology Foundations-BIT 112
30
Example of a Relational Database Table
Information Technology Foundations-BIT 112
31
Normalization
• A set of rules for analyzing the attributes of an information model– Eliminate model redundancy– Ensure model consistency – Verify structural correctness– Maximize stability
• However, normalization cannot validate a model's accuracy in reflecting the business meaning of the information
Information Technology Foundations-BIT 112
32
Normal Forms
• Sequential steps for achieving an optimized and logically desirable information model
• Provides a common foundation from which an efficient physical database design can be created
• There are six degrees of normal form - the first three are usually sufficient for most modeling applications
• First normal form• Second normal form• Third normal form• Boyce/Codd normal form• Fourth normal form• Fifth normal form
Information Technology Foundations-BIT 112
33
First Normal Form - (1NF)
• Every key and non-key attribute of an entity must be single valued
• No entity instance can have multiple values for a given attribute
• i.e., The No Repeat Rule
• A violating entity is corrected by removing repeating or multivalued attributes to another, dependent (child) entity
Information Technology Foundations-BIT 112
34
First Normal Form - ExampleRESTAURANT
REST NAME ADDRESS PHONE # EMPLOYEE NAME
BURGER KING TACO HOUSE FISH COMPANY
123 NORTH ST 345 126TH PLACE 77 SUNSET AVE
123-2345 765-8907 395-5682
JOHN, SUE, LISA MARY, BILL ED, SAM, JOSE, RICK
REST NAME ADDRESS PHONE # EMPLOYEE NAME
RESTAURANT
REST NAME ADDRESS PHONE #
EMPLOYEE
EMPLOYEE NAMEREST NAME POSITION
employs
Information Technology Foundations-BIT 112
35
Second Normal Form - (2NF)
• An entity that is in first normal form and each non-key attribute is dependent on the entire primary key
• No non-key attribute instance can be determined by knowing just part of an entity instances key
• A violating entity is corrected by removing to a parent entity any attributes that depend on only a subset of the primary key
Information Technology Foundations-BIT 112
36
Second Normal Form - ExampleRESTAURANT ORDER
REST NAME SUPPLIER NAME ORDER ITEM SUPPLIER PHONE #
BURGER KING TACO HOUSE FISH COMPANY
SAM'S PRODUCE SALSA INC. SAM'S PRODUCE
BEEF PEPPERS SNAPPER
123-2345 765-8907 123-2345
REST NAME SUPPLIER NAME ORDER ITEM SUPPLIER PHONE #
fills
RESTAURANT ORDERREST NAME ORDER ITEM SUPPLIER NAME (FK1)
SUPPLIERSUPPLIER NAME PHONE #
Information Technology Foundations-BIT 112
37
Third Normal Form - (3NF)
• An entity that is in second normal form and each non-key attribute is only dependent on the entire primary key and nothing other than the key
• No non-key attribute instance can be determined by knowing the value of another non-key attribute for the same instance
• A violating entity is corrected by removing to a parent entity any attributes exhibiting transitive dependencies (non-key attributes that not only depend on the whole key but also on other non-key attributes)
Information Technology Foundations-BIT 112
38
Third Normal Form - ExampleRESTAURANT RESERVATIONREST NAME RESERVATION # CUSTOMER NAME CUSTOMER PHONE # TIME # IN PARTY
BURGER KING TACO HOUSE FISH COMPANY
12 234 88
11:00 AM 2:30 PM 8:15 PM
123-2345 765-8907 123-2345
REST NAME RES # CUST NAME CUST PH # TIME # IN PARTY
makes
F. JONES R. SMITH F. JONES
4 4 6
CUSTOMER
CUSTOMER NAME PHONE #
RESTAURANT RESERVATIONREST NAME RESERVATION # CUSTOMER NAME (FK1) TIME # IN PARTY
Information Technology Foundations-BIT 112
39
Example #2
Non-Normalized Relation
Information Technology Foundations-BIT 112
40
Normalizing the Database (part A)
Information Technology Foundations-BIT 112
41
Normalizing the Database (part B)
Information Technology Foundations-BIT 112
42
Summary: Normalization Produces Order
Information Technology Foundations-BIT 112
43
Database that Catches Plagiarists P116
A Turnitin originality report
http://www.turnitin.com
Information Technology Foundations-BIT 112
44
4.4 Data Warehousing
• Data warehouse – A repository of historical data organized by subject to
support decision makers in an organization.– Organized by business dimension or subject.– Data warehouses are multidimensional.
A Data Cube with three dimensions:
• customer, • product, and • time.
Information Technology Foundations-BIT 112
45
Data Warehousing (continued)
• Data warehouses are historical.– Historical data in data warehouses can be used for
identifying trends, forecasting, and making comparisons over time.
• Data warehouses use Online Analytical Processing (OLAP).– OLAP involves the analysis of accumulated data by end
users (usually in a data warehouse).– In contrast, Online Transaction Processing (OLTP) typically
involves a database, where data from business transactions are processed online and as soon as they occur.
Information Technology Foundations-BIT 112
46
Data Warehouse Framework & Views
• Process of building and using a data warehouse.
Information Technology Foundations-BIT 112
47
Relational Databases
• First slide of five showing the relationship between relational databases and a multidimensional data structure (or data cube).
Information Technology Foundations-BIT 112
48
Multidimensional Database View
Information Technology Foundations-BIT 112
49
Equivalence Between Relational and Multidimensional Databases
Information Technology Foundations-BIT 112
50
Equivalence Between Relational and Multidimensional Databases
Information Technology Foundations-BIT 112
51
Equivalence Between Relational and Multidimensional Databases
Information Technology Foundations-BIT 112
52
Benefits of Data Warehousing
• End users can access data quickly and easily via Web browsers because they are located in one place.
• End users can conduct extensive analysis with data in ways that may not have been possible before.
• End users have a consolidated view of organizational data.
Information Technology Foundations-BIT 112
53
Data Marts
• A data mart is a small data warehouse, designed for the end-user needs in a strategic business unit (SBU) or a department.
• Are far less costly than an enterprise Data Warehouse. Typically by at least an order of magnitude.
Information Technology Foundations-BIT 112
54
4.5 Data Governance – An enterprise wide approach to managing data• Data governance definition
– An approach to managing data and information across an entire organization.
• Master Data Management – A method that organizations use in data governance.– Comprises a set of processes and tools for collecting,
aggregating, matching, consolidating, quality-assuring, persisting and distributing data throughout an organization in such a way as to ensure consistency and control in the ongoing maintenance and application use of this information.
• Master data – The set of core data, non transactional data, such as customer,
product, employee, and location, that spans all enterprise information systems.
Information Technology Foundations-BIT 112
55
Relationship Among Executive Management, IT Governance, and Data Governance
• Shows the relationship between data governance and data management.
Master Data Management
Information Technology Foundations-BIT 112
56
Data Governance (continued)
Information Technology Foundations-BIT 112
57
4.6 Knowledge Management
• Knowledge management (KM)– process that helps organizations
manipulate important knowledge that is part of the organization’s memory, usually in an unstructured format.
• Knowledge– Is something that is contextual,
relevant, and actionable.– a.k.a., Intellectual capital (or
intellectual assets)
Information Technology Foundations-BIT 112
58
Knowledge Management (continued)
Tacit Knowledge(below the waterline)• subjective or experiential learning.• Examples: experiences, insights,
expertise, know-how, trade secrets, understanding, skill sets, and learning.
Explicit Knowledge (above the waterline)• objective, rational, technical
knowledge that has been documented.• Examples: policies, procedural guides,
reports, products, strategies, goals, core competencies.
Information Technology Foundations-BIT 112
59
Knowledge Management (continued)
• Knowledge management systems (KMSs)– Systems that use
information technologies to systematize, enhance, and expedite intra and inter-organization knowledge management.
• Best practices– The most effective and
efficient ways/processes of doing things.
Information Technology Foundations-BIT 112
60
Knowledge Management System Life Cycle
Six steps1.Create knowledge2.Capture knowledge3.Refine knowledge4.Store knowledge5.Manage knowledge6.Disseminate knowledge
Information Technology Foundations-BIT 112
61
Chapter Closing Case P. 131
High CVM passengerstravel in style