McGraw-Hill/Irwin ©2009 The McGraw-Hill Companies, All Rights Reserved
CHAPTER 6
DATABASES AND DATA WAREHOUSES
CHAPTER 6
DATABASES AND DATA WAREHOUSES
Business Driven Information Systems 2eBusiness Driven Information Systems 2eBusiness Driven Information Systems 2eBusiness Driven Information Systems 2e
6-2
Chapter Six Overview
• SECTION 6.1 – DATABASE FUNDAMENTALS– Organizational Information– Storing Organizational Information– Relational Database Fundamentals– Relational Database Advantages– Database Management Systems– Integrating Data Among Multiple Databases
• SECTION 6.2 – DATA WARAEHOUSE FUNDAMENTALS– Accessing Organizational Information– History of Data Warehousing– Data Warehouse Fundamentals– Data Mining and Business Intelligence
McGraw-Hill/Irwin ©2009 The McGraw-Hill Companies, All Rights Reserved
SECTION 6.1SECTION 6.1
DATABASE FUNDAMENTALS
DATABASE FUNDAMENTALS
6-4
LEARNING OUTCOMES
1. List, describe, and provide an example of each of the five characteristics of high quality information
2. Define the relationship between a database and a database management system
3. Describe the advantages an organization can gain by using a database.
6-5
LEARNING OUTCOMES
4. Define the fundamental concepts of the relational database model
5. Describe the two primary methods for integrating information across multiple databases
6. Compare relational integrity constraints and business-critical integrity constraints
7. Describe the benefits of a data-driven website
6-6
Organizational Information
• Information is everywhere in an organization
• Employees must be able to obtain and analyze the many different levels, formats, and granularities of organizational information to make decisions
• Successfully collecting, compiling, sorting, and analyzing information can provide tremendous insight into how an organization is performing
6-7
Organizational Information
• Levels, formats, and granularities of organizational information
6-8
The Value of Transactional and Analytical Information
6-9
The Value of Timely Information
• Timeliness is an aspect of information that depends on the situation– Real-time information – immediate, up-to-
date information– Real-time system – provides real-time
information in response to query requests
6-10
The Value of Quality Information
• Business decisions are only as good as the quality of the information used to make the decisions
• You never want to find yourself using technology to help you make a bad decision faster
6-11
The Value of Quality Information
• Characteristics of high-quality information include:– Accuracy– Completeness– Consistency– Uniqueness – Timeliness
6-12
The Value of Quality Information
• Low quality information example
6-13
Understanding the Costs of Poor Information
• The four primary sources of low quality information include:
1. Customers intentionally enter inaccurate information to protect their privacy
2. Different entry standards and formats3. Operators enter abbreviated or erroneous
information by accident or to save time4. Third party and external information contains
inconsistencies, inaccuracies, and errors
6-14
Understanding the Costs of Poor Information
• Potential business effects resulting from low quality information include:– Inability to accurately track customers– Difficulty identifying valuable customers– Inability to identify selling opportunities– Marketing to nonexistent customers– Difficulty tracking revenue– Inability to build strong customer relationships
6-15
Understanding the Benefits of
Good Information • High quality information can significantly
improve the chances of making a good decision
• Good decisions can directly impact an organization's bottom line
6-16
Relational Database Fundamentals
• Information is everywhere in an organization
• Information is stored in databases– Database – maintains information about
various types of objects (inventory), events (transactions), people (employees), and places (warehouses)
6-17
Relational Database Fundamentals
• Database models include:– Hierarchical database model– Network database model – Relational database model – stores
information in the form of logically related two-dimensional tables
6-18
Entities and Attributes
• Entity – a person, place, thing, transaction, or event about which information is stored– The rows in each table contain the entities– In Figure 6.5 CUSTOMER includes Dave’s Sub Shop
and Pizza Palace entities
• Attribute (field, column) – characteristics or properties of an entity class– The columns in each table contain the attributes– In Figure 6.5 attributes for CUSTOMER include
Customer ID, Customer Name, Contact Name
6-19
Keys and Relationships
• Primary keys and foreign keys identify the various entities (tables) in the database– Primary key – a field (or group of fields) that
uniquely identifies a given entity in a table– Foreign key – a primary key of one table that
appears an attribute in another table and acts to provide a logical relationship among the two tables
6-20
• Potential relational database for Coca-Cola
6-21
Relational Database Advantages
• Database advantages from a business perspective include– Increased flexibility– Increased scalability and performance– Reduced information redundancy– Increased information integrity (quality)– Increased information security
6-22
Increased Flexibility
• A well-designed database should:– Handle changes quickly and easily– Provide users with different views– Have only one physical view
• Physical view – deals with the physical storage of information on a storage device
– Have multiple logical views• Logical view – focuses on how users logically
access information
6-23
Increased Scalability and Performance
• A database must scale to meet increased demand, while maintaining acceptable performance levels– Scalability – refers to how well a system can
adapt to increased demands– Performance – measures how quickly a
system performs a certain process or transaction
6-24
Reduced Information Redundancy
• Databases reduce information redundancy– Redundancy – the duplication of information
or storing the same information in multiple places
• Inconsistency is one of the primary problems with redundant information
6-25
Increase Information Integrity (Quality)
• Information integrity – measures the quality of information
• Integrity constraint – rules that help ensure the quality of information– Relational integrity constraint– Business-critical integrity constraint
6-26
Increased Information Security
• Information is an organizational asset and must be protected
• Databases offer several security features including:– Password – provides authentication of the user– Access level – determines who has access to the
different types of information – Access control – determines types of user access,
such as read-only access
6-27
Database Management Systems
• Database management systems (DBMS) – software through which users and application programs interact with a database
6-28
Data-Driven Websites
• Data-driven websites – an interactive website kept constantly updated and relevant to the needs of its customers through the use of a database
6-29
Data-Driven Websites
6-30
Data-Driven Website Business Advantages
• Development
• Content Management
• Future Expandability
• Minimizing Human Error
• Cutting Production and Update Costs
• More Efficient
• Improved Stability
6-31
Data-Driven Business Intelligence
6-32
Integrating Information among Multiple Databases
• Integration – allows separate systems to communicate directly with each other– Forward integration – takes information
entered into a given system and sends it automatically to all downstream systems and processes
– Backward integration – takes information entered into a given system and sends it automatically to all upstream systems and processes
6-33
Integrating Information among Multiple Databases
• Forward integration
6-34
Integrating Information among Multiple Databases
• Backward integration
6-35
Integrating Information among Multiple Databases
• Building a central repository specifically for integrated information
6-36
OPENING CASE STUDY QUESTIONSIt Takes A Village to Write an Encyclopedia
1. Determine if an entry in Wikipedia is an example of transactional information or analytical information
2. What is the impact to Wikipedia if the information contained in its database is of low quality?
3. Review the five common characteristics of high quality information and rank them in order of importance to Wikipedia
6-37
OPENING CASE STUDY QUESTIONSIt Takes A Village to Write an Encyclopedia
4. How is Wikipedia resolving the issue of poor information?
5. Identify the different types of entities that might be stored in Wikipedia’s database
6. Why is database technology so important to Wikipedia’s business model?
McGraw-Hill/Irwin ©2009 The McGraw-Hill Companies, All Rights Reserved
SECTION 6.2SECTION 6.2
DATA WAREHOUSE FUNDAMENTALS
DATA WAREHOUSE FUNDAMENTALS
6-39
LEARNING OUTCOMES
8. Describe the roles and purposes of data warehouses and data marts in an organization
9. Compare the multidimensional nature of data warehouses (and data marts) with the two-dimensional nature of databases
6-40
LEARNING OUTCOMES
10. Identify the importance of ensuring the cleanliness of information throughout an organization
11.Explain the relationship between business intelligence and a data warehouse
6-41
HISTORY OF DATA WAREHOUSING
• Data warehouses extend the transformation of data into information
• In the 1990’s executives became less concerned with the day-to-day business operations and more concerned with overall business functions
• The data warehouse provided the ability to support decision making without disrupting the day-to-day operations
6-42
DATA WAREHOUSE FUNDAMENTALS
• Data warehouse – a logical collection of information – gathered from many different operational databases – that supports business analysis activities and decision-making tasks
• The primary purpose of a data warehouse is to aggregate information throughout an organization into a single repository for decision-making purposes
6-43
DATA WAREHOUSE FUNDAMENTALS
• Extraction, transformation, and loading (ETL) – a process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse
• Data mart – contains a subset of data warehouse information
6-44
DATA WAREHOUSE FUNDAMENTALS
6-45
Multidimensional Analysis
• Databases contain information in a series of two-dimensional tables
• In a data warehouse and data mart, information is multidimensional, it contains layers of columns and rows– Dimension – a particular attribute of
information
6-46
Multidimensional Analysis
• Cube – common term for the representation of multidimensional information
6-47
Information Cleansing or Scrubbing
• An organization must maintain high-quality data in the data warehouse
• Information cleansing or scrubbing – a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information
6-48
Information Cleansing or Scrubbing
• Contact information in an operational system
6-49
Information Cleansing or Scrubbing
• Standardizing Customer name from Operational Systems
6-50
Information Cleansing or Scrubbing
6-51
Information Cleansing or Scrubbing
• Accurate and complete information
6-52
Data Mining and Business Intelligence
• Data mining – the process of analyzing data to extract information not offered by the raw data alone
• To perform data mining users need data-mining tools
• Data-mining tools helps users uncover BI
6-53
OPENING CASE STUDY QUESTIONSIt Takes A Village to Write an Encyclopedia
7. How could Wikipedia use a data warehouse to improve its business operations?
8. Why must Wikipedia cleanse or scrub the information in its data warehouse?
9. How could a company use information from Wikipedia to gain business intelligence?
6-54
Closing Case OneGoogle
1. How did the website RateMyProfessors.com solve its problem of low-quality information?
2. Review the five common characteristics of high-quality information and rank them in order of importance to Google’s business
3. What would be the ramifications to Google’s business if the search information it presented to its customers was of low quality?
6-55
Closing Case OneGoogle
4. Describe the different types of databases. Why should Google use a relational database?
5. Identify the different types of entities, attributes, keys, and relationships that might be stored in Google’s AdWords relational database
6-56
CLOSING CASE ONEGoogle
6. How could Google use a data warehouse to improve its business operations?
7. Why would Google need to scrub and cleanse the information in its data warehouse?
8. Identify a data mart that Google’s marketing and sales department might use to track and analyze its AdWords revenue
6-57
CLOSING CASE TWOMining the Data Warehouse
1. How is Ben & Jerry’s using business intelligence to remain successful and competitive in a saturated market?
2. Why is information cleansing and scrubbing critical to California Pizza Kitchen’s success?
6-58
CLOSING CASE TWOMining the Data Warehouse
3. Why is 100 percent accurate and complete information impossible for Noodles & Company to obtain?
4. Describe how each of the companies above is using BI from their data warehouse to gain a competitive advantage
6-59
CLOSING CASE THREE Harrah’s
1. Identify the effects poor information might have on Harrah’s service-oriented business strategy
2. How does Harrah’s uses database technologies to implement its service-oriented strategy?
3. Harrah’s was one of the first casino companies to find value in offering rewards to customers who visit multiple Harrah’s locations. Describe the effects on the company if it did not build any integrations among the databases located at each of its casinos. How could Harrah’s use a data warehouse to synchronize customer information?
6-60
4. Estimate the potential impact to Harrah’s business if there is a security breach in its customer information
5. Identify three different types of data marts Harrah’s might want to build to help it analyze its operational performance
CLOSING CASE THREE Harrah’s
6-61
6. What might occur if Harrah’s fails to clean or scrub its information before loading it into its data warehouse?
CLOSING CASE THREE Harrah’s
6-62
BUSINESS DRIVEN BEST SELLERS
• Business @ The Speed of Thought, by Bill Gates
6-63
BUSINESS DRIVEN BEST SELLERS
• Why Smart Executives Fail, by Sydney Finkelstein