cc03

63
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 3 Chapter 3 Databases and Data Databases and Data Warehouses Warehouses

Upload: databaseguys

Post on 24-Dec-2014

386 views

Category:

Documents


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: CC03

McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies,All Rights Reserved

Chapter 3Chapter 3

Databases and Data WarehousesDatabases and Data Warehouses

Page 2: CC03

3-2

STUDENT LEARNING OUTCOMESSTUDENT LEARNING OUTCOMES

1.1. Describe business intelligence and its role in Describe business intelligence and its role in an organization.an organization.

2.2. Differentiate between databases and data Differentiate between databases and data warehouses with respect to their focus on warehouses with respect to their focus on OLTP and OLAP.OLTP and OLAP.

3.3. List and describe the key characteristics of a List and describe the key characteristics of a relational database.relational database.

Page 3: CC03

3-3

STUDENT LEARNING OUTCOMESSTUDENT LEARNING OUTCOMES

4.4. Define the five software components of a Define the five software components of a database management system.database management system.

5.5. List and describe the key characteristics of a List and describe the key characteristics of a data warehouse.data warehouse.

6.6. Define the four major types of data-mining Define the four major types of data-mining tools in a data warehouse environment.tools in a data warehouse environment.

7.7. List key considerations in information List key considerations in information ownership in an organization.ownership in an organization.

Page 4: CC03

3-4

Can Companies Keep Your Personal Can Companies Keep Your Personal Information Secure and Private? Information Secure and Private?

• Databases and data warehouses are Databases and data warehouses are organizational repositories of informationorganizational repositories of information

• Much of the information is personalMuch of the information is personal

• It must be secureIt must be secure

• If hackers get your personal information, you If hackers get your personal information, you can suffer from identity theft can suffer from identity theft

Page 5: CC03

3-5

Can Companies Keep Your Personal Can Companies Keep Your Personal Information Secure and Private?Information Secure and Private?

• Top-10 incidents of personal information loss Top-10 incidents of personal information loss by organizationsby organizations

• Could affect over 53 million peopleCould affect over 53 million people

• CardSystems lost information on 40 million CardSystems lost information on 40 million customerscustomers

• Many others Many others

Page 6: CC03

3-6

Can Companies Keep Your Personal Can Companies Keep Your Personal Information Secure and Private?Information Secure and Private?

• Have you been a victim of identity theft?Have you been a victim of identity theft?– What happened?What happened?– What did you do to recover?What did you do to recover?– How long did it take? How long did it take?

Page 7: CC03

3-7

INTRODUCTIONINTRODUCTION

• Businesses need business intelligence (BI)Businesses need business intelligence (BI)

• Business intelligenceBusiness intelligence – knowledge about your – knowledge about your customers, competitors, business partners, customers, competitors, business partners, environment, and internal operationsenvironment, and internal operations– Enables effective decision makingEnables effective decision making– Information on steroids Information on steroids

Page 8: CC03

3-8

INTRODUCTIONINTRODUCTION

• IT tools help process information to create IT tools help process information to create business intelligence according to…business intelligence according to…– OLTP (online transaction processing)OLTP (online transaction processing)– OLAP (online analytical processing) OLAP (online analytical processing)

Page 9: CC03

3-9

INTRODUCTIONINTRODUCTION

• OLTPOLTP – gathering and processing transaction – gathering and processing transaction information and updating existing information information and updating existing information to reflect transactionto reflect transaction– Databases support OLTPDatabases support OLTP– Operational databaseOperational database – database that supports – database that supports

OLTP OLTP

Page 10: CC03

3-10

INTRODUCTIONINTRODUCTION

• OLAP OLAP – manipulation of information to – manipulation of information to support decision makingsupport decision making– Databases can help someDatabases can help some– Data warehouses support only OLAP, not OLTPData warehouses support only OLAP, not OLTP– Data warehouses – special forms of databases that Data warehouses – special forms of databases that

support decision making support decision making

Page 11: CC03

3-11

INTRODUCTIONINTRODUCTION

Page 12: CC03

3-12

INTRODUCTIONINTRODUCTION

• This chapter – database and data warehouse This chapter – database and data warehouse conceptsconcepts

• Along with some privacy and security Along with some privacy and security considerationsconsiderations

Page 13: CC03

3-13

RELATIONAL DATABASE MODELRELATIONAL DATABASE MODEL

• DatabaseDatabase – logical collection of information – logical collection of information you organize and access according to the you organize and access according to the logical structure of the informationlogical structure of the information

• Relational databaseRelational database – uses a series of two- – uses a series of two-dimensional tables or files to store information dimensional tables or files to store information in the form of a database in the form of a database

Page 14: CC03

3-14

Databases Are… Databases Are…

• Collections of informationCollections of information

• Created with logical structuresCreated with logical structures

• With logical ties within the informationWith logical ties within the information

• With built-in integrity constraints With built-in integrity constraints

Page 15: CC03

3-15

Databases – Collections of Information Databases – Collections of Information

• Databases have many tablesDatabases have many tables

• Solomon Enterprises as a concrete provider. Solomon Enterprises as a concrete provider. Tables include:Tables include:– OrderOrder– CustomerCustomer– Concrete TypeConcrete Type– EmployeeEmployee– TruckTruck

Page 16: CC03

3-16

Databases – Collections of Information Databases – Collections of Information

Page 17: CC03

3-17

Databases – Created with Logical Databases – Created with Logical Structures Structures

• In databases, row numbers are irrelevantIn databases, row numbers are irrelevant

• In databases, columns have logical names such In databases, columns have logical names such as as Order DateOrder Date and and Customer NameCustomer Name

• Data dictionaryData dictionary – contains the logical – contains the logical structure of the information in a database structure of the information in a database

Page 18: CC03

3-18

Databases – Logical Ties within the Databases – Logical Ties within the Information Information

• Logical ties must exist between the tablesLogical ties must exist between the tables

• Logical ties are created with primary and Logical ties are created with primary and foreign keysforeign keys

• Primary keyPrimary key – field (or group of fields in some – field (or group of fields in some cases) that uniquely describe each record cases) that uniquely describe each record

Page 19: CC03

3-19

Databases – Logical Ties within the Databases – Logical Ties within the Information Information

• Foreign keyForeign key – primary key of one file that – primary key of one file that appears in another fileappears in another file

• Foreign keys help create relationships among Foreign keys help create relationships among tablestables

• Table = file = relation (don’t confuse yourself) Table = file = relation (don’t confuse yourself)

Page 20: CC03

3-20

Databases – Logical Ties within the Databases – Logical Ties within the Information Information

Page 21: CC03

3-21

Databases – Built-in Integrity Databases – Built-in Integrity Constraints Constraints

• Integrity constraintIntegrity constraint – rule that helps ensure – rule that helps ensure the quality of informationthe quality of information

• ExamplesExamples– Primary keys must be uniquePrimary keys must be unique– Foreign keys cannot be blankForeign keys cannot be blank– Sales price cannot be negativeSales price cannot be negative– Phone numbers must have an area code Phone numbers must have an area code

Page 22: CC03

3-22

DBMS TOOLSDBMS TOOLS

• Database management system (DBMS)Database management system (DBMS) – – helps you specify the logical organization for a helps you specify the logical organization for a database and access and use the information database and access and use the information within a databasewithin a database– Word processing software = documentWord processing software = document– Spreadsheet software = workbookSpreadsheet software = workbook– DBMS software = database DBMS software = database

Page 23: CC03

3-23

DBMS TOOLSDBMS TOOLS

• 5 software components5 software components1.1. DBMS engineDBMS engine

2.2. Data definition subsystemData definition subsystem

3.3. Data manipulation subsystemData manipulation subsystem

4.4. Application generation subsystemApplication generation subsystem

5.5. Data administration subsystem Data administration subsystem

Page 24: CC03

3-24

DBMS TOOLSDBMS TOOLS

Page 25: CC03

3-25

DBMS EngineDBMS Engine

• DBMS engineDBMS engine – accepts logical requests, – accepts logical requests, converts them into their physical equivalent, converts them into their physical equivalent, and accesses the database and data dictionaryand accesses the database and data dictionary

• DBMS engine separates the logical from the DBMS engine separates the logical from the physical physical

Page 26: CC03

3-26

DBMS EngineDBMS Engine

• Physical viewPhysical view – how information is arranged, – how information is arranged, stored, and accessed on a storage devicestored, and accessed on a storage device

• Logical viewLogical view – how you (knowledge worker) – how you (knowledge worker) need to arrange and access informationneed to arrange and access information

• Databases – you work only with logical views Databases – you work only with logical views

Page 27: CC03

3-27

Data Definition SubsystemData Definition Subsystem

• Data definition subsystemData definition subsystem – helps you create – helps you create and maintain the data dictionary and define the and maintain the data dictionary and define the structure of the files in a databasestructure of the files in a database

• Must create data dictionary for a database Must create data dictionary for a database before entering any informationbefore entering any information

Page 28: CC03

3-28

Data Manipulation SubsystemData Manipulation Subsystem

• Data manipulation subsystemData manipulation subsystem – helps you – helps you add, change, and delete informationadd, change, and delete information

• Primary interface between you and a databasePrimary interface between you and a database– ViewsViews– Report generatorsReport generators– QBE toolsQBE tools– SQL SQL

Page 29: CC03

3-29

ViewsViews

• View View – allows you to see the contents of a – allows you to see the contents of a database filedatabase file

• Similar to a spreadsheet viewSimilar to a spreadsheet view– Make changesMake changes– SortSort– Query Query

Page 30: CC03

3-30

ViewsViews

Page 31: CC03

3-31

Report GeneratorsReport Generators

• Report generatorReport generator – helps you quickly define – helps you quickly define formats of reports and what information you formats of reports and what information you want to see in a reportwant to see in a report

• Save report formats to use laterSave report formats to use later

• Uses a wizard interface Uses a wizard interface

Page 32: CC03

3-32

Report GeneratorsReport Generators

Specify the fields you want in a report

Specify the layout of the report

Page 33: CC03

3-33

Report GeneratorsReport Generators

Page 34: CC03

3-34

QBE ToolsQBE Tools

• Query-by-example (QBE) toolQuery-by-example (QBE) tool – helps you – helps you graphically design the answer to a questiongraphically design the answer to a question

• ““What driver most often delivers concrete to What driver most often delivers concrete to Triple A Homes?” Triple A Homes?”

Page 35: CC03

3-35

QBE ToolsQBE Tools

Page 36: CC03

3-36

SQLSQL

• Structured query language (SQL)Structured query language (SQL) – – standardized fourth-generation language found standardized fourth-generation language found in most DBMSsin most DBMSs

• Performs same task as QBEPerforms same task as QBE

• Uses sentence structure insteadUses sentence structure instead

• Mostly used by IT people Mostly used by IT people

Page 37: CC03

3-37

Application Generation SubsystemApplication Generation Subsystem

• Application generation subsystemApplication generation subsystem – contains – contains facilities to help you develop transaction-facilities to help you develop transaction-intensive applicationsintensive applications– Data entry screens (called forms in Access)Data entry screens (called forms in Access)– Programming languagesProgramming languages

• Mostly used by IT people Mostly used by IT people

Page 38: CC03

3-38

Data Administration SubsystemData Administration Subsystem

• Data administration subsystemData administration subsystem – helps you – helps you manage the overall database environmentmanage the overall database environment– Backup and recoveryBackup and recovery– Security managementSecurity management– Query optimizationQuery optimization– Concurrency controlConcurrency control– Change management Change management

Page 39: CC03

3-39

Data Administration SubsystemData Administration Subsystem

• Backup and recoveryBackup and recovery– Periodically back up informationPeriodically back up information– Recover a database after a failureRecover a database after a failure

• Security managementSecurity management– Who has access to what informationWho has access to what information– Who can perform CRUD tasks on information Who can perform CRUD tasks on information

Page 40: CC03

3-40

Data Administration SubsystemData Administration Subsystem

• Query optimizationQuery optimization– Restructure physical view to optimize response Restructure physical view to optimize response

times to queriestimes to queries

• Concurrency controlConcurrency control– What happens if two people simultaneously try to What happens if two people simultaneously try to

change the same information? change the same information?

Page 41: CC03

3-41

Data Administration SubsystemData Administration Subsystem

• Change managementChange management– What is the effect of structural changes to a What is the effect of structural changes to a

database?database?– What if you add a new column?What if you add a new column?– What happens if you delete a column?What happens if you delete a column?– What happens if you change a column’s attributes? What happens if you change a column’s attributes?

Page 42: CC03

3-42

DATA WAREHOUSES & DATA DATA WAREHOUSES & DATA MININGMINING

• Data warehouses support OLAP and decision Data warehouses support OLAP and decision makingmaking

• Data warehouses do not support OLTPData warehouses do not support OLTP

• Data-mining tools are tools for working with Data-mining tools are tools for working with data warehouse informationdata warehouse information– DBMS software = databaseDBMS software = database– Data-mining tools = data warehouse Data-mining tools = data warehouse

Page 43: CC03

3-43

What Is a Data Warehouse?What Is a Data Warehouse?

• Data warehouseData warehouse – logical collection of – logical collection of information – gathered from operational information – gathered from operational databases – used to create business intelligence databases – used to create business intelligence that supports business analysis activities and that supports business analysis activities and decision-making tasks decision-making tasks

Page 44: CC03

3-44

What Is a Data Warehouse?What Is a Data Warehouse?

Page 45: CC03

3-45

What Is a Data Warehouse?What Is a Data Warehouse?

• MultidimensionalMultidimensional

• Rows and columnsRows and columns

• Also layersAlso layers

• Many times called Many times called hypercubeshypercubes

• What are the dimensions in Figure 3.8 on page What are the dimensions in Figure 3.8 on page 97? 97?

Page 46: CC03

3-46

What Are Data-Mining Tools?What Are Data-Mining Tools?

• Data-mining toolsData-mining tools – software tools that you – software tools that you use to query information in a data warehouseuse to query information in a data warehouse– Query-and-reporting toolsQuery-and-reporting tools– Intelligent agentsIntelligent agents– Multidimensional analysis toolsMultidimensional analysis tools– Statistical tools Statistical tools

Page 47: CC03

3-47

What Are Data-Mining Tools?What Are Data-Mining Tools?

Page 48: CC03

3-48

Query-and-Reporting ToolsQuery-and-Reporting Tools

• Query-and-reporting toolsQuery-and-reporting tools – similar to QBE – similar to QBE tools, SQL, and report generators in the typical tools, SQL, and report generators in the typical database environmentdatabase environment– Also similar to pivot tables in Excel Also similar to pivot tables in Excel

Page 49: CC03

3-49

Intelligent AgentsIntelligent Agents

• Use various AI tools such as neural networks Use various AI tools such as neural networks and fuzzy logic to form the basis for and fuzzy logic to form the basis for “information discovery” and building BI“information discovery” and building BI

• Help you find hidden patterns in informationHelp you find hidden patterns in information

• Chapter 4 focuses on these Chapter 4 focuses on these

Page 50: CC03

3-50

Multidimensional Analysis ToolsMultidimensional Analysis Tools

• Multidimensional analysis (MDA) toolsMultidimensional analysis (MDA) tools – – slice-and-dice techniques that allow you to slice-and-dice techniques that allow you to view multidimensional information from view multidimensional information from different perspectivesdifferent perspectives– Bring new layers to the frontBring new layers to the front– Reorganize rows and columns Reorganize rows and columns

Page 51: CC03

3-51

Statistical ToolsStatistical Tools

• Help you apply various mathematical models Help you apply various mathematical models to the information stored in a data warehouse to the information stored in a data warehouse to discover new informationto discover new information– RegressionRegression– Analysis of varianceAnalysis of variance– And so on And so on

Page 52: CC03

3-52

Data MartsData Marts

• Data warehouses are organizationwideData warehouses are organizationwide

• Data marts have subsets of an Data marts have subsets of an organizationwide data warehouseorganizationwide data warehouse

• Data martData mart – subset of a data warehouse in – subset of a data warehouse in which only a focused portion of the data which only a focused portion of the data warehouse information is kept warehouse information is kept

Page 53: CC03

3-53

Data MartsData Marts

Page 54: CC03

3-54

Data Mining as a Career OpportunityData Mining as a Career Opportunity

• Knowledge of data mining can be a substantial Knowledge of data mining can be a substantial career opportunity for youcareer opportunity for you– Business ObjectsBusiness Objects– SASSAS– CognosCognos– InformaticaInformatica– Many others Many others

Page 55: CC03

3-55

Considerations in Using a Data Considerations in Using a Data WarehouseWarehouse

• Do you need a data warehouse?Do you need a data warehouse?– DBMS may offer all you needDBMS may offer all you need

• Do all employees need the entire data Do all employees need the entire data warehouse?warehouse?– Consider a data martConsider a data mart

• How up-to-date must information be?How up-to-date must information be?– ““Snapshot” conceptSnapshot” concept

• What data-mining tools do you need? What data-mining tools do you need? – Training can be expensiveTraining can be expensive

Page 56: CC03

3-56

INFORMATION OWNERSHIPINFORMATION OWNERSHIP

• Strategic management supportStrategic management support

• The sharing of information with responsibilityThe sharing of information with responsibility

• Information cleanliness Information cleanliness

Page 57: CC03

3-57

Strategic Management Support Strategic Management Support

• Chief privacy officer (CPO)Chief privacy officer (CPO) – ensuring that – ensuring that information is used in an ethical wayinformation is used in an ethical way

• Chief security officer (CSO)Chief security officer (CSO) – ensuring – ensuring security of information (e.g., firewalls)security of information (e.g., firewalls)

• Chief information officer (CIO)Chief information officer (CIO) – oversees – oversees every aspect of an organization’s information every aspect of an organization’s information resource resource

Page 58: CC03

3-58

Strategic Management Support Strategic Management Support

• Data administrationData administration – plans for, oversees the – plans for, oversees the development of, and monitors the information development of, and monitors the information resourceresource

• Database administrationDatabase administration – responsible for the – responsible for the more technical aspects and operational aspects more technical aspects and operational aspects of managing informationof managing information

• Both often report to the CIO Both often report to the CIO

Page 59: CC03

3-59

The Sharing of Information with The Sharing of Information with ResponsibilityResponsibility

• If you create it, you “own” itIf you create it, you “own” it

• You will also share it with othersYou will also share it with others

• Because you “own” it, you are responsible for Because you “own” it, you are responsible for its quality its quality

Page 60: CC03

3-60

Information Cleanliness Information Cleanliness

• Database and data warehouse information Database and data warehouse information must be “clean”must be “clean”– No errorsNo errors– No duplicates No duplicates

Page 61: CC03

3-61

Information Cleanliness Information Cleanliness

• Extraction, transformation, and loading Extraction, transformation, and loading (ETL)(ETL) – what information you want from each – what information you want from each database, how the information is associated, database, how the information is associated, and what rules to follow in consolidating the and what rules to follow in consolidating the information to ensure its cleanliness in a data information to ensure its cleanliness in a data warehouse warehouse

Page 62: CC03

3-62

CAN YOU…CAN YOU…

1.1. Describe business intelligence and its role in Describe business intelligence and its role in an organization.an organization.

2.2. Differentiate between databases and data Differentiate between databases and data warehouses with respect to their focus on warehouses with respect to their focus on OLTP and OLAP.OLTP and OLAP.

3.3. List and describe the key characteristics of a List and describe the key characteristics of a relational database.relational database.

Page 63: CC03

3-63

CAN YOU…CAN YOU…

4.4. Define the five software components of a Define the five software components of a database management system.database management system.

5.5. List and describe the key characteristics of a List and describe the key characteristics of a data warehouse.data warehouse.

6.6. Define the four major types of data-mining Define the four major types of data-mining tools in a data warehouse environment.tools in a data warehouse environment.

7.7. List key considerations in information List key considerations in information ownership in an organization.ownership in an organization.