paul k chen
DESCRIPTION
Data Warehouse Fundamentals. Chapter 1. Introduction to Data Warehouse. Paul K Chen. 1. Introduction to Data Warehouse. Portions of the Materials at this website subject- Data Warehouse Fundamentals -are drawn from the Textbooks below: Data Warehouse Fundamentals - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/1.jpg)
1
Paul K Chen
Introduction to Data Warehouse
Chapter 1
Data Warehouse Fundamentals
![Page 2: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/2.jpg)
Introduction to Data Warehouse
Portions of the Materials at this website subject-Data Warehouse Fundamentals -are drawn from the Textbooks below: Data Warehouse FundamentalsAuthor: Paulraj PonniahPublisher: John Wiley & Sons, Inc. 2001
Database SystemsAuthors: Thomas Connolly and Carolyn BeggPublisher: Wesley Longman, Inc. Second Edition
![Page 3: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/3.jpg)
Road Map for Learning By Subject DW Overview
DW Architecture/Components/Building Blocks
Relational & Dimensional Modeling-DW DB Design
Analyzing DW Business Requirements
Trends
Chapters 3
Chapters 6,7
Chapters 1
DW Information Delivery/Data Retrieval by OLAP and Data Mining via Web
DW Project Planning and
ManagementChapter 4
Physical Design Process andData Quality
Chapter 11Chapters 8, 9, 10
Chapter 2
Chapter 5
![Page 4: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/4.jpg)
Chapter 1 - Objectives
Understand the differences between data and information and the information crisis
Recognize the information crisis at every enterprise Understand the various ways of organizing and
managing information for decision making use Review the history of decision support systems Learn briefly what is data warehouse and see why data
warehousing is the viable solution
![Page 5: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/5.jpg)
Data and Information
We’re told we live in the “information age”. People often talk about data and information as if
there were the same. They are, in many regards, opposite.
A datum is just a fact—your name is a fact, your phone number is a fact.
Information is data that is presented in a meaningful, understandable and. beneficial format. Information is data that has been organized , sequenced, correlated and summarized, such as a phone book.
![Page 6: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/6.jpg)
Data and Information
A phone book is information. It not only contains names and phone numbers, but it correctly associates each person’s phone number with their names. It presents this list of correlated names and phone numbers in alphabetical sequence, so that we find the phone number from the name. In addition, it divides the phone numbers into two types; personal and business.
It is the function of the computer to convert data to information.
![Page 7: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/7.jpg)
Definitions
Database: The database is a place where you put your data; data that you wish to convert to information at some future time.
Database Management System: A DBMS is the software that converts the data in your database to information. It is the DBMS that provides you the capability for cross-referencing, correlating, sorting, summarizing, etc.
![Page 8: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/8.jpg)
Information as A Competitive Weapon
Information technology and quality information are not
the goals, but merely to support organizations to reach
goals of
Superior products and services
Greater productivity
Eventually success
![Page 9: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/9.jpg)
Data, Information, and Decision
Data
Information (Data + Process)
Knowledge
Decision (Information +
Knowledge)
Data/Information/Decision
Data Resource Management (DRM)
MIS (OLTP) & OOAD
KM (Knowledge Mgt), KWS (Knowledge Work Systems)
DSS; ESS, EIS (Executive Information Systems)
Data Warehousing/Data Mart/Data Mining/OLAP (Executive, Collaborative and individual levels)
![Page 10: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/10.jpg)
Data, Information, and Decision bySubject
Data Data processing
+ Processing System Analysis/Design Information MIS, Database Systems Object (Data+Processing) Object-Oriented SD/DA
Knowledge Artificial Intelligence
+ Information Expert system Decision (executive level) DSS, EIS Decision (all levels, sophisticated) Data warehousing
Data Mining
![Page 11: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/11.jpg)
The Information Crisis
Integrated: Must have a single, enterprise-wide view. Data Integrity: Information must be accurate and
must conform to business rules. Accessible: Easily accessible with intuitive access
paths, and responsive for analysis. Credible: Every business factor must have one and one
value. Timely: Information must be available within the
stipulated time frame.
![Page 12: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/12.jpg)
The Era of Information-Based Management—Five Themes
A Single Information Source (E-Business)
Distributed Information Availability (XML)
Information In A Business Context (Decision Support Systems)
Automated Information Delivery (for ex., Trigger)
Information Quality and Ownership (for ex., DRM)
![Page 13: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/13.jpg)
Complete E-Business Suite
One Database
MarketingSales
Order Mgt
Procurement
Supply Chain (SCM)Manufacturing
FinancialServices
Human Resources
Projects
CustomerRelationship(CRM)
ERP EAI
![Page 14: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/14.jpg)
What is EAI? What is EAI? EAI refers to Enterprise Application Integration.
EAI is the merging of applications and data from various new and legacy systems within a business. Various means are employed to accomplish EAI, including middleware, in order to unify IT resources, maximize new ERP investments, diminish errors and get everyone on the same page. EAI enables companies to link their existing software applications with each other and with portals. EAI provides the ability to get their applications to exchange critical data. EAI is usually close to the top of any CIO's list of concerns. There are different approaches to EAI. Some rely on linking specific applications with tailored code, but most rely on generic solutions, typically called middleware. XML, combined with SOAP and UDDI, is a kind of middleware.
![Page 15: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/15.jpg)
Data Warehouse & ERP
– ERP = Enterprise Resource Planning
– A software solution that addresses enterprise needs taking the process view of an organization to meet the
organization goals.
-- It integrates all the departments and functions across
a company into a single computer system that can
serve all those different departments’ particular
needs.
![Page 16: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/16.jpg)
Information System Categories
![Page 17: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/17.jpg)
Information System Categories
![Page 18: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/18.jpg)
DATA RESOURCE MANAGEMENT (DRM)
DEFINITION
DATA RESOURCE MANAGEMENT (DRM) IS THE
BUSINESS DISCIPLINE WHICH FOCUSES ON HOW
DATA CAN BE MANAGED TO MOST EFFICIENTLY
SUPPORT THE BUSINESS ENTERPRISE. DRM
ADDRESSES THE MANAGEMENT OF ALL
ENTERPRISE DATA. WHEN COMBINED WITH OTHER
ENTERPRISE PROCESSES, DRM PROVIDES
INFORMATION WHEN NEEDED, WHERE NEEDED, IN
THE FORM NEEDED, WITH DESIRED ACCURACY
AND AT MINIMUM COST FOR BUSINESS
ENTERPRISE.
![Page 19: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/19.jpg)
DATA RESOURCE MANAGEMENT (DRM)
DATA RESOURCE MANAGEMENT BECOMES INCREASINGLY CRITICAL TO THE SUCCESS OF THE CORPORATION IN THE MARKETPLACE DUE TO THESE NEW REALITIES:
THE COMPETITIVE, GLOBAL ENVIRONMENT THAT BUSINESS IS FACING
EXPLOSIVE GROWTH OF THE WEB OVER THE INTERNET
INCREASING USE OF DATA WAREHOUSE SYSTEMS TO MAKE BETTER DECISIONS
![Page 20: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/20.jpg)
DATA RESOURCE MANAGEMENT (DRM)
WHAT IT IS:
PROVIDING A UNIFIED AND INTEGRATED APPROACH FOR PLANNING, CONTROL AND INTEGRATION OF OUR DATA ASSETS IN SUPPORT OF ENTERPRISE’S BUSINESS
ENCOURAGING THE REDUCTION OF UNNECESSARY DATA DUPLICATION
ENCOURAGING THE REUSE AND SHARING OF HIGH QUALITY DATA
DONE RIGHT, THE INVESTMENT CAN BE PAID BACK MANY TIMES OVER.
![Page 21: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/21.jpg)
DRM PRINCIPLES
THE FOLLOWING PRINCIPLES SERVE AS
GUIDELINES FOR MANAGING DATA AS AN
ENTERPRISE DATA:
STRATEGICALLY AND TECHNICALLY DRIVEN:
THE EXISTENCE OF EACH DATA ITEM MUST BE JUSTIFIED BY A BUSINESS PROCESS REQUIRED OF EITHER SHORT-TERM OR LONG-TERM GOALS.
![Page 22: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/22.jpg)
DRM PRINCIPLES (Continued)
DATA LIFE CYCLE ASSESSMENT
DATA LIFE CYCLE FROM ACQUISITION OR CREATION TO PRODUCTION OR DELETION MUST BE PERIODICALLY ASSESSED BASED ON BUSINESS NEEDS AND CLIMATES.
![Page 23: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/23.jpg)
DRM PRINCIPLES (Continued)
DATA DEFINED
DATA MUST BE UNIQUELY DEFINED AND ASSIGNED PRECISE MEANING PER ORGANIZATION VOCABULARY.
![Page 24: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/24.jpg)
DRM PRINCIPLES (Continued)
INTEGRITY
DATA INTEGRITY RULES MUST BE MAINTAINED TO ASSURE CONSISTENCY AND TO CONTROL REDUNDANCY.
![Page 25: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/25.jpg)
DRM PRINCIPLES (Continued)
SECURITY/CONFIDENTIALITY
DATA MUST BE PROTECTED FROM UNAUTHORIZED AND INADVERTENT ACCESS, MODIFICATION, DESTRUCTION AND DISCLOSURE.
![Page 26: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/26.jpg)
DRM PRINCIPLES (Continued)
ACCESSIBILITY
DATA MUST BE MADE AVAILABLE WHEN AND WHERE NEEDED FOR SHARING AND REUSE.
![Page 27: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/27.jpg)
DRM PRINCIPLES (Continued)
DATA STEWARDSHIP
DATA SUBJECT AREAS WILL BE MANAGED BY A TEAM OF PEOPLE KNOWN AS DATA OWNERS AND CUSTODIANS. THE GROUP IS RESPONSIBLE FOR ASSURING THAT DATA STRUCTURE REFLECTS BUSINESS POLICIES AND RULES.
![Page 28: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/28.jpg)
DRM PRINCIPLES (Continued)
COST/BENEFIT OPTIMIZATION
DATA MUST BE UTILIZED TO MAXIMIZE BUSINESS
BENEFITS AT A MINIMUM COST.
![Page 29: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/29.jpg)
Knowledge Management (KM) – Side Benefits of DRM
It is a systematic process for capturing, integrating, organizing, and communicating knowledge accumulated by employees.
It is a vehicle to share corporate knowledge so that employees may be more more effective and be productive in their work.
A knowledge management system must store all such knowledge in a knowledge repository.
![Page 30: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/30.jpg)
What is AI?
What is intelligence?
– The ways humans think..
– The ways humans behave ..
– The ways rational/intelligent things think..
– -The ways rational/intelligent things behave… AI is the science of understanding intelligence and the
art of making intelligent things
![Page 31: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/31.jpg)
What does AI do?
Automation of problem solving
– Learning
– Memory (Knowledge Representation)
– Reasoning
– Acting Study of mental faculty through computational models Making computers do what people do better now (or
did better at some point!)
![Page 32: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/32.jpg)
History of Decision-Support Systems
Ad Hoc Reports Special Extract Programs Small Applications Information Centers Decision-Support Systems Executive Information Systems
![Page 33: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/33.jpg)
Four Levels of Analytical Processing
In modern organization, at least four levels of analytical processing should be supported by information systems
– First level: Consists of simple queries and reports against current and historical data
– Second level: Goes deeper and requires the ability to do “what if” processing across data store dimensions
![Page 34: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/34.jpg)
Four Levels of Analytical Processing
– Third level: Needs to step back and analyze what has previously occurred to bring about the current stat of the data
– Fourth level: Analyzes what has happened in the past and what needs to be done in the future in order to bring some specific change
![Page 35: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/35.jpg)
The Evolution of Data Warehousing
Since 1970s, organizations gained competitive advantage through systems that automate business processes to offer more efficient and cost-effective services to the customer.
This resulted in accumulation of growing amounts of data in operational databases.
![Page 36: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/36.jpg)
The Evolution of Data Warehousing
Organizations now focus on ways to use operational data to support decision-making, as a means of gaining competitive advantage.
However, operational systems were never designed to support such business activities.
Businesses typically have numerous operational systems with overlapping and sometimes contradictory definitions.
![Page 37: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/37.jpg)
The Evolution of Data Warehousing
Organizations need to turn their archives of data into a source of knowledge, so that a single integrated / consolidated view of the organization’s data is presented to the user.
A data warehouse was deemed the solution to meet the requirements of a system capable of supporting decision-making, receiving data from multiple operational data sources.
![Page 38: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/38.jpg)
Objectives of Today’s Businesses
Access and combine data from a variety of data stores Perform complex data analysis across these date stores Create multidimensional views of data and its
metadata Easily summarize and roll up the information across
subject areas and business dimensions
![Page 39: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/39.jpg)
These objectives cannot be met easily
Data is scattered in many types of incompatible structures.
Lack of documentation has prevented from integration older legacy systems with newer systems
Internet software like searching engine needs to be improved
Accurate and accessible metadata across multiple organizations is hard to get
![Page 40: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/40.jpg)
A New Type of System Environment
Data is designed for analytical tasks Data from multiple applications Easy to use and conductive to long interactive sessions by users Read-intensive data usage Direct interaction with the system by the users without IT
assistance Content updated periodically and stable Content to include current and historical data Ability for users to run queries and get results online Ability for users to initiate reports
![Page 41: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/41.jpg)
What is a Data Warehouse?
Characteristics:
1. A central database that is loaded from multiple operational databases for the purpose of end-user access and decision
support.
Data Warehousing is a decision support system. It has theFollowing characteristics:
![Page 42: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/42.jpg)
What is a Data Warehouse? - Continued
2. A data warehouse differs from an
operational system in that the data it contains is normally static and
updated in a scheduled manner through
massive
loading procedures.
![Page 43: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/43.jpg)
What is a Data Warehouse? - Continued
3. A data warehouse is developed to accommodate random, ad hoc queries and to allow users to ‘drill down’ to
minute levels of detail.
![Page 44: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/44.jpg)
Definition
Bill Inmon defines a central data warehouse as a database that is: 1. Subject Oriented Data naturally congregates around major
categories within any corporation. These categories are called subject areas. For example, subject areas are bill of material, customer, product, and criminal profile. The subject area will be designed to contain only the data appropriate for decision support analysis.
![Page 45: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/45.jpg)
Definition (Continued)
2. Integrated Data integration is displayed by
consistence in the measurement of variables, naming conventions, physical data definitions across the data. There will be only one definition, identifier, etc., for each
subject area.
![Page 46: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/46.jpg)
Definition (Continued)
3. Time Variant
Data in the DW is historical and accurate as of some point in time. Since DW data is extracted from operational systems, it must have an element of time as part of its key structure.
![Page 47: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/47.jpg)
Definition (Continued)
4. Static Since the data in DW is a snap shot
extracted from operational system, it must be
static or non-updateable.
![Page 48: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/48.jpg)
Definition (Continued)
Data in the warehouse is summarized at different levels.
Granularity levels are based on the data types and the expected system performance for queries.
5. Data Granularity
![Page 49: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/49.jpg)
The Benefits of Data Warehouse
Enable workers to make better and wiser decisions
A data warehouse is specifically developed to allow users the ability to explore data in an unlimited number of ways, accommodating essentially any query a manager could dream up and providing access to the data sources that are behind the results. For example, information gleaned from a data warehouse can change pricing information.
![Page 50: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/50.jpg)
The Benefits of Data Warehouse
Identify hidden business opportunities
A data warehouse performs a second, and very valuable function by searching data for trends and abnormalities which users may not know to look for.
For example: Assisting companies in
spotting sales trends, and detecting erroneous or fraudulent billings.
![Page 51: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/51.jpg)
The Benefits of Data Warehouse
Bending with the customer
A data warehouse can help companies by really understanding who their customers are and what services they are using.
For example, by collecting and analyzing
internet portal click stream data, companies are able to build extensive user profiles to boost profits through sales channel.
![Page 52: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/52.jpg)
The Benefits of Data Warehouse
Precision Marketing
A data warehouse can aid in detecting segments of the marketplace (geographically and demographically) which remain untapped, and help show the best way to reach out to these potential customers (rapid response to market and technology trends).
![Page 53: Paul K Chen](https://reader035.vdocument.in/reader035/viewer/2022062408/56813273550346895d990eb0/html5/thumbnails/53.jpg)
Tugas
Apa yang dimaksud dengan datawarehouse Mengapa perlu adanya data warehouse dalam
lingkungan bisnis? Jelaskan manfaat adanya data warehouse Bagaimana pengembangan data warehouse di
masa depan Sebutkan contoh kasus dalam penggunaan data
warehouse