![Page 1: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/1.jpg)
1
Data MiningData Warehouses
![Page 2: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/2.jpg)
2
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
From data warehousing to data mining
Data Warehousing and OLAP Technology: An Overview
![Page 3: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/3.jpg)
3
Defined in many different ways, but not rigorously.
◦ A decision support database that is maintained separately
from the organization’s operational database
◦ Support information processing by providing a solid
platform of consolidated, historical data for analysis.
“A data warehouse is a subject-oriented, integrated, time-
variant, and nonvolatile collection of data in support of
management’s decision-making process.”—W. H. Inmon
Data warehousing:
◦ The process of constructing and using data warehouses
What is Data Warehouse?
![Page 4: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/4.jpg)
4
Organized around major subjects, such as
customer, product, sales
Focusing on the modeling and analysis of data for
decision makers, not on daily operations or
transaction processing
Provide a simple and concise view around
particular subject issues by excluding data that
are not useful in the decision support process
Data Warehouse—Subject-Oriented
![Page 5: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/5.jpg)
5
Constructed by integrating multiple, heterogeneous data sources◦ relational databases, flat files, on-line
transaction records Data cleaning and data integration techniques
are applied.◦ Ensure consistency in naming conventions,
encoding structures, attribute measures, etc. among different data sources E.g., Hotel price: currency, tax, breakfast covered, etc.
◦ When data is moved to the warehouse, it is converted.
Data Warehouse—Integrated
![Page 6: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/6.jpg)
6
The time horizon for the data warehouse is significantly longer than that of operational systems
◦ Operational database: current value data
◦ Data warehouse data: provide information from a historical perspective (e.g., past 5-10 years)
Every key structure in the data warehouse
◦ Contains an element of time, explicitly or implicitly
◦ But the key of operational data may or may not contain “time element”
Data Warehouse—Time Variant
![Page 7: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/7.jpg)
7
A physically separate store of data transformed
from the operational environment
Operational update of data does not occur in the
data warehouse environment
◦ Does not require transaction processing,
recovery, and concurrency control mechanisms
◦ Requires only two operations in data accessing: initial loading of data and access of data
Data Warehouse—Nonvolatile
![Page 8: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/8.jpg)
8
OLTP (on-line transaction processing)◦ Major task of traditional relational DBMS◦ Day-to-day operations: purchasing, inventory, banking,
manufacturing, payroll, registration, accounting, etc. OLAP (on-line analytical processing)
◦ Major task of data warehouse system◦ Data analysis and decision making
Distinct features (OLTP vs. OLAP):◦ User and system orientation: customer vs. market◦ Data contents: current, detailed vs. historical, consolidated◦ Database design: ER + application vs. star + subject◦ View: current, local vs. evolutionary, integrated◦ Access patterns: update vs. read-only but complex queries
Data Warehouse vs. Operational DBMS
![Page 9: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/9.jpg)
9
OLTP vs. OLAP
OLTP OLAP
function day to day operations decision support
DB design application-oriented subject-oriented
data current, up-to-date detailed, flat relational isolated
historical, summarized, multidimensional integrated, consolidated
access read/write index/hash on prim. key
lots of scans
unit of work short, simple transaction complex query
# records accessed tens millions
#users thousands hundreds
DB size 100MB-GB 100GB-TB
![Page 10: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/10.jpg)
10
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
From data warehousing to data mining
Data Warehousing and OLAP Technology: An Overview
![Page 11: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/11.jpg)
11
A data warehouse is based on a multidimensional data
model which views data in the form of a data cube
A data cube, such as sales, allows data to be modeled and
viewed in multiple dimensions
◦ Dimension tables, such as item (item_name, brand, type),
or time(day, week, month, quarter, year)
◦ Fact table contains measures (such as dollars_sold) and
keys to each of the related dimension tables
From Tables and Spreadsheets to Data Cubes
![Page 12: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/12.jpg)
12
Modeling data warehouses: dimensions &
measures
◦ Star schema: A fact table in the middle
connected to a set of dimension tables
◦ Snowflake schema: A refinement of star schema
where some dimensional hierarchy is normalized
into a set of smaller dimension tables, forming a
shape similar to snowflake
Conceptual Modeling of Data Warehouses
![Page 13: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/13.jpg)
13
Example of Star Schema
time_keydayday_of_the_weekmonthquarteryear
time
location_keystreetcitystate_or_provincecountry
location
Sales Fact Table
time_key
item_key
branch_key
location_key
units_sold
dollars_sold
avg_sales
Measures
item_keyitem_namebrandtypesupplier_type
item
branch_keybranch_namebranch_type
branch
![Page 14: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/14.jpg)
14
Example of Star Schema
![Page 15: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/15.jpg)
15
Example of Snowflake Schema
time_keydayday_of_the_weekmonthquarteryear
time
location_keystreetcity_key
location
Sales Fact Table
time_key
item_key
branch_key
location_key
units_sold
dollars_sold
avg_sales
Measures
item_keyitem_namebrandtypesupplier_key
item
branch_keybranch_namebranch_type
branch
supplier_keysupplier_type
supplier
city_keycitystate_or_provincecountry
city
![Page 16: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/16.jpg)
Example of Snowflake Schema
![Page 17: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/17.jpg)
17
Allows data to be modeled and viewed in multiple dimensions
Another representation like star schema Dimensions are entities which you want to
keep records◦ Time, item, branch, location…
Each dimension has a table called dimension table
Data Cubes
![Page 18: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/18.jpg)
18
3-D Data Cube
• 4-D cubes can be visualized as a series of 3-D cubesSupplier 1 Supplier 2 Supplier 3
![Page 19: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/19.jpg)
19
A data cube is often referred as a cuboid Can generate a cubiod for all possible
subsets of dimensions◦ Provide different level of summarization
N-D cube base cuboid◦ Lowest level of summary
0-D cube apex cuboid◦ Highest level of summary◦ Summary over all dimensions
Cuboid
![Page 20: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/20.jpg)
20
Cube: A Lattice of Cuboids
time,item
time,item,location
time, item, location, supplier
all
time item location supplier
time,location
time,supplier
item,location
item,supplier
location,supplier
time,item,supplier
time,location,supplier
item,location,supplier
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
![Page 21: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/21.jpg)
21
Start with time-product (2-D) table that shows sale amounts
Ex. Data Cube Gen
TV PC VCR
1st Qtr 1000 850 350
2nd Qtr 1352 940 298
3rd Qtr 1450 658 314
4th Qtr 1500 965 365
USA
![Page 22: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/22.jpg)
22
TV PC VCR TV PC VCR TV PC VCR
1st Q 1000 850 350 2600 750 425 1300 850 350
2nd Q 1352 940 298 1752 860 236 1200 1000 400
3rd Q 1450 658 314 1055 458 520 1150 555 510
4th Q 1500 965 365 1350 1065 390 900 750 425
Ex. Data Cube Gen (3D)
USA Canada Mexico
![Page 23: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/23.jpg)
23
A Sample Data Cube
Total annual salesof TV in U.S.A.Date
Produ
ct
Cou
ntr
ysum
sum TV
VCRPC
1Qtr 2Qtr 3Qtr 4Qtr
U.S.A
Canada
Mexico
sum
![Page 24: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/24.jpg)
24
Cuboids Corresponding to the Cube
all
product date country
product,date product,country date, country
product, date, country
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D(base) cuboid
![Page 25: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/25.jpg)
25
Sales volume as a function of product, month, and region
Multidimensional DataP
rodu
ctReg
ion
Month
Dimensions: Product, Location, TimeHierarchical summarization paths
Industry Region Year
Category Country Quarter
Product City Month Week
Office Day
![Page 26: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/26.jpg)
26
A Concept Hierarchy: Dimension (location)
all
Europe North_America
MexicoCanadaSpainGermany
Vancouver
M. WindL. Chan
...
......
... ...
...
all
region
office
country
TorontoFrankfurtcity
![Page 27: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/27.jpg)
27
Roll up (drill-up): summarize data◦ by climbing up hierarchy or by dimension reduction
Drill down (roll down): reverse of roll-up◦ from higher level summary to lower level summary
or detailed data, or introducing new dimensions Slice and dice:
◦ project and select Pivot (rotate):
◦ reorient the cube, visualization, 3D to series of 2D planes
Typical OLAP Operations
![Page 28: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/28.jpg)
28
![Page 29: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/29.jpg)
29
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
From data warehousing to data mining
Chapter 3: Data Warehousing and OLAP Technology: An Overview
![Page 30: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/30.jpg)
Data Warehouse: A Multi-Tiered Architecture
DataWarehouse
ExtractTransformLoadRefresh
OLAP Engine
AnalysisQueryReportsData mining
Monitor&
IntegratorMetadata
Data Sources Front-End Tools
Serve
Data Marts
Operational DBs
Othersources
Data Storage
OLAP Server
![Page 31: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/31.jpg)
31
Data Mart◦ a subset of corporate-wide data that is of value to a specific
groups of users. Its scope is confined to specific, selected groups, such as marketing data mart
Meta data is the data defining warehouse objects. It stores:◦ Description of the structure of the data warehouse
schema, view, dimensions, hierarchies, data mart locations and contents
◦ Monitoring information warehouse usage statistics, error reports
◦ Algorithms used for summarization
Warehouse Architecture
![Page 32: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/32.jpg)
32
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
From data warehousing to data mining
Chapter 3: Data Warehousing and OLAP Technology: An Overview
![Page 33: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/33.jpg)
33
Three kinds of data warehouse applications
◦ Information processing supports querying, basic statistical analysis, and
reporting using crosstabs, tables, charts and graphs
◦ Analytical processing multidimensional analysis of data warehouse data supports basic OLAP operations, slice-dice, drilling,
pivoting
◦ Data mining knowledge discovery from hidden patterns supports associations, constructing analytical models,
performing classification and prediction, and presenting the mining results using visualization tools
Data Warehouse Usage
![Page 34: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/34.jpg)
34
What is a data warehouse?
A multi-dimensional data model
Data warehouse architecture
From data warehousing to data mining
Summary
Chapter 3: Data Warehousing and OLAP Technology: An Overview
![Page 35: Data Warehouses 1. What is a data warehouse? A multi-dimensional data model Data warehouse architecture From data warehousing to data mining 2](https://reader035.vdocument.in/reader035/viewer/2022062305/5697bfaf1a28abf838c9d27f/html5/thumbnails/35.jpg)
35
Why data warehousing? A multi-dimensional model of a data warehouse
◦ Star schema, snowflake schema
◦ A data cube consists of dimensions & measures OLAP operations: drilling, rolling, slicing, dicing and pivoting Data warehouse architecture
Summary: Data Warehouse and OLAP Technology