data ware house kt
TRANSCRIPT
-
8/9/2019 Data ware House KT
1/14
Agenda Agenda
Data Warehousing Concepts
Hanmath Singuluri
-
8/9/2019 Data ware House KT
2/14
Data Warehousing - ArchitectureData Warehousing - Architecture
Enterprise
Data
Warehouse
Enterprise
Data
WarehouseData Mart
Data Mart
Data Mart
Data Mart
Execution
Systems
• CRM• ERP• Legacy• e-Commerce
Execution
Systems
• CRM• ERP• Legacy• e-Commerce
External
Data
• Purchase! Mar$et
Data• Sprea!sheets
External
Data
• Purchase! Mar$etData• Sprea!sheets
•Oracle•S#L Ser%er •Tera!ata•D&'
Data an! Meta!ata
Repository Layer
ETL Tools(•)n*ormatica Po+erMart•ET)•Oracle Warehouse &uil!er •Custom programs
•S#L scripts
Extract, Trans*ormation,
an! Loa! ETL. Layer
• Cleanse Data• /ilter Recor!s• Stan!ar!i0e alues• Deco!e alues• pply &usiness Rules• "ousehol!ing• De!upe Recor!s• Merge Recor!s
Extract, Trans*ormation,
an! Loa! ETL. Layer
• Cleanse Data• /ilter Recor!s• Stan!ar!i0e alues• Deco!e alues• pply &usiness Rules• "ousehol!ing• De!upe Recor!s
• Merge Recor!s
ETL Layer
Meta!ata
Repository
Meta!ata
Repository
ODS
ODS
•PeopleSo*t•SP•Sie2el•Oracle pplications•Manugistics
•Custom Systems
Data Mart
Data Mart
Source Systems
Sample Technologies(
-
8/9/2019 Data ware House KT
3/14
OLTP vs DW OLTP vs DW
OLTP DW
Data dependencies (E-R) model Dimensional model
Microscopic data consistency Global data consistenc
Millions of transactions per day One transaction per da
Mostly does not keep istory !eepin" istory is nec
Gets loaded in te day Gets loaded in te ni"
-
8/9/2019 Data ware House KT
4/14
Dimensional Data Modeling Dimensional Data Modeling E-R mo!el
– Symmetric
– Di%i!es !ata into many entities – Descri2es entities an! relationships
– See$s to eliminate !ata re!un!ancy
– 4oo! *or high transaction per*ormance Dimensional mo!el
– symmetric
–Di%i!es !ata into !imensions an! *acts
– Descri2es !imensions an! measures
– Encourages !ata re!un!ancy
– 4oo! *or high 5uery per*ormance
-
8/9/2019 Data ware House KT
5/14
Facts/DimensionsFacts/Dimensions
/act
– Central, !ominant ta2le
– Multi-part primary $ey
– "ol!s millions 6 2illions o* recor!s
– Lin$s !irectly to !imensions
– Stores 2usiness measures
– Constantly %arying !ata
-
8/9/2019 Data ware House KT
6/14
Facts/Dimensions (contd!Facts/Dimensions (contd!
Dimensions
– Single 3oin to the *act ta2le single primary $ey.
– Stores 2usiness attri2utes
– ttri2utes are textual in nature
– Organi0e! into hierarchies
– More or less constant !ata
– E7g7 Time, Pro!uct, Customer, Store, etc7
-
8/9/2019 Data ware House KT
7/14
Star/Sno"#la$e schemaStar/Sno"#la$e schema
Star schema
– /act surroun!e! 2y 8-19 !imensions
– Dimensions are !e-normali0e!
Sno+*la$e schema
– Star schema +ith secon!ary !imensions
– Don:t sno+*la$e *or sa%ing space
– Sno+*la$e i* secon!ary !imensions ha%e many attri2utes
-
8/9/2019 Data ware House KT
8/14
Star schemaStar schema
-
8/9/2019 Data ware House KT
9/14
Star schema e%am&leStar schema e%am&le
-
8/9/2019 Data ware House KT
10/14
Sno"#la$e schema e%am&leSno"#la$e schema e%am&le
STORE KEY
Store Dimension
Store Description
City
State
District ID
District Desc.
Region_ID
Region Desc.
Regional Mgr.
District_ID
District Desc.
Region_ID
STORE KEY
PRODUCT KEY
PERIOD KEY
Dollars
Units
Price
Store Fact Tale
-
8/9/2019 Data ware House KT
11/14
DM ' DW ODS DM ' DW ODS
DM
– Organi0e! aroun! a single 2usiness process
– Represents small part o* the organi0ation:s 2usiness
– Logical su2set o* the complete !ata +arehouse
– /aster roll out, 2ut complex integration in the long run
-
8/9/2019 Data ware House KT
12/14
DM ' DW ODS (contd!DM ' DW ODS (contd!
DW
– ;nion o* its constituent !ata marts
– #uerya2le source o* !ata in the organi0ation
– Re5uires extensi%e 2usiness mo!eling may ta$e years an! 2uil!.
ODS
–
Point o* integration *or operational systems – Lo+-le%el !ecision support
– Can store integrate! !ata, 2ut at !etaile! le%el
-
8/9/2019 Data ware House KT
13/14
OLAP OLAP
Element o* !ecision support systems DSS.
Support almost. a!-hoc 5uerying *or 2usiness analyst
"elps the $no+le!ge +or$er executi%e, manager, analyst. ma$e
2etter !ecisions
ROLP - exten!e! RD&MS that maps operations on multi!imen
stan!ar! relational operators
MOLP - Special-purpose ser%er that !irectly implements multi!
!ata an! operations
-
8/9/2019 Data ware House KT
14/14