data warehouse design models in higher education courses patrizia poščić, associate professor...
TRANSCRIPT
![Page 1: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/1.jpg)
Data Warehouse design models in higher education courses
Patrizia Poščić, Associate Professor [email protected]
Danijela Subotić, Teaching Assistant [email protected]
Department of Informatics, University of RijekaRadmile Matejčić 2, 51000 Rijeka, Croatia
http://www.inf.uniri.hr
![Page 2: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/2.jpg)
Overview
• Introduction
• DW architecture
• Modeling practices
– Entity-relationship model
– Data Vault model
– Dimensional model
• Conclusion
2
![Page 3: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/3.jpg)
Introduction
• Selected Topics in Databases
• Graduate study, 1st year
• Data warehouse (DW) design as a topic
• Integrating several data modeling practices for complete DW design
• Practical assignment at the end of the semester
3
![Page 4: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/4.jpg)
DW architecture
4
![Page 5: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/5.jpg)
Modeling practices
• Modeling of existing database (DB) sources
– Entity-relationship model
– Relational model
• Modeling enterprise data warehouse (EDW) as system of records
– Data Vault model
• Modeling data marts (DM)
– Dimensional model
5
![Page 6: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/6.jpg)
Business case
• We use a business case which deals with a DW for the outdoor and adventure equipment sales company
• All data model examples (which are shown on following slides) are made in Erwin 9.5 and are based on IDEF1X
6
![Page 7: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/7.jpg)
Entity-Relationship (ER) model
Sales DB
7
Marketing DB
![Page 8: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/8.jpg)
Data Vault model
• A data modeling method that supports design of data warehouses for long-term storage of historical data collected from various data sources
• Based on the assumption that the DW environment is in constant change
• It highlights the need for tracking the origin of data contained in the database, through empirically defined set of metadata
• Enables tracking the value back to the source and tracking the history of changes
8
![Page 9: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/9.jpg)
Data Vault model
• There is no difference between good and bad data - all the data is stored at all times, regardless of whether they are adaptable to business rules - avoiding the loss of information
• The structural data are explicitly separated from descriptive attributes, regardless of whether they come from the same source
• Model flexible to changes in business environment
• Allows for a gap analysis and trend projections
9
![Page 10: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/10.jpg)
Data Vault model
• Any change is implemented in the model as an independent extension of the existing model:
– the changes do not affect current applications
– all versions of the application can be based on the same, developing DB
– all versions of the model are a subset of the DV model
• Enables fast parallel loading which reduces the overall costs
• Aiming at flexibility and performance
10
![Page 11: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/11.jpg)
Data Vault model
• Hub
• Link
• Satellite
11
H_CUSTOMER
customerID
loadDTSrecSource
S_CUST_NAME
customerID (FK)loadDTS
custNamecustCreditLimitloadEndDTSrecSource
H_ORDER
orderID
loadDTSrecSource
L_CST_ORD
customerID (FK)orderID (FK)
loadDTSrecSource
S_CUST_CONTACT
customerID (FK)loadDTS
custAddresscustZipCodecustCitycustProvincecustCountrycustPhonecustEmailloadEndDTSrecSource
S_ORDER
orderID (FK)loadDTS
orderDatedeliveryDatepaymentDiscountorderTotalloadEndDTSrecSource
H_CLASS
classID
loadDTSrecSource
L_ORD_ORD_CLS
orderID (FK)classID (FK)
loadDTSrecSource
S_CLASS
classID (FK)loadDTS
classTypeloadEndDTSrecSource
![Page 12: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/12.jpg)
Data Vault model
12
![Page 13: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/13.jpg)
Data Vault model(main advantages)
• Inserts, deletes, or updates of rows are implemented only as additions (nothing ever get lost/overwritten)
• Structural changes of and in data sources results in model expansion, principally by new links and without structural reconstruction of existing DW elements (architectural stability)
• Enables rapid parallel data loads
13
![Page 14: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/14.jpg)
Dimensional model
• Practically universally used for DM design presentation
• Distinguished by star schema design
– centralized fact table, which contains a multi-layered keys and one or more numerical business measures
– fact (set of measurement) needs to be tracked for a lowest granularity of data
– fact is surrounded with a rich context of dimensions
– dimension tables are denormalized, they have a simple key and they store business attributes in the form of textual information
14
![Page 15: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/15.jpg)
Dimensional model
15
F_ORDER
customerID (FK)timeID (FK)agentID (FK)paymentDueDeliveryDuesubsegmentID (FK)
quantitydiscountreturnscancellationstotalNetOrdprice
F_INVOICE
customerID (FK)timeID (FK)agentID (FK)subsegmentID (FK)
quantitydiscountprice
D_TIME
timeID
datedayOfWeekweekmonthquartalcampSeasonyearcampYear
D_CUSTOMER
customerID
nameCustsexCustclassCustbranchCustcityregion
D_AGENT
agentID
nameAgtypeAgagencymasterAg
D_PRODUCT
subsegmentID
subsegmentsegmentsubcategorycategory
![Page 16: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/16.jpg)
Conclusion
• We presented a set of complementary data warehouse design models which may enable well integrated DW solutions for relational DB implementations
• Models based on a common notation (IDEF1X) and in a single design tool (ErWin)
• Our goal is to present students with a compact set of modelling knowledge in the field of DB and DW
• Upgrade and further develop theoretical knowledge and practical modelling skills through the educational process
16
![Page 17: Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf.uniri.hr Danijela Subotić, Teaching Assistant](https://reader035.vdocument.in/reader035/viewer/2022062517/56649edb5503460f94beb01d/html5/thumbnails/17.jpg)
Thank You for your attention!