data warehousing -kalyani. topics definition types components architecture database design olap...

43
Data Warehousing -Kalyani

Post on 21-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Data Warehousing

-Kalyani

Page 2: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Topics

• Definition

• Types

• Components

• Architecture

• Database Design

• OLAP

• Metadata repository

Page 3: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

OLTP vs. Warehousing

• Organized by transactions vs. Organized by particular subject

• More number of users vs. less

• Accesses few records vs. entire table

• Smaller database vs. Large database

• Normalised data structure vs. Unnormalized

• Continuous update vs. periodic update

Page 4: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Definition

• A datawarehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of managements decision making process.

• It is the process whereby organizations extract value from their informational assets through use of special stores called data warehouses

Page 5: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Types

• Operational Data Store: Operational data mirror. Eg: Item in stock.

• Enterprise data warehouse: Historical analysis, Complex pattern analysis.

• Data Marts

Page 6: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Uses of a datawarehouse

• Presentation of standard reports and graphs

• For dimensional analysis

• Data mining

Page 7: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Advantages

• Lowers cost of information access

• Improves customer responsiveness

• Identifies hidden business opportunities

• Strategic decision making

Page 8: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Roadmap to DataWarehousing

• Data extracted, transformed and cleaned

• Stored in a database - RDBMS, MDD

• Query and Reporting systems

• Executive Information System and Decision Support System

Page 9: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Data Extraction and Load

• Find sources of data : Tables, files, documents, commercial databases, emails, Internet

• Bad data Quality: Same name but different things, Different Units

• Tool to clean data - Apertus

• Tool to convert codes, aggregate and calculate derived values - SAS

• Data Reengineering tools

Page 10: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Metadata

• Database that describes various aspects of data in the warehouse

• Administrative Metadata: Source database and contents, Transformations required, History of Migrated data

• End User Metadata: Definition of warehouse dataDescriptions of itConsolidation Hierarchy

Page 11: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Storage

• Relational databases

• MDDMeasurements are numbers that

quantify the business processDimensions are attributes

that describe measurements

Page 12: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Information Analysis & Delivery

• Speed up retrieval using query optimizers and bitmap indices

• Adhoc query - Simple query and analysis functions

• Managed Query - Business layer between end users and database

• Multidimensional - OLAP - support complex analysis of dimensional data

Page 13: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Information Analysis & Delivery

• EIS/DSSPackaged queries and reports

Preplanned analytical functions Answer specific questions

• AlertsSpecific indicators

Page 14: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Managing the Data Warehouse

• Data - Size storage needsSecurityBackupsTracking

• Process- Monitoring update process like changes in source, quality of dataAccurate and upto date

Page 15: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Tools

• Data Extraction - SAS

• Data Cleaning - Apertus, Trillium

• Data Storage - ORACLE, SYBASE

• Optimizers - Advanced Parallel Optimizer Bitmap Indices

Star Index

Page 16: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Tools

• Development tools to create applicationsIBM Visualizer, ORACLE

CDE

• Relational OLAP Informix Metacube

Page 17: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Architecture

• Rehosting Mainframe ApplicationsMoving to lower cost

microprocessors Tools - Micro Focus COBOL Lowers Cost

No transparent Access to data

Page 18: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Architecture

• Mainframe as server 2-tier approachFront end client &

back end server Power Builder, VB - Front end tools Minimal investment in extra hardware Data inconsistency hidden

Fat Client Cannot be used if number of end users increase

Page 19: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Architecture

• Enterprise Information Architecture3 tier

Source data on host computer Database servers like ORACLE, Essbase(MDD)

Front-end tools - DSS/EIS

Page 20: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

RDBMS

• RDBMS provide rapid response to queries Bitmap index

Index structures

• Functionality added to conventional RDBMS like data extraction and replication

Page 21: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

MDD

• Decision support environment

• Supports iterative queries

• Extensions to SQL - for high performance data warehousing

• Performance degrades as size increases

• Inability to incrementally load

• Loading is slow

• No agreed upon model

Page 22: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

MDD

• No standard access method like SQL

• Minor changes require complete reorganization

Page 23: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Data Access Tools

• Simple relational query tools - Esperent

• DSS/EIS - EXPRESS used by financial specialists

Page 24: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Database Design

• Simple

• Data must be clean

• Query processing must be fast

• Fast loading

Page 25: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Star Schema

• Consists of a group of tables that describe the dimensions of the business arranged logically around a huge central table that contains all the accumulated facts and figures of the business.

• The smaller, outer tables are points of the star, the larger table the center from which the points radiate.

Page 26: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Star Schema

• Fact Table -Sales, Orders, Budget,

Shipment Real values (numeric)

• Dimension Table-Period, Market, Product

Character data

• Summary/Aggregate data

Page 27: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Star Schema

• Data you can trustReferrential Integrity

• Query SpeedFact table - Primary key

Dimension table - all columns Query optimizer which understands star schema

Page 28: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Star Schema

• Load ProcessingMust be done offline

Issue if aggregate data is stored

Page 29: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Variations of Star Schema

• Outboard tables

• Fact table families

• Multistar fact table

Page 30: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

OLAP

• Front end tool for MDD

• Slice Report

• Pivot Report

• Alert-reporting

• Time-based

• Exception reporting

Page 31: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Wide OLAP

• Generating (synthesizing) information as well as using it, and storing this additional information by updating the data source

• Modeling capabilities, including a calculation engine for deriving results and creating aggregations, consolidations and complex calculations

• Forecasting, trend analysis, optimization, statistical analysis

Page 32: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Relational OLAP

• Has a powerful SQL-generator

• Generates SQL optimized for the target database

• Rapidly changing dimensions

Page 33: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

MDD OLAP

• Row level calculations

• Financial functions, currency conversions, interest calculations

Page 34: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Metadata

• User Oriented Definition of attributes

• System orientedRecord and field edit procedure names

Page 35: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Uses of Metadata

• Map source system data to data warehouse tables

• Generate data extract, transform, and load procedures for import jobs

• Help users discover what data are in the data warehouse

• Help users structure queries to access data they need

Page 36: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Describing the data warehouse

• I/P - O/P object File/TableArchive Period

• Relationship• Data element - Name, Defn., Type• Relationship Member - Role, Participation

Constraint• Field Assignment

Page 37: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Extract Jobs

• Wholesale replace

• Wholesale append

• Update replace

• Update append

Page 38: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Data Quality

• Target and Actual Quality Characteristic

Page 39: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Planning

• Interviews

• Data quality

• Data Access

• Timeliness and history

• Data sources

• Decide on Architecture

Page 40: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Development Process

• Project Initiation

• Develop Enterprise Info. Architecture

• Design Data Warehouse Database

• Transform data

• Manage Metadata

• Develop User-Interface

• Manage Production

Page 41: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Evolution

• Support the current DW baseline

• Enhance current baseline capabilities

• Define new business requirements

• Implement new baseline

Page 42: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

Mistakes

• Starting with the wrong sponsorship chain• Setting expectations that cannot be met• Believing that DW design is the same as

Transactional Database Design• Believing the Performance, Capacity

Promises• Believing that Once the Data Warehouse Is

Up and Running Problems are finished

Page 43: Data Warehousing -Kalyani. Topics Definition Types Components Architecture Database Design OLAP Metadata repository

• NSWCDD - ORACLE on UNIX

• Harris Semiconductor IYM with Alarms, INGRES