data warehouse by piyush

17
BY : PIYUSH JAIN ROLL NUMBR: 05 SECTION:C2703 TRADE:BTECH(IT)-MBA DATA WAREHOUSING

Upload: astronish

Post on 06-May-2015

3.913 views

Category:

Education


3 download

DESCRIPTION

This ppt gives all information regarding the topic..

TRANSCRIPT

Page 1: Data Warehouse By Piyush

BY : PIYUSH JAINROLL NUMBR: 05SECTION:C2703

TRADE:BTECH(IT)-MBA

DATA WAREHOUSING

Page 2: Data Warehouse By Piyush

CONTENTS OF THE TOPIC :

WHAT IS DATA WAREHOUSE (DW)? WHY WE USE DATA WAREHOUSING ? BASIC ARCHITECTURE OF DATA WAREHOUSE

COMPONENTS OF THE WAREHOUSE Extract and to validate the data Transform the data Loading the data into DW .

TYPES OF LAYERS IN THE DATA WAREHOUSE APPROACHES FOR DATA STORAGE STAGES OF USE OF DATA WAREHOUSE ADVANTAGES AND DISADVANTAGES OF DATA WAREHOUSE SOME APPLICATIONS ALSO..

Page 3: Data Warehouse By Piyush

A data warehouse is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis

What is DW ?

Page 4: Data Warehouse By Piyush

Why we need DW ?

Organizations getting larger and amassing ever increasing amounts of data

Historic data encodes useful information about working of an organization.

However, data scattered across multiple sources, in multiple formats.

That’s why we need a Data Ware House form where the data can be easily stored and easily accessible .

Page 5: Data Warehouse By Piyush

BASIC ARCHITECTURE OF DW :

RelationalDatabases

Purchased Data

Data Warehouse Engine

Optimized Loader

ExtractionCleansing

AnalyzeQuery

Metadata Repository

Page 6: Data Warehouse By Piyush

TYPES OF LAYERS IN DW :

1. OPERATIONAL DATABASE LAYER : The source data for the data warehouse

2. DATA ACCESS LAYER : The interface between the operational and informational

access layer — Tools to extract, transform and load data into the warehouse fall into this layer.

3. METADATA LAYER : The data directory - This is usually more detailed than an

operational system data directory.

4. INFORMATION ACCESS LAYER : The data accessed for reporting and analyzing and the tools

for reporting and analyzing data .

Page 7: Data Warehouse By Piyush

DATA WAREHOUSING

Order Processing

Inventory

Sales

Data Extraction

DataWarehouse

(OLAP)

OLTP

Page 8: Data Warehouse By Piyush

Components of the Warehouse

Data Extraction and ValidationTransforming the dataAnalyze and Query - OLAP Tools ( Loading

the data)

Page 9: Data Warehouse By Piyush

Extracting data from External Source

The first part of an ETL process involves extracting the data from the source systems. Most data warehousing projects consolidate data from different source systems. Each separate system may also use a different data organization/format. Common data source formats are relational databases but may include non-relational database structure such as ISAM and VSAM.

VALIDATING DATA :An intrinsic part of the extraction involves the parsing of extracted data, resulting in a check if the data meets an expected pattern or structure. If not, the data may be rejected entirely or in part

Page 10: Data Warehouse By Piyush

Data Transformation

Selecting only certain columns to load Translating coded values Encoding free-form values Deriving a new calculated value Filtering Sorting Joining data from multiple sources (e.g., lookup, merge) Aggregation Transposing or pivoting Splitting a column into multiple columns Disaggregation of repeating columns into a separate detail

table

Page 11: Data Warehouse By Piyush

Loading up of data

The load phase loads the data into the end target, usually the data warehouse (DW). Depending on the requirements of the organization, this process varies widely. Some data warehouses may overwrite existing information with cumulative, updated data every week, while other DW may add new data in a historicized form, for example, hourly. We generally do the loading data into DW using SQL queries.

Page 12: Data Warehouse By Piyush

APPROACHES FOR DATA STORAGE :

Mainly two leading approaches in the data storage :

1. Normalized Approach :Data in the data warehouse are stored following, to a degree of database normalization rules. Tables are grouped together by subject areas that reflect general data categories (e.g., data on customers, products, finance, etc.).

2. Dimensional Approach :Transaction data are partitioned into either "facts", which are generally numeric transaction data, or "dimensions", which are the reference information that gives context to the facts.

Page 13: Data Warehouse By Piyush

STAGES IN DATA WAREHOUSE :

Mainly four stages of use of the data warehouse can be distinguished:

1.OFFLINE OPERATIONAL DATABASE 2.OFFLINE DATA WAREHOUSE3.REAL TIME DATA WAREHOUSE4.INTEGRATED DATA WAREHOUSE

Page 14: Data Warehouse By Piyush

BENEFITS OF DW :

A data warehouse provides a common data model for all data of interest regardless of the data's source .

Prior to loading data into the data warehouse, inconsistencies are identified and resolved .

Information in the data warehouse is under the control of data warehouse users so that, even if the source system data is purged over time, the information in the warehouse can be stored safely for extended periods of time .

Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems.

Page 15: Data Warehouse By Piyush

DISADVANTAGES ALSO…

Data warehouses are not the optimal environment for unstructured data.

Because data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse data.

Over their life, data warehouses can have high costs. Maintenance costs are high.

Data warehouses can get outdated relatively quickly.

Page 16: Data Warehouse By Piyush

SOME APPLICATIONS ALSO…

Credit card churn analysisInsurance fraud analysisCall record analysisLogistics management.

Page 17: Data Warehouse By Piyush

THANKX FOR PAYING YOUR ATTENTIONANY QUERIES ?????