data ware house architecture

39
DATA WARE HOUSE AND IT’S ARCHITECTURE PRESENTED BY: DEEPAK CHAURASIA M.TECH (DELHI COLLEGE OF ENGINEERING)

Upload: deepak-chaurasia

Post on 22-Nov-2014

110 views

Category:

Engineering


3 download

DESCRIPTION

this was the slide i made for class presentation .now it is public

TRANSCRIPT

Page 1: Data ware house architecture

DATA WARE HOUSE AND IT’S ARCHITECTURE

PRESENTED BY: DEEPAK CHAURASIA

M.TECH (DELHI COLLEGE OF

ENGINEERING)

Page 2: Data ware house architecture

The issus I’ll make focus on… What is data ware house? Architecture of data ware house? Olap server and its various types and

their working? Data marts?

Page 3: Data ware house architecture

What is this dataware house all about ??

Page 4: Data ware house architecture

A data warehouse is a Subject-oriented ->DATABASE AND DATAWARE HOUSE ARE 2

DIFFERENT THINGS SO DIFFERENT APPROACH S OF STORING DATA INTO THEM .

Integrated -> BRINGING INTO A COMMON FORMAT

Time-varying ->HISTORICAL DATA ,DATE ASSOCIATED WITH TIME

Non-volatile -> UNDELETABLE AND NON UPDATABLE FORMAT

collection of data that is used primarily in organizational decision making.

Page 5: Data ware house architecture

Subject oriented??

5

Application -orientation

Operational Database

Saving account

Order processing

Data ware house

Subject-orientation

sales

account

Stock mgmt

Billing

Loan account

Current account

Business

Bank

Page 6: Data ware house architecture

Explanation As we can see in both business and bank

example the databases store the data application wise . It simply means that for every operational application of the organization there is a storage associated in which that application specific data are stored. These storages are called database.

But in the case of data ware house of the organization the data are stored subject wise , this subject is most important aspect of the organization . for bank account is important for business sale is important

Page 7: Data ware house architecture

Integrated ??• Data in DW comes from several operational systems.

• Different datasets in these operational system have different file formats.

• Example: Data for subject Account comes from 3 different data sources.(AS SHOWN IN FIGURE)

Account

savings

current

Loan

Subject = account

Operational environment

Page 8: Data ware house architecture

o So variations could be there, like:

1. Naming conventions could be different.Example: Saving account no. could be of 8 bytes long but only 6 bytes

for checking accounts.

2. Number of total Attributes for data items could be different.Example :saving account can have 5 attribute while checking account

can have 7 attribute associated with it.

Page 9: Data ware house architecture

Time variant??

The operational database stores only current data but the data ware house stores all present as well as past data in order to full fill its purposes.

Data is stored as series of snapshots each representing a period of time.

Data is tagged with some element of time - creation date, as of date, etc.

Data is available on-line for long periods of time for trend analysis and forecasting. For example, five or more years

Data warehouse

Page 10: Data ware house architecture

Non-volatile??

Data from operational systems are moved into DW after

specific intervals.(process is called refreashing)

Business transaction don’t update in Data ware house.

Data from Data ware house is not deleted.

Page 11: Data ware house architecture

The 3 tier architecture of Data Ware house---

• When all the components of the system are combined together to form the complete system then the style of designing(combining) of that structure is known as the architecture of the system.(ex-the architecture of a school building).

• In data ware house the components are-1. Data acquisition2. Data storage3. Data processing4. Data delivery

 Layers(ex. Osi reference model in computer network ) means the system is made by logically separated components and

tier means the system is made by physically separated components.

Page 12: Data ware house architecture

The various possible architecture while dealing with database:

Hare database (in the form of files) is itself stored in the client computer.

Hare database server is present in the distant place and client machine and database are connected via network.

Page 13: Data ware house architecture

Here between the client machine and the database server we have included an application server which is mainly at server side and does the processing and return results to the client machine.

Page 14: Data ware house architecture

conclusionsTiers

SecurityMaintainabilityNo . Of users

Speedcost

Page 15: Data ware house architecture

The architecture of data ware house

Information Sources Data Warehouse Server(Tier 1)

OLAP Servers(Tier 2)

Clients(Tier 3)

OperationalDB’s

External sources

extracttransformload

Data Marts

DataWarehous

e

MOLAP

ROLAP

serve

OLAP

Query/Reporting

Data Mining

serve

serve

Data tier logic tier presentation tier

Page 16: Data ware house architecture

The bottom most:

Operational databases

External sourse

• These are the application specific database which are used to store all the daily basis transactional data of the organization.

• This is the database which is used to store all important external information.

Page 17: Data ware house architecture

Database vs. data ware house

OLTP (on-line transaction processing)Major task of traditional relational DBMSDay-to-day operations: purchasing,

inventory, banking, manufacturing, payroll, registration, accounting, etc.

OLAP (on-line analytical processing)Major task of data warehouse system.Data analysis and decision making.Forecasting, monitoring of business.

Page 18: Data ware house architecture

How loading is done of the Warehouse??

This is done using back end tools. To know about back end tools go to the

next page.

Page 19: Data ware house architecture

Data extraction:get data from multiple, heterogeneous, and external sources.Data cleaning: correcting values. Data transformation: converting from one format to another format. (pond kg , age dob)Load: summarize tables are loaded into data ware house.Refresh:propagate the updates from the data sources to the warehouse.

Page 20: Data ware house architecture

Tier 1 :data ware house

It is the data ware house that is loaded with strategy making information.

This tier also consists of data marts.

Page 21: Data ware house architecture

Tier 2 This tier consists of Olap server which

are used for the processing purposes. Here the following issues are also handled—

Security of data.(you are not letting user directly communicate with data base)

Business logic(here you can decide what kind of information to be shown to a particular kind of query ).

Translation(users high level query are converted into low level sql query).

Intermediate calculations(removes burden from user interface and database )

Page 22: Data ware house architecture

Olap server

Rolap server Molap server

Choose this if space is important for you

Choose this if time is important for you

Page 23: Data ware house architecture

HOW DOES ROLAP WORK??

Page 24: Data ware house architecture

Complex

query

User

requ

est

resu

lts

Multi dimensional view

Desktop client

Rolap server

Rdbms server

Data ware house

Creating data cube dynamically (on the fly)

ROLAP

Page 25: Data ware house architecture

DETAILS Relational online analytical processing (ROLAP) is

a form of online analytical processing (olap) that performs multidimensional analysis of data which is stored in a relational database rather than in a multidimensional database.

In a three-tiered architecture, the user submits a request for multidimensional analysis and the ROLAP engine converts the request to SQL for submission to the relational database. Then the operation is performed in reverse: the engine converts the resulting data from SQL to a multidimensional format(on the fly) before it is returned to the client for viewing.

Page 26: Data ware house architecture

In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date

ans date sum1 812 48

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

Add up total sale amount by day

QUERY

Page 27: Data ware house architecture

HOW DOES MOLAP WORK??

Page 28: Data ware house architecture

User

request

Create

and

store

sum

mary

data cu

bes

Multi dimensional view

Desktop client

Molap server

Rdbms server

Data ware house

Multidimensional database

resu

lts

Molap

Page 29: Data ware house architecture

POINTS ABOUT MOLAP: Here we use Multidimensional database for the

purpose of data fetching when an analytical query is submitted by user.

Facts (fact table)are stored in multi-dimensional arrays.

Dimensions(dimension table) used to index the arrays.

One of the major distinctions of molap against a rolap tool is that data are pre-summarized pre-

calculated and are stored in an optimized format in a multidimensional cube, instead of in a relational database , in accordance with a client’s reporting

requirements .

Page 30: Data ware house architecture

MOLAP is more optimized for fast query performance and retrieval of summarized information.

There are certain limitations to implementation of a MOLAP system, one primary weakness of which is that MOLAP tool is less scalable than a ROLAP tool as the former is capable of handling only a limited amount ofdata.

Pre-calculating or pre-consolidating transactional data improves speed.

Page 31: Data ware house architecture

The MOLAP Cube

sale prodId storeId amtp1 s1 12p2 s1 11p1 s3 50p2 s2 8

s1 s2 s3p1 12 50p2 11 8

Fact table view: Multi-dimensional cube:

dimensions = 2

Add up total sale amount by day

Page 32: Data ware house architecture

dimensions = 3

Multi-dimensional cube:Fact table view:

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

day 2 s1 s2 s3p1 44 4p2

s1 s2 s3p1 12 50p2 11 8

day 1

Add up total sale amount by day

Page 33: Data ware house architecture

The total sale of of computers in year 2008 at the location asia is 200 unit

The total sale of of books in year 2008 at the location Europe is 200

Page 34: Data ware house architecture

Hybrid OLAP (HOLAP)

HOLAP = Hybrid OLAP:

Best of both worlds

Storing detailed data in RDBMS

Storing aggregated data in MDBMS

User access via MOLAP tools

Page 35: Data ware house architecture

Multi-dimensional access Multidimension

al Viewer

RelationalViewer

ClientMDBMS Server

Multi-dimensionaldata

SQL-Read

RDBMS Server

Userdata Meta data

Deriveddata

SQL-Reach

Through

SQL-Read

Data Flow in HOLAP

Page 36: Data ware house architecture

Pie chart

reports

GraphsQuery result Bar chart

Front end tools

Mobile phone

computer

Page 37: Data ware house architecture

Data mart

37

Page 38: Data ware house architecture
Page 39: Data ware house architecture

THANK YOU