professional issues in computing...nuces, islamabad campus data warehousing - fall 2012 12 nuces,...

Post on 16-Oct-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data Warehousing(The Need, Importance & the Big Picture)

Naveed Iqbal, Assistant Professor

NUCES, Islamabad Campus(Lecture Slides Week # 1)

NUCES, Islamabad Campus Data Warehousing - Fall 2012 2

Why this Course?

The World is changing / (in fact changed)

Either change or Be left behind.

Missing the opportunities or going in thewrong direction has prevented us fromgrowing.

What is the right direction?

Harnessing the data, in the knowledgedriven economy.

Doing what can’t be or difficult to automate.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 3

The Need of the Time

Drowning in data AND/BUT starving for

information.

Knowledge is power BUT Intelligence is

absolute/super power.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 4

The Need of the Time

Data

Information

Knowledge

Intelligence

POWER

($/£)

Evolution of Information Systems

NUCES, Islamabad Campus Data Warehousing - Fall 2012 5

NUCES, Islamabad Campus Data Warehousing - Fall 2012 6

NUCES, Islamabad Campus Data Warehousing - Fall 2012 7

Business Intelligence

NUCES, Islamabad Campus Data Warehousing - Fall 2012 8

NUCES, Islamabad Campus Data Warehousing - Fall 2012 9

Visualization

NUCES, Islamabad Campus Data Warehousing - Fall 2012 10

NUCES, Islamabad Campus Data Warehousing - Fall 2012 11

Date Warehousing – the big picture

Data Warehouse Server

(Tier 1)

Data

Warehouse

Operational

Data Bases

Semistructured

Sources Query/Reporting

Data Marts

MOLAP

ROLAP

Clients

(Tier 3)

Tools

Meta

Data

Data sources

Data

(Tier 0)

IT

Users

Business

Users

Business Users

Data Mining

Archived

data

Analysis

OLAP Servers

(Tier 2)

Extract

Transform

Load

(ETL)

www data

NUCES, Islamabad Campus Data Warehousing - Fall 2012 12

NUCES, Islamabad Campus Data Warehousing - Fall 2012 13

Approach of the Course

Develop an understanding of the underlying RDBMSconcepts.

Apply these concepts to VLDB / DSS environmentsand understand where and why they break down?

Expose the differences between RDBMS and DataWarehouse in the context of VLDB.

Provide the basics of DSS tools such as OLAP, DataMining and demonstrate their applications.

Demonstrate the application of DSS concepts andlimitations of the OLTP concepts through labexercises.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 14

Summary of the Course

Introduction & Background

Extract-Transform-Load (ETL)

Normalization & De-Normalization

Dimensional Modeling

Online Analytical Processing (OLAP)

Data Quality Management (DQM)

Need for Speed (Parallelism, Join and Indexing Techniques)

DWH Implementation Steps

Complete Implementation Case Study

Lab and Tool Usage

NUCES, Islamabad Campus Data Warehousing - Fall 2012 15

Books

Reference Books Golfarelli & Rizzi, Data Warehouse Design – Modern

Principles and Methodoligies, McGRAW-Hill

W. H. Inmon, Building the Data Warehouse,

John Wiley & Sons Inc., NY

R. Kimball, The Data Warehouse Toolkit,

John Wiley & Sons Inc., NY

A. Abdullah, “Data Warehousing for Beginners: Concepts

& Issues”.

Paulraj Ponniah, Data Warehousing Fundamentals, John

Wiley & Sons Inc., NY

. . .

NUCES, Islamabad Campus Data Warehousing - Fall 2012 16

Course Execution Plan

Lecturing / Discussions

Lab Work + Tutorials

Assignments / Case Studies

Projects

Marks Breakup:

Mid-I: 12% Quizzes: 6%

Mid-II: 13% Assignments/Case Study: 9%

Final*: 40% Projects*: 20%

* Mandatory (Missing means F)

NUCES, Islamabad Campus Data Warehousing - Fall 2012 17

Code of Conduct

Regularity Attendance criteria as per university policy

Punctuality No entry after 5 minutes from class start time (N/A for habitual late

comers)

Discipline ABSOLUTLY NO COMPROMISE

Positive Attitude

High Level of Class Participation

No Plagiarism, Cheating …

No Change in Deadlines

No Usage of Mobile / Other Devices

NUCES, Islamabad Campus Data Warehousing - Fall 2012 18

Scenario 1

ABC Pvt Ltd is a company with branches at

Karachi, Quetta, Peshawar and Lahore. The Sales

Manager wants quarterly sales report. Each

branch has a separate operational system.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 19

Scenario 1 : ABC Pvt Ltd.

Karachi

Quetta

Peshawar

Lahore

Sales

ManagerSales per item type per branch

for first quarter.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 20

Solution 1:ABC Pvt Ltd.

Extract sales information from each database.

Store the information in a common repository

at a single site.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 21

Solution 1:ABC Pvt Ltd.

Karachi

Quetta

Peshawar

Lahore

Data

Warehouse

Sales

Manager

Query &

Analysis tools

Report

NUCES, Islamabad Campus Data Warehousing - Fall 2012 22

Scenario 2

One Stop Shopping Super Market has huge

operational database. Whenever Executives wants

some report, the OLTP system becomes slow and

data entry operators have to wait for some time.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 23

Scenario 2 : One Stop Shopping

Operational

Database

Data Entry Operator

Data Entry Operator

ManagementWait

Report

NUCES, Islamabad Campus Data Warehousing - Fall 2012 24

Solution 2

Extract data needed for analysis fromoperational database.

Store it in warehouse.

Refresh warehouse at regular interval so that itcontains up to date information for analysis.

Warehouse will contain data with historicalperspective.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 25

Solution 2

Operational

database

Data

Warehouse

Extract

data

Data Entry

Operator

Data Entry

Operator

Manager

Report

Transaction

NUCES, Islamabad Campus Data Warehousing - Fall 2012 26

Scenario 3

Cakes & Cookies is a small, new company. President

of the company wants his company should grow. He

needs information so that he can make correct

decisions.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 27

Solution 3

Improve the quality of data before

loading it into the warehouse.

Perform data cleaning and

transformation before loading the data.

Use query analysis tools to support

adhoc queries.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 28

Solution 3

Query and Analysis

toolPresident

Expansio

n

Improvemen

t

sales

time

Data

Warehouse

NUCES, Islamabad Campus Data Warehousing - Fall 2012 29

Case Study

AFCO Foods & Beverages is a new companywhich produces dairy, bread and meatproducts with production unit located atGujranwala.

There products are sold in all the region ofPakistan.

They have sales units at provincial HeadQuarters.

The President of the company wants salesinformation.

NUCES, Islamabad Campus Data Warehousing - Fall 2012 30

Sales Information

Report: The number of units sold.

113

Report: The number of units sold over time

January February March April

14 41 33 25

NUCES, Islamabad Campus Data Warehousing - Fall 2012 31

Sales Information

Report : The number of items sold for each product with

time

Jan Feb Mar Apr

Wheat Bread 6 17

Cheese 6 16 6 8

Swiss Rolls 8 25 21

Product

NUCES, Islamabad Campus Data Warehousing - Fall 2012 32

Sales Information

Report: The number of items sold in each City for each

product with time

Jan Feb Mar Apr

Karachi Wheat

Bread

3 10

Cheese 3 16 6

Swiss Rolls 4 16 6

Lahore Wheat

Bread

3 7

Cheese 3 8

Swiss Rolls 4 9 15

Product

Tim

e

NUCES, Islamabad Campus Data Warehousing - Fall 2012 33

Sales Information

Report: The number of items sold and income in each region for

each product with time.

Jan Feb Mar Apr

Rs U Rs U Rs U Rs U

Karachi Wheat Bread 7.44 3 24.80 10

Cheese 7.95 3 42.40 16 15.90 6

Swiss Rolls 7.32 4 29.98 16 10.98 6

Lahore Wheat Bread 7.44 3 17.36 7

Cheese 7.95 3 21.20 8

Swiss Rolls 7.32 4 16.47 9 27.45 15

NUCES, Islamabad Campus Data Warehousing - Fall 2012 34

Data Warehousing includes

Building Data Warehouse

Online Analysis/Analytical Processing (OLAP)

Presentation

RDBMS

Flat File

Presentation

Cleaning ,Selection &

Integration

Warehouse & OLAP serverClient

top related