ist722 data warehousing project management & requirements gathering michael a. fudge, jr

35
IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr.

Upload: barrie-palmer

Post on 24-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

IST722 Data

WarehousingProject Management &

Requirements Gathering

Michael A. Fudge, Jr.

Page 2: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Please arrange into your

project teams.

Page 3: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Todays’ Agenda:Learn how to get started with a data warehousing initiative….We’ll use the Kimball Approach….

Page 4: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Recall: Kimball Lifecycle

Page 5: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Some Kimball Terminology

• Program – Collection of coordinated projects• Several Data marts will be

created.• Ex. Sales BI program: Build

data marts for internet sales, store sales, and partner sales.

• Project – Single iteration of the entire cycle. • Encompasses a business

process which results in a data mart.• Ex. Build a data mart for

internet sales

Page 6: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

The Relationships between:Programs, Projects, and Data Marts.

Program Project Data Mart

Contains one or more…

Implemented as one or more…

Each project team will work on one project within the same program.

Page 7: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Is Your Organization Prepared To Take This On?• First you must assess your

organization’s readiness:Do you have strong support

from upper management?Is there a compelling business

motivation behind the initiative ?

It is technically feasible with the resources and data you’re given?

Your answer to these questions

should beYES

Page 8: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Group Activity: Choose a projectGroup Project

A

B

C

D

E

F

G

H

TODO:Remember: you’re all

working on the same program.

Each of you needs to identify a project you will work on, based on the data in ExternalSources

Each group should select a different project.

Page 9: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Planning Activities• The Charter• Define the project background• Set project scope and boundaries (what’s excluded).• Identify success criteria for the project• State the business justification

• Assemble the Project Team (at minimum)• Business Lead - In charge of initiative• Project Manager – Manages project• Business Analyst – Collects requirements• Data Architect – Dimensional Modeling / Implementation• ETL Architect – ETL Design / Implementation• BI Architect - BI Design / Implementation

Page 10: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Group Activity: Charter and team

TODO:Write up your charter

for the chosen group project.

Select primary roles for each team member.

Each team member should have 2 roles.

• The Charter• Define the project background• Set project scope and boundaries (what’s

excluded).• Identify success criteria for the project• State the business justification

• The Project Team• Business Lead - In charge of initiative• Project Manager – Manages project• Business Analyst – Collects requirements• Data Architect – Dimensional Modeling /

Implementation• ETL Architect – ETL Design / Implementation• BI Architect - BI Design / Implementation

Page 11: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Planning Activities

• Establish a Communication Plan• How will you keep stakeholders informed?• How often and in what form will you meet?• Who needs to be present at which meetings?

• Create your Project Plan and Task List• Track issues using a change log or issue tracking system.• Hold a Kickoff meeting to get everyone on the same page.

Page 12: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Requirements Gathering

Page 13: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Key Activities of Requirements Gathering• Interviews with business users.• Data Audits – data profiling to assess capabilities of data

sources.• Documentation, including

Interview Write-upsIdentify Business ProcessesEnterprise Bus MatrixPrioritization GridIssues List

Page 14: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Sample Interview Questions

•What type of routine analysis do you perform? What data is used and where do you get it? What do you do with the data once you get it?•Which reports do you use? Which data on the report is

important? If the report were dynamic what would it do differently?• Describe your products. How do you distinguish different

products? How are they categorized? Do categories change over time?

Page 15: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Data Profiling• One important activity is to explore your existing data to get a

sense of • Technical feasibility of the project• Structure and condition of data• Availability of Data Sources

• We call this Data Profiling• If the source data is in a relational database, you can use the

SQL SELECT statement to profile data.• For other sources, Microsoft Excel makes for a good data

profiling tool.

Page 16: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Data Profiling Example• You can use tools like Excel to profile your data,

searching for useful dimensions and facts.

TotalSales Column LabelsRow Labels 2006 2007 2008 Grand TotalUnited States 15,535,349 20,563,610 25,963,721 62,062,681United Kingdom 550,507 5,083,062 6,257,260 11,890,829Canada 3,591,852 2,763,865 5,147,140 11,502,857Australia 2,568,701 2,099,585 5,805,290 10,473,577France 414,245 2,021,672 4,714,497 7,150,415Germany 513,353 593,247 3,574,747 4,681,348NA 186,518 558,762 1,251,447 1,996,727Grand Total 23,360,526 33,683,805 52,714,103 109,758,434

By C

ount

ry

By Year

Total Sales

Page 17: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Critical Skill: Turn business processes into dimensional models!Here’s the process for building a dimensional model from a business process. This dimensional model will eventually become a star schema in your enterprise data warehouse.1. Identify the business process and business process type.2. Identify the facts of the business processes.3. Identify the dimensions of the business process.

Page 18: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Business User Says:

Demo of my amazing modeling skills!!!!

• Facts are the business process measurement events• Dimensions provide the context for that event.

“I need to know:How many sneakers did we sell last week?”

Quantity (Fact)

ProductType

(Attribute ofa Product

Dimension)

Duration of Time

(Attribute ofa Sales Date Dimension)

Business Process(Sales)

What I hear

Page 19: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

#1: Identifying Business Processes 3 types:1. Events or

Transactions2. Workflows a.k.a.

Accumulating Snapshots3. Points in time a.k.a

Periodic Snapshots

Business processes contain facts which we use end up being the fact tables in our ROLAP star schemas.

Transaction

Accumulating Snapshot

PeriodicSnapshot

Page 20: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Transaction Fact

• The most basic fact grain• One row per line in a transaction• Corresponds to a point in space and time• Once inserted, it is not revisited for update• Rows inserted into fact table when transaction or event

occurs• Examples:• Sales, Returns, Telemarketing, Registration Events

Page 21: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Accumulating Snapshot Fact

• Less frequently used, application specific.• Used to capture a business process workflow.• Fact row is initially inserted, then updated as milestones occur • Fact table has multiple date FK that correspond to each milestone • Special facts: milestone counters and lag facts for length of time

between milestones• Examples:• Order fulfillment, Job Applicant tracking, Rental Cars

Page 22: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Periodic Snapshot Fact

• At predetermined intervals snapshots of the same level of details are taken and stacked consecutively in the fact table• Snapshots can be taken daily, weekly, monthly, hourly, etc…• Complements detailed transaction facts but does not replace them• Share the same conformed dimensions but has less dimensions• Examples: • Financial reports, Bank account values, Semester class schedules,

Daily classroom Lab Logins, Student GPAs

Page 23: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Group Activity: Which Fact Table Grain?1. Concert ticket purchases?2. Voter exit polls in an election?3. Mortgage loan application and approval?4. Auditing software use in a computer lab?5. Daily summaries of visitors to websites?6. Tracking Law School applications?7. Attendance at sporting events?8. Admissions to sporting events at 15 minute

intervals?

Transaction

Accumulating Snapshot

Periodic Snapshot

Page 24: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Answers: Which Fact Table Grain?1. Concert ticket purchases? T2. Voter exit polls in an election? T3. Mortgage loan application and approval? AS4. Auditing software use in a computer lab? T5. Daily summaries of visitors to websites? PS6. Tracking Law School applications? AS7. Attendance at sporting events? T8. Admissions to sporting events at 15 minute

intervals? PS

Transaction

Accumulating Snapshot

Periodic Snapshot

Page 25: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

#2 Identify the facts of the business process.• Facts are quantifiable numerical values associated with the business

process.• How much?• How many?• How long?• How often?

• If its not tied to the business process, its not a fact.• For example:• Points Scored == Fact, Player He

Page 26: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

3 Types of Facts•Additive - Fact can be summed across all dimensions. • The most useful kind of fact.• Quantity sold, hours billed.

•Semi-Additive - Cannot be summed across all dimensions, such as time periods.• Sometime these are averaged across the time dimension.• Quantity on Hand, Time logged on to computer.

•Non-Additive - Cannot be summed across any dimension.• These do not belong in the fact table, but with the dimension.• Basketball player height, Retail Price

Page 27: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Group Activity: Facts or Not??Additive? Semi? Non?1. Number of page views on a website?2. The amount of taxes withheld on an employee’s weekly

paycheck?3. Credit card balance.4. Pants waist size? 32, 34, etc…5. Tracking when a student attends class?6. Product Retail Price?7. Vehicle’s MPG rating?8. The number of minutes late employees arrive to work each

day.

Page 28: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Answers: Facts or Not?Additive? Semi? Non?

1. Number of page views on a website? F/A2. The amount of taxes withheld on an employee’s weekly

paycheck? F/A3. Credit card balance. F/S4. Pants waist size? 32, 34, etc… N/A5. Tracking when a student attends class? F/A6. Product Retail Price? N/A7. Vehicle’s MPG rating? N/A8. The number of minutes late employees arrive to work each

day. F/A

Page 29: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

#3: Identify the Dimensions

• Dimensions provide context for our facts.•We can easily identify dimensions because of the “by”

and/or “for” words.• Ex. Total accounts receivables for the IT Department by

Month.• Dimensions have attributes which describe and categorize

their values.• Ex. Student: Major, Year, Dormitory, Gender.

• The attributes help constrain and summarize facts.

Page 30: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Group Activity: Try these

• Identify: Business Process, Fact and Dimensions

1. What’s the total amount of product shipped by sales region for 2010-2014?

2. What’s the average time in days for a student’s application to be processed?

3. How many employees wait more than 15 minutes for a bus to the Manley parking lot?

Page 31: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Q:How do you document this?A: Enterprise Bus Matrix• A key deliverable from requirements gathering, the bus

matrix documents your business processes, facts and dimensions across all projects in your program.

Page 32: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Group Activity: Bus Matrix

TODO:Identify the business

processes, facts and dimensions for your group’s business process.

Your prof will create an enterprise bus matrix based on the entire program.

• Identify Business Processes• Transaction• Periodic Snapshot• Accumulating Snapshot

• Identify Facts of the business process• Should be Additive, or at least Semi-Additive

• Identify the dimensions used by the business process

Page 33: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

Prioritization Grid

Page 34: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

In Summary

• The initial phases of the Kimball Lifecycle are project planning and requirements analysis.• First your organization must assess its readiness to take on the project.• In the project planning phase you should establish the project charter

and assemble the team.• In the requirements phase you should interview business users,

conduct data audits, and write up documentation.• You documentation should include an enterprise bus matrix to layout

business processes & conformed dimensions and a prioritization grid high-value targets.

Page 35: IST722 Data Warehousing Project Management & Requirements Gathering Michael A. Fudge, Jr

IST722 Data

WarehousingProject Management &

Requirements Gathering

Michael A. Fudge, Jr.