1 paul k chen chapter 4 data warehouse project planning & management data warehouse fundamentals
TRANSCRIPT
1
Paul K Chen
Chapter 4
Data Warehouse Project Planning & Management
Data Warehouse Fundamentals
Chapter 4 - Objectives
Review types of development models Review the essentials of system development life cycle
and project management functions Discuss project team organization, roles, and
responsibilities Review data warehouse project scope document Consider the warning signs and success factors Distinguish between data warehouse projects and
OLTP system projects Discuss Data Warehouse deployment
Types of Development Models
The Waterfall Development Model
The Spiral Model
The Iterative Development Model
The Waterfall Development Model
Characteristics: Encouraging to gather and define system requirements.
Breaking the complex mission of development into several logical steps (, analysis, design, code, test, and so forth) – Divide and conquer approach.
Ensuring each step is executed properly with good quality deliverable, validation, entry, and exit criteria for each step.
The Waterfall Development Model
Advantages:
Enabling tracking of project progress more accurately and uncovering possible slippages early.
Focusing the organization that develops the software system to be more structured and manageable.
Disadvantages:
The process could become too rigid to be efficient and effective.
The Spiral Model
Developed by Boehm in 1988
Characteristics: Relying heavily on prototyping and risk management vs. the
document-driven approach of the waterfall approach.
Foe each portion of the project and for each of its levels of elaboration, the same sequence of steps (cycle) is involved. For instance, the concept of software requirements, to design, and implementation, each involves a spiral cycle.
The Spiral Model
Approach: The first step of each cycle is to identify the objective of the
portion of the product being elaborated, the alternative means of implementation of the portion of the product, and the constraints imposed on the application of the alternatives.
The next step is to evaluate the alternatives relative to the objectives and constraints and to identify the associated risks and resolve them.
In addition to prototyping for risk analysis, the spiral model also simulations, models, and benchmarks in order to reach the beat alternatives.
Waterfall Approach vs. Spiral Approach
Structured Development: Analysis, design and coding take place in The traditional waterfall way. Each step is isolated from the other. (Waterfall Approach)
A D P
A D
P(Spiral Approach)
Object-oriented development: One multifaceted model is used from Concept to code. Because one underlying model is used, teams apply Analysis, design
And programming
Concurrently.
The Iterative Development Model
Characteristics:
Begin with a subset of the requirements and develop a subset of the product that satisfies the essential needs of the users.
Based on the analysis of each immediate product, the requirements and design are modified over a series of iterations to provide a system to the user that meets evolving customer needs with improved design based on feedback and testing.
Combine with prototyping with the strength of the classical waterfall model.
Supporting the iterative development was the small team approach in which each team assumed the full responsibility of the system.
System Development Life Cycle – A brief overview
It is a systematic approach to solving business problem. It’s divided into seven phases:
Identifying problems, opportunities, and objectives Determining system requirements Analyzing system needs Designing the recommended systems Developing and documenting software Testing and maintaining the system Implementing and evaluating the systems
System Development Life Cycle – A Brief Overview
Why should a system development project be segmented in phases?
Project Management– easier to understand and manage its deliverables and track its progress
Resources – Better utilize the resources related to technology, skills, and time Risk –Minimize commitment and cost in case the project
restarts.
Project Management Functions
Initiate project
Project planning
Establishing project
Organization
Start the project by assessing the
opportunity
Determining tasks, schedule, and
allocating resources
Defining project charter and issuing
The statement of work
Organizing staff by function, tools
and environment
Project Management Functions (cont’d)
Administration
Evaluation and control
Termination
On-going project reporting And administrative work
Monitor project progress by cost, product, and schedule
Wrapping up the task by doing project summary and archives
Five Major Project Fundamentals For System Analysts
The five project fundamentals the system analysts must handle are:
Project initiation
Determining project feasibility
Project scheduling
Activity planning and control
Managing system analysis team members
Project Initiation
Projects are initiated for two broad reasons:– Problems that lend themselves to systems
solutions– Opportunities for improvement through
» Upgrading systems
» Altering systems
» Installing new systems
Project Feasibility
A feasibility study assesses the operational, technical, and economic merits of the proposed project
There are three types of feasibility:
– Technical feasibility
– Economic feasibility
– Operational feasibility
Technical Feasibility
Technical feasibility assesses whether the current technical resources are sufficient for the new system
If they are not available, can they be upgraded to provide the level of technology necessary for the new system
Economic Feasibility
Economic feasibility determines whether the time and money are available to develop the system
Includes the purchase of
– New equipment
– Hardware
– Software
Operational Feasibility
Operational feasibility determines if the human resources are available to operate the system once it has been installed
Users that do not want a new system may prevent it from becoming operationally feasible
Determining Project Feasibility (Key Issues)
Value and Expectations
Risk Assessment
Top-down or Bottom-up
Build or Buy
Single Vendor or Best-of-Breed
Tools for Planning & Scheduling Activities
Gantt Chart & PERT (Program Evaluation and Review Techniques) diagram; Spreadsheet
Computer-based project scheduling
Such as: Microsoft Project; Computer Associates’ CA-Super Project
Gantt Chart
A
B
C
D
E2 4 95Time
10
20
40
30
50
A 4
B, 2
C, 5
D,3
E, 6
15
Gantt vs. PERT Diagram
Circles called events
The longest path is called critical path.
Activity Planning and Control
Beginning to plan a project by breaking it these three
major activities :
Analysis Design Implementation
Activity Planning and Control
Refining the planning and scheduling of analysis activities
by adding detailed tasks and establishing the following
milestones:
Data Gathering Data Flow & Decision analysis Proposal Preparation
Data Warehouse Project Team: Roles and Responsibilities
Executive Sponsor – Direction, support, arbitration Project Manager – Assignment, monitoring, control User Liaison Manager – Coordination with user group Lead Architect – Architecture Design Business Analyst – Requirement definition Data Modeler – Relational and Dimensional Modeling Data Warehouse Administrator –DBA Quality Assurance Analyst – Quality control for warehouse data Testing Coordinator – Program, system and tool testing End-user Application Specialist – Confirmation of data meanings/relationships Development Programmer – in-house programming and scripts
Steps in Ascertaining Hardware and Software Needs
Inventory computer hardware already in the organization Estimate both current and projected workload for the system Evaluate the performance of hardware and software using
some predetermined criteria Choose the vendor according to the evaluation Acquire the hardware and software from the selected vendor Acquire the hardware and software in conformance with
your enterprise architecture The acquisition of the hardware and software must be
justified by a business process required of either short-term (tactical) or long-term(strategic) goals.
Data Warehouse Project Scope Document
I Executive Summary -- Business needsII Project Background -- How did the project start? -- Who is the sponsor?III Project Definition -- Project Objectives -- Project Organization -- Project Critical Success Factor -- Measurements of Success
Data Warehouse Project Scope Document
IV Project Scope What’s in the Data Warehouse? What’s not in the Data Warehouse? Samples of Queries & Reports
V Methodology and Approach Methodology Employed Techniques Employed
Data Warehouse Project Scope Document
VI Project Cost/BenefitsVII Project Schedule, Budget and Resources -- The plan should include the following milestones: Logical Data Modeling Data Warehouse Data Modeling Data Warehouse Physical Model Source System of Record Extraction/Transformation Program Populated Data Warehouse Populated Metadata End User Access Application End User Training Ongoing Support Plan
Data Warehouse Project Scope Document
VIII Project Planning Assumptions and Issues
-- Project Assumptions
-- Project Risks
-- Project Contingencies
IX Expected Follow-on Projects
Summary
Project management consists of these four essential elements:
Planning (an iterative process) Determining the deliverables Estimating efforts and cost Projecting the resources
Organizing Assembling the team Defining and establishing the structure of the team Creating a productive environment
Summary
Controlling the project Monitoring the progress Reporting performance and variables Adjusting resources
Leading the project Emphasizing human factors—motivation; Team spirit; Delegation
Consider the warning signs and success factors
Warning Sign Indication Action
The Data Requirements definition phase is well the target date.
Need to write too many in-house programs.
Users not cooperating to provide details of data.
Suffering from “analysis paralysis”.
Selected third party tools running out of steam.
Possible turf concerns over data ownership.
Stop the capturing of unwanted inf. Remove any problems by meeting with users. Set firm final target date.
If there is time and budget, get different tools.
Otherwise increase programming staff.
Work with executive sponsor to resolve the issue.
Consider the warning signs and success factors (cont’d)
Warning Sign Indication Action
Users not comfortable with the query tools.
Continuing problems with data brought over to the staging area.
Users not trained adequately.
Data transformation and mapping not complete.
First ensure that the selected query tool is appropriate. Then provide additional training.
Revise all data transformation and integration routines. Ensure that no data is missing. Include the user representative in the verification process.
Data Warehouse Project Different From OLTP System Project
Data Acquisition Data Storage Inf. Delivery
Large Number of sources
Many disparate sources
Different computing platforms
Outside sources
High initial load
Ongoing data feeds
Storage of large data volumes
Rapid growth
Need for parallel processing
Data storage in staging area
Multiple index types
Several index files
Several user types
Queries stretched to limits
Multiple query types
Web-enabled
Multidimensional analysis
OLAP functionality
Metadata management
Data Warehouse Project Different From OLTP System Project
Data Acquisition Data Storage Inf. Delivery
Data replication considerations
Difficult data integration
Complex data transformations
Data cleaning
Storage of newer data types
Archival of old data
Compatibility with tools
RDBMS & MDDBMS
Interfaces to DSS applications
Feed into data mining
Multi-vendor tools
Major Deployment Activities
Complete User Acceptance
Finish final testing of all aspects of user interface including system performance.
Perform Initial Loads
Load dimension tables followed by the fact tables. Create aggregate tables.
Major Deployment Activities (cont’d)
Get User Desktops Ready
Install all the needed desktop tools. Test each client machine.
Complete Initial User Training
Train the users on data warehouse concepts, relevant contents, and data access tools.
Major Deployment Activities (cont’d)
Institute Initial user Support
Set up support to assist the users in basic usage, answer questions, and hold hands.
Deploy in stages
Divide the deployment into manageable stages in agreement with users.
Deploy in Stages
Top-down approach Deploy the overall enterprise data warehouse (E-R model)
followed by the dependent data marts, one by one. Bottom-up approach Gather departmental requirements, plan and deploy the
independent data marts, one by one.
Practical approach Deploy the subject data marts (dimensional model), one by
one, with fully confirmed dimensions and facts, according to preplanned sequence.
Considerations for A Pilot
Proof-of Technology: Intended only to prove new technology for IT.
Comprehensive Test: Only intended for IT to test all infrastructure/architecture. Proof-of-concept: Small-scale, works with limited data, not suitable for
integration
Types of pilot deployment:
Considerations for A Pilot (cont’d)
User tool appreciation: Only intended for users to test and become familiar with tools.
Broad Business: Early deliverable with broader scope, may be integrated.
Expandable Seed: Manageable and simple, but
designed for integration.