proven practices for predictive · pdf fileproven practices for predictive modeling...

78
Copyright © 2013, SAS Institute Inc. All rights reserved. PROVEN PRACTICES FOR PREDICTIVE MODELING MARY-ELIZABETH (“M-E”) EDDLESTONE PRINCIPAL SYSTEMS ENGINEER, ANALYTICS CONTRIBUTIONS FROM: DARIUS BAER DAVID OGDEN DOUG WIELENGA BROUGHT TO YOU BY SAS CUSTOMER LOYALTY

Upload: lamnhi

Post on 28-Mar-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

PROVEN PRACTICES FOR PREDICTIVE MODELING

MARY-ELIZABETH (“M-E”) EDDLESTONEPRINCIPAL SYSTEMS ENGINEER, ANALYTICS

CONTRIBUTIONS FROM:DARIUS BAERDAVID OGDENDOUG WIELENGA

BROUGHT TO YOU BY SAS CUSTOMER LOYALTY

Page 2: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES DISCLAIMERS

• The choice of “Best Practices” is highly subjective.

• Certain suggested practices may not be suitable for a particular situation.

• It is the responsibility of a data mining practitioner to critically evaluate methods and select the best method for a particular situation.

• This presentation represents the opinions of the contributors.

Page 3: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES TO HELP YOU MEET AND EXCEED YOUR GOALS

Faster model developmentMore useful modelsSuperior models

Page 4: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES FORMAT OF PRESENTATION

• Background & General Guidance• Developing the Data• Developing & Delivering the Model

Page 5: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BACKGROUND &GENERAL GUIDANCEANALYTICS CYCLE AND THE MODELING PROCESS

Page 6: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

ANALYTICS LIFECYCLE

Formulate Problem

Data Preparation

Data Exploration

Transform & Select

Develop Models

Validate Models

Deploy Model

Evaluate & Monitor

Model

Page 7: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE IT’S ALL ABOUT BALANCE

Page 8: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE IT’S ALL ABOUT BALANCE

• Many factors need to be considered and optimized:• Time• People• Money• IT Resources

People

TechnologyBusiness Process

Page 9: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyright © 2013, SAS Institute Inc. All rights reserved.

• BI reporting• Web portals /

dashboards• Information

management• Problem-specific

business solutions• Predictive analytics• Hardware

ANALYTICS INFRASTRUCTURE

TECHNOLOGY

BUSINESS PROCESS

DECISION ANALYTICS

Fact-based decision making requires the right technology, talent, processes and culture

• Continuous Process Improvement

• Planning• Project

methodology• Standards

• Vision & Leadership

• Team composition

• Enterprise authorityPEOPLE

Page 10: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Domain ExpertMakes DecisionsEvaluates Processes & ROI

BUSINESS MANAGER

Data ExplorationData VisualizationReport Creation

BUSINESS ANALYST

Exploratory AnalysisDescriptive SegmentationPredictive ModelingModel Validation & Registration

DATA MINER

Model ValidationModel DeploymentModel MonitoringData Preparation

IT/SYSTEMS MANAGEMENT

Formulate Problem

Data Preparation

Data Exploration

Transform & Select

Develop Models

Validate Models

Deploy Model

Evaluate & Monitor

Model

LIFECYCLE BEST PRACTICE

INVOLVE ALL THE RELEVANT PEOPLE/ROLES

Page 11: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE WEAR MANY HATS

Have a passion to understand not just

analytics, but the business and

technology.

Page 12: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE USE THE TECHNOLOGY AND METHOD THE FITS THE JOB

Every tool and method has advantages and disadvantages.Whenever possible, select the tool or method that balances long-term goals for the entire process.

Page 13: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE BEGIN WITH THE END IN MIND

Page 14: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE BEGIN WITH THE END IN MIND

•What?•How?•Who?•When?

Page 15: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE BEGIN WITH THE END IN MIND

• What is the overarching strategic objective/initiative?

• How will the model be used?• How will it be put into production?

Page 16: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE BEGIN WITH THE END IN MIND

• Who will be affected by the use of the model?

• Who needs to be convinced of the value of the model?

• When will the model be used?

Page 17: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES BUSINESS CONSIDERATIONS BEFORE YOU MODEL

• Thoroughly understand the business/marketing objectives• Detail the precise (planned) usage for the output• Define the target variable (the outcome being modeled /

predicted)• Formulate a theoretical model: Y = f (X1, X2, …) fill-in

the likely X’s

Page 18: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE

SEMMA Process for Model

Development

Page 19: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE MODELING APPROACH

1. Sample training set(s), validation set(s), holdout test set

2. Explore min, max, mean, median, missing values, levels (categorical cardinality)

3. Modify filtering outliers, reducing cardinality, correcting multicolinearity, imputations, non-linear transformations

Page 20: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE MODELING APPROACH

4. Model variable selection, various model formulations, iterative cycle, insights & client reviews

5. Assess performance criteria and review

Page 21: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE MODELING APPROACH (CONTINUED)

7. Final Assessment & Testing8. Profile characteristics & indicators9. Document results10. Prepare (production-ready) data collection and

score code11. Monitor model performance

Page 22: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DEVELOPING THE DATA

Page 23: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES OPTIMIZING DATA

Determining DataSelecting TargetPreparing Variables

Page 24: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DETERMINING DATA

Page 25: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES TECHNICAL CONSIDERATIONS BEFORE MODELING

• Brainstorm all potential input data elements• Identify source systems, specific data fields, availability/priority/level-of-effort of data

• Finalize data to be collected

Page 26: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES TECHNICAL CONSIDERATIONS BEFORE MODELING

• Formulate structure and layout of modeling dataset to be built

• Devil-in-the-details: filters, timeframe of history, etc…

• Build modeling dataset

Page 27: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE ALLOW SUFFICIENT TIME FOR ALL ASPECTS

Page 28: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE

SEMMA Process for Model

Development

Page 29: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAMPLE

• (Over) Sampling• Decisioning• Partitioning

Page 30: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAMPLING

Page 31: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAMPLE TO SAMPLE OR NOT?

• Sampling is a valuable tool that can be used to great effect.

• If computing resources are no object, it’s possible to use all data.

• When resource constrained, try increasing sample sizes as model development progresses.

• When model is nearly finalized, try different seeds for samples to ensure model stability.

SAMPLE

SAMPLE

ALL DATA

Page 32: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAMPLE WHAT ABOUT OVERSAMPLING?

• It depends.• Frequently one needs to oversample in order to allow

algorithm(s) to identify effect, especially with rare targets.• Only oversample as much as you need to in order to

obtain a model that makes sense from a business perspective. This is highly subjective.

Page 33: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SELECTING TARGET

Page 34: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

CHOOSING YOUR TARGET

• Choosing the Target• Response vs. Propensity• Number of Models

Page 35: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DECISIONING

Page 36: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

WEIGHTING YOUR DECISIONS

• Expected Profit• Decision Boundaries

Page 37: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

UNDERSTANDING EXPECTED PROFIT

• Consider this game• Flip a fair coin one time• If it is heads, you win $10.00• Cost of playing one time is $1.00

Do you want to play this game?

Page 38: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

UNDERSTANDING EXPECTED PROFIT

• Consider this game• Flip a fair coin one time• If it is heads, you win $10.00• Cost of playing one time is $1.00

E(Profit) = 0.5 * (10 - 1) + 0.5 * (-1) = 4.50 + (-0.50) = 4.00

Page 39: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DECISIONS COMBINING THE WEIGHTS WITH PRIORS

Page 40: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DECISIONS DETERMINING DECISION WEIGHTS

• To determine the amount of weight to assign to the rare event in a binary target, calculate this ratio:

• Specify the weight of the rare event to be this ratio• http://support.sas.com/kb/47/965.html

Page 41: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DECISIONS DETERMINING DECISION WEIGHTS: EXAMPLE

• Consider a binary event where Prob(Yes) = 0.1 and Prob(No) = 0.9

• To determine the amount of weight to assign to the rare event in a binary target, calculate this ratio:

= ..

= 9

• Specify the weight of the rare event to be this ratio

Page 42: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DECISIONS INCORPORATING PRIORS

• Before fitting model• Decision Profile

• After fitting model• Decision Node

Page 43: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

PARTITIONING

Page 44: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAMPLE DATA PARTITIONING

PARTITION ROLE

Training Used to fit the model

Validation Used to validate the model and prevent over-fitting

Test Used to provide unbiased estimate of model performance

Page 45: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAMPLE SAMPLE: DATA PARTITIONING

40%

30%

30%

WHAT IS OPTIMAL PARTITION?

Training

Validation

Test

Page 46: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE SAMPLE: DATA PARTITIONING

60%

40%0%

WHAT IS OPTIMAL PARTITION?

Training

Validation

Test

Page 47: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE SAMPLE: DATA PARTITIONING

70%

30%0%

WHAT IS OPTIMAL PARTITION?

Training

Validation

Test

It depends!

Page 48: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAMPLE DATA PARTITIONING CONSIDERATIONS

• How much data is available?• Is an unbiased measure of model performance required?

• Should test data be in-sample or out-of-sample?• How many test samples are needed? (e.g. different time

periods, different geographies, etc.)• When should test data be used in the process?

Page 49: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE DATA PARTITIONING

• Percentages: frequently used percentages are 50/50/0, 60/40/0 and 70/30/0 with a completely separate Test partition.

• Do not bring Test data into process until model is complete. It should not influence modeling process, merely used to report performance.

• Multiple Test data can be used – consider how model will be deployed and create representative samples.

Page 50: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

PREPARING DATA

Page 51: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXPLORE & MODIFY ITERATIVE RELATIONSHIP WITH DATA PREPARATION

Data Prep

Data Exploration

Data Modification

Page 52: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE EXPLORE & MODIFY: GETTING THE MOST OUT OF DATA

• Once you have an analytics-ready table:• Examine Categorical Variables• Examine Continuous Variables• Explore Missing Values• Cluster Variables

Page 53: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXPLORE & MODIFY CATEGORICAL VARIABLES

Page 54: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXPLORE & MODIFY CONTINUOUS VARIABLES

Page 55: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXPLORE & MODIFY MISSING DATA

• Why is data missing?• Are there patterns to the missing data within or across variables?

• Imputation methods to consider

Page 56: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXPLORE & MODIFY CLUSTER VARIABLES

• There is no single answer for clusters

• Design clusters and profiles around themes using smaller set of related variables

Page 57: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SELECTING VARIABLES

Page 58: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXPLORE & MODIFY VARIABLE SELECTION/REDUCTION TECHNIQUES

• Stepwise Regression• Variable Selection Node• Decision Tree Node• Variable Clustering • Combined Approach

Page 59: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXPLORE & MODIFY VARIABLE SELECTION/REDUCTION TECHNIQUES

• Multicollinearity

Page 60: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXPLORE & MODIFY VARIABLE SELECTION/REDUCTION TECHNIQUES

• Interactions

Page 61: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES OPTIMIZING DATA

Selecting TargetDetermining DataPreparing Variables

Page 62: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DEVELOPING & DELIVERING THE MODEL

Page 63: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

MODEL & ASSESS DELIVERING THE MODEL

• Developing Your Model• Choosing a Model• Deploying the Model

Page 64: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DEVELOPING THE MODEL

Page 65: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

MODEL MODEL DEVELOPMENT

• Regression• Decision Trees• Neural Networks• Ensemble• Rule Induction • Something Else?

Page 66: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE MODEL DEVELOPMENT

• Try various techniques and combinations of techniques.

Page 67: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

CHOOSING A MODEL

Page 68: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES MODEL SELECTION

• Evaluate model metrics• Consider business knowledge• Recognize constraints

Page 69: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

ASSESS CUMULATIVE CHARTS

Page 70: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

ASSESS NON-CUMULATIVE CHARTS

Page 71: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DEPLOYING THE MODEL

Page 72: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES MODEL DEPLOYMENT

• Reporting Results• Clean up and back up• Monitor performance

Page 73: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES MODEL DEPLOYMENT

• Incorporate and share knowledge• Automate ETL (Extract, Transform, Load)• Automate process

Page 74: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

ULTIMATE GOAL

SAS MODEL FACTORYSOURCE /

OPERATIONAL SYSTEMS

MODEL MANAGEMENT

MODEL DEVELOPMENT

DATA PREPARATION

MODEL DEPLOYMENT

Page 75: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

MODEL & ASSESS DELIVERING THE MODEL

• Developing Your Model• Choosing a Model• Deploying the Model

Page 76: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICES FORMAT OF PRESENTATION

• Background & General Guidance• Developing the Data• Developing & Delivering the Model

Page 77: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013, SAS Ins t i tu te Inc . A l l r igh ts reserved.

BEST PRACTICE BE ANALYTICALLY SAVVY AND CREATIVE

analytical creative

It’s both science and

art!

Page 78: PROVEN PRACTICES FOR PREDICTIVE · PDF fileproven practices for predictive modeling mary-elizabeth (“m-e”) eddlestone principal systems engineer, ... best practices business considerations

Copyr igh t © 2013 SAS Ins t i tu te Inc . A l l r igh ts reserved. www.SAS.com

THANK YOU FOR USING SAS!