math 6330: statistical consulting class 8 - cox associatescox-associates.com/6330/class8.pdf ·...

46
Math 6330: Statistical Consulting Class 8 Tony Cox [email protected] University of Colorado at Denver Course web site: http://cox-associates.com/6330/

Upload: others

Post on 02-Jun-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Math 6330: Statistical ConsultingClass 8

Tony [email protected]

University of Colorado at Denver

Course web site: http://cox-associates.com/6330/

Page 2: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Agenda• Projects and schedule

• Prescriptive (decision) analytics (Cont.)– Decision trees

– Simulation-optimization

– Newsvendor problem and applications

– Decision rules, optimal statistical decisions

– Quality control, SPRT

• Evaluation analytics

• Learning analytics

• Decision psychology– Heuristics and biases

2

Page 3: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Recommended readings

• Charniak (1991) (rest of paper)– Build the network in Figure 2 www.aaai.org/ojs/index.php/aimagazine/article/viewFile/918/836

• Pearl (2009) http://ftp.cs.ucla.edu/pub/stat_ser/r350.pdf

• Methods to Accelerate the Learning of Bayesian Network Structures, Daly and Shen (2007) https://pdfs.semanticscholar.org/e7d3/029e84a1775bb12e7e67541beaf2367f7a88.pdf

• Distinguishing cause from effect using observational data (Mooij et al., 2016), www.jmlr.org/papers/volume17/14-518/14-518.pdf

• Probabilistic computational causal discovery for systems biology (Lagani et al., 2016) www.mensxmachina.org/files/publications/Probabilistic%20Causal%20Discovery%20for%20Systems%20Biology_prePrint.pdf

3

Page 4: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Projects

4

Page 5: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Papers and projects: 3 types

• Applied: Analyze an application (description, prediction, causal analysis, decision, evaluation, learning) using high-value statistical consulting methods

• Research/develop software– R packages, algorithms, CAT modules, etc.

• Research/review book or papers (3-5 articles)– Explain a topic within statistical consulting– Examples: Netica’s Bayesian inference algorithms,

multicriteria decision-making, machine learning algorithms, etc.

5

Page 6: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Projects (cont.)

• Typical report paper is about 10-20 pages, font 12, space 1.5. (This is typical, not required)

• Content matters; length does not

• Typical in-class presentation is 20-30 minutes– Can run longer if needed

• Purposes: 1. Learn something interesting and useful;

2. Either explain/show what you learned, or show how to use it in practice (or both)

6

Page 7: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Project proposals due March 17

• If you have not yet done so, please send me a succinct description of what you want to do (and perhaps what you hope to learn by doing it).– Problem to be addressed– Methods to be researched/applied– Hoped-for results

• Due by end of day on Friday, March 17th (though sooner is welcome)

• Key dates: April 14 for rough draft (or very good outline)

• Start in-class presentations/discussions April 18• May 4, 8:00 PM for final

7

Page 8: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Course schedule

• March 14: No class. (Work on project idea)

• March 17: Project/paper proposals due

• March 21: No class (Spring break)

• April 14: Draft of project/term paper due

• April 18, 25, May 2, (May 9): In-class presentations

• May 4: Final project/paper due by 8:00 PM

8

Page 9: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Prescriptive analytics (cont.)

9

Page 10: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Algorithms for optimizing actions

• Decision analysis framework: Choose act a from choice set A to maximize expected utility of consequence c, given a causal model c(a, s), Pr(s) or Pr(c | a, s), Pr(s)– s = state = random variable = things that affect c other than

the choice of act a

• Influence diagram algorithms– Learning ID structure from data – Validating causal mechanisms– Using for inference and recommendations

• Simulation-optimization• Robust optimization• Adaptive optimization/learning algorithms

10

Page 11: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Prescriptive analytics methods

• Optimization– Decision trees, – Stochastic dynamic programming, optimal control– Gittins indices– Reinforcement learning (RL) algorithms

• Influence diagram solution algorithms• Simulation-optimization• Adaptive learning and optimization

– EVOP (Evolutionary operations)– Multi-arm bandit problems, UCL strategies

11

Page 12: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Decision tree ingredients

• Three types of nodes

– Choice nodes (squares)

– Chance nodes (circles)

– Terminal nodes / value nodes

• Arcs show how decisions and chance events can unfold over time

– Uncertainties are resolved as time passes and choices are made

12

Page 13: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Solving decision trees

• “Backward induction”

• “Stochastic dynamic programming”

– “Average out and roll back” implicitly, tree determines Pr(c | a)

• Procedure:

– Start at tips of tree, work backward

– Compute expected value at each chance node

• “Averaging out”

– Choose maximum expected value at each choice node

13

Page 14: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Obtaining Pr(s) from Decision treeshttp://www.eogogics.com/talkgogics/tutorials/decision-tree

Decision 1: Develop or Do Not DevelopDevelopment Successful + Development Unsuccessful(70% X $172,000) + (30% x (- $500,000))$120,400 + (-$150,000)

Page 15: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Obtaining Pr(s) from Decision treeshttp://www.eogogics.com/talkgogics/tutorials/decision-tree

Decision 1: Develop or Do Not DevelopDevelopment Successful + Development Unsuccessful(70% X $172,000) + (30% x (- $500,000))$120,400 + (-$150,000)

Page 16: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

What happened to act a and state s?http://www.eogogics.com/talkgogics/tutorials/decision-tree

Decision 1: Develop or Do Not DevelopDevelopment Successful + Development Unsuccessful(70% X $172,000) + (30% x (- $500,000))$120,400 + (-$150,000)

Page 17: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

What happened to act a and state s?http://www.eogogics.com/talkgogics/tutorials/decision-tree

Decision 1: Develop or Do Not DevelopDevelopment Successful + Development Unsuccessful(70% X $172,000) + (30% x (- $500,000))$120,400 + (-$150,000)

Page 18: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

What happened to act a and state s?http://www.eogogics.com/talkgogics/tutorials/decision-tree

What are the 3 possible acts in this tree?

Page 19: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

What happened to act a and state s?http://www.eogogics.com/talkgogics/tutorials/decision-tree

What are the 3 possible acts in this tree?

(a) Don’t develop; (b) Develop, then rebuild if successful; (c) Develop, then new line if successful.

Page 20: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

What happened to act a and state s?http://www.eogogics.com/talkgogics/tutorials/decision-tree

What are the 3 possible acts in this tree?

(a) Don’t develop; (b) Develop, then rebuild if successful; (c) Develop, then new line if successful.

Optimize decisions!

Page 21: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Key points

• Solving decision trees (with decisions) requires embedded optimization – Make future decisions optimally, given the

information available when they are made

• Event trees = decision trees with no decisions – Can be solved, to find outcome probabilities, by

forward Monte-Carlo simulation, or by multiplication and addition

• In general, sequential decision-making cannot be modeled well using event trees.– Must include (optimal choice | information)

Page 22: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

What happened to state s?http://www.eogogics.com/talkgogics/tutorials/decision-tree

What are the 4 possible states?

Page 23: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

What happened to state s?http://www.eogogics.com/talkgogics/tutorials/decision-tree

What are the 4 possible states?

C1 can succeed or not; C2 can be high or low demand

Page 25: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Key theoretical insight• A complex decision model can be viewed as a (possibly

large) simple Pr(c | a) model.

– s = selection of branch at each chance node

– a = selection of branch at each choice node

– c = outcome at terminal node for (a, s)

– Pr(c | a) = sPr(c | a, s)*Pr(s)

• Other complex decision models can also be interpreted as c(a, s), Pr(c | a, s), or Pr(c |a) models

– s = system state & information signal

– a = decision rule (information act)

– c may include changes in s and in possible a.

Page 26: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Real decision trees can quickly become “bushy messes” (Raiffa, 1968) with many duplicated sub-

trees

Page 27: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

D1

Tra

ck Im

po

rtsD

on’t T

rack Im

po

rts

No BSE

BSE in CA

BSE in US

No BSE

BSE in CA

BSE in US from US

BSE in US from CA

Test All

Repeat Test

Test All

Repeat Test

Test CA only

Test All

Test All

Test All

Test CA only

Test CA only

Test CA only

Repeat Test

Repeat Test

Repeat Test

Test All

Test All

Repeat Test

Repeat Test

A

A

A

A

BSE in CA

BSE in US from US

BSE in US from CA

No BSEA

BSE in CA

BSE in US from US

BSE in US from CA

No BSEA

B

B

B

B

No BSE

BSE in CA

BSE in US from CA

B No BSE

BSE in CA

BSE in US from CA

B

C

C

C

Y2|d1,d2

No BSE

BSE in US

BSE in CA

CNo BSE

BSE in US

BSE in CA

C

D2|d1Y1|d1

A

A

A

A

C

C

C

Page 28: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Influence Diagrams help to avoid large treeshttp://en.wikipedia.org/wiki/Decision_tree

Often much more compact than decision trees

Page 29: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Limitations of decision trees

• Combinatorial explosion

– Example: Searching for a prize in one of N boxes or locations involves building a tree of depth N! = N(N – 1)…*2*1.

• Infinite trees

– Continuous variables

– When to stop growing a tree?

• How to evaluate utilities and probabilities?

29

Page 30: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Optimization formulations of decision problems

• Example: Prize is in location j with prior probability p(j), j = 1, 2, …, N

• It costs c(j) to inspect location j

• What search strategy minimizes expected cost of finding prize?

– What is a strategy? Order in which to inspect

– How many are there? N!

30

Page 31: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

With two locations, 1 and 2

Strategy 1: Inspect 1, then 2 if needed: – Expected cost: c1 + (1 – p1)c2 = c1 + c2 – p1c2

Strategy 2: Inspect 2, then 1 if needed: – Expected cost: c2 + (1 – p2)c1 = c1 + c2 – p2c1

Strategy 1 has lower expected cost if:

• p1c2 > p2c1, or p1/c1 > p2/c2

• So, look first at location with highest success probability per unit cost

31

Page 32: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

With N locations

• Optimal decision rule: Always inspect next the (as-yet uninspected) location with the greatest success probability-to-cost ratio

– Example of an “index policy,” “Gittins index”

– If M players take turns, competing to find prize, each should still use this rule.

• A decision table or tree can be unwieldy even for such simple optimization problems

32

Page 33: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Other optimization formulations

• maxa A EU(a) – Typically, a is a vector, A is the feasible set– More generally, a is a strategy/policy/decision rule, A

is the choice set of feasible strategies– In previous example, A = set of permutations

• maxa A EU(a) s.t. EU(a) = ∑cPr(c | a)u(c)

Pr(c | a) = ∑sPr(c | a, s)p(s)g(a) ≤ 0 (feasible set, A)

33

Page 34: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Introduction to evaluation analytics

34

Page 35: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Evaluation analytics: How well are policies working?

• Algorithms for evaluating effects of actions, events, conditions – Intervention analysis/interrupted time series

• Key idea: Compare predicted outcomes with no action to observed outcomes with it

– Counterfactual causal analysis

– Google’s new CausalImpact algorithm

• Quasi-experimental designs and analysis– Refute non-causal explanations for data

– Compare to control groups to estimate effects

35

Page 36: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

How did U.K. National Institute for Health and Clinical Excellence (NICE) recommendation of complete cessation of antibiotic prophylaxis for prevention of infective endocarditis in March, 2008 affect incidence of infective endocarditis?

36www.thelancet.com/journals/lancet/article/PIIS0140-6736(14)62007-9/fulltext?rss=yes

Page 38: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Nonlinear models complicate inference of intervention effects

38

Solution: Non-parametric models, gradient boosting

Page 40: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Algorithms for evaluating effects of combinations of factors

• Classification trees– Boosted trees, Random Forest, MARS

• Bayesian Network algorithms– Discovery

• Conditional independence tests

– Validation

– Inference and explanation

• Response surface algorithms– Adaptive learning, design of experiments

40

Page 41: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Learning analytics• Learn to predict better

– Create ensemble of models, algorithms• Use multiple machine learning algorithms

– Logistic regression, Random Forest, SVM, ANN, deep learning, gradient boosting, KNN, lasso, etc.

– “Stack” models (hybridize multiple predictions)• Cross-validation assesses model performance

– Meta-learner combines performance-weighted predictors to produce an improved predictor

• Theoretical guarantees, practical successes (Kaggle competitions)

• Learn to decide better– Low-regret learning of decision rules

• Theoretical guarantees (MDPs)

• practical performance

41http://www2.hawaii.edu/~chenx/ics699rl/grid/rl.html

Page 42: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

42http://groups.inf.ed.ac.uk/agents/index.php/Main/Projects

Collaborative risk analytics: Multiple interacting learning agents

Page 43: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Collaborative risk analytics

• Global performance metrics

• Local information, control, tasks, priorities, rewards

– Hierarchical distributed control

– Collaborative sensing, filtering, deliberation, and decision-control networks of agents• Mixed human and machine agents

• Autonomous agents vs. intelligent assistants

43http://www.cities.io/news/page/3/

Page 44: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Collaborative risk analytics: Games as labs for distributed AI

• Local information, control, tasks, priorities– Hierarchical distributed control

– Collaborative sensing, deliberation, control networks

• From decentralized agents to effective risk analytics teams and HCI support– Trust, reputation, performance

– Sharing information, attention, control, evaluation, learning

44http://people.idsia.ch/~juergen/learningrobots.html

Page 45: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Risk analytics toolkit: Summary

1. Descriptive analytics– Change-point analysis, likelihood ratio CPA

– Machine learning , response surfaces ML (LR, RF, GBM, SVM, ANN, KNN, etc.)

2. Predictive analytics– Bayesian networks, dynamic BN BN, DBN

– Bayesian model averaging BMA, ML

3. Causal analytics & principles– Causal BNs, systems dynamics (SD) DAGs, SD simulation

– Time series causation

4. Prescriptive analytics: IDs, simulation-optimization, robust

5. Evaluation analytics: QE, credit assignment, attribution

6. Learning analytics– Machine learning, superlearning ML

– Low-regret learning of decision rules Collaborative learning

45

Page 46: Math 6330: Statistical Consulting Class 8 - Cox Associatescox-associates.com/6330/Class8.pdf · –Validating causal mechanisms –Using for inference and recommendations •Simulation-optimization

Applied risk analytics toolkit: Toward more practical analytics

Reorientation: From solving well-posed problems to

discovering how to act more effectively

1. Descriptive analytics: What’s happening?

2. Predictive analytics: What’s coming next?

3. Causal analytics: What can we do about it?

4. Prescriptive analytics: What should we do?

5. Evaluation analytics: How well is it working?

6. Learning analytics: How to do better?

7. Collaboration: How to do better together?

46