expanding predictive analytics through the use of machine ......expanding predictive analytics...

24
Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey, FCAS, MAAA Chief Actuary EagleEye Analytics Columbia, S.C. Christopher Cooksey, is a senior actuarial consultant with EagleEye Analytics. In this role, Chris is actively involved with the ongoing development of EagleEye's cutting-edge predictive analytics solutions. He also plays a key role in assisting EagleEye's clients in finding optimum solutions to their rating and profitability challenges. Most recently, Chris served as research director at Nationwide Insurance, where he was responsible for the development GLM's and other sophisticated modeling techniques primarily in the non- standard automobile line of business. Chris also spent several years teaching statistics at Ohio Dominican University. Chris achieved summa cum laude honors with a bachelor's degree in physics and mathematics from Valparaiso University and master’s degree in physics from Ohio State University. Session Description: This session will introduce the concepts of Machine Learning and how they differ from more traditionally used techniques, with an emphasis on how the resulting information can be used in commercial insurance. Specific examples of rating and non-rating applications will be discussed. Creativity in the use of model output will be used to show the wide range of problems to which predictive models and machine learning techniques can be applied. The participants in this session will leave with: An understanding of what the label "machine learning" refers to and how these techniques compare to more traditional actuarial methods. A description of one specific machine learning technique. An appreciation of the breadth of application for machine learning techniques. An ability to identify opportunities where predictive analytics can enhance insurance operations.

Upload: others

Post on 23-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m.

Chris Cooksey, FCAS, MAAA Chief Actuary EagleEye Analytics Columbia, S.C. Christopher Cooksey, is a senior actuarial consultant with EagleEye Analytics. In this role, Chris is actively involved with the ongoing development of EagleEye's cutting-edge predictive analytics solutions. He also plays a key role in assisting EagleEye's clients in finding optimum solutions to their rating and profitability challenges. Most recently, Chris served as research director at Nationwide Insurance, where he was responsible for the development GLM's and other sophisticated modeling techniques primarily in the non-standard automobile line of business. Chris also spent several years teaching statistics at Ohio Dominican University. Chris achieved summa cum laude honors with a bachelor's degree in physics and mathematics from Valparaiso University and master’s degree in physics from Ohio State University. Session Description: This session will introduce the concepts of Machine Learning and how they differ from more traditionally used techniques, with an emphasis on how the resulting information can be used in commercial insurance. Specific examples of rating and non-rating applications will be discussed. Creativity in the use of model output will be used to show the wide range of problems to which predictive models and machine learning techniques can be applied. The participants in this session will leave with:

• An understanding of what the label "machine learning" refers to and how these techniques compare to more traditional actuarial methods.

• A description of one specific machine learning technique. • An appreciation of the breadth of application for machine learning techniques. • An ability to identify opportunities where predictive analytics can enhance insurance operations.

Page 2: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

Top Three Session Ideas Tools or tips you learned from this session and can apply back at the office.

1. ______________________________________________________________________

2. _______________________________________________________________________

3. ________________________________________________________________________

Page 3: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

Expanding Predictive Analytics Through the Use of Machine Learning Session Outline

Overview What is Machine Learning?

• Predictive Modeling o Examples o What Makes a Quality Predictive Model?

• How Does One Know Now What the Accuracy will be? • Hold-out Datasets

o Two General Methods Out of Sample Out of Time

• How Does One Build a Quality Predictive Model? • Definition of Machine Learning • Applications of Machine Learning

How can Machine Learning Apply to Insurance?

• Machine Learning Approaches • Basic Approach of Decision Trees

o Data Split Based on Some Target and Criterion o Each Path is Split Again Until Some Ending Criterion is Met o The Tree May Include Some Pruning Criteria

• Customer Frequency • This Approach Can Be Used:

o On Different Types of Data o To Target Different Criteria o At Different Levels

Non-rating Uses for Machine Learning

• Underwriting Tiers and Company Placement • Straight-thru Versus Expert Underwriter • Re-underwrite or Re-inspect • Profitability – Reduce the Bad • Profitability – Increase the Good (Target Marketing) • Quality of Business • Agent/Broker Relationship

o More Profitable than Expected o Less Profitable than Expected o Agents with the Most Red and Green Business

• Retention Analyses Rating Applications of Machine Learning

• The Quick Fix • Schedule Rating • Creating a Class Plan from Scratch

Summary Questions & Answers

Page 4: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

EXPANDING ANALYTICSTHROUGH THE USE OF

MACHINE LEARNING

2013 NAMIC COMMERCIALLINES SEMINAR

28 February 2013

Christopher Cooksey, FCAS, MAAA

2

Agenda…

1. What is Machine Learning?

2. How can Machine Learning apply to insurance?

3. Non‐rating Uses for Machine Learning

4. Rating Applications of Machine Learning

2013 NAMIC Commercial Lines Seminar - Cooksey Page 1 of 21

Page 5: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

1

WHAT IS MACHINE LEARNING?

4

What is Machine Learning?

Predictive Modeling – The process of using information you have to predict something you don’t know yet.

Examples of predictive models…• Econometric models – if unemployment and interest 

rates, etc., then inflation is expected to be Y.

• Personal expectations – if vehicle age, and mileage, and price, etc., then I should be happy with the purchase.

• Pricing algorithms – if location, and SIC, and # of employees, then losses are expected to be Y and therefore premium needs to be Y + expected expenses.

Note: Predictive models are predictive, not determinative.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 2 of 21

Page 6: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

5

What is Machine Learning?

What makes a quality Predictive Model?

• Accuracy – does the predicted result match the actual result?

• Simplicity/Cost – how difficult is it to gather the information the model needs as input?

• Stability – how long will the model be accurate?

In the end, some models are more useful than others, and much depends on the purpose of the model.

Example…does a model which uses credit information justify the cost of acquiring the data?

6

What is Machine Learning?

How does one know now what the accuracy will be?

Use the data you have and balance between under‐ and overfitting.

To estimate generalization power, set aside data purely for testing.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 3 of 21

Page 7: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

7

What is Machine Learning?

Hold‐out datasets

Two general methods –• Out of sample:  train on a random 70% of data; 

validate against the remaining 30% of data.

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

1 3 4

5 6 7 10

12 13 14 16

17 18 19

2 8 9

11 15 20

Training Data Validation Data

8

What is Machine Learning?

Hold‐out datasets

Two general methods –• Out of sample:  train on a random 70% of data; 

validate against the remaining 30% of data.• Out of time:  train against older years of data; 

validate against newest years of data.

2005

2006

2007

2008

2009

Training Data Validation Data

2005

2006

2007

2008

2009

2013 NAMIC Commercial Lines Seminar - Cooksey Page 4 of 21

Page 8: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

9

What is Machine Learning?

How does one build a quality Predictive Model?

• Univariate models – how does one predictor relate to the target?  Can be explored through trial and error.

• Multivariate models – how do multiple predictors relate to the target?  Comes in many a various forms. 

Currently in use in the insurance industry are Generalized Linear Models.  These are multivariate, but come with certain assumptions.  For example, GLMs produce models where the predictors relate linearly to the target.

Machine Learning includes a number of different modeling techniques which can produce predictive models.

10

What is Machine Learning?

Machine Learning is a broad field concerned with the study of computer algorithms that automatically improve with experience.

A computer is said to “learn” from experience if…

…its performance on some set of tasks improves as experience increases.

This definition is from Machine Learning, Tom M. Mitchell, McGraw‐Hill, 1997.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 5 of 21

Page 9: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

11

What is Machine Learning?

Applications of Machine Learning include…

• Recognizing speech• Driving an autonomous vehicle• Predicting recovery rates of pneumonia patients• Playing world‐class backgammon• Extracting valuable knowledge from large commercial 

databases• Many, many, others…

12

What is Machine Learning?

Machine Learning

Probability and 

Statistics

Actuaries

2013 NAMIC Commercial Lines Seminar - Cooksey Page 6 of 21

Page 10: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

2

HOW CAN MACHINE LEARNING

APPLY TO INSURANCE?

14

How can Machine Learning apply to insurance?

Machine Learning includes many different approaches…• Neural networks• Decision trees• Genetic algorithms• Instance‐based learning• Others

…and many different approaches for improving results• Ensembles• Boosting• Bagging• Bayesian learning• Others

Focus here on decision trees – applicable to insurance & accessible

2013 NAMIC Commercial Lines Seminar - Cooksey Page 7 of 21

Page 11: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

15

How can Machine Learning apply to insurance?

Basic Approach of Decision Trees• Data split based on some target and criterion

• Target: entropy, frequency, severity, loss ratio, loss cost, etc.

• Criteria: maximize the difference, maximize the Gini coefficient, minimize the entropy, etc.

• Each path is split again until some ending criterion is met• Statistical tests on the utility of further splitting• No further improvement possible• Others

• The tree may include some pruning criteria• Performance on a validation set of data (i.e. 

reduced error pruning)• Rule post‐pruning• Others

Number of Units

CovLimit

Number of 

Insured

1 >1

>10k<=10k

1,2 >2

16

How can Machine Learning apply to insurance?

All Data

Number of Units = 1

Any Cov Limit

Any Number of Insured

Number of Units > 1

Cov Limit > 10k

Any Number of Insured

Cov Limit <=10k

Number of Insured = 1,2

Number of Insured > 2

Leaf Node 1 Leaf Node 2 Leaf Node 3 Leaf Node 4

• In decision trees all the data is assigned to one leaf node only• Not all attributes are used in each path –

for example, Leaf Node 2 does not use Number of Insured

2013 NAMIC Commercial Lines Seminar - Cooksey Page 8 of 21

Page 12: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

17

How can Machine Learning apply to insurance?

All Data

Number of Units = 1

Any Cov Limit

Any Number of Insured

Number of Units > 1

Cov Limit > 10k

Any Number of Insured

Cov Limit <=10k

Number of Insured = 1,2

Number of Insured > 2

Freq = 0.022 Freq = 0.037 Freq = 0.012 Freq = 0.024Segment 1 Segment 2 Segment 3 Segment 4

• Decision trees are easily expressed as lift curves• Segments are relatively easily described

18

How can Machine Learning apply to insurance?

Who are my highest frequency customers?

• Policies with higher coverage limits (>10k) and multiple units (>1)

Who are my lowest frequency customers?

• Policies with lower coverage limts (<=10k), multiple units (>1), but lower numbers of insureds (1 or 2)

2013 NAMIC Commercial Lines Seminar - Cooksey Page 9 of 21

Page 13: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

19

How can Machine Learning apply to insurance?

This approach can be used on different types of data• Pricing• Underwriting• Claims• Marketing• Etc.

This approach can be used to target different criteria• Frequency• Severity• Loss Ratio• Retention• Etc.

This approach can be used at different levels• Vehicle/Coverage• Vehicle• Unit/building/location• Policy• Etc.

3

NON‐RATING USES FOR MACHINE LEARNING

2013 NAMIC Commercial Lines Seminar - Cooksey Page 10 of 21

Page 14: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

21

Non‐rating Uses for Machine Learning

Underwriting Tiers and Company Placement

Target frequency at the policy level

Define tiers based on similar frequency characteristics.

Note that a project like this would need to be done in conjunction with pricing.  This sorting of data occurs prior to rating and would need to be accounted for.

Tier 1

Tier 2

Tier 3

22

Non‐rating Uses for Machine Learning

Straight‐thru versus

Expert UW

Target frequency or loss ratio at the policy level

Consider policy performance versus current level of UW scrutiny.

Do not forget that current practices affect the frequency and loss ratio of your historical business.  Results like this may indicate modifications to current practices.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 11 of 21

Page 15: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

23

Non‐rating Uses for Machine Learning

“I have the budget to re‐underwrite 10% of my book.  I just need to know which 10% to look at!”

With any project of this sort, the level of the analysis should reflect the level at which the decision is made, and the target should reflect the basis of your decision.

In this case, we are making the decision to re‐underwrite a given POLICY.  Do the analysis at the policy level.  (Re‐inspection of buildings may be done at the unit level.)

To re‐underwrite unprofitable policies, use loss ratio as the target.

Note:  when using loss ratio, be sure to current‐level premium at the policy level (not in aggregate).

24

Non‐rating Uses for Machine Learning

Re‐underwrite or

Re‐inspect

Target loss ratio at the policy level

Depending on the size of the program, target segments 7 & 9 as unprofitable.

If the analysis data is current enough, and if in‐force policies can be identified, this kind of analysis can result in a list of policies to target rather than just the attributes that correspond with unprofitable policies (segments 7 & 9).

2013 NAMIC Commercial Lines Seminar - Cooksey Page 12 of 21

Page 16: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

25

Non‐rating Uses for Machine Learning

Profitability –reduce the bad

Target loss ratio at the policy level

Reduce the size of segment 7 –consider non‐renewals and/or the amount of new business.

There is a range of aggressiveness here which may also be affected by the regulatory environment.

26

Non‐rating Uses for Machine Learning

Profitability –increase the good (target marketing)

Target loss ratio at the policy level

If the attributes of segment 5 define profit‐able business, get more of it.

This kind of analysis defines the kind of business you write profitably.  This needs to be combined with marketing/demographic data to identify areas rich in this kind of business.  Results may drive agent placement or marketing.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 13 of 21

Page 17: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

27

Non‐rating Uses for Machine Learning

Quality of Business

Target loss ratio at the policy level

Knowing who you write at a profit and loss, you can monitor new business as it comes in.

Monitor trends over time to assess the adverse selection against your company.  Estimate the effectiveness of underwriting actions to change your mix of business.

28

Non‐rating Uses for Machine Learning

Quality of Business

Here you can see adverse selection occurring through March 2009.

Company action at that point reversed the trend. This looks at the total business of the book.  Can also focus 

exclusively on new business. 

2013 NAMIC Commercial Lines Seminar - Cooksey Page 14 of 21

Page 18: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

29

Non‐rating Uses for Machine Learning

Agent/broker Relationship

Target loss ratio at the policy level

Use this analysis to inform your understanding of agent performance.

Actual agent loss ratios are often volatile due to smaller volume.  How can you reward or limit agents based on this?  A loss ratio analysis can help you understand EXPECTED performance as well as actual.

GreenYellow

Red30.9% LR

41.3% LR

66.1% LR

30

Non‐rating Uses for Machine Learning

Agent/broker Relationship

More profitable than expected…

This agent writes yellow and red business better than expected.

Best practices – is there something this agent does that others should be doing?

Getting lucky – is this agent living on borrowed time?  Have the conversation to share this info with the agent.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 15 of 21

Page 19: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

31

Non‐rating Uses for Machine Learning

Agent/broker Relationship

Less profitable than expected…

This agent writes all business worse than expected.

Worst practices – is this agent skipping inspections or not following UW rules?

Getting unlucky – This agent doesn’t write much red business.  Maybe they are given more time because their mix of business should give good results over time.

32

Non‐rating Uses for Machine Learning

Agent/broker Relationship

Agents with the most Green BusinessSome of these agents who write large amounts of low‐risk business get unlucky, but the odds are good that they’ll be profitable.

Agents with the most Red BusinessNot only is the underlying loss ratio higher, but the odds of that big loss is much higher too.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 16 of 21

Page 20: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

33

Non‐rating Uses for Machine Learning

Retention Analyses

Target retention at the policy level

What are the common characteristics of those with high retention (segment 7)?

This information can be used in a variety of ways…

• Guide marketing & sales towardscustomers with higher retention

• Form the basis of a more formal lifetime value analysis

• Cross‐reference retention and loss ratio to get a more useful look…

34

Non‐rating Uses for Machine Learning

Retention Analyses

Simple looks at retention can be even more useful when cross‐referenced with loss ratio.

Is a segment of business above or below average retention?  Above or below the target lossratio?

Note:  retention is essentially a static look at your book.  What kinds of customers retained?  What kinds didn’t?  There is no consideration of the choice customers had at renewal.  Were they facing a rate changeand renewed anyway?

2013 NAMIC Commercial Lines Seminar - Cooksey Page 17 of 21

Page 21: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

4

RATING APPLICATIONS OF MACHINE LEARNING

36

Rating Applications of Machine Learning

The Quick Fix

Target loss ratio at the coverage level

The lift curve is easily translated into relativities which can even out your rating.

Note that the quickest fix to profitability is taking underwriting action.  But the quickest fix for rating is to add a correction to existing rates.  This can be done because loss ratio shows results given the current rating plan.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 18 of 21

Page 22: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

37

Rating Applications of Machine Learning

The Quick Fix

First determine relativities based on the analysis loss ratios.

Then create a table which assigns relativities.

Note that this can be one table as shown, or it can be two tables: one which assigns the segments and one which connects segments to relativities.  The exact form will depend on your system.

38

Rating Applications of Machine Learning

Schedule Rating

Target loss ratio at the coverage level

Use this information to guide schedule rating decisions.

This doesn’t have to be overly prescriptive; these models can be used to set guidelines.  For example, set a rule that segments 5 & 3 may be credited but not debited, while segments 9 & 7 may be debited but not credited.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 19 of 21

Page 23: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

39

Rating Applications of Machine Learning

Creating a class plan from scratch

Using Machine Learning and GLMs together…

Run a GLM and calculate the residual signal

Use the residual from GLM to run a Decision 

Tree

Use the segments from the Decision Tree as predictors in 

the GLM

40

Expanding Analytics Through the Use of Machine Learning

Summary

• The more accessible Machine Learning techniques, such as decision trees, can be used today to enhance insurance operations.

• Machine Learning results are not too complicated to use in insurance.

• Non‐rating applications of Machine Learning span underwriting, marketing, product management, and executive‐level functions.

• Actuaries should pursue the business goal most beneficial to the company – this may include some of these non‐rating applications.

• Rating applications of Machine Learning include both quick fixes and fundamental restructuring of rating algorithms.

2013 NAMIC Commercial Lines Seminar - Cooksey Page 20 of 21

Page 24: Expanding Predictive Analytics Through the Use of Machine ......Expanding Predictive Analytics Through the Use of Machine Learning Thursday, February 28, 2013, 11:10 a.m. Chris Cooksey,

41

Expanding Analytics Through the Use of Machine Learning

Questions?

Contact InfoChristopher Cooksey, FCAS, MAAA

EagleEye [email protected]

www.eeanalytics.com

2013 NAMIC Commercial Lines Seminar - Cooksey Page 21 of 21