generalised linear modelling for improved...

Pricing AdvantageDavid Kirk 2008

Generalised Linear Modelling for improved ratemaking

How to achievePricing Advantage

Competition & Market Dynamics

Sharper ratemaking can provide a competitive advantage for increased market share and higher margins

Rating uncovers hidden risk characteristics

So that we can group risks and price effectively

Cla

im S

ize

Claim Frequency

-50% -45% -40% -35% -30% -25% -20% -15% -10% -5% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%

Current Premium Subsidies

Why we need to rate more accurately

Subsidised Subsidising

Unprofitable Risk being undercut

Modelling the potential benefitsPolicyholder behaviour and market dynamic model

Needs plenty of assumptions Price sensitivity

Information accessibility

Loyalty

And estimatesDistribution of policyholders by risk

Competitive response

Pricing accuracy

“all models are wrong… but some are useful.”

Price SensitivityPrice sensitivity If Price Sensitivity is 100%, policyholders always choose cheapest

alternative

If Price Sensitivity is 0%, policyholders ignore price in their decision

Can be estimatedMarket Research and Surveys

Analysis of offer accept and decline statistics

Affects speed of change in market share and equilibrium level

Information Accessibility Information accessibility If Information Accessibility is 0%, then Price Sensitivity is irrelevant

If Information Accessibility is less than 100%, then the effect of Price Sensitivity is dampened


Can be changed through advertising

Affects speed of change in market share and equilibrium level

LoyaltyCustomer Loyalty If Loyalty is 100%, no existing clients ever change insurer

If Loyalty is 0%, then at renewal the current insurer has no advantage


Analysis of offer accept and decline statistics

Affects speed of change in market share but not the eventual equilibrium level

Competitive ResponseNo competitive response assumed for following projectionsNot realistic

Overstates the results

Difficult to quantitatively assess competitive response Prefer to model knowing it’s incorrect

Rather than model something which we don’t understand

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 2 3 4 5 6 7 8 9 10

Years in projection

Projected Market Share

Some actual results

0%

2%

4%

6%

8%

10%

12%

14%

16%

1 2 3 4 5 6 7 8 9 10

Op

erati

on

al R

esu

lt

Years in projection

Projected Profitability (assuming no competitive

reaction)

Some actual results

Seeing the light

A South African insurer started in 1998 now has over 10% market share, higher than average profitability using advanced ratemaking

Top 20 US insurers all using multivariate techniques

Even more prevalent in UK

What makes good rates?

A good ratemaking process will generate rates that achieve specific objectives

Choosing Rating ObjectivesAccuracy

Ease and cost of administration

Flexibility of approach

Statistical efficiency of estimates

Consistency over time

Objectivity and freedom from manipulation

Transparency

Consistent with other corporate objectives

Correlation, causation and predictive valueCorrelation does not imply causationA vehicle accident or restaurant fire last year does not cause the

accident or fire this year

A prior accident or fire claim is a useful predictor of future claims

Correlation can have predictive value even if there is no causationRating factors versus risk factors

Understanding the cause is preferable, but not required in order to use the factor

Introduction to statistical ratemaking

Scientific ratemaking doesn’t have to be complex

Univariate AnalysisConsider one rating factor at a timeAge

Gender

Vehicle Make

Property type

Simple analysis can be performed in a spreadsheet

Can be performed on Loss Ratios

Pure Premiums

Loss Ratio AnalysisConsider differences in Loss Ratios between different rating

factors or groups

Data requirements Premium and claim amounts for each policy

Premiums for prior periods must be re-rated to be consistent with current rating standards

May require re-underwriting if underwriting approach has changed

Rating factors associated with each policy

New rates are an adjustment to old rates

Loss Ratio Analysis example

Pure Premium AnalysisThe Pure Premium is the absolute premium rate applied to the

exposure

Frequency & Severity estimated separately

!!!"#$%&'%$()&*)"+,#

-./"0+12

!!3+)421&"5&'%$()0

-./"0+12"!"#$%&'%$()&*)"+,#

3+)421&"5&'%$()0

!!'%$()&6127+2,89"'%$()&:2;21(#9

!!""#

Pure Premium Analysis

Pure Premium example

Pure Premium AnalysisBased on exposures rather than premiums

Doesn’t require re-rating of past premiums to current on-level premiums

Analysis isn’t affected by special or negotiated rates

Can be used for a new line without an existing basis

Requires a coherent exposure measureUnique or idiosyncratic risks not well suited

Advanced Scientific Ratemaking Ratemaking

Multivariate, robust, accurate, flexible, understood and competitive ratemaking using GLM

Revisiting correlation, causation and predictive valueCorrelation does not imply causation

Correlation can have predictive value even if there is no causation

Correlation is a measure of linear relationshipsNot all relationships of interest are linear

Correlation doesn’t guarantee low variance of prediction

Correlation neither sufficient nor necessary to be predictive

Univariate analysis is flawed

0.00

0.50

1.00

1.50

2.00

2.50

3.00

MVMSP

MVFSP

MVMGP

MVFGP

MVMRP

MVFRP

MVMSB

MVFSB

MVMGB

MVFGB

MVMRB

MVFRB

MVMSM

MVFSM

MVMGM

MVFGM

MVMRM

MVFRM

Sta

nd

ard

ised

Prem

ium

Rate

s

Comparison of GLM and Univariate Rates

GLM Univariate

!"#$

#$

"#$

%#$

&#$

'#$

(#$

)#$

#$ "#$ %#$ &#$ '#$ (#$ )#$ *#$ +#$ ,#$ "##$ ""#$ "%#$

!"#$%!#$&'"&%()*'+,-$*%

.,/)0%1,+-2#%3$4$5%

6"5/75$%.$0#$88'2)%9:,&75$%

-./01203$4562$ 4720859:$"$ 4720859:$%$

Multiple Regression

!"!#! $"!"#

"$"""###"#

%$"%"&"

Theoretical and practical problems

Multiple Regression assumes normal distribution of residualsCommon actual distributions for frequency is Poisson and for severity

is Gamma

Multiple Regression can lead to frequencies and severity estimates less than 0

Can still be usefulWidely understood technique

Can even do analysis in a spreadsheet

Problems with Multiple Regression

GLM: Some evidenceGLM is used widely and successfully in insurance around the

world

Part of ultra-competitive UK non-life industry for many years

Growing quickly in the US since 1990s introductionAs pricing restrictions are reduced, competition has increased

Top 20 insurers all use GLM now

Casualty Actuarial Society heavily involved

Expensive software available, and in demand

GLM: Basic conceptsGLM is still a form of linear modelling

Generalised to allow for different distributionsAnd technically, we need iterative estimation techniques

But practically this has little effect given current computing power

And making use of transformations of the dependent variable

GLM: The Link Function

g is known as the Link Function and links the expected value of the dependent variable to a linear combination of risk factors or exposure measures

Can choose any monotonic link function

But natural or canonical links exist for each distribution of the dependent variable

!!"! # "! $" %!

!&"## $!

#'#$ $!$'%%%'#% $

!

%

!"!!#

$"!!#

%!"!!#

%$"!!#

&!"!!#

&$"!!#

'!"!!#

'$"!!#

()"!!#

('"!!#

(&"!!#

(%"!!#

!"!!#

%"!!#

&"!!#

'"!!#

)"!!#

!"#$%&'#()*#%+,-./012%

*+,#*-./#

0.12342#*-./#

*+,-5#*-./#

6.-5#*-./#

GLM: The Link Function

!"!!!#

!"!$!#

!"!%!#

!"!&!#

!"!'!#

!"(!!#

!"($!#

!"(%!#

!"(&!#

!# (# $# )# %# *# &# +# '# ,# (!# ((# ($# ()# (%# (*# (&# (+# ('# (,# $!# $(# $$# $)# $%# $*# $&# $+# $'# $,#

!"#$$"%&!'"()(#*#+,&-"'&.*)#/&-'0120%.,&

Modelling Claim Frequency with GLM

Canonical Link is Log Link.Scales mean and variance to be constant

!"!!!#

!"!$!#

!"!%!#

!"!&!#

!"!'!#

!"!(!#

!"!)!#

!"!*!#

!"!+!#

!"##"$%&'("()*)+,$-'&$.*")#$/010&)+,$

Modelling Claim Severity with GLM

Canonical Link is Inverse Link

Example Frequency functionVehicle accident damage

!!"! #"#$%&$'() ! $

* %!!&""+ ,#-.$#/0*$!'#

* %!!&""1*$',$# !'#

* %!!&""+ ,#-.$#/0*$!'"1 *$',$# !'#

* %!!&""+ ,#-.$#/0*$!'"1 *$',$# !'"2,#-.$#/0*$!*$',$# !'#

12

3

4

5 ?

Example Frequency function

!"

#!"

$!"

%!"

&!"

!'!"

!#!"

!$!"

!%!"

!&!"

!" (" )" !*" !+" #!" #(" #)"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

,-./"012/34/5"

6/7-./"012/34/5"

Plot of Frequency FunctionObserved Data

!"

#!"

$!"

%!"

&!"

'!!"

'#!"

'$!"

'%!"

'&!"

'&" #%" ($" $#" )!" )&" %%" *$"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

+,-./-012"

Possible Model 1Age

%!"

&!"

!'!"

!#!"

!$!"

!%!"

!&!"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

,-./"012/34/5"

6/7-./"012/34/5"

!"

#!"

$!"

%!"

&!"

!'!"

!#!"

!$!"

!%!"

!&!"

!&" #%" ($" $#" )'" )&" %%" *$"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

+,-."

/.0,-."

Possible Model 2Gender

!"

#!"

$!"

%!"

&!"

'!!"

'#!"

'$!"

'%!"

'&!"

#!!"

'&" #%" ($" $#" )!" )&" %%" *$"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

+,-."

/.0,-."

Possible Model 3Age and Gender

!"

#!"

$!!"

$#!"

%!!"

%#!"

$&" %'" ()" )%" #!" #&" ''" *)"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

+,-."

/.0,-."

Possible Model 4Age, Gender, and Interaction

!"

#!"

$!"

%!"

&!"

'!!"

'#!"

'$!"

'%!"

'&!"

'" (" )" '*" '+" #'" #(" #)"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

,-./"

0/1-./"

Selected Model 5Age, Gender, Age-squared + all interaction terms

!"

#!"

$!"

%!"

&!"

!'!"

!#!"

!$!"

!%!"

!&!"

!" (" )" !*" !+" #!" #(" #)"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

,-./"01/2345/2"

6/7-./"01/435/2"

,-./"89:/1;/2"

6/7-./"89:/1;/2"

Final fitted model 5Age, Gender, Age-squared + all interaction terms

Plot of Frequency FunctionAge, Gender, Age-squared + all interaction terms

!"

#!"

$!"

%!"

&!"

'!!"

'#!"

'$!"

'%!"

'&!"

'&" #%" ($" $#" )!" )&" %%" *$"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

+,-./-012"

!!"#

!$%#

!$"#

!&%#

!&"#

!'%#

!'"#

!"%#

!""#

!(# $)# &'# '$# "%# "(# ))# *'#

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

+,-.#

/.0,-.#

!"

#!"

$!"

%!"

&!"

'!!"

'#!"

'$!"

'%!"

'&!"

#!!"

'&" #%" ($" $#" )!" )&" %%" *$"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

+,-."

/.0,-."

!"

#!"

$!!"

$#!"

%!!"

%#!"

$&" %'" ()" )%" #!" #&" ''" *)"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

+,-."

/.0,-."

!"

#!"

$!"

%!"

&!"

'!!"

'#!"

'$!"

'%!"

'&!"

'" (" )" '*" '+" #'" #(" #)"

!"#$%&"'(%%$)"*+',-"./"%$"0'1"-'2$&&"'

,-./"

0/1-./"

Now

we need to add the

other 15 rating factors!

The role of skill, experience and judgementCritically important in model selection and validation

No time to cover in detail here

Model Selection Forward, Backwards, Hierarchical and All Subsets

Model criteria Likelihood, Deviance, Akaike Information Criterion

Diagnostic testsResidual plots, serial correlation, constant variance

!"!!!#

!"!$!#

!"!%!#

!"!&!#

!"!'!#

!"(!!#

!"($!#

!"(%!#

!"(&!#

!# (# $# )# %# *# &# +# '# ,# (!# ((# ($# ()# (%# (*# (&# (+# ('# (,# $!# $(# $$# $)# $%# $*# $&# $+# $'# $,#

!"#$$"%&!'"()(#*#+,&-"'&.*)#/&-'0120%.,&

Combine estimates of frequency and severity into modelled pure premium for each rating group

Make adjustments if necessary Legal minima, maxima

Restrict large changes

!"!!!#

!"!$!#

!"!%!#

!"!&!#

!"!'!#

!"!(!#

!"!)!#

!"!*!#

!"!+!#

!"##"$%&'("()*)+,$-'&$.*")#$/010&)+,$

Deriving premium tables

Adjusting rates for riskOnce we have derived pure premiums based on detailed

analysis of multivariate effects on severity and frequency, we need a method to allow for systematic risk

Problems of variabilityAccepted that greater risk requires greater reward

Not all lines of business are equal in risk to insurer

Diversification is imperfectnon-homogenous claims

systematic estimation error

large claims

correlations and concentrations of risk

Pricing for variabilityLow reinsurance retentions can reduce these problems

But unless reinsured 100%, some risk remains

Historically, insurers have priced: judgement, heuristics and rules of thumb

proportional to portfolio variance

Current best practice is pricing relative to: Systematic contribution to risk

Economic Capital requirements

Not just ratemakingSimilar techniques can be used in other business-critical areas of

general insurance, life insurance and banking

Credit Scoring ModelsCredit Scoring in banking analogous to ratemaking in non-life

insurance Fundamental to business

Source of significant competitive advantage

Generalised Linear ModelsCan perform independent check on current approach

Replace existing methods

Early warning for deteriorating credit

Easier interpretation than discriminant models

Cross-selling OpportunitiesFinancial Services Groups can increase market share and

profitability through increased cross-selling Typically follow:

a scatter approach; or

time-intensive, inaccurate judgement approach

Little analysis on effectiveness of tactics

GLM can improve: improve hit rate, limit customer annoyance and lower costs

Provide feedback to improve future targeting

Renewal and Acquisition ModellingApplications of GLM to new sales and renewal pricing

decisions

Interacts with the market dynamics models introduced earlier

Understand what factors lead to renewal probabilities and acceptance of offers for new policies

Optimise pricing: increase margins with limited impact on market share

profitably increase market share

Renewal and Acquisition Rating FactorsCan use existing rating factors as a staring point

E.g. age, age of vehicle, make and model of vehicle, vehicle use, marital status, vehicle ownership status

But also additional factorsAbsolute size of current premium

Time since policy written / number of times renewed

Competitor’s premiums

Actual vs indicated premium

Level of No Claims Discount / Bonus Malus

Distribution channel

Economic conditions (inflation, interest rates, GDP)

Renewal and Acquisition Price Sensitivity / ElasticityMost importantly, proposed increase in premium

In original policyholder behaviour model we had to assume a level of price sensitivity

Now we can estimate this, scientifically, for different types of policyholders

Measure what premium can be tolerated with a given probability of cancellation

And differentiate between different types of policyholders

What aboutNeural Networks?Artificial Neural Networks are mathematical modelsCan be used to build predictive models

Based on many interconnected artificial neurons

Use training algorithms to teach the network how to predict claims

Less widely known and understood than GLMAre in fairly wide use in predictive modelling outside insurance

Black-box limitations

Making it happenTheory is only useful when it is implemented effectively and

efficiently.That means Systems & Data.

Steps to Make It Happen1.Commit to investigate advanced ratemaking

2.Select a Champion to drive the project

•Sufficiently senior to ensure progress

3.Assemble project team

•Including marketing, underwriting, actuarial, IT, legal

4.Generate possible rating factor ideas

5.Select or build database to store relevant data

6.Build GLM rating systems and select models

7.Test results in the market

The Need for SystemsAnalysis of offer accept and decline statistics

Store accurate exposures and claims

Start storing information for future analysisNeed data to assess rating factors

Can’t only store for factors we know are important

Need quick, reliable analysis

Build near-permanent competitive advantage

Typical analysis requirementsSome large companies will interrogate 200 or 300 possible

variables Large data sets

Robust systems to store data and manage model selection

Need to separate exposures and claims between: Perils

Rating factors (used and possible)

With accurate dates

Database requires robust controls to maintain data integrity and ease of extraction

Available GLM SoftwareThree primary choices:

1.Purpose-built GLM software for insurance applications

2.Leverage proprietary statistical software

3.Leverage open-source statistical software

Purpose-built insurance GLM softwareSupports key applications to insurance out of the box

Need to import data in correct format

Expensive licences

Reliant on service provider

But service provider can provide useful, directly relevant support

Requires specialised, mobile skills

Proprietary statistical softwareRequires more initial effort to access required functionality


Expensive licences

Reliant on service provider

Service provider can provide generic support

Requires fairly widespread skills

Open-source statistical softwareRequires more initial effort to access required functionality


No licence fees

Codebase and algorithms peer reviewed and extensively used in academia and business

Plenty of resources available, but limited paid support

Requires fairly widespread skills

Common PitfallsEvery technique can be misused.

People ProblemsFailing to get full buy-in from key stakeholdersOrdinary change management problems

Can develop prototype project

Specific Objectives

Specific Resources

Need vision to recognise long-term imperative

Past success can be an inhibitor to success

Relying too heavily on pre-analysisMultivariate techniques give different results!

-50% -45% -40% -35% -30% -25% -20% -15% -10% -5% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%

Current Premium Subsidies

Subsidised Subsidising

Unprofitable Risk being undercut

0%

2%

4%

6%

8%

10%

12%

14%

16%

1 2 3 4 5 6 7 8 9 10

Op

erati

on

al R

esu

lt

Years in projection

Projected Profitability (assuming no competitive

reaction)

0.00

0.50

1.00

1.50

2.00

2.50

3.00

MVMSP

MVFSP

MVMGP

MVFGP

MVMRP

MVFRP

MVMSB

MVFSB

MVMGB

MVFGB

MVMRB

MVFRB

MVMSM

MVFSM

MVMGM

MVFGM

MVMRM

MVFRM

Sta

nd

ard

ised

Prem

ium

Rate

s

Comparison of GLM and Univariate Rates

GLM Univariate

Application Problems 1Not performing sufficient pre-analysis

Using loss ratio analysisVery common mistake, even with large companies

Don’t have standard distributions for loss ratios

Need to adjust data to current rate-level

Difficult to use industry experience to perform sense checks

Application Problems 2Modelling raw pure premiums for all coverages directly rather

than modelling at the component level Theoretical problems

Awkward, bi-modal distributions

Inferior practical results

Not performing sufficient diagnostic testing on the fitted models while selecting the best model

Treating the predictive model as a black box"

Not Going Far EnoughRestricting analysis to variables and groupings in the current

rating algorithmConvenient because likely to have data available

Significant advantages to casting net wider

New rating factors

Limiting the use of GLMs to risk models Same technology, skills can be used for other predictive modelling

Renewal modelling can be as valuable as risk models

Extend to credit scoring in affiliated banks?

AchievingPricing Advantage

generalised linear modelling for improved...

Documents