iabe big data information paper - an actuarial perspective

23
BIG DATA: An actuarial perspective Information Paper November 2015

Upload: mateusz-maj

Post on 21-Jan-2017

506 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: IABE Big Data information paper - An actuarial perspective

   

 

 

 

BIG  DATA:  An  actuarial  perspective  

 

 

Information  Paper  

November  2015  

 

Page 2: IABE Big Data information paper - An actuarial perspective

   

2

Table  of  Contents  

1  INTRODUCTION   3  

2  INTRODUCTION  TO  BIG  DATA   3  

2.1  INTRODUCTION  AND  CHARACTERISTICS   3  2.2  BIG  DATA  TECHNIQUES  AND  TOOLS   4  2.3  BIG  DATA  APPLICATIONS   4  2.4  DATA  DRIVEN  BUSINESS   5  

3  BIG  DATA  IN  INSURANCE  VALUE  CHAIN   6  

3.1  INSURANCE  UNDERWRITING   6  3.2  INSURANCE  PRICING   8  3.3  INSURANCE  RESERVING   10  3.4  CLAIMS  MANAGEMENT   11  

4  LEGAL  ASPECTS  OF  BIG  DATA   13  

4.1  INTRODUCTION   13  4.2  DATA  PROCESSING   14  4.3  DISCRIMINATION   16  

5  NEW  FRONTIERS   17  

5.1  RISK  POOLING  VS.  PERSONALIZATION   17  5.2  PERSONALISED  PREMIUM   18  5.3  FROM  INSURANCE  TO  PREVENTION   18  5.4  THE  ALL-­‐SEEING  INSURER   18  5.5  CHANGE  IN  INSURANCE  BUSINESS   19  

6  ACTUARIAL  SCIENCES  AND  THE  ROLE  OF  ACTUARIES   19  

6.1  WHAT  IS  BIG  DATA  BRINGING  FOR  THE  ACTUARY?   19  6.2  WHAT  IS  THE  ACTUARY  BRINGING  TO  BIG  DATA?   20  

7  CONCLUSIONS   21  

8  REFERENCES   22  

Page 3: IABE Big Data information paper - An actuarial perspective

   

3

1 Introduction  The  Internet  has  started  in  1984  linking  1,000  university  and  corporate  labs.  In  1998  it  grew  to  50  million  users,  while   in   2015   it   reached   3.2   billion   people   (44%  of   the   global   population).   This   enormous   user  growth  was   combined  with   an   explosion   of   data   that  we   all   produce.   Every   day  we   create   around   2.5  quintillion  bytes  of  data,   information  coming  from  various  sources  including  social  media  sites,  gadgets,  smartphones,   intelligent   homes   and   cars   or   industrial   sensors   to   name   few.   Any   company   that   can  combine  various  datasets  and  can  entail  effective  data  analytics  will  be  able   to  become  more  profitable  and  successful.  According  to  a  recent  report1  400  large  companies  who  adopted  Big  Data  analytics  "have  gained  a  significant  lead  over  the  rest  of  the  corporate  world."  Big  data  offers  big  business  gains,  but  also  has   hidden   costs   and   complexity   that   companies   will   have   to   struggle   with.   Semi-­‐structured   and  unstructured  big  data  requires  new  skills  and  there  is  shortage  of  people  who  mastered  data  science  and  can  handle  mathematics  and  statistics,  programming  and  possess  substantive,  domain  knowledge.    What  will  be  the   impact  on  the   insurance  sector  and  the  actuarial  profession?  The  concepts  of  Big  Data  and   predictive  modelling   are   not   new   to   insurers   who   have   already   been   storing   and   analysing   large  quantities  of  data  to  achieve  deeper  insights  into  customers’  behaviour  or  setting  up  insurance  premiums.  Moreover   actuaries   are   data   scientists   for   insurance   and   they   have   all   the   statistical   training   and  analytical  thinking  to  understand  complexity  of  data  combined  with  the  business  insights.  We  look  closely  on   the   insurance   value   chain   and   assess   the   impact   of   Big   Data   on   underwriting,   pricing   and   claims  reserving.   We   examine   the   ethics   of   Big   Data   including   data   privacy,   customer   identification,   data  ownership   and   the   legal   aspects.   We   also   discuss   new   frontiers   for   insurance   and   its   impact   on   the  actuarial  profession.  Will  actuaries  will  be  able  to  leverage  Big  Data,  create  sophisticated  risk  models  and  more  personalized  insurance  offers,  and  bring  new  wave  of  innovation  to  the  market?      2 Introduction  to  Big  Data  

 2.1 Introduction  and  characteristics  Big  Data  broadly  refers  to  data  sets  so  large  and  complex  that  they  cannot  be  handled  by  traditional  data  processing  software  and  it  can  be  defined  by  the  following  attributes:  

a. Volume:  in  2012  it  was  estimated  that  2.5  x  1018  bytes  of  data  was  created  worldwide  every  day  -­‐  this  is  equivalent  to  a  stack  of  books  from  the  Sun  to  Pluto  and  back  again.  This  data  comes  from  everywhere:   sensors   used   to   gather   climate   information,   posts   to   social   media   sites,   digital  pictures  and  videos,  purchase  transaction  records,  software  logs,  GPS  signals  from  mobile  devices,  among  others.  

b. Variety   and  Variability:   the   challenges  of  Big  Data  do  not  only  arise   from   the   sheer  volume  of  data  but  also  from  the  fact  that  data  is  generated  in  multiple  forms  as  a  mix  of  unstructured  and  structured  data,   and   as   a  mix  of   data   at   rest   and  data   in  motion   (i.e.   static   and   real   time  data).  Furthermore   the  meaning   of   data   can   change   over   time   or   depend   on   the   context.   Structured  data   is  organized   in  a  way   that  both  computers  and  humans  can  read,   for  example   information  stored   in   traditional   databases.  Unstructured   data   refers   to   data   types   such   as   images,   audio,  video,   social   media   and   other   information   that   are   not   organized   or   easily   interpreted   by  traditional   databases.   It   includes   data   generated   by   machines   such   as   sensors,   web   feeds,  networks  or  service  platforms.  

c. Visualization:  the  insights  gained  by  a  company  from  analysing  data  must  be  shared  in  a  way  that  is  efficient  and  understandable  to  the  company’s  stakeholders.  

d. Velocity:  data  is  created,  saved,  analysed  and  visualized  at  an  increasing  speed,  making  it  possible  to  analyse  and  visualize  high  volumes  of  data  in  real  time.    

e. Veracity:  it  is  essential  that  the  data  is  accurate  in  order  to  generate  value.  

f. Value:  the  insights  gleaned  from  Big  Data  can  help  organizations  deepen  customer  engagement,  optimize  operations,  prevent  threats  and  fraud,  and  capitalize  on  new  sources  of  revenue.  

                                                                                                                         1  http://www.bain.com/publications/articles/big_data_the_organizational_challenge.aspx  

Page 4: IABE Big Data information paper - An actuarial perspective

   

4

2.2 Big  Data  techniques  and  tools  The  Big  Data  industry  has  been  supported  by  the  following  technologies:  

a. The  Apache  Hadoop  software  library  was   initially  released   in  December  2011  and   is  an  open  source  framework  that  allows  for  the  distributed  processing  of   large  data  sets  across  clusters  of  computers  using  simple  algorithms.  It  is  designed  to  scale  up  from  one  to  thousands  of  machines,  each   one   being   a   computational   and   storage   unit.   The   software   library   is   designed   under   the  fundamental   assumption   that   hardware   failures   are   common:   the   library   itself   automatically  detects   and   handles   hardware   failures   in   order   to   guarantee   that   the   services   provided   by   a  computer  cluster  will  stay  available  even  when  the  cluster  is  affected  by  hardware  failures.  A  wide  variety  of  companies  and  organizations  use  Hadoop  for  both  research  and  production:  web-­‐based  companies   that   own   some   of   the  world’s   biggest   data  warehouses   (Amazon,   Facebook,   Google,  Twitter,  Yahoo!,  ...),  media  groups,  universities  among  others.  A  list  of  Hadoop  users  and  systems  is  available  at  http://wiki.apache.org/hadoop/PoweredBy.  

b. Non-­‐relational  databases  have  existed  since   the   late  1960s  but   resurfaced   in  2009  (under   the  moniker  of  Not  Only  SQL  -­‐  NOSQL))  as  it  became  clear  they  are  especially  well  suited  to  handle  the  Big   Data   challenges   of   volume   and   variety   and   as   they   neatly   fit   within   the   Apache   Hadoop  framework.  

c. Cloud   Computing   is   a   kind   of   internet-­‐based   computing,   where   shared   resources   and  information   are   provided   to   computers   and   other   devices   on-­‐demand   (Wikipedia).   A   service  provider  offers  computing  resources  for  a  fixed  price,  available  online  and  in  general  with  a  high  degree  of  flexibility  and  reliability.  These  technologies  have  been  created  by  major  online  actors  (Amazon,   Google)   followed   by   other   technology   providers   (IBM,  Microsoft,   RedHat).   There   is   a  wide   variety   of   architecture   Public,   Private   and  Hybride   Cloud  with   all   the   objective   of  making  computing   infrastructure  a  commodity  asset  with  the  best  quality/total  cost  of  ownership  ratio.  Having  a  nearly  infinite  amount  of  computing  power  at  hand  with  a  high  flexibility  is  a  key  factor  for  the  success  of  Big  Data  initiatives.  

d. Mining  Massive  Datasets  is  a  set  of  methods,  algorithms  and  techniques  that  can  be  used  to  deal  with  Big  Data  problems  and  in  particular  with  volume,  variety  and  velocity  issues.  PageRank  can  be   seen   as   a   major   step   (see   http://infolab.stanford.edu/pub/papers/google.pdf)   and   its  evolution  to  a  Map-­‐Reduce  (https://en.wikipedia.org/wiki/MapReduce)  approach  is  definitively  a  breakthrough.  Social  Netword  Analysis  is  becoming  an  area  of  research  in  itself  that  aim  to  extract  useful   information   from   the  massive   amount   of   data   the   Social   Networks   are   providing.   These  methods   are   very   well   suited   to   run   on   software   such   as   Hadoop   in   a   Cloud   Computing  environment.  

e. Social  Networks   is  one  source  of  Bid  Data  that  provides  a  stream  of  data  with  a  huge  value  for  almost  all  economic  (and  even  non-­‐economic)  actors.  For  most  companies,  it  is  the  very  first  time  in  history  they  are  capable  of  interacting  directly  with  their  customers.  Many  applications  of  Big  Data  make   use   of   these   data   to   provide   enhanced   services,   products   and   to   increase   customer  satisfaction.  

2.3 Big  Data  Applications  Big  Data  has  the  potential  to  change  the  way  academic  institutions,  corporate  and  organizations  conduct  business  and  change  our  daily  life.  Great  examples  of  Big  Data  applications  include:  

a. Healthcare:  Big   Data   technologies   will   have   a  major   impact   in   healthcare.   IBM   estimates   that  80%  of  medical  data  is  unstructured  and  is  clinically  relevant.  Furthermore  medical  data  resides  in  multiple  places  like  individual  medical  files,  lab  and  imaging  systems,  physician  notes,  medical  correspondence,   etc.   Big   Data   technologies   allow   healthcare   organizations   to   bring   all   the  information   about   an   individual   together   to   get   insights   on   how   to  manage   care   coordination,  outcomes-­‐based  reimbursement  models,  patient  engagement  and  outreach  programs.  

b. Retail:  Retailers  can  get  insights  for  personalizing  marketing  and  improving  the  effectiveness  of  marketing  campaigns,  for  optimizing  assortment  and  merchandising  decisions,  and  for  removing  inefficiencies  in  distribution  and  operations.  For  instance  several  retailers  now  incorporate  

Page 5: IABE Big Data information paper - An actuarial perspective

   

5

Twitter  streams  into  their  analysis  of  loyalty-­‐program  data.  The  gained  insights  make  it  possible  to  plan  for  surges  in  demand  for  certain  items  and  to  create  mobile  marketing  campaigns  targeting  specific  customers  with  offers  at  the  times  of  day  they  would  be  most  receptive  to  them.2  

c. Politics:  Big  Data  technologies  will  improve  the  efficiency  and  effectiveness  across  the  broad  range  of  government  responsibilities.  Great  example  of  Big  Data  use  in  politics  was  2012  analytics  and  metrics  driven  Barack  Obama’s  presidential  campaign  [1].  Other  examples  include:  

i. Threat  and  crime  prediction  and  prevention.  For  instance  the  Detroit  Crime  Commission  has  turned  to  Big  Data  in  its  effort  to  assist  the  government  and  citizens  of  southeast  Michigan  in  the  prevention,  investigation  and  prosecution  of  neighbourhood  crime;3  

ii. Detection  of  fraud,  waste  and  errors  in  social  programs;  

iii. Detection  of  tax  fraud  and  abuse.  

d. Cyber  risk  prevention:  companies  can  analyse  data  traffic  in  their  computer  networks  in  real  time  to  detect  anomalies  that  may  indicate  the  early  stages  of  a  cyber  attack.  Research  firm  Gartner  estimates  that  by  2016,  more  than  25%  of  global  firms  will  adopt  big  data  analytics  for  at  least  one  security  and  fraud  detection  use  case,  up  from  8%  as  at  2014.4  

e. Insurance  fraud  detection:  Insurance  companies  can  determine  a  score  for  each  claim  in  order  to  target  for  fraud  investigation  the  claims  with  the  highest  scores  i.e.  the  ones  that  are  most  likely  to  be  fraudulent.  Fraud  detection  is  treated  in  paragraph  3.4.  

f. Usage-­‐Based  Insurance:  is  an  insurance  scheme,  where  car  insurance  premiums  are  calculated  based  on  dynamic  causal  data,  including  actual  usage  and  driving  behaviour.  Telematics  data  transmitted  from  a  vehicle  combined  with  Big  Data  analytics  enables  insurers  to  distinguish  cautious  drivers  from  aggressive  drivers  and  match  insurance  rate  with  the  actual  risk  incurred.  

2.4 Data  driven  business  The   quantity   of   data   is   steeply   increasing   month   after   month   in   the   world.   Some   argue   it   is   time   to  organize  and  use  this  information:  data  must  now  be  viewed  as  a  corporate  asset.    In  order  to  respond  to  this  arising  transformation  of  business  culture,  two  specific  C-­‐level  roles  have  thus  appeared  in  the  past  years,  one  in  the  banking  and  the  other  in  the  insurance  industry.  

2.4.1 The  Chief  Data  Officer  The  Chief  Data  Officer  (abbreviated  to  CDO)  is  the  first  architect  of  this  “data-­‐driven  business”.  Thanks  to  his  role  of  coordinator,  the  CDO  will  be  in  charge  of  the  data  that  drive  the  company,  by:    

• defining  and  setting  up  a  strategy  to  guarantee  their  quality,  their  reliability  and  their  coherency;  

• organizing  and  classifying  them;  • making  them  accessible  to  the  right  person  at  the  right  moment,  for  the  pertinent  need  and  in  

the  right  format.  

Thus,  the  Chief  Data  Officer  needs  a  strong  business  background  to  understand  how  business  runs.  The  following   question   will   then   emerge:   to   whom   should   the   CDO   report?   In   some   firms,   the   CDO   is  considered  part  of  the  IT,  and  reports  to  the  CTO  (Chief  Technology  Officer);  in  others,  he  holds  more  of  a  business  role,  reporting  to  the  CEO.  It’s  therefore  up  to  the  company  to  decide,  as  not  two  companies  are  exactly  similar  from  a  structural  point  of  view.    

Which   companies   have   already   a   CDO?   Generali   Group   has   appointed   someone   to   this   newly   created  position   in   June   2015.   Other   companies   such   as   HSBC,  Wells   Fargo   and   QBE   had   already   appointed   a  person   to   this   position   in   2013   or   2014.   Even   Barack   Obama   appointed   a   Chief   Data   Officer/Scientist  during  his  2012  campaign  and  the  metrics-­‐driven  decision-­‐making  campaign  played  a  big  role  in  Obama’s  

                                                                                                                         2  http://asmarterplanet.com/blog/2015/03/surprising-­‐insights-­‐ibmtwitter-­‐alliance.html#more-­‐33140  3  http://www.datameer.com/company/news/press-­‐releases/detroit-­‐crime-­‐commission-­‐combats-­‐crime-­‐with-­‐datameer-­‐big-­‐data-­‐analytics.html  4  http://www.gartner.com/newsroom/id/2663015  

Page 6: IABE Big Data information paper - An actuarial perspective

   

6

re-­‐election.   In   the   beginning,  most   of   the   professionals   holding   the   actual   job   title   “Chief   Data   Officer”  were  located  in  the  United  States.  After  a  while,  Europe  followed  the  move.  Also,  lots  of  people  did  the  job  in  their  day-­‐to-­‐day  work,  but  didn’t  necessarily  hold  the  title.  Many  analysts  in  the  financial  sector  believe  that  yet  more   insurance  and  banking  companies  will  have  to  do  the  move   in  the  following  years   if   they  want  to  stay  attractive.  

2.4.2 The  Chief  Analytics  Officer  Another  C-­‐level  position  aroused  in  the  past  months:  the  Chief  Analytics  Officer  (abbreviated  to  CAO).  Are  there  differences  between  a  CAO  and  a  CDO?    Theoretically  a  CDO  focuses  on  tactical  data  management,  while  the  CAO  concentrates  on  the  strategic  deployment  of  analytics.  The  latter’s  focus  is  on  data  analysis  to   find   hidden,   but   valuable,   patterns.   These   will   result   in   operational   decisions   that   will   make   the  company   more   competitive,   more   efficient   and   more   attractive   to   their   potential   and   current   clients.  Therefore,   the   CAO   is   a   normal   prolongation   of   the   data-­‐driven   business:   the   more   analytics   are  embedded  in  the  organization,  the  more  you  need  an  executive-­‐level  person  to  manage  that  position  and  communicate  the  results  in  an  understandable  way.  The  CAO  usually  reports  to  the  CEO.  

In   practice,   some   companies   put   the   CAO   responsibilities   into   the   CDO   tasks,   while   others   distinguish  both  positions.  Currently,  it’s  quite  rare  to  find  an  explicit  “Chief  Analytics  Officer”  position  in  the  banking  and  insurance  sector,  because  of  this  overlap.  But  in  other  fields,  the  distinction  is  often  made.  

3 Big  Data  in  insurance  value  chain  Big   Data   provides   new   insights   from   social   networks,   telematics   sensors,   and   other   new   information  channels   and   as   a   result   it   allows   understanding   customer   preferences   better,   enabling   new   business  approaches  and  products,  and  enhancing  existing  internal  models,  processes  and  services.  With  the  rise  of  Big  Data  the  insurance  world  could  fundamentally  change  and  the  entire  insurance  value  chain  could  be  impacted  starting  from  underwriting  to  claims  management.    

 3.1 Insurance  underwriting  3.1.1 Introduction  In  traditional  insurance  underwriting  and  actuarial  analyses,  for  years  we  have  been  observing  a  never-­‐ending  search  for  more  meaningful  insight  into  individual  policyholder  risk  characteristics  to  distinguish  good   risks   from   the   bad   and   to   accurately   price   each   risk   accordingly.   The   analytics   performed   by  actuaries,  based  on  advanced  mathematical  and  financial  theories,  have  always  been  critically  important  to   an   insurer’s   profitability.   Over   the   last   decade,   however,   revolutionary   advances   in   computing  technology   and   the   explosion   of   new   digital   data   sources   have   expanded   and   reinvented   the   core  disciplines   of   insurers.   Today’s   advanced   analytics   in   insurance   go   much   further   than   traditional  underwriting  and  actuarial   science.  Data  mining  and  predictive  modelling   is   today   the  way   forward   for  insurers  for  improving  pricing,  segmentation  and  increasing  profitability.  

3.1.2 What  is  predictive  modelling?  Predictive  modelling   can  be  defined  as   the  analysis  of   large  historical  data   sets   to   identify   correlations  and   interactions   and   the   use   of   this   knowledge   to   predict   future   events.   For   actuaries,   the   concepts   of  predictive  modelling   are   not   new   to   the   profession.   The   use   of  mortality   tables   to   price   life   insurance  products   is   an   example   of   predictive   modelling.   The   Belgian   MK,   FK   and   MR,   FR   tables   showed   the  relationship  between  death  probability  and  the  explaining  variables  of  age,  sex  and  product  type  (in  this  case  life  insurance  or  annuity).  

Predictive  models   have   been   around   a   long   time   in   sales   and  marketing   environments   for   example   to  predict   the  probability   of   a   customer   to  buy   a  new  product.  Bringing   together   expertise   from  both   the  actuarial   profession   and   marketing   analytics   can   lead   to   new   innovative   initiatives   where   predictive  models  guide  expert  decisions  in  areas  such  as  claims  management,  fraud  detection  and  underwriting.  

3.1.3 From  small  over  medium  to  Big  Data  Insurers   collect   a  wealth   of   information   on   their   customers.   In   the   first   place   during   the   underwriting  process:   by   asking   about   the   claims   history   of   a   customer   for   car   and   home   insurance   for   example.  Another  source  is  the  history  of  the  relationship  the  customer  has  with  the  insurance  company.  While  in  the  past  the  data  was  kept  in  silos  by  product,  the  key  challenge  now  lies  in  gathering  all  this  information  into  one  place  where  the  customer  dimension   is  central.  The  transversal  approach  to   the  database  also  

Page 7: IABE Big Data information paper - An actuarial perspective

   

7

reflects  the  recent  evolution   in  marketing:  going  from  the  4P’s  (product,  price,  place,  promotion)  to  the  4C’s5  (customer,  costs,  convenience,  communication).  

On  top  of  unleashing  the  value  of  internal  data,  new  data  sources  are  becoming  available  like  for  instance  wearables,   social  networks   to  name   few.  Because  Big  Data  can  be  overwhelming   to  start  with,  medium  data   should   be   considered   at   first.   In   Belgium,   the   strong   bancassurance   tradition   offers   interesting  opportunities  of  combining  the  insurance  and  bank  data  to  create  powerful  predictive  models.  

3.1.4 Examples  of  predictive  modelling  for  underwriting  1°  Use  the  360  view  on  the  customer  and  predictive  models  to  maximize  profitability  and  gain  more  business.  

By   thoroughly   analysing   data   from   different   sources   and   applying   analytics   to   gain   insight,   insurance  companies   should   strive   to   develop   a   comprehensive   360-­‐degree   customer   view.   The   gains   of   this  complete  and  accurate  view  of  the  customer  are  twofold:  

• Maximizing  the  profitability  of  the  current  customer  portfolio  through:  o detecting  cross-­‐sell  and  up-­‐sell  opportunities;  o customer  satisfaction  and  loyalty  actions,  o effective   targeting  of  products  and  services   (e.g.    customers   that  are  most   likely   to  be   in  

good  health  or  those  customers  that  are  less  likely  to  have  a  car  accident).  • Acquiring  more   profitable   new   customers   at   a   reduced  marketing   cost:   modelling   the   existing  

customers  will   lead  to  useful   information  to   focus  marketing  campaigns  on  the  most   interesting  prospects.  

By   combining  data  mining   and  analytics,   insurance   companies   can  better  understand  which   customers  are  most   likely   to   buy,   discover  who   are   their  most   profitable   customers   and   how   to   attract   or   retain  more   of   them.   Another   use   case   can   be   the   evaluation   of   the   underwriting   process   to   improve   the  customer  experience  during  this  on-­‐boarding  process.  

2°  Predictive  underwriting  for  life  insurance6  

Using  predictive  models,  in  theory  it  is  possible  to  predict  the  death  probability  of  a  customer.  However,  the  low  frequency  of  life  insurance  claims  presents  a  challenge  to  modellers.  While  for  car  insurance,  the  probability  of  a  customer  having  a  claim  can  be  around  10%,  for  life  insurance  it  is  around  0,1%  for  the  first   year.  Not  only  does   this  mean   that   a   significant   in   force  book   is  needed   to  have   confidence   in   the  results,   but   also   that   sufficient  history   should  be  present   to  be   able   to   show  mortality   experience  over  time.  For  this  reason,  using  the  underwriting  decision  as  the  variable  to  predict  is  a  more  common  choice.  

All  life  insurance  companies  hold  historical  data  on  medical  underwriting  decisions  that  can  be  leveraged  to  build  predictive  models  that  predict  underwriting  decisions.  Depending  on  how  the  model  is  used,  the  outcome  can  be  a  reduction  of  costs  for  medical  examinations,  to  have  more  customer  friendly  processes  by   avoiding   asking   numerous   invasive   personal   questions   or   a   reduction   in   time   needed   to   assess   the  risks  by   automatically   approving   good   risks   and   focusing  underwriting   efforts   on  more   complex   cases.  For   example,   if   the   predictive  model   tells   you   that   a   new   customer   has   a   high   degree   of   similarity   to  customers   that   passed   the   medical   examination,   the   medical   examination   could   be   waved   for   this  customer.  

If  this  sounds  scary  for  risk  professionals,  first  a  softer  approach  can  be  tested,  for  instance  by  improving  marketing   actions   by   targeting   only   those   individuals   that   have   a   high   likelihood   to   be   in   good  health.  This   not   only   decreases   the   cost   of   the   campaign,   but   also   avoids   the   disappointment   of   a   potential  customer  who  is  refused  during  the  medical  screening  process.  

 

 

                                                                                                                         5  http://www.customfitonline.com/news/2012/10/19/4-­‐cs-­‐versus-­‐the-­‐4-­‐ps-­‐of-­‐marketing/  6  Predictive  modeling  for  life  insurance,  April  2010,  Deloitte  

Page 8: IABE Big Data information paper - An actuarial perspective

   

8

3.1.5 Challenges  of  predictive  modelling  in  underwriting7  Predictive  models   can   only   be   as   good   as   the   input   used   to   calibrate   the  model.   The   first   challenge   in  every  predictive  modelling  project  is  to  collect  relevant,  high  quality  data  of  which  a  history  is  present.  As  many   insurers   are   currently   replacing   legacy   systems   to   reduce  maintenance   costs,   this   can   be   at   the  expense  of   the  history.  Actuaries  are  uniquely  placed   to  prevent   the  history  being   lost,   as   for  adequate  risk   management;   a   portfolio’s   history   should   be   kept.   The   trend   of   moving   all   policies   from   several  legacy  systems  into  one  modern  single  policy  administration  system  is  an  opportunity  that  must  be  seized  so  in  the  future  data  collection  will  be  easier.  

Once  the  necessary  data  are  collected,  some  legal  or  compliance  concerns  need  to  be  addressed  as  there  might  be  boundaries  to  using  certain  variables  in  the  underwriting  process.  In  Europe,   if  the  model  will  influence  the  price  of  the   insurance,  gender   is  no   longer  allowed  as  an  explanatory  variable.  And  this   is  only   one   example.   It   is   important   that   the  purpose  of   the  model   and   the  possible   inputs   are  discussed  with  the  legal  department  prior  to  starting  the  modelling.  

Once   the  model   is  built,   it   is   important   that   the  users   realize   that  no  model   is  perfect.  This  means   that  residual  risks  will  be  present  and  this  should  be  put   in  the  balance  against  the  gains  that  the  use  of  the  model  can  bring.  

And  finally,  once  a  predictive  model  has  been  set  up,  a  continuous  reviewing  cycle  must  be  put  in  place  that  collects  feedback  from  the  underwriting  and  sales  teams  and  collects  data  to  improve  and  refine  the  model.  Building  a  predictive  model  is  a  continuous  improvement  process,  not  a  one-­‐off  project.  

3.2 Insurance  pricing  3.2.1 Overview  of  existing  pricing  techniques  The   first   rate-­‐making   techniques  were   based   on   rudimentary  methods   such   as   univariate   analysis   and  later   iterative  standardized  univariate  methods  such  as  the  minimum  bias  procedure.  They  look  at  how  changes  in  one  characteristic  result  in  differences  in  loss  frequency  or  severity.    

Later   on   insurance   companies   moved   to   multivariate   methods.   However,   this   was   associated   with   a  further   development   of   the   computing   power   and   data   capabilities.   These   techniques   are   now   being  adopted  by  more  and  more  insurers  and  are  becoming  part  of  everyday  business  practices.  Multivariate  analytical   techniques   focus  on   individual   level  data  and  take   into  account   the  effects   (interactions)   that  many  different  characteristics  of  a  risk  have  on  one  another.  As  it  was  explained  in  the  previous  section,  many  companies  use  predictive  modelling   (a   form  of  multivariate  analysis)   to  create  measures  of   the  likelihood  that  a  customer  will  purchase  a  particular  product.  Banks  use  these  tools  to  create  measures  (e.g.   credit   scores)   of  whether   a   client  will   be   able   to  meet   lending  obligations   for   a   loan  or  mortgage.  Similarly,   P&C   insurers   can   use   predictive   models   to   predict   claim   behaviour.   Multivariate   methods  provide  valuable  diagnostics  that  aid  in  understanding  the  certainty  and  reasonableness  of  results.    

Generalized  Linear  Models  are  essentially  a  generalized  form  of  linear  models.  This  family  encompasses  normal   error   linear   regression   models   and   the   nonlinear   exponential,   logistic   and   Poisson   regression  models,  as  well  as  many  other  models,  such  as  log-­‐linear  models  for  categorical  data.  Generalized  linear  models  have  become  the  standard  for  classification  rate-­‐making  in  most  developed  insurance  markets—particularly  because  of  the  benefit  of  transparency.  Understanding  the  mathematical  underpinnings  is  an  important  responsibility  of  the  rate-­‐making  actuary  who  intends  to  use  such  a  method.  Linear  models  are  a   good   place   to   start   as   GLMs   are   essentially   a   generalized   form   of   such   a   model.   As   with   many  techniques,  visualizing   the  GLM  results   is  an   intuitive  way  to  connect   the   theory  with   the  practical  use.  GLMs   do   not   stand   alone   as   the   only  multivariate   classification  method.   Other  methods   such   as   CART,  factor  analysis,  and  neural  networks  are  often  used  to  augment  GLM  analysis.    

In  general  the  data  mining  techniques  listed  above  can  enhance  a  rate-­‐making  exercise  by:  

• whittling   down   a   long   list   of   potential   explanatory   variables   to   a  more  manageable   list   for   use  within  a  GLM;  

• providing  guidance  in  how  to  categorize  discrete  variables;  

                                                                                                                         7  Predictive  modelling  in  insurance:  key  issues  to  consider  throughout  the  lifecycle  of  a  model  

Page 9: IABE Big Data information paper - An actuarial perspective

   

9

• reducing   the   dimension   of   multi-­‐level   discrete   variables   (i.e.,   condensing   100   levels,   many   of  which  have  few  or  no  claims,  into  20  homogenous  levels);  

• identifying   candidates   for   interaction   variables   within   GLMs   by   detecting   patterns   of  interdependency  between  variables.    

3.2.2 Old  versus  new  modelling  techniques  The  adoption  of  GLMs  resulted   in  many  companies  seeking  external  data  sources   to  augment  what  had  already   been   collected   and   analysed   about   their   own   policies.   This   includes   but   is   not   limited   to  information   about   geo-­‐demographics,   sensor   data,   social   media   information,   weather,   and   property  characteristics,   information   about   insured   individuals   or   business.   This   additional   data   helps   actuaries  further  improve  the  granularity  and  accuracy  of  classification  rate-­‐making.  Unfortunately  this  new  data  is  very   often   unstructured   and   massive,   and   hence   the   traditional   generalized   linear   model   (GLM)  techniques  become  useless.  

With   so   many   unique   new   variables   in   play,   it   can   become   a   very   difficult   task   to   identify   and   take  advantage   of   the   most   meaningful   correlations.   In   many   cases,   GLM   techniques   are   simply   unable   to  penetrate  deeply  into  these  giant  stores.  Even  in  the  cases  when  they  can,  the  time  constraints  required  to  uncover  the  critical  correlations  tend  to  be  onerous,  requiring  days,  weeks,  and  even  months  of  analysis.  Only   with   advanced   techniques,   and   specifically   machine   learning,   can   companies   generate   predictive  models  to  take  advantage  of  all  the  data  they  are  capturing.    

Machine   learning   is   the  modern  science  of   finding  patterns  and  making  predictions   from  data  based  on  work   in   multivariate   statistics,   data   mining,   pattern   recognition,   and   advanced/predictive   analytics.  Machine  learning  methods  are  particularly  effective  in  situations  where  deep  and  predictive  insights  need  to  be  uncovered  from  data  sets  that  are  large,  diverse  and  fast  changing  —  Big  Data.  Across  these  types  of  data,  machine  learning  easily  outperforms  traditional  methods  on  accuracy,  scale,  and  speed.  

3.2.3 Personalized  and  Real-­‐time  pricing  –  Motor  Insurance  In  order  to  price  risk  more  accurately,  insurance  companies  are  now  combining  analytical  applications  –  e.g.  behavioural  models  based  on  customer  profile  data  –  with  a  continuous  stream  of  real  time  data  –  e.g.  satellite  data,  weather  reports,  vehicle  sensors  –  to  create  detailed  and  personalized  assessment  of  risk.  

Usage-­‐based   insurance   (UBI)   has   been   around   for   a  while   –   it   began  with   Pay-­‐As-­‐You-­‐Drive   programs  that  gave  drivers  discounts  on  their  insurance  premiums  for  driving  under  a  set  number  of  miles.  These  soon   developed   into   Pay-­‐How-­‐You-­‐Drive   programs,   which   track   your   driving   habits   and   give   you  discounts  for  'safe'  driving.  

UBI  allows  a  firm  to  snap  a  picture  of  an  individual's  specific  risk  profile,  based  on  that  individual's  actual  driving  habits.  UBI  condenses  the  period  of  time  under  inspection  to  a  few  months,  guaranteeing  a  much  more  relevant  pool  of  information.  With  all  this  data  available,  the  pricing  scheme  for  UBI  deviates  greatly  from   that   of   traditional   auto   insurance.   Traditional   auto   insurance   relies   on   actuarial   studies   of  aggregated  historical  data   to  produce   rating   factors   that   include  driving   record,   credit-­‐based   insurance  score,  personal  characteristics  (age,  gender,  and  marital  status),  vehicle  type,  living  location,  vehicle  use,  previous  claims,  liability  limits,  and  deductibles.    

Policyholders   tend   to   think  of   traditional   auto   insurance   as   a   fixed   cost,   assessed   annually   and  usually  paid  for  in  lump  sums  on  an  annual,  semi-­‐annual,  or  quarterly  basis.  However,  studies  show  that  there  is  a  strong   correlation   between   claim   and   loss   costs   and  mileage   driven,   particularly  within   existing   price  rating  factors  (such  as  class  and  territory).  For  this  reason,  many  UBI  programs  seek  to  convert  the  fixed  costs  associated  with  mileage  driven  into  variable  costs  that  can  be  used  in  conjunction  with  other  rating  factors   in   the   premium   calculation.   UBI   has   the   advantage   of   utilizing   individual   and   current   driving  behaviours,  rather  than  relying  on  aggregated  statistics  and  driving  records  that  are  based  on  past  trends  and  events,  making  premium  pricing  more  individualized  and  precise.  

3.2.4 Advantages  UBI  programs  offer  many   advantages   to   insurers,   consumers   and   society.   Linking   insurance  premiums  more   closely   to   actual   individual   vehicle  or   fleet  performance  allows   insurers   to  price  premiums  more  accurately.   This   increases   affordability   for   lower-­‐risk   drivers,   many   of   whom   are   also   lower-­‐income  drivers.  It  also  gives  consumers  the  ability  to  control  their  premium  costs  by  encouraging  them  to  reduce  

Page 10: IABE Big Data information paper - An actuarial perspective

   

10

miles   driven   and   adopt   safer   driving   habits.   The   use   of   telematics   helps   insurers   to   more   accurately  estimate  accident  damages  and  reduce  fraud  by  enabling  them  to  analyse  the  driving  data  (such  as  hard  breaking,  speed,  and  time)  during  an  accident.  This  additional  data  can  also  be  used  by  insurers  to  refine  or  differentiate  UBI  products.    

3.2.5 Shortcomings/challenges    3.2.5.1 Organization  and  resources  Taking   advantage   of   the   potential   of   Big   Data   requires   some   different   approaches   to   organization,  resources,   and   technology.   As   in   many   new   technologies   that   offer   promise,   there   are   challenges   to  successful   implementation   and   the   production   of   meaningful   business   results.   The   number   one  organizational  challenge  is  determining  the  business  value,  with  financing  as  a  close  second.  Talent  is  the  other   big   issue   –   identifying   the  business   and   technology   experts   inside   the   enterprise,   recruiting  new  employees,  training  and  mentoring  individuals,  and  partnering  with  outside  resources  is  clearly  a  critical  success  factor  for  Big  Data.  Implementing  the  new  technology  and  organizing  the  data  are  listed  as  lesser  challenges  by  insurers,  although  there  are  still  areas  that  require  attention.  

3.2.5.2 Technology  challenges  The  biggest  technology  challenge  in  the  Big  Data  world  is  framed  in  the  context  of  different  Big  Data  “V”  characteristics.   These   include   the   standard   three   V’s   of   volume,   velocity,   and   variety,   plus   two  more   –  veracity   and   value.   The   variety   and   veracity   of   the   data   presents   the   biggest   challenges.   As   insurers  venture   beyond   analysis   of   structured   transaction   data   to   incorporate   external   data   and   unstructured  data  of  all  sorts,  the  ability  to  combine  and  input  the  data  into  an  analytic  analysis  may  be  complicated.  On  one  hand,  the  variety  expresses  the  promise  of  Big  Data,  but  on  the  other  hand,  the  technical  challenges  are   significant.   The   veracity   of   the   data   is   also   deemed   as   a   challenge.   It   is   true   that   some   Big   Data  analyses  do  not  require  the  data  to  be  as  cleaned  and  organized  as   in  traditional  approaches.  However,  the  data  must  still  reflect  the  underlying  truth/reality  of  the  domain.  

3.2.5.3 Technology  Approaches  Technology  should  not  be  the  first  focus  area  for  evaluating  the  potential  of  Big  Data  in  an  organization.  However,   choosing   the   best   technology   platform   for   your   organization   and   business   problems   does  become  an   important  consideration   for  success.  Cloud  computing  will  play  a  very   important  role   in  Big  Data.  Although  there  are  challenges  and  new  approaches  required  for  Big  Data,  there  is  a  growing  body  of  experience,  expertise,  and  best  practices  to  assist  in  successful  Big  Data  implementations.  

3.3 Insurance  Reserving  Loss  reserving  is  a  classic  actuarial  problem  encountered  extensively  in  motor,  property  and  casualty  as  well   as   in   health   insurance.   It   is   a   consequence   of   the   fact   that   insurers   need   to   set   reserves   to   cover  future   liabilities   related   to   the  book  of   contracts.   In  other  words   the   insurer  has   to  hold   funds  aside   to  meet  future  liabilities  attached  to  incurred  claims.    In  non-­‐life  insurance,  most  policies  run  for  a  period  of  12  months.  However  the  claims  payment  process  can  take  years  or  even  decades.  In  particular,  losses  arising  from  casualty  insurance  can  take  a  long  time  to   settle   and   even  when   the   claims   are   acknowledged,   it  may   take   time   to   establish   the   extent   of   the  claims   settlement   costs.   A   well-­‐known   and   costly   example   is   provided   by   the   claims   from   asbestos  liabilities.  Thus  it  is  not  a  surprise  that  the  biggest  item  on  the  liabilities  side  of  an  insurer’s  balance  sheet  is   often   the   provision   of   reserves   for   future   claims   payments.   It   is   the   job   of   the   reserving   actuary   to  predict,   with  maximum   accuracy,   the   total   amount   necessary   to   pay   those   claims   that   the   insurer   has  legally  committed  to  cover  for.    Historically,  reserving  was  based  on  deterministic  calculations  with  pen  and  paper,  combined  with  expert  judgement.   Since   the   1980s,   the   arrival   of   personal   computers   and   ‘spreadsheet’   software   packages  induced  a  real  change  for  the  reserving  actuaries.  The  use  of  spreadsheets  does  not  only  result  in  gain  of  calculation   time  but  allows  also   testing  different  scenarios  and  the  sensitivity  of   the   forecasts.  The   first  simple  models  used  by  actuaries  started  to  evolve  towards  more  developed  ideas  through  the  evolution  of   the   IT   resources.   Moreover   the   recent   changes   in   regulatory   requirements,   such   as   Solvency   II   in  Europe,  have  showed  the  need  of  stochastic  models  and  more  precise  statistical  techniques.        

Page 11: IABE Big Data information paper - An actuarial perspective

   

11

3.3.1 Classical  methods  There  are  a  lot  of  different  frameworks  and  models  used  by  reserving  actuaries  to  compute  the  technical  provisions,  and  it  is  not  the  goal  of  this  paper  to  review  them  in  an  exhaustive  way  but  rather  to  show  that  they  share  the  central  notion  of  triangle.  A  triangle  is  a  way  of  presenting  data  in  the  form  of  a  triangular  structure  showing  the  development  of  claims  over  time  for  each  origin  period.  An  origin  period  can  be  the  year  the  policy  was  written  or  earned,  or  the  loss  occurrence  period.      After  having  used  deterministic  models,  reserving  generally  switches  to  stochastic  models.  These  models  allow  for  quantifying  reserve  risk.      The  use  of  models  based  on  aggregated  data  used  to  be  convenient   in  the  past  when  IT  resources  were  limited  but  is  more  and  more  questionable  nowadays  when  we  have  huge  computational  power  at  hand  at   an  affordable  price.  Therefore   there   is   a  need   to  move   to  models   that   fully  use  data   available   in   the  insurers’  data  warehouses.    3.3.2 Micro-­‐level  reserving  methods  Unlike  aggregate  models  (or  macro-­‐level  models),  micro-­‐level  reserving  methods  (also  called  individual  claim   level   models)   use   individual   claims   data   as   inputs   and   estimate   outstanding   liabilities   for   each  individual  claim.  Unlike  the  models  detailed  in  the  previous  section,  they  model  very  precisely  the  lifetime  development   process   of   each   individual   claim,   including   events   such   as   claim   occurrence,   reporting,  payments   and   settlement.  Moreover   they   can   include  micro-­‐level   covariates   such   as   information   about  the  policy,  the  policyholder,  claim,  claimant  and  transactions.    When  well  specified,  such  models  are  expected  to  generate  reliable  reserve  estimates.  Indeed  the  ability  to   model   the   claims   development   at   the   individual   level   and   to   incorporate   micro-­‐level   covariate  information  allows  micro-­‐level  models  to  handle  heterogeneities  in  claims  data  efficiently.  Moreover  the  large   amount   of   data   used   in  modelling   can   help   to   avoid   issues   of   over-­‐parameterization   and   lack   of  robustness.   As   a   consequence,   micro-­‐level   models   are   especially   significant   under   changing  environments,  as  these  changes  can  be  indicated  by  appropriate  covariates.    3.4 Claims  Management  Big  Data  can  play  a  tremendous  role  in  the  improvement  of  claims  management.  It  provides  access  to  data  that  was  not  available  before,  and  makes  the  claims  processing  faster.  Therefore  it  enables  improved  risk  management,   reduces   loss   adjustment   expenses   and   enhances   quality   of   service   resulting   in   increased  customer  retention.  Below  we  present  details  of  how  Big  Data  analytics  improves  fraud  detection  process.  

3.4.1 Fraud  detection  It  is  estimated  that  a  typical  organization  loses  5%  of  its  revenues  to  fraud  each  year8.    The  total  cost  of  insurance  fraud  (non-­‐health  insurance)  in  the  US  is  estimated  to  be  more  than  $40  billion  per  year9.    The  advent  of  Big  Data  &  Analytics  has  provided  new  and  powerful  tools  to  fight  fraud.      

3.4.2 What  are  the  current  challenges  in  fraud  detection?  The  first  challenge  is  finding  the  right  data.    Analytical  models  need  data  and  in  a  fraud  detection  setting  this   is   not   always   that   evident.     Collected   fraud   data   are   often   very   skew,  with   typically   less   than   1%  fraudsters,  which  seriously   complicates   the  detection   task.    Also   the  asymmetric   costs  of  missing   fraud  versus   harassing   non-­‐fraudulent   customers   represent   important   model   difficulties.     Furthermore,  fraudsters   try   to   constantly   outperform   the   analytical   models   such   that   these   models   should   be  permanently  monitored  and  re-­‐configured  on  an  ongoing  basis.      

3.4.3 What  analytical  approaches  are  being  used  to  tackle  fraud?  Most   of   the   fraud   detection   models   in   use   nowadays   are   expert   based   models.     When   data   becomes  available,  one  can  start  doing  analytics.    A  first  approach  is  supervised  learning  which  analyses  a  labelled  data   set   of   historically   observed   fraud   behaviour.     It   can   be   used   to   both   predict   fraud   as  well   as   the  amount   thereof.     Unsupervised   learning   starts   from   an   unlabelled   data   set   and   performs   anomaly  detection.     Finally,   Social   network   learning   analyses   fraud   behaviour   in   networks   of   linked   entities.    Throughout  our  research,  it  has  been  found  that  this  approach  is  superior  to  all  others!                                                                                                                              8  www.acfe.com  9  www.fbi.gov  

Page 12: IABE Big Data information paper - An actuarial perspective

   

12

3.4.4 What  are  the  key  characteristics  of  successful  analytical  models  for  fraud  detection?    Successful  fraud  analytical  models  should  satisfy  various  requirements.    First,  they  should  achieve  good  statistical  performance  in  terms  of  recall  or  hit  rate,  which  is  the  percentage  of  fraudsters  labelled  by  the  analytical  model   as   suspicious,   and   precision,  which   is   the   percentage   of   fraudsters   amongst   the   ones  labelled   as   suspicious.     Next,   the   analytical   models   should   not   be   based   on   complex   mathematical  formulas  (such  as  neural  networks,  support  vector  machines,...)  but  should  provide  clear  insight  into  the  fraud   mechanisms   adopted.     This   is   particularly   important   since   the   insights   gained   will   be   used   to  develop  new   fraud  prevention   strategies.     Also   the   operational   efficiency   of   the   fraud   analytical  model  needs   to  be  evaluated.    This   refers   to   the  amount  of   resources  needed   to   calculate   the   fraud  score  and  adequately  act  upon  it.    E.g.,  in  a  credit  card  fraud  environment,  a  decision  needs  to  be  made  within  a  few  seconds  after  the  transaction  was  initiated.      

3.4.5 Use  of  social  network  analytics  to  detect  fraud10    Research   has   proven   that   network   models   significantly   outperform   non-­‐network   models   in   terms   of  accuracy,  precision  and  recall.  Network  analytics  can  help   improve  fraud  detection  techniques.  Fraud  is  present  in  many  critical  human  processes  such  as  credit  card  transactions,  insurance  claim  fraud,  opinion  fraud,   social   security   fraud...   Fraud   can   be   defined   by   the   following   five   characteristics.     Fraud   is   an  uncommon,  well-­‐considered,   imperceptibly   concealed,   time-­‐evolving   and   often   carefully   organized   crime,  which   appears   in  many   types   and   forms.   Before   applying   fraud   detection   techniques,   these   five   issues  should  be  resolved  or  counterbalanced.      Fraud   is   an   uncommon   crime   and   this   means   that   it   is   an   extremely   skewed   class   distribution.  Rebalancing  techniques  could  be  used  such  as  the  SMOTE  to  counterbalance  this  effect.  SMOTE  consists  in  under  sampling  the  majority  class  of  data  (reduce  the  number  of  legitimate  cases)  and  oversampling  the  minority  class  of  data  (duplicate  of  fraud  cases  or  create  artificial  fraud  cases).      

Complex   fraud  structures  are  well-­‐considered,   this   implies   that   there  will  be  changes   in  behaviour  over  time  so  not  every  time  period  will  have  the  same   importance.  A  temporal  weighting  adjustment  should  put  an  emphasis  on  the  more  important  periods  (more  recent  data  periods)  that  could  be  explanatory  of  the  fraudulent  behaviour.  

Fraud  is  imperceptibly  concealed  meaning  that  it  is  difficult  to  identify  fraud.  One  could  leverage  on  expert  knowledge  to  create  features  and  help  identify  fraud.    

Fraud   is   time-­‐evolving.   The   period   of   study   should   be   selected   carefully   taking   into   consideration   that  fraud   evolves   over   time.   How  much   of   previous   time   periods   could   explain   or   affect   the   present?   The  model  should  incorporate  these  changes  over  time.  Another  question  to  rise  is  in  what  time-­‐window  the  model  should  be  able  to  detect  fraud:  short,  medium  or  long  term.  

The   last   characteristic   of   fraud   is   that   it   is  most   of   the   time   carefully   organized.   Fraud   is   often   not   an  individual   phenomenon,   in   fact   there   are  many   interactions   between   fraudsters.   Often   there   are   fraud  sub-­‐networks   developing   in   a   bigger   network.   Social   network   analysis   could   be   used   to   detect   these  networks.    

Social  Network  analysis  helps  deriving  useful  patterns  and  insights  by  exploiting  the  relational  structure  between  objects.  

A   network   consists   of   two   set   of   elements:   the   objects   of   the   network  which   are   called   nodes   and   the  relationships  between  nodes  which  are  called  links.  The  links  connect  two  or  more  nodes.    A  weight  could  be   assigned   to   the   nodes   and   links   to   measure   the   magnitude   of   the   crime   or   the   intensity   of   the  relationship.  When  constructing  such  networks,  focus  will  be  put  on  the  neighbourhood  of  a  node  which  is  a  subgraph  of  network  around  the  node  of  interest  (fraudster).    

Once   a   network   has   been   constructed,   how   could   this   network   be   used   as   an   indicator   of   fraudulent  activities?   Fraud   could   be   detected   by   answering   following   question:   Does   the   network   contain  statistically   significant   patterns   of   homophily?   Detection   of   fraud   relies   on   a   concept   often   used   in  sociology  which  is  called  homophily.  Homophily  in  networks  consists  in  people  have  a  strong  tendency  to                                                                                                                            10  based  on  the  research  of  Véronique  Van  Vlasselaer  (KULeuven)    

Page 13: IABE Big Data information paper - An actuarial perspective

   

13

associate  with  other  whom  they  perceive  as  being  similar  to  themselves  in  some  way.  This  concept  could  be   translated   in   fraud  networks:   fraudulent  people  are  more   likely   to  be  connected   to  other   fraudulent  people.   Clustering   techniques   could   be   used   to   detect   significant   pattern   of   homophily   and   thus   could  spot  fraudsters.    

Given  a  homophilic  network  with  evidence  of  fraud  clusters  then  it  is  possible  to  extract  features  from  the  network   around   the   node(s)   of   interest   (fraud   activity)  which   is   also   called   the   neighbourhood   of   the  node.  This  process  is  called  the  featurization  process:  extracting  features  for  each  network  object  based  on  its  neighbourhood.    Focus  will  be  put  on  the  first-­‐order  neighbourhood  (first-­‐degree  links)  also  known  as  the  “egonet”.  (ego:  node  of  interest  surrounded  by  its  direct  associates  known  as  alters).  Featurization  extraction  happens  at  two  levels:  egonet  generic  features  (how  many  fraudulent  resources  are  associated  to  that  company,  is  there  relationships  between  resources...)  and  alter  specific  features  (how  similar  are  the  alter  to  the  ego,  is  the  alter  involved  in  many  fraud  cases  or  not).    

Once   these   first-­‐order   neighbourhood   features   for   each   subject   of   interest   (companies)   have   been  extracted  such  as  degree  of  fraudulent  resources,  the  weight  of  the  fraudulent  resources,  it  is  then  easy  to  derive  the  propagation  effect  of  these  fraudulent  influences  through  the  network.    

To   conclude,   network   models   always   outperform   non-­‐network   models   as   they   are   able   to   better  distinguish   fraudsters   from   non-­‐fraudsters.     They   are   also   more   precise   in   generating   high-­‐risk  companies  and  smaller  list  and  better  detect  more  fraudulent  corporates.  

3.4.6 Fraud  detection  in  motor  insurance  –  Usage-­‐Based  Insurance  example  In   2014,   Coalition   Against   Insurance   Fraud11,   with   assistance   of   business   analytics   company   SAS,   has  published  a  report  in  which  it  stresses  that  technology  plays  a  growing  role  in  fighting  fraud.  “Insurers  are  investing  in  different  technologies  to  combat  fraud,  but  a  common  component  to  all  these  solutions  is  data,”  said   Stuart   Rose,   Global   Insurance   Marketing   Principal   at   SAS.   “The   ability   to   aggregate   and   easily  visualize   data   is   essential   to   identify   specific   fraud   patterns.”   “Technology   is   playing   a   larger   and   more  trusted  role  with  insurers  in  countering  growing  fraud  threats.  Software  tools  provide  the  efficiency  insurers  need  to  thwart  more  scams  and  impose  downward  pressure  on  premiums  for  policyholders,”  said  Dennis  Jay,  the  Coalition’s  executive  director.  

In  motor  insurance,  a  good  example  is  Usage-­‐Based  Insurance  (UBI),  where  insurers  can  benefit  from  the  superior   fraud   detection   that   telematics   can   provide.   It   equips   an   insurer   with   driving   behaviour   and  driving  exposure  patterns  including  information  about  speeding,  driving  dynamics,  driving  trips,  day  and  night  driving  patterns,  garaging  address  or  mileage.   In  some  sense  UBI  can  become  a  “lie  detector”  and  can  help  companies  to  detect   falsification  of  the  garaging  address,  annual  mileage  or  driving  behaviour.  Thanks  to  recording  vehicle’s  geographical   location  and  detecting  sharp  braking  and  harsh  acceleration  during  an  accident,  an  insurer  can  analyse  accident  details  and  estimate  accident  damages.  The  telematics  devices   used   in   the   UBI   can   contain   first   notice   of   loss   (FNOL)   services,   providing   very   valuable  information  for  insurers.  Analytics  performed  on  this  data  provide  additional  evidence  to  consider  when  investigating  a  claim,  and  can  help  to  reduce  fraud  and  claims  disputes.  

4 Legal  aspects  of  Big  Data  4.1 Introduction  Data  processing  lies  at  the  very  heart  of  the  insurance  activities.  Insurers  and  intermediaries  collect  and  process  vast  amounts  of  personal  data  about   their  customers.  At   the  same  time  they  are  dealing  with  a  particular  type  of  ‘discrimination’  among  their  insureds.  Like  all  businesses  operating  in  Europe,  insurers  are   subject   to   European   and   national   data   protection   laws   and   anti-­‐discrimination   rules.   The   fast  technological   evolution   and   globalization   has   activated   a   comprehensive   reform   of   the   current   Data  Protection   laws.  The  EU  hopes   to  complete  a  new  General  Data  Protection  Regulation  at   the  end  of   this  year.  Insurers  are  concerned  that  this  new  Regulation  could  introduce  unintended  consequences  for  the  insurance  industry.          

                                                                                                                         11  http://www.insurancefraud.org/about-­‐us.htm  

Page 14: IABE Big Data information paper - An actuarial perspective

   

14

4.2 Data  processing  4.2.1 Legislation:  an  overview  Insurers   collect   and   process   data   to   analyse   risks   that   individuals   wish   to   cover,   to   tailor   products  accordingly,  to  valuate  and  pay  claims  and  benefits,  and  detect  and  prevent  insurance  fraud.  The  rise  of  Big   Data   presents   opportunities   to   offer   more   creative,   competitive   pricing   and,   importantly,   predict  customers’   behavioural   activity.   As   insurers   continue   to   explore   this   relatively   untapped   resource,  evolutions  in  data  processing  legislation  need  to  be  followed  very  closely.        The   protection   of   personal   data  was   -­‐   as   a   separate   right   granted   to   an   individual   -­‐   for   the   first   time  guaranteed  in  the  Convention  for  the  Protection  of  Individuals  with  regard  to  Automatic  Processing  of  Personal  Data  (Convention  108).  It  was  adopted  by  the  Council  of  Europe  in  1981.  

The  current,  principal  EU  legal  instrument  establishing  rules  for  fair  personal  data  processing  is  the  Data  Protection  Directive   (95/46/EC)  of  1995,  which   regulates   the  protection  of   individuals  with   regard   to  the  processing  of  personal  data  and  the  free  movement  of  such  data.  As  a  framework  law,  the  Directive  had  to  be  implemented  in  EU  Member  States  through  national  laws.  This  Directive  has  set  a  standard  for  the  legal  definition  of  personal  data  and  regulatory  responses  to  the  use  of  personal  data.  The  provisions  includes  principles  related  to  data  quality,  criteria  for  making  data  processing  legitimate  and  the  essential  right  not  to  be  subject  to  automated  individual  decisions.  

The   Data   Protection   Directive   was   complemented   by   other   legal   instruments,   such   as   the   E-­‐Privacy  Directive  (2002/58/EC),  part  of  a  package  of  5  new  Directives  that  aim  to  reform  the  legal  and  regulatory  framework  of  electronic  communications  services  in  the  EU.  Personal  data  and  individuals’  fundamental  right   to   privacy   needs   to   be   protected   but   at   the   same   time   the   legislator  must   take   into   account   the  legitimate  interests  of  governments  and  businesses.  One  of  the  innovative  provisions  of  this  Directive  was  the  introduction  of  a  legal  framework  for  the  use  of  devices  for  storing  or  retrieving  information,  such  as  cookies.  Companies  must  also  inform  customers  of  the  data  processing  to  which  their  data  will  be  subject  and   obtain   subscriber   consent   before   using   traffic   data   for   marketing   or   before   offering   added   value  services  with  traffic  or   location  data.  The  EU  Cookie  Directive  (2009/136/EC),  an  amendment  of  the  E-­‐Privacy  Directive,  aims  to  increase  consumer  protection  and  requires  websites  to  obtain  informed  consent  from  visitors  before  they  store  information  on  a  computer  or  any  web  connected  device.  

In  2006  the  EU  Data  Retention  Directive  (2006/24/EC)  was  adopted  as  an  anti-­‐terrorism  measure  after  the   terrorist   attacks   in   Madrid   and   London.   However   on   8   April   2014,   the   European   Court   of  Justice  declared   this   Directive   invalid.   The   Court   took   the   view   that   the   Directive   does   not   meet   the  principle  of  proportionality  and  should  have  provided  more  safeguards  to  protect  the  fundamental  rights  with  respect  to  private  life  and  to  the  protection  of  personal  data.  

Belgium  has  established  a  Privacy  Act  or  Data  Protection  Act   in  1992.  Since  the  introduction  of  the  EU  Data  Protection  Directive  (1995)  the  principles  of  that  directive  has  been  transposed  into  Belgian  law.  The  Privacy   Act   consequently   underwent   significant   changes   introduced   by   the   Act   of   11   December   1998.  Further  modifications  have  been  made  in  the  meantime,  including  those  of  the  Act  of  26  February  2006.  The   Belgian   Privacy   Commission   is   part   of   a   European   task   force,   which   includes   data   protection  authorities   from  the  Netherlands,  Belgium,  Germany,  France  and  Spain.   In  October  2014,  a  new  Privacy  Bill  was  introduced  in  the  Belgian  Federal  Parliament.  The  Bill  mainly  aims  at  providing  the  Belgian  Data  Protection   Authority   (DPA)   with   stronger   enforcement   capabilities   and   ensuring   that   Belgian   citizens  regain  control  over  their  personal  data.  To  achieve  this,  certain  new  measures  are  being  proposed  to  be  included  in  the  existing  legislation,  adopted  already  in  1992,  as  inspired  by  the  proposed  European  data  protection  Regulation.  

At   this   moment   the   current   data   processing   legislation   needs   an   urgent   update.   Rapid   technological  developments,   the   increasingly  globalized  nature  of  data   flows  and  the  arrival  of  cloud  computing  pose  new   challenges   for   data   protection   authorities.   In   order   to   ensure   a   continuity   of   high   level   data  protection,  the  rules  need  to  be  brought   in   line  with  technological  developments.  The  Directive  of  1995  has  also  not  prevented  fragmentation  in  the  way  data  protection  is  implemented  across  the  Union.  

In   2012   the   European   Commission   has   proposed   a   comprehensive,   pan-­‐European   reform   of   the   data  protection  rules  to  strengthen  online  privacy  rights  and  boost  Europe's  digital  economy.  On  15  June  2015,  the   Council   reached   a   ‘general   approach’   on   a   General   Data   Protection   Regulation   (GDPR)   that  

Page 15: IABE Big Data information paper - An actuarial perspective

   

15

establishes   rules   adapted   to   the   digital   era.   The   European   Commission   is   pushing   for   a   complete  agreement  between  Council  and  European  Parliament  before  the  end  of  this  year.  The  twofold  aim  of  the  Regulation  is  to  enhance  data  protection  rights  of   individuals  and  to  improve  business  opportunities  by  facilitating   the   free   flow   of   personal   data   in   the   digital   single   market.   The   Regulation   must   be  appropriately   balanced   in   order   to   guarantee   a   high   level   of   protection   of   the   individuals   and   allow  companies   to   preserve   innovation   and   competitiveness.   In   parallel   with   the   proposal   for   a   GDPR,   the  Commission  adopted  a  Directive  on  data  processing  for  law  enforcement  purposes  (5833/12).    

4.2.2 Some  concerns  of  the  insurance  industry  The  European   insurance  and   reinsurance   federation,   Insurance  Europe,   is   concerned   that   the  proposed  Regulation  could  introduce  unintended  consequences  for  the  insurance  industry  and  their  policyholders.  The   new   legislation   must   correctly   balance   an   individual’s   right   to   privacy   against   the   needs   of  businesses.   The  way   insurers   process   data  must   be   taken   into   account   appropriately   so   that   they   can  perform   their   contractual   obligations,   assess   consumers’   needs   and   risks,   innovate,   and   also   combat  fraud.  There   is  also  a  clear   tension  between  Big  Data,   the  privacy  of   the   insured’s  personal  data  and   its  availability  to  business  and  the  State.  

An  important  concern  is  that  the  proposed  rules  concerning  profiling  do  not  take  into  consideration  the  way  that  insurance  works.  The  Directive  of  1995  contains  rules  on  'automated  processing'  but  there  is  not  a  single  mention  of   'profiling'  in  the  text.  The  new  GDPR  aims  to  provide  more  legal  certainty  and  more  protection   for   individuals   with   respect   to   data   processing   in   the   context   of   profiling.   Insures   need   to  profile  potential  policyholders  to  measure  risk,  any  restrictions  on  profiling  could,  therefore,  translate  not  only   into   higher   insurance   prices   and   less   insurance   coverage,   but   also   into   an   inability   to   provide  consumers   with   appropriate   insurance.   Insurance   Europe   recommends   that   the   new   EU   Regulation  should   allow   insurance-­‐related   profiling   at   pre-­‐contractual   stage   and   during   the   performance   of   the  contract.  There  is  also  still  some  confusion  in  defining  profiling,  in  the  Council  approach  profiling  means  solely  automated  processing  while  Article  20(5)  proposed  by  the  European  Parliament,  could,  according  to   Insurance   Europe,   be   interpreted   as   prohibiting   fully   automated   processing,   requesting   human  intervention  for  every  single  insurance  contract  offered  to  consumers.    

The   proposal   of   the   EU   Council   (June   2015)   stipulates   that   the   controller   should   use   adequate  mathematical   or   statistical   procedures   for   the  profiling.  He  must   secure  personal  data   in   a  way  which  takes   account   of   the  potential   risks   involved   for   the   interests   and   rights   of   the  data   subject   and  which  prevents  inter  alia  discriminatory  effects  against  individuals  on  the  basis  of  race  or  ethnic  origin,  political  opinions,  religion  or  beliefs,  trade  union  membership,  genetic  or  health  status,  sexual  orientation  or  that  result   in   measures   having   such   effect.   Automated   decision-­‐making   and   profiling   based   on   special  categories  of  personal  data  should  only  be  allowed  under  specific  conditions.    

According   to   the  Article  29  Working  Party12   the  proposals  of   the  Council   according   to  profiling  are   still  unclear  and  do  not  foresee  sufficient  safeguards  which  should  be  put  in  place.  In  June  2015  it  renews  its  call  for  provisions  giving  the  data  subject  a  maximum  of  control  and  autonomy  when  processing  personal  data   for  profiling.  The  provisions   should   clearly  define   the  purposes   for  which  profiles  may  be   created  and  used,   including  specific  obligations  on  controllers  to  inform  the  data  subject,   in  particular  on  his  or  her  right  to  object  to  the  creation  and  the  use  of  profiles.  The  academic  Research  Group  IRISS  remarks  that  the  GDPR  does  not  clarify  whether  or  not  there  is  an  obligation  on  data  controllers  to  disclose  information  about  the  algorithm  involved  in  profiling  practices  and  suggest  clarification  on  this  point.  

Insurance  Europe   also   request   that   the  GDPR   should   explicitly   recognise   insurers’   need   to   process   and  share  data  for  fraud  prevention  and  detection.  According  to  the  Council  and  the  Article  29  Working  Party  fraud  prevention  may  fall  under  the  non-­‐exhaustive  list  of  ‘legitimate  interests’  in  Article  6(1)  (f)  and  will  provide  the  necessary  legal  basis  to  allow  processes  for  combatting  insurance  fraud.  

The   new   Regulation   proposes   also   a   new   right   to   data   portability,   enabling   easier   transmission   of  personal  data   from  one  service  provider  to  another.  This  would  allow  policyholders  to  obtain  a  copy  of  any  of  their  data  being  processed  by  an  insurer  and  insurers  could  be  forced  to  disclose  confidential  and  

                                                                                                                         12  Article  29  Working  Party  is  an  independent  advisory  body  on  data  protection  and  privacy,  set  up  under  Data  Protection  Direction  of  1995.  It  is  composed  of  representatives  from  the  national  data  protection  authorities  of  the  EU  Member  States,  the  European  Data  Protection  Supervisor  and  the  European  Commission.  

Page 16: IABE Big Data information paper - An actuarial perspective

   

16

commercially   sensitive   information.   Insurance   Europe   believes   that   the   scope   of   the   right   to   data  portability  should  be  narrowed  down,  to  make  sure  that  insurers  would  not  be  forced  to  disclose  actuarial  information  to  competitors.      

Insurers   also   need   to   retain   policyholder   information.   It   should   clearly   state   that   the   right   to   be  forgotten   should   not   apply  where   there   is   a   contractual   relationship   between   an   organisation   and   an  individual  or  where  a  data  controller  is  required  to  comply  with  regulatory  obligations  to  retain  data  or  where  the  data  is  processed  to  detect  and  prevent  fraudulent  activities.      

The   implementation   of   more   stringent,   complex   rules   will   require   insurance   firms   to   review   their  compliance  programmes.  They  will  have  to  take  account  of  increased  data  handling  formalities,  profiling,  consent   and   processing   requirements   and   the   responsibilities   and   obligations   of   controllers   and  processors.  

4.3 Discrimination  4.3.1 Legislation:  an  overview  In   2000   two   important   EU   directives   have   provided   a   comprehensive   framework   for   European   anti-­‐discrimination  law.  The  Employment  Equality  Directive  (2000/78/EC)  prohibits  discrimination  on  the  basis   of   sexual   orientation,   religion   or   belief,   age   and   disability   in   the   area   of   employment   while   the  Racial  Equality  Directive  (2000/43/EC)  combats  discrimination  on  the  grounds  of  race  or  ethnicity  in  the  context  of  employment,  the  welfare  system,  social  security,  and  goods  and  services.      The   Gender   Goods   and   Services   Directive   (2004/113/EC)   has   expanded   the   scope   of   sex  discrimination  and  requires  that  differences  in  treatment  may  be  accepted  only  if  they  are  justified  by  a  legitimate  aim.  Any  limitation  should  nevertheless  be  appropriate  and  necessary  in  accordance  with  the  criteria   derived   from   case   law   of   the   ECJ.   As   regards   the   insurance   sector,   the   Directive,   in   principle,  imposes   ‘unisex’   premiums   and   benefits   for   contracts   concluded   after   21  December   2007.  However,   it  provides   for   an   exception   to   this   principle   in   Article   5(2),  with   the   possibility   to   permit   differences   in  treatment  between  women  and  men  after  this  date,  based  on  actuarial  data  and  reliable  statistics.  In  its  Test-­‐Achats  judgment,  the  ECJ  invalidated  this  exception  because  it  was  incompatible  with  Articles  21  and  23  of  the  EU’s  Charter  of  Fundamental  Rights.    A  proposal  for  a  Council  Directive  (COM  2008  426-­‐(15))  stipulates  that  actuarial  and  risk  factors  related  to   disability   and   to   age   can   be   used   in   the   provision   of   insurance.   These   should   not   be   regarded   as  constituting  discrimination  where  the  factors  are  shown  to  be  key  factors  for  the  assessment  of  risk.    The  recent  proposal  of  the  Council  on  the  new  General  Data  Protection  Regulation  (June  2015)  states  that  the  processing  of  special  categories  of  personal  (sensitive)  data,  revealing  racial  or  ethnic  origin,  political  opinions,  religious  or  philosophical  beliefs,   trade-­‐union  membership,  and  the  processing  of  genetic  data  or  data  concerning  health  or  sex  life  shall  be  prohibited.  Derogations  from  this  general  prohibition  should  be   explicitly   provided   inter   alia  where   the   data   subject   gives   explicit   consent   or   in   respect   of   specific  needs,   in  particular  where   the  processing   is   carried  out   in   the  course  of   legitimate  activities  by  certain  associations  or  foundations  the  purpose  of  which  is  to  permit  the  exercise  of  fundamental  freedoms.      In   Belgium   the   EU   Directive   2000/78/EC   is   transposed   to   the   national   legislation   with   the   anti-­‐discrimination   Law   of   10   May   2007   (BS   30.V.2007).   This   law   has   been   amended   by   the   law   of   30  December  2009  (BS  31.XII.2009)  and  by  the  law  of  17  Augustus  2013  (BS  5.III.2014).  Due  to  the  federal  organization  of  Belgium,   laws  prohibiting  discrimination  are  complex  and   fragmented  because   they  are  made  and  implemented  by  six  different  legislative  bodies,  each  within  its  own  sphere  of  competence.      4.3.2 Tension  between  insurance  and  anti-­‐discrimination  law  Insurance   companies   are   dealing  with   a   particular   type   of   ‘discrimination’   among   their   insureds.   They  attempt  to  segregate  insureds  into  separate  risk  pools  based  on  their  differences  in  risk  profiles,  first,  so  that   they   can   charge   different   premiums   to   the   different   groups   based   on   their   risk   and,   second,   to  incentivize  risk  reduction  by  insureds.  They  openly  ‘discriminate’  among  individuals  based  on  observable  characteristics.   Accurate   risk   classification   and   incentivizing   risk   reduction   provide   the   primary  justifications  for  why  we  let  insurers  ‘discriminate’.  [30]    

Page 17: IABE Big Data information paper - An actuarial perspective

   

17

Regulatory  restrictions  on   insurers’  risk  classifications  can  produce  moral  hazard  and  generate  adverse  selection.  Davey   [31]   remarks   that   insurance   and   anti-­‐discrimination   law  are  defending   a   fundamental  different   perspective   to   risk   assessment.   Insurance   has   often   defended   its   practices   as   ‘fair  discrimination’.  They  assert   that   they  are  not  discriminating   in   the   legal  sense  by  treating  similar  cases  differently,   they   rather   are   treating   different   cases   differently.   This   clash   between   the   principal   of  insurance  and  anti-­‐discrimination  law  is  fundamental:  whether  differential  treatment  based  on  actuarial  experience   is   ‘discrimination’   in   law   or   justified   differential   treatment.   This   tension   is   felt   in   both   the  national  and  supranational  levels  as  governments  and  the  EU  seek  to  regulate  underwriting  practices.  A  good,  illustrative  example  is  the  already  mentioned  Test-­‐Achats  case.      Tension  between   insurance  and   the  Charter  of   Fundamental  Rights   is   also   clearly   felt   in   the  debate  on  genetic  discrimination  in  the  context  of  life  insurance.  Insurers  might  wish  to  use  genetic  test  results  for  underwriting,   just   as  other  medical  or   family  history  data.  The  disclosure  of   genetic  data   for   insurance  risk  analysis  will  present  complex  issues  that  overlap  those  related  to  sensitive  data  in  general.    Canada,  the  US,  Russia,  and  Japan  have  chosen  not  to  adopt  laws  specifically  prohibiting  access  to  genetic  data  for  underwriting  by  life  insurers.  In  these  countries,  insurers  treat  genetic  data  like  other  types  of  medical  or  lifestyle  data  [32].  Belgium,  France,  and  Norway  have  chosen  to  adopt  laws  to  prevent  or  limit  insurers'  access   to   genetic   data   for   life   insurance   underwriting.  The  Belgian   Parliament   has   incorporated   in   the  Law  of  25  June  1992  legislative  dispositions  that  prohibits  the  use  of  genetic  testing  to  predict  the  future  health  status  of  applicants  for  (life)  insurances.      Since  EU  member  states  have  adopted  different  approaches  on  the  use  of  genetic  data,  a  pan-­‐European  regulation   is   needed.   The   recent   proposal   of   the   Council   on   a   new  General  Data   Protection   Regulation  (June  2015)  does  not  solve  this  problem.  It  prohibits  the  processing  of  genetic  data  but  recognises  explicit  consent  as  a  valid  legal  basis  for  the  processing  of  genetic  data  and  leaves  to  Member  States  (Article  9(2)  (a))  the  decision  on  not  admitting  consent  for  legitimising  the  processing  of  genetic  data.    5 New  Frontiers  

 5.1 Risk  pooling  vs.  personalization  With  an   introduction  of  Big  Data   in   insurance,   insurance  sector   is  opening  up   to  new  possibilities,  new  innovative  offers  and  personalized  services  for  their  customers.  As  a  result  we  might  see  the  end  of  risk  pooling  and  the  rise  of  individual  risk  assessment.  It  is  said  that  these  personalized  services  will  provide  new  premiums  that  will  be  “fairer”   for  the  policyholder.   Is   it   indeed  true  that  the  imprudence  of  others  will   have   less   impact   on   your   own   insurance   premium?   This  way   of   thinking   holds   for   as   long   as   the  policyholder  does  not  have  any  claim.  In  the  world  of  totally  individualised  premium,  the  event  of  a  claim  would  increase  the  premium  of  that  policyholder  enormously.  And  that  seems  in  contradiction  with  the  way  we  think  about  insurance  i.e.  that  in  the  event  of  a  claim,  your  claim  is  paid  by  the  excess  premium  of  the  other  policyholders.  It  seems  that  with  the  introduction  of  Big  Data,  the  social  aspect  of  insurance  is  gone.  

However,  which  customer  would  like  to  subscribe  to  such  an  insurance  offer?  One  could  then  argue  that  it  is  better  to  save  the  insurance  premium  on  your  own  and  put  it  aside  for  the  possibility  of  a  future  claim.  So   in  order   to   talk  about   insurance,  risk  pooling  will  always  be  necessary.  Big  Data   is   just  changing  the  way  we  pool  the  risks.  

For  example,  until  recently,  the  premium  for  car  insurance  was  only  dependant  on  a  handful  of  indicators  (personal,  demographic  and  car  data).  Therefore,  an  insurance  portfolio  needed  to  be  big  enough  to  have  risk  pools  with  enough  diversification  on  the  other  indicators  that  could  not  be  measured.  

In  recent  years  more  and  more  indicators  can  be  measured  and  used  as  data.  This  means  that  risk  pools  don’t  have   to  be  as  big  as  before  because   the  behaviour  of  each   individual  of   the  risk  pool   is  becoming  more   and   more   predictable.   Somebody   who   speeds   all   the   time   is   more   likely   to   have   an   accident.  Previously  this  was  assumed  to  be  people  with  car  with  high  horsepower.  Nowadays,  this  behaviour  can  be  exactly  measured,  removing  the  need  for  assumptions.  

Page 18: IABE Big Data information paper - An actuarial perspective

   

18

However,  as  long  as  there  is  a  future  event  that  is  uncertain,  risk  pooling  still  makes  sense.  The  risk  pools  are  just  becoming  smaller  and  more  predictable.  In  the  example  given,  even  a  driver  who  does  not  speed  can  still  be  involved  in  an  accident.  

5.2 Personalised  Premium  Personalisation  of  risk  pricing  relies  upon  an  insurer  having  the  capacity  to  handle  a  vast  amount  of  data.  A  big  challenge  is  linked  with  data  collection,  making  sure  it  is  reliable  and  that  it  can  in  fact  be  used  for  insurance  pricing.  They  will  have  to  be  careful  not  to  be  overwhelmed  by  Big  Data.  

We  stated  above  that  the  use  of  Big  Data  will  make  insurance  pricing  fairer.  In  this  case  fair  is  defined  as  taking  into  account  all  members  of  society.  However,  this  does  not  mean  that  everyone  in  society  should  be  treated  in  exactly  the  same  way.  Every  individual  should  have  an  equal  opportunity  to  what  is  on  offer.  However,   it  can  appear   that   the  offer  does  not  meet   the  requirements  of   the  customer,  or  vice  versa.   It  that  case,  an  insurance  cover  will  not  be  possible.    

5.3 From  Insurance  to  Prevention  One  of  the  big  advantages  of  the  gathering  of  Big  Data  by  Insurance  companies  or  other  companies  is  that  this  data  can  in  a  certain  way  be  shared  with  its  customers.  In  that  way,  a  constant  interaction  can  arise  between  the  insurer  and  the  policyholder.  When  consumers  understand  better  how  their  behaviour  can  impact  their  insurance  premium,  they  can  make  changes  in  their  live  that  can  beneficial  both  parties.  

A  typical  example  of  this  is  the  use  of  telematics  in  car  insurance.  A  box  in  the  insured  car  automatically  saves   and   transmits   all   driving   information   of   the   vehicle.   The   insurance   company   uses   this   data   to  analyse   the   risk   the   policyholder   is   facing   during   driving.   When   for   example   the   driver   is   constantly  speeding   and   braking   heavily,   the   insurance   company   can   take   this   as   an   indication   to   increase   the  premium.  On  the  other  hand,  someone  who  drives  calmly  and  outside  the  busy  hours  and  only  outside  the  city  will  be  rewarded  with  a  lower  premium.  

In   this  way   insurers  will   have   an   impact   on   the  driving  behaviour  of  people.  Once   this   communication  between   policyholder   and   insurer   is   transparent,   the   policyholder   will   act   in   a   way   to   decrease   his  premium.  The  insurer  has  played  the  role  of  a  prevention  officer.  

Another  example  is  “e-­‐Health”.  As  the  health  cost  is  rising  rapidly,  insurers  are  trying  to  lower  the  claim  costs.  It  is  found  that  everyday  living  habits  of  people,  for  example,  eating  behaviour,  the  amount  of  sleep  you  get,  or  the  number  of  hours  you  do  sport  has  a  large  influence  on  health  claims.  

The  Internet  of  Things  will  have  an  impact  on  the  way  the  pricing  is  done  for  each  individual.  Thanks  to  modern  sensors,   insurer  will  be  able  to  acquire  data  at  the   individual/personal   level.  Each  policyholder  will  in  that  way  be  encouraged  to  sleep  enough,  sport  enough  and  eat  healthy.  All  in  all,  it  is  the  consumer  that  benefits  from  less  car  accidents,  a  healthy  lifestyle  and  …  lower  premiums.    

5.4 The  all-­‐seeing  Insurer  Insurance  companies  have  always  been  interested  in  gathering  as  much  information  possible  on  the  risk  being  assured  and   the  people   insuring   them.  With   the  possibilities  of  Big  Data,   this   interest   in  people’s  everyday  life  increases  enormously.  Therefore  insurance  is  becoming  more  and  more  an  embedded  part  of   the  everyday   life  of  people  and  businesses.  Previously,  consumers   just  needed  to   fill   in  some  form  at  the   beginning   of   an   insurance   contract   and   the   impact   of   that   insurance  was  more   or   less   stable   and  predictable  during   the  whole  duration  of   the  contract,  whatever   the   future  behaviour  of   the  consumer.  With  the  introduction  of  Big  Data,  insurers  have  influence  on  every  aspect  of  everyday  life.  The  way  you  drive,  what  you  buy,  what  you  don’t  buy,  the  way  you  sleep,  etc.,  can  have  a  big  impact  on  your  financial  situation.   Indeed,   insurers   are   moving   into   a   position   of   a   central   tower,   observing   our   everyday   life  through  the  bias  of  smartphones  and  all  other  devices  and  sensors.  

The  future  will  tell  us  how  far  the  general  public  will  allow  this  influence  of  insurance  companies.  Sharing  your  driving  behaviour  with  insurers  will  probably  not  be  a  problem  for  most  of  us,  but  sharing  what  we  eat  and  how  we  sleep  is  a  bigger  step.  Every  person  will  have  to  make  a  trade-­‐off  between  privacy  and  better  insurance  offer.  Currently,  for  instance  in  case  of  car  insurance  telematics,  drivers  have  an  opt-­‐in  option   and   they   can   decide  whether   they   are   interested   in   the   telematics-­‐based   offer.   However   in   the  future  data  collection  might  be  default  and  you  will  have   to  pay  extra   to  be  unlisted  and  keep  your   life  private.  

Page 19: IABE Big Data information paper - An actuarial perspective

   

19

5.5 Change  in  Insurance  business  From  an  actuarial  point  of   view  we   tend   to   focus  on   the  opportunities  big  data  hold   for  managing  and  pricing  risk.  But  the  digital  transformation  that  is  at  the  basis  of  big  data  (cfr.  the  increased  data  flow:  the  V’s   from   section   2.1   and   the   increased   computational   power:   section   2.2)   has   also   led   to   a   change   in  customer’s  expectations  and  behaviour.  The  ease  at  which  the  end  customer  can  access  information  and  interact  with  companies  and  the  way  the  digital  enterprises  have  developed  their  services  to  enhance  this  ease   of   use,   has   set   a   new   standard   in   customer   experience.   Customers   are   used   to   getting   quick   and  online  reactions   from  the  companies   they  buy  goods  and  services   from.   Industries   that  do  not  adapt   to  this   new   standard   can   quickly   get   an   image   of   old   fashion,   traditional   and   simply   not   interesting.  We  already  have  seen  new  distribution  models  changing  the  insurance  market  in  surrounding  countries,  i.e.  aggregator   websites   in   the   UK,   that   are   a   result   (or   play   into)   this   trend.   It   is   in   this   new   customer  experience  that  big  data  plays  an  important  role  and  can  be  a  real  element  of  competitive  advantage  as  it  gives  access  to  a  new  level  of  personalization.  Getting  this  personalization  right  can  give  a  company  the  buy-­‐in  into  future  customer  interactions  and  therefore  the  opportunity  for  expanding  the  customer  wallet  or   relation.   This   has   led   to   the   evolution  where   some  big   digital   retailers   have   continuously   expanded  their  offer  to  a  wide  and  loyal  customer  base,  even  into  the  insurance  business  (e.g.  Alibaba  Insurance).  If  these   players   get   it   right   they   can   change   the   insurance   distribution   landscape,   monopolizing   the  customer  relation  and  leaving  traditional  insurers  the  role  of  pure  risk  carriers.  For  now  this  evolution  is  less  noticeable   in  Belgium  where   the   traditional   Insurance  distribution  model   (brokers  and  banks)  still  firmly   holds   its   ground   giving   the   Belgium   Insurance   Industry   an   opportunity   to   modernize   (read  digitalize)  and  personalize  the  customer  experience  before  newcomers  do  so.  

6 Actuarial  sciences  and  the  role  of  actuaries  Big  Data  opens  a  new  world  for  insurance  and  any  other  activity  based  on  data.  The  access  to  the  data,  the  scope  of  the  data,  the  frequency  of  the  data,  the  extension  of  the  samples  of  the  data,  are  important  elements  that  determine  to  what  extend  the  final  decision  is   inspired  by  the  statistical  evidence.  As  Big  Data  changes  those  properties  drastically,   it  also  changes  the  environment  of   those  who  use  these  data  drastically.  The  activity  of  the  actuary  is  particularly  influenced  by  the  underlying  data,  and  therefore  it  is   appropriate   to   conclude   that   the   development   of   the   Big   Data   world   has   a   major   impact   on   the  education  and  training  of  the  actuary,  the  tools  used  by  the  actuary,  the  role  of  the  actuary  in  the  process.  Data  science  aiming   to  optimise   the  analytics   in   function  of   the  volume  and  diversity  of   the  data   is  an  upcoming   and   fast   developing   field.   The   combination  of   the   actuarial   skills   and   research   allows   for   an  optimal  implementation  of  the  insights  and  tools  offered  by  the  data  science  world.  

6.1 What  is  Big  Data  bringing  for  the  actuary?  6.1.1 Knowledge  gives  power  Big  data  gives  access  to  more  information  than  before:  this  gives  the  actuary  a  richer  basis  for  actuarial  mathematical   analysis.  When   data   are  more   granular   and   readily   available,   actuaries   can   extend   their  analysis  and  identify  better  the  risk  factors  and  the  underlying  dependencies.  Best  estimate  approaches  are   upgraded   to   stochastic   evidence.   Christophe   Geissler13   states   that   big   data   will   progressively  stimulate  the  actuary  to  abandon  purely  explicative  models  for  more  complex  models  aiming  to  identify  sub  groups  with  heterogenic  subgroups.  The  explicative  models  are  based  on  the  assumption  that  there  exists   a   formula   that   explains   the   behaviour   of   all   persons.   Big   data   and   the   calculation   power,   allow  developing  innovative  algorithms  to  detect  visible  and  verifiable  indicators  for  a  different  risk  profile.    

6.1.2 Dynamic  risk  management  “Even   if   an   actuary   uses   data   to   develop   an   informed   judgement,   that   type   of   estimate   does   not   seem  sufficient   in   today’s   era   of   Big   Data”,   a   statement   that   can   be   read   on   a   discussion   forum   of   actuaries.  Instead,  dynamic  risk  management   is  considered  to  be  an  advanced  form  of  actuarial  science.  Actuarial  science  is  about  collecting  all  pertinent  data,  using  models  and  expertise  to  factor  risks,  and  then  making  a  decision.  Dynamic  risk  management  entails  real-­‐time  decision-­‐making  based  on  a  stream  of  data.    

Scope  and  resources  Big  data  opens  the  horizon  of  the  actuary.  The  applications  of  Big  Data  go  far  beyond  the  insurance  

                                                                                                                         13  Christophe  Geissler,  Le  nouveau  big  bang  de  l’actuariat,  L’Argus  de  l’Assurance,  November  2013  

Page 20: IABE Big Data information paper - An actuarial perspective

   

20

activity  and  relate  to  various  domains  where  the  statistical  analysis  and  the  economic/financial  implications  are  essential.  Jobert  Koomans14,  board  member  of  the  Actuarial  Genootschap,  refers  to  estimates  that  Big  Data  will  create  a  big  number  of  jobs  (“1,5  million  new  data  analysts  will  be  required  in  the  US  in  2018”).  Given  that  actuaries  have  very  string  analytical  skills  combined  with  business  knowledge  thanks  to  being  involved  from  pricing  to  financial  reporting,  gives  them  a  lot  of  new  opportunities  across  different  industries.  

 6.2 What  is  the  actuary  bringing  to  Big  Data?  6.2.1 The  Subject  Matter  Expert  Data  are  a  tool  to  quantify  the  implications  of  events  and  behaviour.  The  initial  modelling  and  analysis  nevertheless  are  defining  the  framework  and  the  ultimate  outcome.  Deductive  and  inductive  approaches  can  be  used  in  this  context.    Kevin  Pledge15  refers  in  the  to  the  role  of  the  Subject  Matter  Expert.  “Understanding  the  business  is  a  critical  factor  for  analytics,  understanding  does  not  come  from  a  system,  but  from  training  and  experience.  …  Not  only  do  actuaries  have  the  quantitative  skills  to  be  data  scientists  of  insurance,  but  our  involvement  in  everything  from  pricing  to  financial  reporting  gives  us  the  business  knowledge  to  make  sense  of  this.  This  business  knowledge  is  as  important  as  the  statistical  and  quant  skills  typically  thought  of  when  you  think  data  scientist”.      Actuaries   are   well   placed   to   combine   the   data   analytics   and   the   business   knowledge.   The   specific  education  of  the  actuary  as  well  as  the  real   life  experience  in  the  insurance  industry  and  other  domains  with   actuarial   roots   are   essential   for   a   successful   implementation   of   the   big   date   approach.      6.2.2 Streamlining  the  process  The  actuary  formulates  the  objectives  and  framework  for  the  quantitative  research  and  by  this   initiates  the  Big  Data  process.  Big  data  requires  the  appropriate  technology  and  the  use  of  advanced  data  science.  Actuaries  can  help  to  optimise  this  computer  science  driven  analysis  with  their  in  depth  understanding  of  the  full  cycle.  Streamlining  the  full  process  from  detecting  the  needs  and  defining  the  models,  over  using  the  appropriate  data,  to  the  monitoring  of  the  outcome  taking  into  account  general  interest  and  specific  stakeholder  interest,  is  the  key  of  success  of  data  science  in  hands  of  the  actuary.  

 6.2.3 Simple  models  with  predictive  power  Esko  Kivisaari16:  “The  real  challenge  of  Big  Data  for  actuaries  is  to  create  valid  models  with  good  predictive  power  with  the  use  of  a  lots  of  data.  The  value  of  a  good  model  is  not  that  it   is   just  adapted  to  the  data  at  hand   but   it   should   have   predictive   power   outside   experience.   There   will   be   the   temptation   to   create  complicated  models  with  lots  of  parameters  that  closely  replicate  what  is  in  the  data.  The  real  challenge  is  to  have  the  insight  to  still  produce  simple  models  that  have  real  predictive  power.”  

 The  added  value  of  the  actuary  can  be  found  in  the  modelling  skills  and  the  ability  to  use  the  professional  judgement.   The   organisation   of   the   profession   and   the   interaction   with   peers   creates   the   framework  allowing   to  exercise   this   judgement.  Actuaries’   focus  also  goes   to  an  appropriate  communication  of   the  results  so  that  the  contribution  to  the  value  creation  can  be  optimized.        6.2.4 Information  to  the  individual  customer  Big  Data  can  help  to  find  answers  on  the  needs  of  consumers  and  society.  Customers  will  be  informed  on  their  behaviour  so  that  they  will  be  able  to  correct,  influence  and  change  the  risk  behaviour.  Actuaries  will  be   in  the  perfect  position  to  bring  the  data  back  to  the  customer,  be   it   through  the  pricing  of   insurance  products  or  through  helping  in  establishing  awareness  campaigns.                                                                                                                                      14  Jobert  Koomans,  Big  Data  –  Kennis  maakt  macht,  De  Actuaris  (Actuarieel  Genootschap),  May  2014  15  Kevin  Pledge,  Newsletters  of  the  Society  of  Actuaries,  October  2012  16  Esko  Kivisaari,  Big  Data  and  actuarial  mathematics,  working  paper  Insurance  Committee  of  the  Actuarial  Association  of  Europe,  March  2015  

Page 21: IABE Big Data information paper - An actuarial perspective

   

21

7 Conclusions  The   rise   of   technology   megatrends   like   ubiquitous   mobile   phones   and   social   media,   customer  personalization,   cloud   computing   and   Big   Data   have   enormous   impact   on   our   daily   lives   but   also   on  business  operations.  There  are  plenty  very  successful  businesses,  across  different  industries  that  regard  Big  Data  as  very  important  and  central  to  their  strategy.      In  this  information  paper  we  wanted  to  understand  what  would  be  the  impact  of  Big  Data  on  insurance  industry  and  the  actuarial  profession.  We  asked  ourselves  whether  insurers  are  immune  to  these  recent  changes?   Will   they   be   able   to   leverage   on   huge   volumes   of   new   available   data   coming   from   various  sources  (mobile  phones,  social  media,  telematics  sensors,  wearables)  and  power  of  Big  Data?        We  think  that  Big  Data  will  have  various  effects.   It  will  demand  from  companies  to  adopt  new  business  culture  and  become  data-­‐driven  businesses.   It  will  have  an   impact  on   the  entire   insurance  value  chain,  ranging  from  underwriting  to  claims  management.      Today’s   advanced   analytics   in   insurance   go   much   further   than   traditional   underwriting   and   actuarial  science.  Machine  learning  and  predictive  modelling  is  the  way  forward  for  insurers  for  improving  pricing,  segmentation   and   increasing   profitability.   For   instance   direct   measurement   of   driving   behaviour  provides  new  rating  factors  and  transforms  auto  insurance  underwriting  and  pricing  processes.      Big   Data   can   also   play   a   tremendous   role   in   the   improvement   of   claims   management   by   for   instance  providing  very  efficient  fraud  detection  models.      We  would  note  that  there  are  few  inhibitors  that  could  block  these  changes  with  legislation  being  one  of  the   main   concerns.   The   EU   is   currently   working   on   General   Data   Protection   Regulation   (GDPR)   that  updates  the  data  processing,  protection  privacy  and  establishes  legislation  adapted  to  the  digital  era.  It  is  still  unclear  what  will  be  the  final  agreement  but  the  Regulation  must  be  appropriately  balanced  in  order  to  guarantee  a  high  level  of  protection  of  the  individuals  and  allow  companies  to  preserve  innovation  and  competitiveness.      Finally  we  discussed  new  frontiers  of  insurance.  Big  Data  gives  us  huge  amount  of  information  and  allows  creating  “fairer”,  more  personalized  insurance  premium  being  at  odds  with  solidarity  aspect  of  insurance.  However  we   think   that  Big  Data  will   not   revolutionize   it   and   risk   pooling  will   remain   core,   it  will   just  become  better.      Big   Data   opens   a   lot   of   new   possibilities   for   actuaries.   Data   science   and   actuarial   science   do  mutually  reinforce  each  other.  More  data  allow  for  a  richer  basis  for  actuarial  mathematical  analysis,  big  data  leads  to   a   dynamic   risk   management   approach;   the   application   of   Big   Data   goes   far   beyond   the   insurance  activity  and  therefore  offers  a  lot  of  new  opportunities.  The  implementation  of  Big  Data  in  insurance  and  the   financial   services   industry   requires   the   input   of   the   actuary   as   the   subject  matter   expert  who   also  understands   the   complex   methodology.   For   Big   Data   to   be   successful,   understandable   models   with  predictive  power  are  required  for  which  the  professional  judgement  of  the  actuary  is  essential.      We  hope  that  the  paper  will  be  a  good  starting  point  for  the  discussion  about  the  interplay  between  Big  Data   and   insurance   and   the   actuarial   profession.   The   Institute   for   Actuaries   in   Belgium   will   further  develop  the  subject  and  prepare  the  Belgian  actuaries.                        

Page 22: IABE Big Data information paper - An actuarial perspective

   

22

8 References  8.1 Section  3.1  [1]  Predictive  modeling  for  life  insurance,  Mike  Batty,  2010,  Deloitte  (https://www.soa.org/files/pdf/research-­‐pred-­‐mod-­‐life-­‐batty.pdf  )  [2]  Predictive  modeling  in  insurance:  key  issues  to  consider  throughout  the  lifecycle  of  a  model,  Chris  Homewood,  2012,    Swiss  Re,  (http://www.swissre.com/library/archive/?searchByType=1010965&searchByType=1010965&sort=descending&sort=descending&search=yes&search=yes&searchByLanguage=851547&searchByLanguage=851547&m=m&m=m&searchByCategory=1023505&searchByCategory=1023505&searchByYear=872532&searchByYear=872532#inline  )  [3]  Data  analytics  in  life  insurance:  lessons  from  predictive  underwriting,  Willam  Trump,  2014,  Swiss  Re(http://cgd.swissre.com/risk_dialogue_magazine/Healthcare_revolution/Data_Analytics_in_life_insurance.html  )  [4]  Advanced  analytics  and  the  art  of  underwriting,  transforming  the  insurance  industry,  2007,  Deloitte  (https://www.risknet.de/fileadmin/.../Deloitte-­‐Underwriting-­‐2007.pdf)  [5]  Data  Management:  Foundation  for  a  360-­‐degree  Customer  View-­‐  White  Paper,  2012?,  Pitney  Bowes  Software  (http://www.retailsolutionsonline.com/doc/data-­‐management-­‐foundation-­‐for-­‐a-­‐degree-­‐customer-­‐view-­‐0001)  [6]  Unleashing  the  value  of  advanced  analytics  in  insurance,  Richard  Clarke  and  Ari  Libarikian,  2014,  McKinsey  (http://www.mckinsey.com/insights/financial_services/unleashing_the_value_of_advanced_analytics_in_insurance)    8.2 Section  3.2  [7]  Ptolemus  USAGE-­‐BASED  INSURANCE  Global  Study  2013  [8]  Milliman:  Usage-­‐based  insurance:  Big  data,  machine  learning,  and  putting  telematics  to  work  -­‐  Marcus  Looft,  Scott  C.  Kurban  [9]  Capitalizing  on  Big  Data  Analytics  for  the  Insurance  Industry  [10]  Driving  profitability  and  lowering  costs  in  the  Insurance  Industry  using  Machine  Learning  on  Hadoop  -­‐August  6,  2015  /  Amit  Rawlani  /  Big  Data  Ecosystem,  Business,  Machine  Learning  /  Leave  a  Reply  [11]  HSBC  -­‐  Big  data  opens  new  horizons  for  insurers    8.3 Section  3.3:  [12]  Antonio,  K.,  &  Plat,  R.  (2014).  Micro-­‐level  stochastic  loss  reserving  in  general  insurance.  Scandinavian  Actuarial  Journal  ,  649-­‐669.  [13]  Arjas,  E.  (1989).  The  Claims  Reserving  Problem  in  Non-­‐Life  Insurance:  Some  Structural  Ideas.  Astin  Bulletin  19  (2),  140-­‐152.  [14]  England,  P.,  &  Verrall,  R.  (2002).  Stochastic  Claims  Reserving  in  General  Insurance.  British  Actuarial  Journal  (8),  443-­‐544.  [15]  Gremillet,  M.,  Miehe,  P.,  &  Trufin,  J.  (n.d.).  Implementing  the  Individual  Claims  Reserving  Method,  A  New  Approach  in  Non-­‐Life  Reserving.  Working  Paper  .  [16]  Haastrup,  S.,  &  Arjas,  E.  (1996).  Claims  Reserving  in  Continuous  Time  -­‐  A  Nonparametric  Bayesian  Approach.  ASTIN  Bulletin  (26),  139-­‐164.  [17]  Jewell,  W.  (1989).  Predicting  IBNYR  Events  and  Delays,  Part  I  Continuous  Time.  ASTIN  Bulletin  (19),  25-­‐56.  [18]  Jin,  X.,  &  Frees,  E.  W.  (n.d.).  Comparing  Micro-­‐  and  Macro-­‐Level  Loss  Reserving  Models.  Working  Paper  .  [19]  Larsen,  C.  R.  (2007).  An  Individual  Claims  Reserving  Model.  ASTIN  Bulletin  (37),  113-­‐132.  [20]  Mack,  T.  (1993).  Distribution-­‐free  calculation  of  the  standard  error  of  Chain  Ladder  reserve  estimates.  ASTIN  Bulletin  (23),  213-­‐225.  [21]  Mack,  T.  (1999).  The  standard  error  of  Chain  Ladder  reserve  estimate:  recursive  calculation  and  inclusion  of  a  tail  factor.  ASTIN  Bulletin  (29),  361-­‐366.  [22]  Norberg,  R.  (1999).  Prediction  of  Outstanding  Liabilities  II:  Model  Variations  and  Extensions.  ASTIN  Bulletin  (29),  5-­‐25.  [23]  Norberg,  R.  (1993).  Prediction  of  Outstanding  Liabilities  in  Non-­‐Life  Insurance.  ASTIN  Bulletin  (23),  95-­‐115.  

Page 23: IABE Big Data information paper - An actuarial perspective

   

23

[24]  Pigeon,  M.,  Antonio,  K.,  &  Denuit,  M.  (2014).  Individual  loss  reserving  using  paid-­‐incurred  data.  Insurance:  Mathematics  &  Economics  (58),  121-­‐131.  [25]  Pigeon,  M.,  Antonio,  K.,  &  Denuit,  M.  (2013).  Individual  Loss  Reserving  with  the  Multivariate  Skew  Normal  Model.  ASTIN  Bulletin  (43),  399-­‐428.  [26]  Wüthrich,  M.,  &  Merz,  M.  (2008).  Modelling  the  claims  development  result  for  Solvency  purposes.  ASTIN  colloquium  .  [27]  Wüthrich,  M.,  &  Merz,  M.  (2008).  Stochastic  Claims  Reserving  Methods  in  Insurance.  New  York:  Wiley.  [28]  Zhao,  X.,  &  Zhou,  X.  (2010).  Applying  Copula  Models  to  Individual  Claim  Loss  Reserving  Methods.  Insurance:  Mathematics  and  Economics  (46),  290-­‐299.[29]  Zhao,  X.,  Zhou,  X.,  &  Wang,  J.  (2009).  Semiparametric  Model  for  Prediction  of  Individual  Claim  Loss  Reserving.  Insurance:  Mathematics  and  Economics  (45),  1-­‐8.  

   8.4 Section  4  [30]  Avraham,  R.,  Logue,  K.  D.,  and  Schwarcz,  D.  B.,  Understanding  Insurance  Anti-­‐Discrimination  Laws,  Law  &  Economics  Working  Papers  52,  University  Michigan,  2013.  [31]  Davey,  James,  Genetic  discrimination  in  insurance:  lessons  from  test  achats  in  De  Paor,  A.,  Quinn,  G.  and  Blanck,  P.  (eds.),  Genetic  Discrimination  -­‐  Transatlantic  Perspectives  on  the  Case  for  a  European  Level  Legal  Response,  Abingdon,  2014.  [32]  Yann  Joly  et  al.,  Life  insurance:  genomic  stratification  and  risk  classification  in  European  Journal  of  Human  Genetics,  2014  May;  22(5),  575–579,  p.  575