large commercial risks (lcr) in insurance: focus on asia...

1

Large Commercial Risks (LCR) in Insurance:

Focus on Asia-Pacific

Principal Investigator: Andreas Milidonis,

University of Cyprus, Cyprus, & IRFRC, Nanyang Technological University, Singapore.

Co-Investigator Enrico Biffis

Imperial College Business School, London, U.K.

External Collaborators: Davide Benedetti

Imperial College Business School, London, U.K.

September 2015

Please cite as: Benedetti, D., Biffis, E., and A. Milidonis (2015). Large Commercial Risks (LCR) in Insurance: Focus on Asia-Pacific, Insurance Risk and Finance Research Centre Technical report, Retrieved from www.irfrc.com.

Email: [email protected] / Website: www.irfrc.com

mailto:[email protected]

http://www.irfrc.com/

2

Table of Contents

1. Executive Summary ..................................................................................................... 3

2. Motivation ................................................................................................................... 5

3. Objectives .................................................................................................................. 12

4. Data collection exercise ............................................................................................. 15

5. Descriptive Statistics ................................................................................................. 18

6. Methodology .............................................................................................................. 40

7. Results ....................................................................................................................... 48

8. Price sensitivity analysis .......................................................................................... 117

9. Limitations ............................................................................................................... 122

10. Acknowledgements .............................................................................................. 126

11. References ............................................................................................................ 127

12. Appendix A: Data Collection Details .................................................................. 129

3

1. Executive Summary

We study losses in commercial insurance lines in the Asia-Pacific (APAC) region, by

using a new dataset resulting from contributions by SCOR Services Asia-Pacific Pte Ltd

and two large Lloyd’s syndicates, Hiscox and Liberty. We focus on man-made risks for

commercial, manufacturing, and on-shore energy exposures (typically referred to as

Large Commercial Risks; LCR), and provide a comparison of tail risk profiles within the

APAC region, by distinguishing between developed and developing countries, as well as

different occupancy types. We also explore the existence of possible structural breaks, by

looking at pre- vs. post- financial crisis data, and pre- vs. post-2011 Thai floods data.

The dataset, referred to as the IRFRC LCR dataset, was manually collected by research

teams at IRFRC (Nanyang Business School, NTU, Singapore) and at Imperial College

Business School (London, UK). The IRFRC LCR dataset is made freely available

through www.irfrc.com.

We develop our analysis in three steps. First, we measure tail risk by approximating the

tail behaviour of the claims with a power law (and estimating the tail index with different

methodologies), and by estimating a Generalized Pareto Distribution. As a second step,

instead of conditioning on relevant exposure characteristics, we analyse our data jointly,

by using tail index regression and Generalized Pareto Distribution with covariates. Third,

we construct simple benchmark portfolios of APAC exposures that allow us to quantify

the contribution of different occupancy types and geographical areas to average loss

severity and portfolio tail risk.

The range of methodologies used in the analysis allows us to provide a more robust

picture of large commercial risks in the APAC region. The main results are that LCR


4

losses are in general heavy tailed, with manufacturing exposures more so than

commercial exposures. Increasing the exposure to developed countries in an APAC

insurance portfolio leads in general to an increase in average loss severity and portfolio

tail risk. The impact of large shocks, such as the 2008 financial crisis, seems to lead to

lighter tail risk in the post-crisis period, the effect being more pronounced for developing

countries than developed ones. The empirical distribution of LCR losses for developed

APAC countries is well captured by a power law; that of developing APAC countries

presents a richer behavior, possibly reflecting greater heterogeneity of exposures.

5

2. Motivation

According to Swiss Re (2012), USD 600 billion of direct insurance premiums were

written in commercial insurance lines worldwide in 2010. The US is by far the largest

market, both in absolute dollar value terms and relative to the size of its economy, but the

APAC region is also of strategic importance: Japan, for example, is the second largest

market in the world, followed by China. Figures reported in Swiss Re (2012) indicate a

premium volume equal to USD 237bn for the US, 35.4bn for Japan, and 30.7bn for

China. In this project, we focus on commercial property and energy on-shore risk

exposures in the APAC region, and on man-made risks such as fire and explosion. We

aim to shed light on the heterogeneity of tail risk profiles of different risk exposures

within the APAC region, as well as to allow end users to quantify any divergence

between the experience in this area and in North America and Europe.

Despite the relevance of commercial lines for the global P&C business and the corporate

sector, limited public information is available on their risk characteristics, in particular

about extreme loss realizations, which make up a large proportion of the total claims

value. The reason is that the heterogeneity of businesses by type and size makes it

difficult for individual insurers to develop reliable statistical claims information, and

induces those who have larger portfolios and longer claims histories not to disclose

information for competitive reasons. There is therefore the tendency for underwriters to

apply a considerable degree of judgment in pricing decisions, often giving too much

weight to the value of reported claims, which may not adequately reflect the risk of the

business written, and often exacerbates price volatility in response to claims occurrence.

6

In this research project we use a combination of Asia Pacific (APAC) LCR data

sourced from a global reinsurer (SCOR) and from two large Lloyd’s syndicates (Hiscox

and Liberty). The latter contributed to the creation of a worldwide LCR dataset in

collaboration with Enrico Biffis from Imperial College London, and with the support of

the Insurance Intellectual Capital Initiative (IICI) and Lloyd’s. An overview of the

Imperial-IICI dataset is offered in Biffis and Chavez (2014). Some of their findings on

the tail risk of LCR are discussed in Section 2.1 below, and compared with the results of

this project.

The dataset resulting from the combination of the APAC subset of the Imperial-IICI

dataset, and the APAC data provided by SCOR Services Asia-Pacific Pte Ltd, will be

referred to as IRFRC LCR Dataset. The database provides information on FGU (from

the ground up) losses occurred during the period 2000-2013 in the APAC region for

commercial, manufacturing, energy on shore, residential and miscellaneous1 exposures.

In line with the information contained in the Imperial-IICI dataset, the focus is on man-

made risks, such as fire and explosion, which are often regarded as un-modeled risks. On

the other hand natural catastrophes are excluded, as they are typically covered by

catastrophe models. In addition to FGU loss information, the dataset provides information

on the risk exposure, including location, occupancy type, and Total Insurable Value

(TIV). For anonymization purposes, aggregation of the two data sources has been carried

out by bucketing data into three time periods (2000-2003; 2004-2008; 2009-2013), and

replacing original currency and country information with the categorical values

1 Miscellaneous losses correspond to claims that, after the merge, couldn't be defined as commercial or manufacturing.

7

“developed country” and “developing country”. To define the latter, we follow the World

Bank’s economic development classification2. Details on the data collection,

anonymization, and aggregation procedures are discussed in section 4 and the data

appendix.

The combination of the Singapore and London market data sources into the IRFRC LCR

dataset represents an important step forward for insurance market participants and

researchers. In particular, the new dataset provides the following benefits:

(a) It provides the first publicly available overview of LCRs in the APAC region, thus

offering direct benefits to insurers already operating in the region, or planning to enter

the APAC market. More broadly, the dataset helps businesses and governmental

organizations concerned with LCRs to better understand risk exposures in the APAC

region.

(b) It allows market participants and researchers to draw comparisons with the global

market for LCRs (as presented in Swiss Re, 2012, or Biffis and Chavez, 2014), thus

identifying geographical sources of heterogeneity, which may reflect socio-economic

differentials, or the fact that more developed markets may cover risks not yet largely

underwritten in the APAC region. The importance of such comparisons is

demonstrated by similar studies being carried out by ISO-Verisk.3

(c) It provides historical information on losses and exposures associated with LCRs, to be

enjoyed by the data-contributing parties, as well as the wider community of

2 See www.worldbank.org. 3 See http://www.verisk.com/. John Buchanan and Chris Kent helped us to better understand this point by making available on a confidential basis some of their results on international claims cost comparison.

http://www.worldbank.org/

http://www.verisk.com/

8

researchers and market participants. By offering broad information on TIV,

occupancy type, and location, the dataset improves on typical datasets used by the

academic community, which typically only focus on the loss dimension, without

allowing researchers to differentiate the shape of the tail distribution of LCRs by

exposure type.

(d) It allows end users to develop robust benchmarks that can be used to compare LCRs

across macro APAC areas (e.g., developed vs. developing countries) by exposure

type, and to identify spatial/time trends or structural breaks. When complemented

with more granular internal claim history and underwriting information, the

benchmarks can be used to inform the application of appropriate rating factors to

reference loss curves that are widely used, but have no reliable statistical value.

(e) It provides a better understanding of pricing and capital requirement differentials

associated to different mix of business by exposure type and geographical area.

(f) It allows end users to improve the statistical value of their own claims experience, by

using a larger statistical basis as an anchor for internal models and underwriting

assumptions. The project represents a successful example of data enrichment strategy

that can benefit both data providers and the broader insurance community. Several

initiatives are currently addressing these issues, and will benefit from the output of

this project. For example, the Lloyd’s Market Association Loss & Exposure Data

Working Group (chaired by Mike Hood, XL Catlin), and a joint Casualty Actuarial

Society and Faculty and Institute of Actuaries Working Party (chaired by John

Buchanan, ISO Verisk), are working on the definition of data enrichment strategies

9

for the insurance market, and the definition of data quality standards for reinsurance

submissions, respectively.

(g) Ultimately provides a more robust view of LCRs, thus contributing towards reducing

pricing volatility and allowing corporations to better budget for, and benefit from,

insurance coverage for LCRs.

The Imperial-IICI dataset

The discussion of the Imperial-IICI dataset is based on Biffis and Chavez (2014). The

dataset contains claim and exposure information obtained from Hiscox and Liberty, two

leading syndicates of Lloyd’s of London. As the latter is a subscription market, the data

span business written by a number of other syndicates. Granular information on claims

and exposures was obtained from internal data systems, loss adjusters’ reports, and to a

large extent brokers’ submissions. The latter are documents informing the ‘lead’

underwriter of any claims occurring under a policy; the information is then shared with

the market, in order to allocate the losses to each ‘follower’, depending on the individual

retentions of the syndicates that co-insured the risk underwritten by the ‘lead’. It is in

general very difficult to recover FGU claims from the losses incurred by individual

syndicates, due to the complex layering and coinsurance arrangements characterizing

large commercial property insurance. Brokers’ submissions are therefore fundamental in

the analysis of Biffis and Chavez (2014).

All data were anonymized and aggregated by using fictitious claims and policy

identifiers. Internal validation of the data was carried out by looking at individual claims

narratives and policy schedules (documents listing the asset values insured under a

10

policy). External macro-validation was carried out by using data from fire protection

agencies as compiled by ISO Verisk.

The Imperial-IICI FGU claims provide aggregate information on indemnities for physical

damage and business interruption, as well as claims assessment and settlement fees. Both

claims and exposures are expressed in 2012 USD terms; the normalization is obtained by

trending claims and exposures at an average rate of 2.5% per annum across the two

syndicates. For the purpose of this project, we extrapolated the data to the end of 2014 by

using the same trending factors.

In terms of exposure information, in addition to TIV information, Biffis and Chavez

(2014) classify occupancy types by developing a classification based on three levels of

increasing granularity. The first one broadly classifies exposures into commercial (e.g.,

offices, banks, stores), manufacturing (e.g., utilities, food processors, mines), residential

property (e.g., hotels, hospitals), and energy on shore (e.g., oil refinery). The second

level, reported in Table 1 in Section 4, provides some more detail, allowing one to

distinguish, for example, a hotel from a hospital, or metals from food producers. The

third occupancy level offers a more granular view of the exposures, distinguishing for

example between large vs. small hotels, heavy vs. light fabrication infrastructure, and

food & drugs vs. chemicals vs. metal & minerals processing plants. Finally, occupancy

information is complemented by the claim narrative, which may also provide some more

information on the hazard event (e.g., burst of waterpipe, electrical failure, fire from hotel

restaurant).

The dataset developed in this project follows the Imperial-IICI template. As opposed to

Biffis and Chavez (2014), the analysis cannot be extended to the residential class, as it is

11

severely underrepresented in the APAC dataset. Similarly, the use of the second and third

levels of occupancy information is very limited due to small sample issues. In addition to

Biffis and Chavez (2014), however, the analysis is partially extended to the energy-on-

shore class. Given the focus of the project on the APAC region, the analysis distinguishes

between developed and developing countries. Moreover, the study explores the presence

of structural breaks in the data, looking at pre- vs. post-financial crisis results, as well as

pre- vs. post-2011 Thai floods data.

12

3. Objectives

The main objectives of the project are summarized below:

1st objective: Improving our understanding of LCRs, which are largely unexplored

and often referred to as un-modeled risks.

LCRs are a largely unexplored area of research. As opposed to natural hazards, for which

studies based on catastrophe models are available, LCR losses generated by human error,

machine failure, and other non-natural hazards (e.g., fire and explosion) are poorly

understood, and difficult to model, due to the complex nonlinear relation between hazard

events and realized losses. This poses challenges that are relevant from a research

perspective. Even if losses are modelled in reduced form (i.e., abstracting away from a

structural link between hazard event and loss realization), the heterogeneity of LCRs

makes small sample issues particularly relevant when trying to gauge the shape of the tail

distribution of different exposure types. This provides an excellent opportunity to use

estimation techniques that explicitly address the trade-off between bias and efficiency in

tail risk estimation (Dacorogna et al., 1995), such as the Weighted Hill estimator

proposed by Huisman et al. (2001), or the Rank-1/2 approach proposed by Gabaix and

Ibragimov (2011) and used in Biffis and Chavez (2014). The results allow us to better

gauge the tail distribution of different occupancy types, and hence understand more

precisely the potential limits to diversification for LCRs. In addition to the popular

distributions used in actuarial modelling (Klugman et al., 2012), we use tail risk

estimation methodologies with an explicit view towards taking into account exposure

characteristics, in line with rating methodologies and experience-based methods widely

13

used by actuaries and underwriters (e.g., Michaelides, et al. 1997, Guggisberg, 2004,

Riegel, 2010, Desmedt et al., 2012, Buchanan and Angelina, 2015).

2nd objective: Developing benchmarks for understanding the tail risk profile of large

commercial losses.

Pricing and capital modeling exercises for LCRs by and large rely on baseline models

(such as first loss curves) that have limited statistical relevance, and are simply used as

reference benchmarks that need to be adjusted based on exposure characteristics, amount

of insurance, risk mitigation systems in place, as well as underwriters’ information. The

availability of LCR loss data with exposure information allows us to develop simple

empirical benchmarks that can be used to better understand how baseline models widely

used in the industry compare to their empirical counterparts, and whether benchmark

technical pricing tools factor in any margins for prudence along the loss severity

dimension. The focus on APAC exposures contributes to the important issue of

extrapolating more reliable information available for some developed countries (e.g.

North America) to other parts of the world. We are not aware of any publicly available

datasets that can inform such extrapolation exercises, in particular for the APAC region.

If our data are used to complement internal pricing models (which is beyond the scope of

this project), end users could test assumptions on how the market prices insurance for

different layers of risk in the presence of model uncertainty.

3rd objective: Exploring the structural determinants of LCR losses.

It is in general difficult to pin down the main drivers of divergence between LCR losses

across time and space. Anecdotal evidence, however, suggests that socio-economic

14

factors may affect loss severities considerably. Similarly, large economic shocks may

affect loss frequency and severity due to moral hazard, or induce a selection bias via

supply and demand considerations (e.g., higher insurance prices and lower take up of

coverage by low-risk corporates). We explore some of these dimensions by looking at

pre- vs. post-financial crisis data, as well as pre- vs. post- 2011 Thai floods data.

Although sample size considerations prevent us from supporting some of the views

shared by market participants (in relation to the impact of Thai floods, for example), we

find some evidence of a structural break induced by the global financial crisis. Beyond

the specific findings we discuss in this project, the approach demonstrates how data

enrichment efforts can allow market participants to develop a deeper understanding of

LCRs and potentially test for the statistical significance of structural assumptions that

may have important operational implications.

15

4. Data collection exercise

Chronologically, the Imperial-IICI dataset was already in place before this project began.

Hence, the data collection experience acquired through the Imperial-IICI dataset was

used to collect data through the IRFRC. In particular, the lists of relevant data fields and

exposure rating factors present in the Imperial-IICI database were used as reference

template to collect data in Singapore. More details about the data collection process are

provided in the Data Appendix.

Challenges encountered in the data collection exercise include the following:

• Reinsurance treaty business poses some challenges that are different from the London

market data collection exercise. For example, top location profiles (the largest risks

insured in a treaty) may not coincide with the exposures actually giving rise to the

losses recorded.

• In line with the London market experience, a fundamental challenge is that recovering

FGU losses is in general difficult, and requires gaining access to external information

that is often stored in unstructured form, or cannot be searched electronically. In line

with the London market experience, FGU information, when directly available in the

data providers’ systems, may just represent an approximation of the actual FGU. The

approximation may be useful for business purposes, but is typically does not allow us

to properly quantify FGU losses. The collection of FGU loss information therefore

required the use of brokers’ submissions and loss adjusters’ reports, as well as the

careful study of individual policy limits and deductibles.

16

• Another challenge is represented by the fact that some data providers may store

information on an event basis, others on a claim basis. This clearly required careful

reconciliation of data sources when a single event affected several contracts at the

same time.

• Although it will not be shown in the publicly available dataset, considerable effort

was placed into breaking down losses into physical damage and business interruption.

It is envisaged that such information may be made available if and when the dataset is

suitably enlarged following contributions from additional data providers.

Data in London and Singapore were collected in original currency (local country

currency). To make data comparable across time and space, we converted the amounts

into USD at the end of the relevant calendar year, and then inflated FGU losses and

exposures in order to express them in 2014 values. In line with the Imperial-IICI dataset,

we applied an inflation factor of 2.5% per annum. The factor was originally based on

averaging of assumptions used by individual data providers, and may not reflect actual

inflation experienced by losses and exposures in the APAC region. The idea is to use a

simple approach to make aggregation of data and comparison of results feasible.

The losses in the aggregated and anonymized IRFRC LCR dataset are grouped into

different Economic Development, Period, and Occupancy classes. Economic

Development refers to the country where the loss occurred, and can be either developed

(D), if the loss happened in a developed country, or developing (represented by E, which

stands for “emerging”), if the loss occurred in a developing country. The losses are

grouped into 4 periods; period 1 if the loss occurred before or during 2003; period 2 if the

loss occurred during and between 2004 and 2008; period 3 if the loss occurred during the

17

years 2009 or 2010; finally period 4 if the losses occurred during 2011, 2012 or 2013.

Occupancy type refers to the type of exposure hit by the loss. Following Biffis and

Chavez (2014), we use high level classification based on Commercial (CO),

Manufacturing (MA), Residential (RE), Miscellaneous (MI), and Energy on Shore (EON)

exposures. Moreover, we use more granular classification based on Table 1 reported

below. As the RE class is underrepresented in our APAC dataset, in the following we

limit our analysis to the classes MA, CO, and EON. As it turns out, sample size

considerations make our analysis more compelling for MA and CO exposures. We still

report some results for the EON class when statistical significance is reasonable.

Table 1: The table shows the categories for the second level of exposure information provided in Biffis and Chavez (2014).

18

5. Descriptive Statistics

This section presents descriptive statistics and univariate analysis of our dataset in

Tables 2 through 4. The focus is FGU loss, our main variable of interest. In the

presentation of the results we first show the entire dataset (all collected losses) over the

period 2000-2013, and then present results based on three occupancy types: CO, MA, and

EON. We then also split the data by economic development classification (developed and

developing countries).

Table 2 is based on the whole period from 2000 to 2013. Panel A shows summary

statistics for whole group, the group of developed countries, and that of developing

countries. We can see that our dataset is composed by 526 losses. As mentioned before,

we decide to ignore class RE and MI, and among the remaining 508 data points we have

180 that occurred in developed countries, and the remaining 328 in developing ones.

Hence, around two thirds of our data come from developing economies.

The average loss for developed countries is almost 4 times larger than the average

loss for developing countries, and almost twice as large as the average loss for the total

group. The first, second and third quartiles for developed countries are also higher.

Volatility, expressed in terms of standard deviation, is also quite large. However, in terms

of coefficient of variation (CV, measured as standard deviation divided by the average)

this group is less volatile, with a CV of 1.77. The group with the highest variability is the

developing one, with a CV of 2.58. Each of the three groups of panel A shows positive

skewness and high values of kurtosis, to indicate a high likelihood of positive extremes.

19

By comparing the quartiles with the average we can notice that almost all of them

are below the mean, to signify extreme positive skewness. Surprisingly, the group of

developed countries is the group with lower kurtosis. This tell us that although exposures

in developed economies tend to have on average higher losses (which might reflect the

fact that insurance policy schedules contain high-value insurable assets, such as

buildings, infrastructure, and machinery), in terms of kurtosis and likelihood of extremes

developing countries have a big cluster of losses in the higher quantiles. As a result, from

these descriptive statistics it appears that developing countries have thicker tails.

Panel B of Table 2 considers only occupancy type CO exposures. The total

number of observations we have amounts to 81 losses, of which 33 in developed

countries, and 48 in developing ones. The considerations are similar to those of panel A,

as the developed countries group is the one with higher mean and quartiles, but lower CV

(hence variability), lower skewness and lower kurtosis.

Panel C focuses on occupancy type EON. We have 44 losses split into 20 for

developed countries and 24 for developing ones. For this type of occupancy not only the

average, but also CV, skewness and kurtosis of the developed group are higher. Hence,

EON exposures in developed countries not only have higher losses on average, but also a

bigger cluster of losses in the high quantiles.

Panel D shows descriptive statistics for occupancy type MA, which accounts

alone for 383 losses (75% of the sample), divided into 127 losses for developed countries

and 256 for developing ones. Considerations are similar to those for panels A and B.

20

By comparing different panels among them we can see that the average loss in the

total sample does not change significantly across different occupancy types. What

changes is the kurtosis, which tends to be lower for the CO group compared to the others.

For developed countries, CO exposures have lower average losses, while EON and MA

exposures have average losses slightly above the total average of developed countries in

panel A (total). The CO class is also the one with lower kurtosis in general. For the

developing group instead, EON exposures are those with higher average losses, while

MA exposures have lowest average.

Table 3 is based on the period before the financial crisis, from 2000 to 2008. As

before, panel A shows summary statistics for the whole group, panel B focuses only on

the CO class, while panels C and D deal with EON and MA, respectively. As we can see

from panel A, the total group corresponds to 272 losses, almost evenly distributed among

developed and developing countries. It is worth mentioning that since we have a total of

272 losses for the 8 years before the crisis, this leaves us with 236 losses for the

following 4 years, from 2009 to 2013. Hence, we have a balanced sample before and after

the crisis, even though we have only 4 years after the crisis and more than twice as many

before that. The average loss of developed countries is higher than that of developing

ones, while the CV for developed countries is lower, meaning a lower variability.

Quartiles, skewness and kurtosis of the developed group are also higher. These results are

in line with what we found for the whole sample, as shown in table 2 and described

above.

Panel B refers to the CO occupancy, and has 41 observations; 26 for developed

and just 15 for developing countries. Here the average losses of developed and

21

developing countries are similar (slightly higher for developing), but the quartiles of

developed countries are still higher. Moreover, even the CV is still lower for developed

countries.

For the EON class (panel C) we only have 31 losses; 17 for developed countries

and 14 for developing ones. The average loss of developed countries is again higher here.

The quartiles are higher for developed countries, with the exception of the first one

(similar to panel C of the previous table). As in panel C of table 2, the CV for the EON

class in developed countries is higher, hence the losses in the developed countries seem to

have lower variability compared to their mean in general, except that for the subsample of

energy companies.

The MA group (panel D) accounts for 200 losses split into 96 for developed and

104 for developing countries. This table confirms the results we had so far. Developed

countries have higher average loss and lower variability. Skewness is always positive

while kurtosis is generally high. Averages are always above the median. For developed

countries the average loss is close to the third quartile, while for developing countries it is

above that.

By comparing occupancies among them, the average loss of CO is smaller than

that of MA and EON, in both total and developed group, while EON has higher average

loss in the developing subsample. These results are consistent with those obtained in the

whole period as we saw in table 2.

Table 4 is based on the period after the financial crisis, from 2009 to 2013. And

again, panel A shows summary statistics for whole group, while panel B focuses only on

22

the CO class, and C and D deal with EON and MA, respectively. As we already

discussed, we have 236 losses from 2009 onward, of which only 41 in developed

countries, while 195 happened in the developing ones. Hence, after the crisis the majority

of our losses (83%) happened in the subsample of developing economies.

CO exposures (panel B) have 40 losses, of which 7 in developed countries, and 33

in developing ones. EON exposures have only 13 losses, of which 3 in developed

countries, and 10 in developing ones. Finally, MA is the group with the highest number

of observations, having 183 losses split into 31 for developed countries and 152 for the

developing ones. Results here are consistent with those obtained in the period before the

crisis. Developed countries have higher average loss, higher quartiles and lower CV

(hence lower variability), except for the EON group (panel C). Skewness and kurtosis are

positive and high except for the EON class, which has zero skewness and low kurtosis.

Average losses are always higher than medians, and close to or higher than the third

quartile. During this period, developing countries and the MA subsample present an

exceptionally high kurtosis.

Tables 5 and 6 report univariate analyses testing the difference between average

losses across different subsamples. Table 5 shows t-statistics for differences in average

losses of developed and developing economies, and of different periods (before and after

crisis). Panel A is based on the whole sample of losses. It shows a statistically significant

difference between type of country (developed vs. developing), both before and after

crisis, meaning that the average loss in developed countries is higher than that in

developing ones, both before and after the crisis. It also shows a statistically significant

difference between periods (before vs. after crisis) only for the subsample of developing

23

countries, with a lower average loss after the crisis. Panels B, C and D in Table 4 refer to

the subsamples of CO, EON and MA losses, respectively. For CO there does not seem to

be any significant difference across groups. For EON, in the after crisis period only,

developing countries have significantly higher average loss than developed countries. For

MA companies instead, results are similar to those of the whole sample.

Table 6 shows differences between CO and MA occupancy types across different

subsamples. Panel A refers to the whole sample, while panel B and C focus on the

periods before and after the crisis, respectively. There is no statistically significant

difference between CO and MA in the overall sample, both before and after the crisis.

However, if we focus on developed countries only, the MA class has statistically

significant higher average losses than the CO class, both during the whole period and

before the crisis. There is no statistically significant difference between CO and MA for

developing countries.

24

Table 2 (part 1): The table shows descriptive Statistics for FGU (from the ground up) losses for the entire sample period 2000-2013. The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.

Panel A: Total

Total Developed Developing

N 508 180 328 Avg 15,635,610 30,323,265 7,575,311 Std 37,118,930 53,541,941 19,572,282 Skw 5 4 5 Kur 41 22 29 Min 140,503 157,431 140,503 Q1 438,854 4,370,947 306,261 Q2 2,543,246 10,647,249 766,226 Q3 13,485,010 34,863,759 5,920,833

Max 365,698,968 365,698,968 171,771,593

Panel B: Commercial



Max 112,806,811 82,730,286 112,806,811

25

Table 2 (part 2): The table shows descriptive Statistics for FGU (from the ground up) for the entire sample period 2000-2013. The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.

Panel C: Energy on Shore


N 44 20 24 Avg 23,770,287 32,801,006 16,244,688 Std 57,012,275 80,107,204 25,410,331 Skw 5 4 3 Kur 35 21 17 Min 533,472 533,472 535,567 Q1 5,971,383 4,632,006 6,386,209 Q2 8,240,584 9,182,496 8,179,591 Q3 15,413,632 28,222,177 14,266,922

Max 365,698,968 365,698,968 121,515,749

Panel D: Manufacturing



Max 327,487,391 327,487,391 171,771,593

26

Table 3 (part 1): The table shows descriptive Statistics for FGU (from the ground up) losses during the period 2000-2008 (before crisis period). The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.

Panel A: Total


N 272 139 133 Avg 22,880,621 32,514,052 12,812,599 Std 46,559,671 58,790,774 25,243,728 Skw 4 4 3 Kur 27 19 10 Min 140,503 157,431 140,503 Q1 849,165 3,847,560 305,632 Q2 6,986,531 10,181,198 1,788,444 Q3 22,652,456 35,915,523 9,514,080

Max 365,698,968 365,698,968 121,515,749

Panel B: Commercial



Max 112,806,811 82,730,286 112,806,811

27

Table 3 (part 2): The table shows descriptive Statistics for FGU (from the ground up) losses during the period 2000-2008 (before crisis period). The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.



N 31 17 14 Avg 29,532,027 37,631,216 19,697,296 Std 67,292,390 86,342,611 32,931,134 Skw 4 3 2 Kur 25 18 11 Min 533,472 533,472 4,614,178 Q1 6,152,414 4,459,795 6,168,552 Q2 8,278,230 10,284,547 7,317,686 Q3 18,812,095 36,216,828 10,646,318

Max 365,698,968 365,698,968 121,515,749



N 200 96 104 Avg 23,235,858 36,196,153 11,272,508 Std 45,624,397 59,448,134 21,472,077 Skw 4 3 2 Kur 24 15 9 Min 140,503 249,429 140,503 Q1 653,394 4,691,163 280,360 Q2 7,069,782 12,225,456 1,041,258 Q3 25,450,886 40,212,623 9,868,709

Max 327,487,391 327,487,391 95,142,364

28

Table 4 (part 1): The table shows descriptive Statistics for FGU (from the ground up) losses during the period 2009-2013 (after crisis period). The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.

Panel A: Total


N 236 41 195 Avg 7,285,428 22,895,965 4,003,212 Std 18,490,366 28,827,275 13,434,179 Skw 6 2 10 Kur 42 9 130 Min 142,156 205,392 142,156 Q1 342,151 5,560,269 305,623 Q2 865,529 14,508,489 630,431 Q3 6,319,004 25,334,197 2,413,517

Max 171,771,593 128,764,052 171,771,593

Panel B: Commercial



Max 35,981,821 35,981,821 30,867,774

29

Table 4 (part 2): The table shows descriptive Statistics for FGU (from the ground up) losses during the period 2009-2013 (after crisis period). The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.



N 13 3 10 Avg 10,030,755 5,429,816 11,411,037 Std 6,027,777 774,942 6,256,112 Skw 0 0 0 Kur 2 NaN 2 Min 535,567 4,600,150 535,567 Q1 5,744,446 4,838,699 6,702,544 Q2 8,202,938 5,554,346 11,810,886 Q3 14,841,126 5,989,801 16,558,643

Max 20,088,128 6,134,952 20,088,128




Max 171,771,593 128,764,052 171,771,593

30

Table 5: The table shows averages for FGU (from the ground up) losses by Economic Development and Period (before or after crisis), and the relative t-Statistic for the difference between the two sample means. We denote t-statistic values above 1.96 in bold and italics, as they show a statistically significant difference between the two sub-samples examined. Panel A refers to the whole dataset. Panel B refers to CO occupancy only. Panel C refers to EON occupancy only. Panel D refers to MA occupancy only.

Panel A: Total

Developed Developing t-Statistic

Before Crisis 32,514,052 12,812,599 3.87 After Crisis 22,895,965 4,003,212 4.14 t-Statistic 1.46 4.99

Panel B: Commercial









31

Table 6: The table shows averages for FGU (losses from the ground up) by occupancy type, and the relative t-Statistic for the difference between the two sample means. We denote t-statistic values above 1.96 in bold and italics, as they show a statistically significant difference between the two sub-samples examined. Panel A refers to the whole dataset. Panel B refers to pre-crisis period only. Panel C refers to the after crisis period only.

Panel A: Total


Commercial 11,422,102 15,210,288 8,817,725 Manufacturing 15,592,185 33,860,064 6,529,604 t-Statistic 1.34 3.10 0.65

Panel B: Before Crisis



Panel C: After Crisis



32

Tables 7 to 9 show summary statistics for the total insurable value (TIV),

corresponding to the total asset values that could be insured. This variable was not

available in our data for every loss. When it was missing we used the total sum insured

(TSI) as a proxy for it, and we left it blank when no information was available. We will

call this variable TIV* to clarify the fact that the TIV is not always available. We should

also emphasize that even when the TIV is available, it is not consistently defined across

the dataset, as it may be derived, for example, from a top location profile or a policy

profile, thus overestimating the actual TIV (see Riegel, 2010, for a discussion of different

types of exposure information). When the TSI is used instead of the TIV, then we are

clearly underestimating the exposure. Note that descriptive statistics are computed based

only on the non-zero values of TIV*.

Panel A of table 7 shows summary statistics for the whole sample, while panels B,

C and D focus on the CO, EON and MA class respectively. Average TIV* of developed

countries is always higher than that of developing ones, across all panels of table 7.

However, in terms of quartiles the relation is not as clear as it was for FGU losses. In

terms of CV, the developing group is the one with highest variability only in the whole

sample and the MA subsample (panels A and D). The distribution of this variable across

different subsamples is positively skewed with high kurtosis, similar to that of FGUs.

Comparing different occupancy types, we see that MA has higher average TIV* for all

geographic configurations, followed by EON, and lastly CO.

Table 8 refers to the period before the crisis only, with panels A, B, C and D

focusing on total, CO, EON and MA respectively. The relation here is not as clear as

before. For total and MA, the developed countries group has lower average and lower

33

quartiles than developing ones. For CO and EON instead, the developed countries group

has higher average loss but lower quartiles (except the third one). As before, the CV is

higher for developing countries in the whole sample and the MA subsample, but higher

for developed countries in the CO and EON class. Skewness and kurtosis are always

positive as before. Manufacturing is still the group with highest average TIV* for all

geographic splits, followed again by EON and lastly CO.

Table 9 refers to the period after the crisis only, with panels A, B, C and D

focusing on the total sample and the CO, EON and MA classes, respectively. Average

TIV* is generally higher here for developed counties except for the MA subsample. The

relation in terms of quartiles is again not very clear. In terms of CV, this time the

developing countries group is always the one with lowest variability with respect to the

developed one. The EON subsample, during this period, has also zero skewness and low

kurtosis (similar to the case of FGUs). Comparing across occupancy types, MA is not the

one class with highest average TIV* anymore, EON being the one with the highest

average in all geographic splits.

34

Table 7 (part 1): The table shows descriptive Statistics for TIV* (total insurable value) for the entire sample period 2000-2013. The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.

Panel A: Total


N 452 133 319 Avg 2,166,960,565 2,425,663,982 2,059,099,893 Std 15,289,334,724 9,998,003,110 17,029,396,779 Skw 5 4 5 Kur 41 22 29 Min 787,252 1,069,725 787,252 Q1 268,716,208 42,571,677 441,748,792 Q2 780,976,256 334,003,896 900,762,469 Q3 1,454,611,208 1,218,786,562 1,494,971,080

Max 304,794,042,837 100,546,318,805 304,794,042,837

Panel B: Commercial


N 72 24 48 Avg 865,121,287 935,732,856 829,815,502 Std 917,555,393 1,130,830,023 801,383,175 Skw 3 2 4 Kur 13 6 19 Min 787,252 1,851,096 787,252 Q1 174,860,492 38,082,152 292,019,131 Q2 689,759,816 288,215,111 705,774,330 Q3 1,127,453,593 1,885,655,377 1,014,072,721

Max 4,001,683,538 3,664,345,555 4,001,683,538 *TIV was replaced by TSI (total sum insured) when unavailable.

35

Table 7 (part 2): The table shows descriptive Statistics for TIV* (total insurable value) for the entire sample period 2000-2013. The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.



N 39 15 24 Avg 1,688,299,126 2,438,517,314 1,219,412,758 Std 3,388,105,740 5,336,299,469 1,017,834,334 Skw 5 4 3 Kur 35 21 17 Min 1,069,725 1,069,725 97,688,983 Q1 244,006,261 39,015,793 526,578,931 Q2 912,685,266 256,525,203 1,006,514,269 Q3 1,452,157,006 2,576,308,696 1,388,220,841

Max 20,706,158,419 20,706,158,419 3,883,766,841



N 341 94 247 Avg 2,496,579,991 2,804,020,865 2,379,578,201 Std 17,553,569,883 11,685,813,090 19,344,047,633 Skw 5 3 5 Kur 34 17 37 Min 3,044,195 4,988,207 3,044,195 Q1 312,265,655 43,111,663 463,227,247 Q2 804,911,129 346,898,230 957,255,016 Q3 1,493,503,149 1,087,026,440 1,611,459,713


36

Table 8 (part 1): The table shows descriptive Statistics for TIV* (total insurable value) for the period 2000-2008 (before crisis period). The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.

Panel A: Total


N 223 99 124 Avg 3,064,137,956 2,653,725,704 3,391,805,802 Std 21,637,266,845 11,167,876,342 27,301,588,774 Skw 4 4 3 Kur 27 19 10 Min 787,252 1,069,725 787,252 Q1 183,947,649 44,169,842 312,265,655 Q2 528,124,738 409,636,121 660,594,894 Q3 1,287,170,784 1,321,509,909 1,264,808,937

Max 304,794,042,837 100,546,318,805 304,794,042,837

Panel B: Commercial


N 32 17 15 Avg 732,603,458 812,895,876 641,605,385 Std 889,783,253 1,097,238,448 600,381,985 Skw 2 2 2 Kur 8 6 7 Min 787,252 1,851,096 787,252 Q1 34,051,911 28,813,611 199,728,199 Q2 409,434,084 162,662,436 581,993,680 Q3 970,985,043 1,755,524,468 894,503,311


37

Table 8 (part 2): The table shows descriptive Statistics for TIV* (total insurable value) for the period 2000-2008 (before crisis period). The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.



N 26 12 14 Avg 1,146,968,550 1,225,458,913 1,079,691,097 Std 1,371,644,358 1,905,617,804 731,318,006 Skw 4 3 2 Kur 25 18 11 Min 1,069,725 1,069,725 97,688,983 Q1 97,688,983 30,950,193 662,238,520 Q2 897,463,412 248,179,241 1,143,699,061 Q3 1,260,348,510 2,053,742,326 1,260,348,510

Max 5,943,175,016 5,943,175,016 2,455,591,004



N 165 70 95 Avg 3,818,413,765 3,345,630,112 4,166,780,667 Std 25,121,541,436 13,213,902,149 31,186,465,472 Skw 4 3 2 Kur 24 15 9 Min 3,044,195 10,678,828 3,044,195 Q1 200,923,671 95,982,120 316,975,259 Q2 521,862,038 445,727,863 606,204,207 Q3 1,327,416,546 1,303,789,998 1,353,994,775


38

Table 9 (part 1): The table shows descriptive Statistics for TIV* (total insurable value) for the period 2009-2013 (after crisis period). The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.

Panel A: Total


N 229 34 195 Avg 1,293,290,005 1,761,601,907 1,211,635,622 Std 2,235,211,910 5,370,748,469 959,186,171 Skw 6 2 10 Kur 42 9 130 Min 4,248,686 4,988,207 4,248,686 Q1 422,545,507 42,112,393 548,254,897 Q2 938,784,035 122,202,747 1,027,992,397 Q3 1,545,025,599 797,249,883 1,611,459,713

Max 24,587,215,882 24,587,215,882 5,764,621,032

Panel B: Commercial


N 40 7 33 Avg 971,135,549 1,234,051,235 915,365,555 Std 936,726,534 1,242,945,182 872,582,710 Skw 2 1 2 Kur 5 2 6 Min 15,644,074 42,112,393 15,644,074 Q1 292,019,131 140,933,143 297,845,475 Q2 780,976,256 797,249,883 780,976,256 Q3 1,299,388,334 2,242,579,190 1,183,129,326


39

Table 9 (part 2): The table shows descriptive Statistics for TIV* (total insurable value) for the period 2009-2013 (after crisis period). The summary statistics reported are: Number of observations (N), Minimum and Maximum values (Min and Max), Average (Avg), Standard deviation (Std), Skewness (Skw), Kurtosis (Kur), First, Second and Third Quartile (Q1, Q2, Q3). Panel A refers to the whole dataset: total and split by Economic Development. Panel B refers to CO occupancy only: total and split by Economic Development. Panel C refers to EON occupancy only: total and split by Economic Development. Panel D refers to MA occupancy only: total and split by Economic Development.



N 13 3 10 Avg 2,770,960,276 7,290,750,918 1,415,023,083 Std 5,525,091,760 11,628,187,216 1,342,427,839 Skw 0 0 0 Kur 2 - 2 Min 98,414,191 98,414,191 319,060,580 Q1 473,779,627 340,730,679 523,842,003 Q2 912,685,266 1,067,680,144 755,753,418 Q3 2,649,656,011 15,796,538,850 2,372,029,659

Max 20,706,158,419 20,706,158,419 3,883,766,841



N 176 24 152 Avg 1,257,360,827 1,224,327,226 1,262,576,659 Std 2,008,300,837 4,985,166,983 942,726,962 Skw 5 2 11 Kur 37 7 128 Min 4,248,686 4,988,207 4,248,686 Q1 483,142,753 17,798,827 626,499,512 Q2 998,276,124 81,208,353 1,062,636,026 Q3 1,572,037,563 306,880,033 1,659,202,309


40

6. Methodology

In this section we provide a detailed explanation of our methodology and the theory

behind it. In part (a) we list the models we implement, and discuss the selection of

estimators suitable for our analysis. In part (b) we describe the theory behind the models

and methodology implemented. In part (c) we describe how we apply the methodology of

parts (a) and (b) to our dataset in order to make our results easily replicable.

a. Approach

Since we are interested in large risks, we focus on the tail risk profile of LCR losses. We

therefore rely on Pareto and Generalized Pareto distributions. These probabilistic models

are well known to provide parsimonious proxies for the shape of the tail of the

distribution of large losses, or characterize the asymptotic behavior of the tail as

exceedances over larger and larger thresholds are considered (see Beirlant et al., 2006, or

Embrechts et al., 1997, for example). Our dataset consists of losses above USD140k, but

for robustness the analysis is systematically applied to higher thresholds representing

different fractions of the original sample.

We focus on the Hill and the Weighted Hill estimator for the Pareto distribution, and

on the peak over threshold (POT) method for the Generalized Pareto distribution. The

Hill estimator is a maximum likelihood type of estimator, which applies to losses above a

given threshold. It provides a simple analytical expression to estimate the single

parameter of the Pareto model. However, it strongly depends on the chosen threshold. If

the threshold is not high enough, the estimator might be biased if the underlying

distribution is a more general Power Law. At the same time, a higher threshold reduces

41

the sample size, leading to loss of efficiency. This trade-off is well understood since the

pioneering works of Hall (1990) and Dacorogna et al. (1995). The Weighted Hill

estimator of Huisman et al. (2001) addresses small sample issues by considering a

weighted average of different Hill estimates computed for increasingly high thresholds.

The POT approach (see Embrechts et al., 1997) is an estimation methodology which

applies to the excess losses over a high threshold. It is based on the assumption that the

conditional distribution of these excess losses will converge to a Generalized Pareto

distribution, and the estimation of its parameters usually relies on maximum likelihood.

Given that our data is quite rich in the cross section (e.g., occupancy type and

country’s economic development), but also in the time dimension (e.g., global financial

crisis of 2008 and Thai floods of 2011), we estimate the models on different subsamples,

in order to be able to perform sensitivity analysis to different economic condition, time

periods, and occupancy/location types.

We also implement models that provide a more structural understanding of the Pareto

and the Generalized Pareto distributions. In particular, we estimate Pareto and

Generalized Pareto distributions whose parameters are driven by covariates, which reflect

occupancy and location characteristics of the exposures. We adopt the tail index

regression methodology of Wang and Tsai (2009) for the Pareto model, and of Chavez-

Demoulin et al. (2015) for the Generalized Pareto distribution.

b. General Theory

Our first approach to tail risk quantification relies on the idea of approximating

the tail distribution with a Power law. The latter is a probabilistic model of the form

42

𝑃(𝑋 > 𝑥) ∝ 𝑥−𝛼𝐿(𝑥),𝑓𝑓𝑓 𝑥 > 𝑐 ≥ 0,

where 𝐿(𝑥) is a slowly varying function satisfying the property that lim𝑥→∞𝐿(𝑡𝑥)𝐿(𝑥)

= 1.

Both Pareto and Generalized Pareto are special cases of this more general class of

distributions, obtained for specific forms of the function 𝐿(𝑥).

Given a sample of n positive ordered losses, 𝑥(𝑛) > 𝑥(𝑛 − 1) > ⋯𝑥(𝑖) >

⋯𝑥(1), independently generated by an unknown fat-tailed distribution, Hill (1975)

assumed a standard Pareto distribution for the losses over a threshold 𝑥(𝑛 − 𝑘) = 𝑐, i.e.,,

𝑃(𝑋 > 𝑥|𝑋 > 𝑐) = �𝑥𝑐�−𝛼

.

The estimator proposed by Hill (1975) for the tail index 𝛼 takes the following form:

𝛼� = �1𝑘� ln

𝑥(𝑛 − 𝑖)𝑥(𝑛 − 𝑘)

𝑘−1

𝑖=0

− ln 𝑥(𝑛 − 𝑘)�

−1

,

which is nothing else than the maximum likelihood estimator of 𝛼. Hence, the Pareto

model sets 𝐿(𝑥) = 𝑐𝛼, which is clearly slowly varying as it does not depend on x. Under

the above assumptions we have that √𝑘(𝛼� − 𝛼)𝑑→ 𝑁(0,𝛼2), providing consistency and

asymptotic normality of the estimator. The parameter 𝛼 measures the thickness of the tail

of the distribution. A lower 𝛼 indicates a slower decay of the tail towards zero, and hence

a higher probability of extreme events. Moreover, the tail index is in one to one

correspondence with the maximal order of finite (centered) moments of the risk

considered. For example, for 𝛼 < 1 the first moment (hence the mean) of the distribution

does not exist, for 𝛼 < 2 the second moment (hence the variance) is infinite, and so on.

43

A clear drawback of the above estimator is the choice of k. If the data generating

process above the threshold is a more general Power law distribution the estimator will be

biased. Dacorogna et al. (1995) show that the bias will generally be increasing in k,

meaning that, as we lower the threshold, the Hill estimator 𝛼� will on average be more and

more different from the true 𝛼. On the other hand, the variance of 𝛼� is decreasing with k,

meaning that increasing the threshold will reduce the bias at the expense of efficiency.

Huisman et al. (2001) state that, for a given class of parameterizations of 𝐿(𝑥), the

bias of the estimator 𝛾� = 1𝛼� is linear in k, i.e., 𝐸(𝛾�(𝑘)) = 𝛽0 + 𝛽1𝑘. They therefore

propose the following estimator for 1𝛼

1𝛼�

= 𝛾� = ∑ 𝑤(𝑘)𝛾�(𝑘)𝑛−1𝑘=𝑚 ,

where 𝛾�(𝑘) = 1𝑘∑ ln 𝑥(𝑛−𝑖)

𝑥(𝑛−𝑘)𝑘−1𝑖=0 − ln 𝑥(𝑛 − 𝑘) is the estimator of 1

𝛼 above the threshold

𝑥(𝑛 − 𝑘), and 𝑤(𝑘) is a weighting factor. The simple idea behind the Weighted Hill

estimator is to pick a threshold 𝑥(𝑚) and construct all the possible estimators 𝛾�(𝑘) for

𝑥(𝑘) greater than 𝑥(𝑚), and then to regress 𝛾�(𝑘) over k and a constant, thus estimating

the following linear model 𝛾�(𝑘) = 𝛽0 + 𝛽1𝑘 + 𝜀(𝑘).

The coefficient �̂�0 estimated with this regression is the Weighted Hill estimator

for 1𝛼. Since the 𝛾�(𝑘) estimators are clearly correlated among themselves, OLS (ordinary

least squares) estimation of 𝛽0 is likely to be biased. To deal with this problem Huisman

et al. (2001) suggest to use WLS (weighted least squares) regression with weighting

matrix 𝑊 = 𝑑𝑖𝑑𝑑(1,√2, … ,√𝑚). So, by defining 𝛾� as the 𝑚 × 1 ordered vector of 𝛾�(𝑘),

44

and 𝑍 as the 𝑚 × 2 matrix containing a vector of ones in the first column, and the vector

(1,2, … ,𝑚)′ in the second one, the Weighted Hill estimator is the coefficient �̂�0 from

�̂�𝑤𝑤𝑤 = (𝑍′𝑊′𝑊𝑍)−1(𝑍′𝑊′𝑊𝛾�) – analytical derivation of the standard errors for �̂�0 can

be found in the appendix Huisman et al. (2001).

The general theory behind the POT approach is based on a result due to Pickands

(1975), which applies to a series of i.i.d. losses, and states that that for a large class of

distributions the conditional distribution of the excess losses 𝑌 = 𝑋 − 𝑐 above the given

threshold 𝑐, will follow a Generalized Pareto distribution as the threshold grows larger:

𝑃(𝑋−𝑐>𝑦)𝑃(𝑋>𝑐) = 𝑃(𝑋 − 𝑐 > 𝑦|𝑋 > 𝑐) = 𝑃𝑐(𝑌 > 𝑦) = �

�1 + 𝜉𝛽𝑦�

−1 𝜉� , 𝜉 ≠ 0

𝑒−𝑦𝛽� , 𝜉 = 0

.

This distribution reduces to the Exponential distribution as 𝜉 goes to zero, is said to be

heavy tailed for 𝜉 > 0, and not to have tail for 𝜉 < 0. The coefficient 𝜉 can be seen as the

inverse of the parameter 𝛼 of the standard Pareto, which means that for 𝜉 > 1 the first

moment (hence the mean) of the distribution does not exist, for 𝜉 > 1/2 the second

moment (hence the variance) is infinite, and so on. The estimation of 𝜉 and 𝛽 usually

relies on maximum likelihood estimation. If we take again the series of ordered

realizations 𝑥(𝑛) > ⋯𝑥(𝑖) > ⋯𝑥(1), and set 𝑥(𝑛 − 𝑘) = 𝑐, we can define the excess

ordered losses over c as 𝑦(𝑘) = 𝑥(𝑛) − 𝑐 > ⋯𝑥(𝑖) = 𝑥(𝑛 − 𝑖) − 𝑐 > ⋯𝑦(0) =

𝑥(𝑛 − 𝑘) = 0, and write the log-likelihood as follows:

45

𝒍(𝜉,𝛽|𝑦𝑖) =

⎩⎪⎨

⎪⎧��− log(𝛽) − �1 +

1𝜉� �1 +

𝜉𝛽𝑦(𝑖)��

𝑘

𝑖=1

, 𝜉 ≠ 0

��− log(𝛽) − 𝑦(𝑖)𝛽� �

𝑘

𝑖=1

, 𝜉 = 0.

To understand how tail risk is shaped by economic development and different

occupancy types, we also implement Pareto and Generalized Pareto regression with

covariates following two works: Wang and Tsai (2009) and Chavez-Demoulin et al.

(2015).

Wang and Tsai assume that the probability of losses over the threshold can be

approximated by

𝑃(𝑋 > 𝑥|𝑍 = 𝑧,𝑋 > 𝑐) = 𝑥−𝛼(𝑧)𝐿(𝑥),

where Z is a vector of covariates and 𝛼(𝑧) = exp (𝑧′𝛽). In their work they approximate

the above probability as 𝑃(𝑋 > 𝑥|𝑍 = 𝑧,𝑋 > 𝑐) = �𝑥𝑐�−𝛼(𝑧)

, and apply the following

approximate negative log-likelihood estimator to derive the coefficients 𝛽

�̂� ≔ arg min𝛽 ∑ �exp(𝑧𝑖′𝛽) ln 𝑥𝑖𝑐− 𝑧𝑖′𝛽�𝑛

𝑖=1 𝐼( 𝑥𝑖 > 𝑐).

Chavez-Demoulin et al. (2015) work with the following parameterization for the

Generalized Pareto distribution:

𝑃(𝑋 − 𝑐 > 𝑥|𝑍 = 𝑧,𝑋 > 𝑐) = ��1 +𝜉(𝑧)(1 + 𝜉(𝑧))

𝑒𝜈(𝑧) 𝑥�−1 𝜉(𝑧)�

, 𝜉(𝑧) ≠ 0

𝑒−𝑥𝑒−𝜈(𝑧) , 𝜉(𝑧) = 0

46

where 𝜈 = ln ((1 + 𝜉)𝛽) is orthogonal to 𝜉. They assume a non-parametric relation in Z,

and a spline function in time for both 𝜉 and 𝜈. They then apply maximum likelihood

estimation to obtain the estimates of the regression coefficients. In our implementation

we assume the linear relations 𝜉(𝑧) = 𝑧′𝜃𝜉 and 𝜈(𝑧) = 𝑧′𝜃𝜈, in order to make it easier to

compare the results with those obtained with the approach of Wang and Tsai (2009). The

log-likelihood takes therefore the following form:

𝒍�𝜃𝜉 ,𝜃𝜈�𝑦𝑖� =

=

⎩⎪⎨

⎪⎧��log(1 + 𝜉(𝑧𝑖)) − 𝜈(𝑧𝑖) − �1 +

1𝜉(𝑧𝑖)

��1 +𝜉(𝑧𝑖)(1 + 𝜉(𝑧𝑖))

𝑒𝜈(𝑧𝑖)𝑦𝑖��

𝑘

𝑖=1

, 𝜉(𝑧𝑖) ≠ 0

��−𝜈(𝑧𝑖) −𝑦𝑖𝛽� �

𝑘

𝑖=1

, 𝜉(𝑧𝑖) = 0

c. Application to our dataset

We begin our analysis by ordering our dataset, as most of our estimators require a

series of ordered losses. As discussed before, we try different thresholds to understand

tail risk as we consider higher quantiles. We estimate the model on the entire dataset, and

then reduce the sample size by 5% at a time, working on the 5th, 10th,…, 90th and 95th

percentiles.

As all the estimators require the losses to be independent and identically

distributed, we carry out univariate analyses on different subsamples, by focusing on CO

or MA occupancy types (the sample size of EON losses being too small), on pre- vs.

post-financial crisis time periods, and developed vs. developing (develop or developing).

We also consider combinations of occupancy types and time periods. The results for pre-

47

and post-2011 Thai floods are not reported because of poor statistical significance,

possibly due to the small post-2011 sample size. We then apply the methodologies to

slightly more granular occupancy type classifications (like general industry, processing

and commercial exposures vs. metals, mines, chemicals and energy), obtaining similar

results.

The regressions with covariates are instead carried out only on the entire sample,

and for thresholds up to the 85th percentile, by assuming that the losses are conditionally

i.i.d., given the regressors. The independent variables we use are dummy variables for

occupancy type, economic development, time period, and interaction between period and

economic development. As a robustness check, we also run the regression including the

logarithm of TIV*, in order to control for size effects. This is in line with widely used

pricing approaches, which rely on using different rating factors that usually depend on

TIV bands.

48

7. Results

In this section we provide our results. In part (a) we report results of the curve fitting

exercise on the entire sample and different sub-samples. In part (b) we show how

estimates vary depending on the covariates described above (economic development, pre-

vs. post-crisis, occupancy type). In part (c) we plot the most robust CDFs from the

models run in part (b), and finally in part (d) we provide a discussion of the tail

regression results as well as robustness checks.

a. Curve Fitting Exercise

Figures 1 to 5 show parameter estimates obtained with Hill, Weighted Hill and

POT method (GPD), compared across different subsamples and different threshold levels.

The x-axis represent different threshold levels measured as percentage of observation

above the threshold; hence 20 means that the threshold has been set at the 80th percentile,

25 that it has been set at the 75th percentile, and so on until 100, which implies we are

using the whole sample. On the y-axis we report the estimates of our parameters. To

ensure results are comparable, the parameters reported are 𝛼� for the Hill estimator, 1 𝛾��

for the Weighted Hill estimator, and 1 𝜉� for the POT method. For these figures we only

show thresholds below the 80th percentile.

Tables 10 to 14 show t-statistics of the estimated coefficients and KS-test for

goodness-of-fit estimates obtained with Hill, Weighted Hill and POT method (GPD)

compared across different subsamples and different threshold levels. The parameters

tested are α for the Hill estimator, γ for the Weighted Hill estimator, and ξ for the

49

Generalized Pareto. The null hypothesis of the KS test is that the distribution being

estimated is the correct one.

Figure 1 shows results for the whole period. Panel A reports estimates for the

whole sample, while panels B and C focus on developed and developing countries

respectively. The GPD parameter tends to be higher than those of the other methods for

any given threshold level. Hence GPD tends to assume a lighter tail. In panel A the GPD

parameter is above two for higher thresholds, and decreases as we lower the threshold,

becoming less than two after the 65th percentile (more than 35% of observation) and less

than 1 after the 35th percentile (more than 65% of observation). It is worth to remember

that a 1𝜉� parameter below two implies infinite variance, and it implies infinite mean

when it is below 1. The Hill estimator is close to one already at the 80th percentile and

falls below one right after, implying extreme heavy tails. Both GPD and Hill estimator

are greatly influenced by the choice of the threshold, and both decrease as we lower the

threshold. Instead the Weighted Hill estimator proves to be more stable, and does not

change significantly in panel A for different thresholds. The Weighted Hill estimator is

also the estimator which gives the lower value of the parameter, and it is around 0.3.

Looking at panel A of table 10, we can notice that after the 80th percentile all the

estimated parameters are significant, except the higher percentile of the GPD estimator. It

is important to note that tests for GPD parameters are based on 𝜉, when this coefficient is

not statistically different from zero it means that 𝛼� (its inverse) approaches infinity and

hence the Generalized Pareto tends to the Exponential distribution, which is a light tailed

probability model. In terms of goodness-of-fit the GPD is the most reliable when more

than 25% of observations are used. The Weighted Hill estimator seems to fit well the data

50

only when almost all the sample is used, while the Hill estimator works best for higher

percentiles. In Panel B (developed group) the GPD parameter is still the highest; it starts

at four and decreases non-monotonically to a lower value of 1.7 when we use the full

sample to perform the estimation. The Hill parameter starts around 1.5 for the developed

group, but then it reaches low values as in panel A when 100% of the observations are

used. The Weighted Hill estimator is still the one which generates the lowest parameter

values, but it increases slightly as we lower the threshold, overtaking the Hill estimates

when more than 85% of observations are used (threshold below the 15th percentile).

Looking at panel B of table 10, we can see that the GPD parameter is again not very

significant for high thresholds, meaning that the model is not heavy tailed but an

Exponential distribution (hence all the moments exist). The KS-test for the Weighted Hill

is always close to zero, indicating that this model fits poorly the data at any given

threshold. The Hill estimator performs well for thresholds above the 35th percentile (less

than 65% of sample), while the GPD seems to represent the data at any level. For the

developing group (panel C) the GPD starts below two this time, and decreases faster than

in the other groups as we increase the threshold, getting closer to the parameters

estimated with the other methods. Both Hill and Weighted Hill estimators are below one

and pretty stable, with the Weighted Hill being the lowest. In panel C of table 10 we can

see that all parameters are significant below the 80th percentile, with GPD becoming

Exponential when less than 20% of observations are used. Moreover Hill and GPD fit

well the data at any given thresholds, while Weighted Hill works well below the 70th

percentile.

51

Figure 2 shows results for the period before the crisis, and table 11 relative t-

statistics and KS-tests. Panel A reports estimates for the whole sample, while panels B

and C focus on developed and developing countries, respectively. The GPD parameters

are always the highest in all panels, tending to be higher than one or two (meaning finite

mean and variance, respectively) for higher thresholds, and decrease as we lower the

threshold. The GPD parameter is negative for high thresholds in the developing group,

meaning that the support of the distribution has an upper bound (as for a Beta

distribution). GPD seems to fit well the data in terms of KS, and the 𝜉 parameter is not

statistically different from 0 for the highest quantiles (especially when the parameter is

negative), meaning as before that 𝛼� is close to infinity and hence the GPD approaches the

Exponential distribution instead of the Beta. Hill estimates are also decreasing in

threshold levels; they are lower than the GPD’s and generally higher than Weighted

Hill’s for the lowest percentiles of the developed group (when more than 85% of the

sample is used). They are always below two (no variance), and tend to go below one

pretty fast. Weighted Hill estimates are the lowest, they are the most stable and

significant, and always below 0.5. However, this approach does not seem to fit well the

data generally in term of KS-test, except for the developing group when more than half of

sample is used.

Figure 3 shows results for the period after the crisis,4 whereas table 12 provides

associated t-statistics and KS-tests. Panel A reports estimates for the whole sample, while

panels B and C focus on developed and developing countries, respectively. The GPD

4 Due to lack of data above high thresholds, the subsample of developed countries after the crisis presents a spike in the GPD estimator. The same applies to figure 4.

52

parameter is always the highest in all panels, except when it is negative. In panel A, the

GPD parameter tends to be higher than one or two (meaning finite mean and variance,

respectively) for higher thresholds, and decreases as we lower the threshold. In the

developed group, the GPD parameter is negative for high thresholds, and tends to be

always above two, or even higher, for all other thresholds. For the developing group GPD

is lower and gives results similar to other estimators when large part of the sample is

used. GPD seems to fit well the data in terms of KS, and the 𝜉 parameter is never

statistically different from 0 in panel B (developed group), where it is the highest. Hill

estimates are also decreasing in threshold levels; they are lower than the GPD’s and

higher then Weighted Hill’s, but go below the Weighted Hill parameter for very low

thresholds in the developed group only. They are still always below two (no variance),

and generally below one (no mean) as well. They also do very well in terms of goodness-

of fit as measured with the KS test. Weighted Hill parameters are the lowest, they are as

usual the most stable and significant, and always below 0.6. In this subsample this

estimator seems to perform well also in terms of KS-test, at least when a large portion of

sample is used.

Figure 4 shows results for the commercial subsample during the whole period,

and table 13 relative t-statistics and KS-tests. Panel A reports estimates for the whole

sample, while panels B and C focus on developed and developing countries respectively.

The GPD parameter is always the highest in all panels, tends to be higher than one or two

(meaning finite mean and variance, respectively) for higher thresholds, and decreases as

we lower the threshold. The GPD parameter is negative for thresholds above the median

in the developed group, and for when less than 25% of the sample is used (panel A). GPD

53

seems to fit well the data in terms of KS, and the 𝜉 parameter is not statistically different

from 0 in the developed group at any given level, and when less than half sample is used

for the other two groups. Hill estimates are also decreasing in threshold levels, they are

lower than the GPD’s and higher than Weighted Hill’s, except for lower percentiles of the

developed group. They are generally below one (no mean), and rarely above two. They

also do very well in terms of goodness-of fit as measured with the KS test. Weighted Hill

parameters are the lowest, are as usual the most stable and significant, and lie always

below 0.5. In these subsamples the estimator seems to perform well also in terms of KS-

test, especially for lower thresholds, and at any given level for the developing group.

Figure 5 shows results for the manufacturing subsample during the whole period,

and table 14 relative t-statistics and KS-test. Panel A reports estimates for the whole

sample, while panels B and C focus on developed and developing countries, respectively.

As before, the GPD parameter is the highest in all panels, tends to be higher than one or

two (meaning finite mean and variance, respectively) for higher thresholds, and decreases

as we lower the threshold. GPD seems to fit well the data in terms of KS, and the 𝜉

parameter is not statistically different from 0 for high thresholds. Hill estimates are also

decreasing in threshold levels, are lower than the GPD’s and higher than weighted Hill’s,

except for low thresholds in the developed group. They are generally below one (no

mean), and never above two. They also do very well in terms of goodness-of fit as

measured with the KS test. Weighted Hill estimates are generally the lowest, are as usual

the most stable and significant, and always lie below 0.5. In these subsamples the

estimator seems to perform well in terms of KS-test, especially in the developing group.

54

Figure 1: The figure shows Parameter Estimates for FGU (losses from the ground up) for the whole dataset from 2000-2013, obtained with the three methodologies. The parameters reported on the y-axes are α for the Hill estimator, 1/γ for the Weighted Hill estimator and 1/ξ for the Generalized Pareto. The x-axes correspond to different threshold levels expressed as percentage of the total sample. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

0

0.5

1

1.5

2

2.5

3

3.5

0 20 40 60 80 100 120

Panel A: Parameter Estimates for FGU from 2000-2013 for Total

Hill Weighted Hill GPD

55

Figure 1: Continued.

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0 20 40 60 80 100 120

Panel B: Parameter Estimates for FGU from 2000-2013 for Developed


0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 20 40 60 80 100 120

Panel C: Parameter Estimates for FGU from 2000-2013 for Developing


56

Figure 2: The figure shows Parameter Estimates for FGU (losses from the ground up) for the whole dataset from 2000-2008 (before crisis period), obtained with the three methodologies. The parameters reported on the y-axes are α for the Hill estimator, 1/γ for the Weighted Hill estimator and 1/ξ for the Generalized Pareto. The x-axes correspond to different threshold levels expressed as percentage of the total sample. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

0

1

2

3

4

5

6

7

0 20 40 60 80 100 120



57


0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0 20 40 60 80 100 120



-8

-6

-4

-2

0

2

4

6

0 20 40 60 80 100 120



58

Figure 3: The figure shows Parameter Estimates for FGU (losses from the ground up) for the whole dataset from 2009-2013 (after crisis period), obtained with the three methodologies. The parameters reported on the y-axes are α for the Hill estimator, 1/γ for the Weighted Hill estimator and 1/ξ for the Generalized Pareto. The x-axes correspond to different threshold levels expressed as percentage of the total sample. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

0

0.5

1

1.5

2

2.5

3

0 20 40 60 80 100 120



59


-40-35-30-25-20-15-10

-505

1015

0 20 40 60 80 100 120



0

0.5

1

1.5

2

2.5

0 20 40 60 80 100 120



60

Figure 4: The figure shows Parameter Estimates for FGU (losses from the ground up) for Commercial occupancy only from 2000-2013, obtained with the three methodologies. The parameters reported on the y-axes are α for the Hill estimator, 1/γ for the Weighted Hill estimator and 1/ξ for the Generalized Pareto. The x-axes correspond to different threshold levels expressed as percentage of the total sample. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

-20

-15

-10

-5

0

5

10

15

20

25

30

35

0 20 40 60 80 100 120



61


-20

-10

0

10

20

30

40

50

0 20 40 60 80 100 120



0

0.5

1

1.5

2

2.5

0 20 40 60 80 100 120



62

Figure 5: The figure shows Parameter Estimates for FGU (losses from the ground up) for Manufacturing occupancy only from 2000-2013, obtained with the three methodologies. The parameters reported on the y-axes are α for the Hill estimator, 1/γ for the Weighted Hill estimator and 1/ξ for the Generalized Pareto. The x-axes correspond to different threshold levels expressed as percentage of the total sample. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

0

0.5

1

1.5

2

2.5

3

0 20 40 60 80 100 120



63


0

1

2

3

4

5

6

7

0 20 40 60 80 100 120



0

0.5

1

1.5

2

2.5

0 20 40 60 80 100 120



64

Table 10 (panel A): The table refers to figure 1 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) for the whole dataset from 2000-2013, obtained with the three methodologies on different thresholds. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel A


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 23 83,732,689 4.8 1.000 9.1 0.000 1.4 1.000 10 46 40,821,893 6.8 0.254 9.0 0.000 0.9 0.909 15 70 26,710,076 8.4 0.667 9.0 0.000 1.6 0.942 20 93 18,205,761 9.6 0.930 9.1 0.000 2.1 0.978 25 116 13,097,962 10.8 0.243 9.4 0.000 2.7 0.985 30 139 8,895,495 11.8 0.049 9.7 0.000 3.1 0.977 35 162 6,980,850 12.7 0.050 10.2 0.000 3.8 0.974 40 186 4,581,297 13.6 0.006 10.8 0.000 4.2 0.767 45 209 2,789,389 14.5 0.000 11.7 0.000 4.6 0.901 50 232 1,949,649 15.2 0.000 12.8 0.000 5.2 0.992 55 255 1,244,590 16.0 0.000 14.2 0.000 5.8 0.510 60 278 877,810 16.7 0.000 16.1 0.000 6.5 0.224 65 302 679,526 17.4 0.000 18.4 0.000 7.2 0.065 70 325 493,934 18.0 0.000 20.7 0.000 7.9 0.053 75 348 402,139 18.7 0.000 23.4 0.000 8.8 0.006 80 371 317,679 19.3 0.000 27.2 0.001 9.7 0.001 85 394 261,144 19.8 0.000 32.0 0.013 10.7 0.000 90 418 222,505 20.4 0.000 38.5 0.057 11.7 0.000 95 441 186,095 21.0 0.000 46.9 0.216 12.6 0.000 100 463 140,503 21.5 0.000 - 0.175 13.3 0.000

65

Table 10 (panel B): The table refers to figure 1 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) for the whole dataset from 2000-2013, obtained with the three methodologies on different thresholds. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel B


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 8 114,963,189 2.8 0.996 4.6 0.000 - 0.976 10 16 85,071,530 4.0 1.000 5.8 0.000 1.0 1.000 15 24 53,663,923 4.9 1.000 6.4 0.000 0.6 1.000 20 32 42,504,805 5.7 0.699 6.9 0.000 1.0 0.999 25 40 34,863,759 6.3 1.000 8.0 0.000 1.4 1.000 30 48 27,776,326 6.9 0.968 8.8 0.000 1.6 1.000 35 56 21,965,627 7.5 0.969 9.9 0.000 1.7 1.000 40 64 18,175,338 8.0 0.924 11.0 0.000 1.9 1.000 45 72 14,222,968 8.5 0.599 11.9 0.000 2.0 0.998 50 80 11,985,530 8.9 0.401 13.1 0.000 2.3 1.000 55 88 10,038,844 9.4 0.120 14.8 0.000 2.6 1.000 60 96 8,814,046 9.8 0.095 17.4 0.000 2.9 1.000 65 104 7,403,872 10.2 0.202 20.7 0.000 3.1 0.959 70 112 5,755,581 10.6 0.033 24.8 0.000 3.2 1.000 75 120 4,245,094 11.0 0.005 30.3 0.000 3.4 1.000 80 128 2,495,716 11.3 0.000 38.0 0.000 3.4 0.979 85 136 1,772,484 11.7 0.000 47.3 0.000 3.6 0.981 90 144 1,056,576 12.0 0.000 58.8 0.000 3.8 0.988 95 152 471,164 12.3 0.000 75.5 0.000 4.0 0.864 100 159 157,431 12.6 0.000 - 0.000 4.3 0.665

66

Table 10 (panel C): The table refers to figure 1 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) for the whole dataset from 2000-2013, obtained with the three methodologies on different thresholds. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel C


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 15 34,516,023 3.9 0.405 8.1 0.003 0.9 0.996 10 30 17,062,955 5.5 0.971 8.5 0.001 0.4 0.936 15 46 9,112,013 6.8 0.798 8.8 0.002 1.4 0.996 20 61 6,206,870 7.8 0.548 9.1 0.005 2.3 0.996 25 76 3,484,732 8.7 0.161 9.3 0.033 2.8 0.998 30 91 2,372,768 9.5 0.226 9.5 0.074 3.5 1.000 35 106 1,554,141 10.3 0.070 9.8 0.168 4.1 0.939 40 122 1,110,381 11.0 0.157 10.1 0.380 4.8 0.808 45 137 798,256 11.7 0.149 10.5 0.269 5.4 0.796 50 152 683,688 12.3 0.176 11.0 0.516 6.1 0.468 55 167 549,386 12.9 0.215 11.3 0.440 6.8 0.380 60 182 452,753 13.5 0.252 11.9 0.460 7.4 0.369 65 198 391,471 14.1 0.289 12.8 0.725 8.0 0.279 70 213 331,811 14.6 0.337 14.4 0.886 8.5 0.332 75 228 284,471 15.1 0.411 16.1 0.909 9.0 0.341 80 243 257,812 15.6 0.411 18.6 0.578 9.5 0.382 85 258 225,531 16.1 0.468 21.9 0.486 9.9 0.403 90 274 196,036 16.6 0.522 25.9 0.098 10.3 0.430 95 289 167,590 17.0 0.621 30.7 0.099 10.7 0.464 100 303 140,503 17.4 0.735 - 0.076 11.0 0.511

67

Table 11 (panel A): The table refers to figure 2 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) before the crisis from 2000-2008, obtained with the three methodologies on different thresholds. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel A


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 12 93,692,959 3.5 1.000 5.5 0.000 0.6 1.000 10 24 67,080,330 4.9 1.000 5.6 0.000 1.1 1.000 15 36 44,091,320 6.0 0.956 5.7 0.000 1.0 1.000 20 48 32,990,236 6.9 0.743 6.0 0.000 1.3 1.000 25 60 22,984,435 7.7 0.502 6.4 0.000 1.3 1.000 30 72 14,803,561 8.5 0.046 7.0 0.000 1.3 1.000 35 84 11,943,807 9.2 0.163 7.7 0.000 1.8 0.999 40 96 8,932,315 9.8 0.043 8.7 0.000 2.2 0.997 45 108 8,091,849 10.4 0.132 9.9 0.000 2.8 0.918 50 120 5,952,546 11.0 0.093 11.4 0.000 3.1 0.873 55 133 3,847,483 11.5 0.001 13.2 0.000 3.4 0.664 60 145 2,440,832 12.0 0.000 15.1 0.000 3.7 0.724 65 157 1,515,903 12.5 0.000 17.4 0.000 4.1 0.778 70 169 1,063,963 13.0 0.000 20.4 0.000 4.6 0.747 75 181 663,406 13.5 0.000 24.7 0.001 5.0 0.568 80 193 465,437 13.9 0.000 30.0 0.000 5.5 0.195 85 205 304,249 14.3 0.000 36.1 0.000 6.0 0.113 90 217 233,428 14.7 0.000 43.5 0.000 6.5 0.040 95 229 189,171 15.1 0.000 53.1 0.000 7.1 0.003 100 240 140,503 15.5 0.000 - 0.000 7.8 0.001

68

Table 11 (panel B): The table refers to figure 2 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) before the crisis from 2000-2008, obtained with the three methodologies on different thresholds. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel B



69

Table 11 (panel C): The table refers to figure 2 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) before the crisis from 2000-2008, obtained with the three methodologies on different thresholds. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel C


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 6 79,116,288 2.4 1.000 3.7 0.002 - 1.000 10 12 37,538,875 3.5 0.938 4.1 0.000 - 0.929 15 18 23,095,059 4.2 0.718 4.4 0.000 - 0.454 20 24 14,056,001 4.9 0.403 4.4 0.000 1.6 0.913 25 30 9,040,856 5.5 0.541 4.5 0.001 0.6 0.950 30 36 7,115,472 6.0 0.530 4.5 0.001 0.7 0.780 35 42 4,154,373 6.5 0.389 4.6 0.005 1.0 0.953 40 48 2,699,224 6.9 0.208 5.0 0.009 1.6 0.960 45 54 1,397,806 7.3 0.027 5.1 0.044 2.1 0.790 50 59 984,065 7.7 0.015 5.3 0.080 2.6 0.765 55 65 694,228 8.1 0.020 5.8 0.084 3.1 0.697 60 71 583,733 8.4 0.052 6.8 0.486 3.6 0.716 65 77 438,720 8.8 0.060 7.6 0.592 4.1 0.379 70 83 320,464 9.1 0.056 8.8 0.654 4.7 0.222 75 89 279,760 9.4 0.079 9.9 0.712 5.3 0.137 80 95 235,992 9.7 0.095 11.3 0.576 5.8 0.111 85 101 220,720 10.0 0.103 13.1 0.468 6.2 0.101 90 107 190,905 10.3 0.119 15.7 0.276 6.7 0.114 95 113 152,487 10.6 0.149 18.9 0.295 7.1 0.140 100 118 140,503 10.9 0.172 - 0.147 7.3 0.149

70

Table 12 (panel A): The table refers to figure 3 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) after the crisis from 2009-2013, obtained with the three methodologies on different thresholds. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel A


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 11 29,539,298 3.3 0.958 6.7 0.022 0.5 0.972 10 22 18,724,315 4.7 0.999 7.0 0.001 1.6 0.995 15 33 14,512,646 5.7 1.000 7.3 0.000 2.2 0.998 20 45 8,065,183 6.7 0.164 7.5 0.000 2.0 0.963 25 56 4,947,709 7.5 0.115 7.8 0.002 2.2 0.985 30 67 3,158,684 8.2 0.089 8.0 0.007 2.4 0.943 35 78 2,018,541 8.8 0.057 8.2 0.031 2.8 0.849 40 89 1,458,411 9.4 0.102 8.4 0.084 3.2 0.615 45 100 1,006,393 10.0 0.053 8.8 0.181 3.6 0.362 50 111 772,154 10.5 0.086 9.3 0.230 4.2 0.305 55 123 670,090 11.1 0.098 9.6 0.214 4.9 0.230 60 134 508,367 11.6 0.078 10.4 0.266 5.6 0.289 65 145 414,724 12.0 0.093 11.6 0.226 6.2 0.286 70 156 380,188 12.5 0.121 12.9 0.137 6.8 0.217 75 167 328,319 12.9 0.126 15.1 0.090 7.3 0.205 80 178 275,764 13.3 0.173 17.5 0.112 7.8 0.249 85 190 252,813 13.8 0.201 20.8 0.048 8.2 0.223 90 201 217,261 14.2 0.180 24.2 0.046 8.6 0.184 95 212 176,808 14.6 0.255 27.9 0.145 9.0 0.230 100 222 142,156 14.9 0.374 - 0.512 9.3 0.294

71

Table 12 (panel B): The table refers to figure 3 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) after the crisis from 2009-2013, obtained with the three methodologies on different thresholds. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel B


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 2 103,522,423 1.4 1.000 0.9 0.116 1.2 1.000 10 4 53,773,712 2.0 1.000 1.8 0.012 - 1.000 15 6 40,990,527 2.4 0.979 3.5 0.001 - 1.000 20 8 35,240,589 2.8 1.000 4.8 0.000 0.1 1.000 25 9 27,684,864 3.0 1.000 4.7 0.000 0.6 0.996 30 11 23,128,230 3.3 0.999 4.2 0.000 0.1 0.988 35 13 21,430,280 3.6 1.000 4.1 0.000 0.6 0.999 40 15 18,144,914 3.9 1.000 4.2 0.001 0.7 0.998 45 17 17,596,587 4.1 0.999 4.6 0.000 1.1 1.000 50 19 15,681,099 4.4 1.000 5.0 0.001 1.2 0.999 55 21 14,352,963 4.6 1.000 5.7 0.001 1.3 0.999 60 23 12,533,840 4.8 1.000 6.5 0.002 1.3 0.999 65 25 9,878,412 5.0 1.000 7.8 0.009 1.1 1.000 70 27 7,091,877 5.2 0.186 9.6 0.053 0.9 1.000 75 28 6,356,987 5.3 0.086 10.8 0.102 0.9 0.987 80 30 4,636,813 5.5 0.049 15.2 0.305 0.9 0.986 85 32 4,223,900 5.7 0.058 20.7 0.321 1.1 0.997 90 34 1,213,985 5.8 0.003 27.1 0.000 0.7 0.965 95 36 249,948 6.0 0.000 68.3 0.000 0.8 0.958 100 37 205,392 6.1 0.000 - 0.000 1.0 0.968

72

Table 12 (panel C): The table refers to figure 3 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) after the crisis from 2009-2013, obtained with the three methodologies on different thresholds. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel C


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 9 17,063,442 3.0 0.937 6.6 0.006 1.4 0.965 10 18 8,170,898 4.2 0.634 7.2 0.005 1.5 0.926 15 28 4,564,751 5.3 0.718 7.8 0.013 1.8 0.977 20 37 2,598,260 6.1 0.768 8.3 0.060 2.0 0.970 25 46 1,975,098 6.8 0.878 8.7 0.112 2.5 0.914 30 55 1,431,149 7.4 0.856 9.3 0.196 2.9 0.936 35 65 1,069,840 8.1 0.855 9.7 0.297 3.4 0.917 40 74 829,973 8.6 0.856 10.0 0.453 3.8 0.954 45 83 695,109 9.1 0.887 10.1 0.579 4.3 0.979 50 92 571,848 9.6 0.874 10.6 0.676 4.7 0.991 55 102 481,938 10.1 0.905 11.5 0.713 5.1 0.983 60 111 410,847 10.5 0.908 11.8 0.775 5.4 0.948 65 120 388,753 11.0 0.927 12.3 0.753 5.7 0.966 70 129 338,661 11.4 0.929 13.3 0.768 6.1 0.907 75 139 294,797 11.8 0.971 15.0 0.816 6.4 0.984 80 148 260,022 12.2 0.988 16.4 0.969 6.7 0.970 85 157 234,974 12.5 0.987 18.9 0.945 7.0 0.993 90 167 205,423 12.9 0.997 23.5 0.903 7.2 0.971 95 176 173,308 13.3 0.645 30.6 0.758 7.4 0.989 100 184 142,156 13.6 0.164 - 0.541 7.6 0.937

73

Table 13 (panel A): The table refers to figure 4 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) for the whole period, 2000-2013, obtained with the three methodologies on different thresholds in the subsample of commercial occupancy only. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel A


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 4 59,530,730 2.0 0.996 2.5 0.015 - 1.000 10 8 33,343,244 2.8 0.980 3.0 0.005 - 0.995 15 12 23,097,671 3.5 0.997 3.2 0.001 0.4 0.994 20 16 14,405,868 4.0 0.561 3.3 0.001 0.2 0.993 25 20 9,563,530 4.5 0.681 3.5 0.002 0.1 1.000 30 24 6,960,400 4.9 0.797 3.6 0.003 0.6 1.000 35 28 6,018,581 5.3 0.895 4.1 0.001 1.2 1.000 40 32 4,544,160 5.7 0.717 4.6 0.000 1.5 0.905 45 36 2,832,349 6.0 0.397 5.0 0.001 1.6 0.925 50 40 2,127,620 6.3 0.216 5.7 0.001 2.0 0.985 55 45 1,775,753 6.7 0.201 6.5 0.001 2.4 0.955 60 49 1,128,507 7.0 0.186 7.7 0.003 2.6 0.893 65 53 854,139 7.3 0.082 9.0 0.009 2.9 0.974 70 57 386,740 7.5 0.009 10.3 0.151 3.1 0.801 75 61 341,340 7.8 0.025 11.8 0.308 3.4 0.968 80 64 260,509 8.0 0.023 13.6 0.200 3.6 0.738 85 69 239,424 8.3 0.029 17.2 0.384 3.9 0.394 90 73 195,863 8.5 0.028 21.1 0.363 4.2 0.233 95 77 162,940 8.8 0.034 25.6 0.357 4.6 0.222 100 80 142,156 8.9 0.055 - 0.507 4.9 0.196

74

Table 13 (panel B): The table refers to figure 4 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) for the whole period, 2000-2013, obtained with the three methodologies on different thresholds in the subsample of commercial occupancy only. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel B


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 2 66,363,328 1.4 1.000 1.0 0.123 - 1.000 10 3 44,582,338 1.7 1.000 1.9 0.025 - 1.000 15 5 38,882,857 2.2 1.000 3.1 0.006 - 1.000 20 7 30,540,329 2.6 1.000 3.6 0.001 - 0.989 25 8 16,230,101 2.8 0.671 3.7 0.006 - 0.949 30 10 13,366,127 3.2 0.744 3.7 0.006 - 1.000 35 12 11,465,146 3.5 0.762 4.5 0.005 0.6 1.000 40 13 7,720,566 3.6 0.596 4.9 0.011 0.9 0.777 45 15 6,770,841 3.9 0.864 5.7 0.012 0.2 0.999 50 16 5,952,546 4.0 0.782 6.1 0.014 0.0 0.992 55 18 5,562,820 4.2 0.888 6.8 0.018 0.8 0.977 60 20 4,377,522 4.5 0.879 7.5 0.047 1.0 0.885 65 21 3,823,377 4.6 0.882 8.6 0.073 1.1 0.903 70 23 2,247,626 4.8 0.644 10.6 0.219 1.0 0.932 75 25 2,095,801 5.0 0.714 12.9 0.265 1.4 0.905 80 26 1,581,183 5.1 0.423 16.7 0.381 1.4 0.964 85 28 1,129,211 5.3 0.230 22.5 0.418 1.6 0.957 90 30 816,888 5.5 0.140 25.4 0.266 1.8 0.952 95 31 248,283 5.6 0.007 27.2 0.000 1.8 0.993 100 32 157,431 5.7 0.006 - 0.000 1.9 0.986

75

Table 13 (panel C): The table refers to figure 4 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) for the whole period, 2000-2013, obtained with the three methodologies on different thresholds in the subsample of commercial occupancy only. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel C


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 2 39,061,678 1.4 0.747 1.5 0.462 - 0.960 10 5 24,890,915 2.2 0.873 2.4 0.187 1.1 0.906 15 7 14,848,536 2.6 0.993 2.6 0.240 0.7 0.965 20 10 7,357,437 3.2 0.994 2.8 0.391 0.9 0.993 25 12 5,840,551 3.5 1.000 2.9 0.307 1.2 0.999 30 14 2,862,787 3.7 0.700 2.9 0.735 1.2 0.992 35 17 1,977,335 4.1 0.691 3.0 0.810 1.6 1.000 40 19 1,389,102 4.4 0.803 3.0 0.934 1.8 0.999 45 22 971,114 4.7 0.722 2.8 0.800 2.1 0.993 50 24 653,844 4.9 0.624 3.0 0.353 2.3 0.994 55 26 383,353 5.1 0.493 3.6 0.181 2.5 0.994 60 29 346,682 5.4 0.539 4.1 0.515 2.7 0.914 65 31 328,327 5.6 0.694 4.5 0.935 2.8 0.773 70 33 260,509 5.7 0.788 4.8 0.969 3.2 0.837 75 36 258,593 6.0 0.774 5.8 0.806 3.6 0.773 80 38 233,305 6.2 0.805 6.6 0.357 3.9 0.783 85 41 196,768 6.4 0.643 8.0 0.143 4.2 0.734 90 43 175,536 6.6 0.825 9.3 0.332 4.4 0.798 95 46 148,601 6.8 0.836 11.3 0.221 4.6 0.837 100 47 142,156 6.9 0.889 - 0.110 4.6 0.890

76

Table 14 (panel A): The table refers to figure 5 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) for the whole period, 2000-2013, obtained with the three methodologies on different thresholds in the subsample of manufacturing occupancy only. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel A


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 19 87,537,592 4.4 1.000 8.2 0.000 1.4 1.000 10 38 42,507,267 6.2 0.200 8.2 0.000 0.8 0.800 15 57 27,694,010 7.5 0.573 8.2 0.000 1.5 0.898 20 77 19,814,119 8.8 0.832 8.3 0.000 2.2 0.968 25 96 13,501,743 9.8 0.497 8.6 0.000 2.5 0.968 30 115 9,171,755 10.7 0.138 8.8 0.000 2.9 0.963 35 134 7,761,958 11.6 0.307 9.3 0.000 3.7 0.913 40 153 4,616,407 12.4 0.022 9.6 0.000 3.9 0.882 45 172 2,769,325 13.1 0.000 10.4 0.000 4.2 0.829 50 191 1,882,356 13.8 0.000 11.3 0.000 4.8 0.990 55 210 1,181,820 14.5 0.000 12.6 0.000 5.3 0.964 60 230 799,606 15.2 0.000 14.3 0.000 5.9 0.400 65 249 671,724 15.8 0.000 16.2 0.000 6.5 0.074 70 268 526,629 16.4 0.000 18.2 0.000 7.2 0.011 75 287 409,063 16.9 0.000 20.5 0.000 8.1 0.002 80 306 330,756 17.5 0.000 23.7 0.003 9.1 0.000 85 326 274,954 18.1 0.000 28.0 0.015 10.1 0.000 90 345 228,020 18.6 0.000 33.4 0.090 10.9 0.000 95 364 189,018 19.1 0.000 40.8 0.337 11.7 0.000 100 382 140,503 19.5 0.000 - 0.336 12.3 0.000

77

Table 14 (panel B): The table refers to figure 5 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) for the whole period, 2000-2013, obtained with the three methodologies on different thresholds in the subsample of manufacturing occupancy only. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel B



78

Table 14 (panel C): The table refers to figure 5 and shows t-Stats of the estimated coefficients and KS-test for goodness-of-fit on FGU (losses from the ground up) for the whole period, 2000-2013, obtained with the three methodologies on different thresholds in the subsample of manufacturing occupancy only. The parameters tested are α for the Hill estimator, γ for the Weighted Hill estimator and ξ for the Generalized Pareto. The null hypothesis of the KS test is that the distribution being estimated is the correct one. Statistically significant values are shown in bold and italics. Panel A refers to the full group. Panel B refers to developed countries only. Panel C refers to developing countries only.

Panel C


%Obs #Obs Threshold t-Stat KS pvalue t-Stat KS pvalue t-Stat KS pvalue 5 13 34,562,984 3.6 0.608 7.4 0.002 0.2 0.969 10 26 15,844,932 5.1 0.532 7.9 0.003 0.5 1.000 15 38 8,732,513 6.2 0.921 8.2 0.004 1.2 0.991 20 51 5,425,747 7.1 0.803 8.5 0.015 1.9 0.998 25 64 3,379,646 8.0 0.322 8.7 0.051 2.6 0.999 30 77 2,344,706 8.8 0.405 9.0 0.096 3.3 0.995 35 90 1,428,214 9.5 0.291 9.4 0.221 3.8 0.953 40 102 1,085,713 10.1 0.245 9.6 0.317 4.4 0.926 45 115 771,347 10.7 0.152 10.1 0.251 5.0 0.864 50 128 683,688 11.3 0.330 10.5 0.810 5.6 0.541 55 141 570,669 11.9 0.290 10.7 0.662 6.3 0.419 60 154 472,647 12.4 0.301 11.1 0.581 6.9 0.460 65 166 406,848 12.9 0.502 11.9 0.869 7.4 0.481 70 179 346,943 13.4 0.599 13.2 0.921 7.9 0.512 75 192 295,398 13.9 0.438 15.0 0.976 8.3 0.574 80 205 264,664 14.3 0.554 17.3 0.605 8.7 0.588 85 218 230,125 14.8 0.628 20.6 0.705 9.0 0.607 90 230 206,378 15.2 0.609 23.9 0.166 9.4 0.666 95 243 173,105 15.6 0.742 28.5 0.190 9.7 0.693 100 255 140,503 16.0 0.580 - 0.147 10.0 0.667

79

b. Curve Fitting for each parameter estimator by each covariate

Figures 6 to 8 show parameter estimates for the whole period by occupancy type,

economic development and time period, obtained with Hill, Weighted Hill and POT

estimators. The x-axis reports different threshold levels measured as percentage of

observation above the threshold; hence 5 means that the threshold has been set as the 95th

percentile, 25 that it has been set as the 75th percentile, and so on until 100, which

indicates that we are using the whole sample. On the y-axis we have the estimates of our

parameters. To enforce comparability, the parameters reported are 𝛼� for the Hill

estimator, 1 𝛾�� for the Weighted Hill estimator and 1 𝜉� for the POT method (GPD).

Figure 6 refers to the Hill estimator only. Panel A shows results for the entire

dataset and by occupancy type. Panel B focuses on total and economic development.

Panel C plots total and period. Looking at occupancy type, we see that MA and total are

close to each other; this is due to the fact that our sample contains more MA data than

CO. The parameter estimates are rarely above two, and go below one below the 80th

percentile. It is not clear whether CO or MA has heaviest tail, given that for some

thresholds the CO parameter is highest and vice versa for other thresholds. Looking at

results by economic development, we see that estimates for the developed class appear to

be greater than for developing, meaning that it is lighter tailed, except when more than

80% of the sample is used for the estimation. Looking at results by period, we see that the

estimates are close to each other, with the coefficients of the before-crisis period higher

than the after-crisis one for thresholds above the 40th percentile (i.e., thinner tail than

after-crisis) and smaller for thresholds below the median (hence thicker tails). Both

80

before- and after-crisis parameters start below two (no variance), at 1.95 the before-crisis

parameter and at 1.6 the after-crisis one; they go below one (no mean) at the 70th and 80th

percentiles, respectively; they both reach the minimum of 0.12 when all the observations

are used.

Figure 7 refers to the Weighted Hill estimator only. Panel A shows results for

total (all data) and by occupancy type. Panel B focuses on total and economic

development. Panel C plots results for total and by period. Looking at results by

occupancy type, manufacturing and total are still close to each other, the parameters for

both being below one (no mean), and oscillating between 0.3 and 0.36. The CO class is

also below one, and oscillates in a manner similar to the MA class. When the percentage

of sample used is between 15% and 65%, the CO parameter is lower, hence the CO class

is heavier tailed for these thresholds. For the other thresholds the MA class seems heavier

tailed; however, differences are not great. Looking at results by economic development,

we see that the developed group parameter increases almost linearly as we lower the

threshold, going from 0.23 to 0.44, so that the mean is never defined. The parameter is

more stable, and around 0.5, when less than 60% of the sample is used; it decreases to

0.37 for lower thresholds. Hence, developing countries are less heavy tailed than

developed countries according to the Weighted Hill estimator, except for the bottom 20

percentiles (we had the opposite for Hill). Looking at the results by period, we see that

the parameters follow a similar pattern, and never intersect with one another, the after-

crisis estimates being higher (less heavy tailed) than total, which is higher (less heavy

tailed) than before-crisis (with Hill the result was mixed).

81

Figure 8 refers to the GPD estimator only. Panel A shows results for total and

occupancy type. Panel B focuses on total and economic development. Panel C plots

results for total and different periods. Looking at the results by occupancy type, we see

that MA and total are close to each other; parameters are high for high thresholds, and

decrease as we lower the threshold. The parameter for the CO class is negative for the

highest quartile, but then goes above the one obtained for the MA class (meaning thinner

tail), and tends to stay above it, meaning that CO is less heavy tailed. So results by

occupancy type are neater with GPD than for the other two methods. Looking at results

by economic development, we see that estimates for both developed and developing

countries start negative, but are both positive and large when more than 10% of the

observations are used. The estimates for developed countries tend to be higher than two

(finite variance), and are greater than for developing countries when more than 15% of

observations are used; hence the developing class is more heavy tailed. The developed

class goes below two (no variance) for thresholds below the 80th percentile, and below

one (no mean) below the 65th. Hence GPD is more in line with Hill than with Weighted

Hill also by economic development. Looking at the results by period, we see that the

before-crisis estimates are higher than the after-crisis ones, meaning that the after-crisis

class is heavier tailed (this is the opposite to what we obtained with Weighted Hill, is also

different from the results for Hill).

82

Figure 6: The figure shows Parameter Estimates for FGU (losses from the ground up) from 2000-2013, obtained with the Hill estimator. The parameter reported on the y-axes is α. The x-axes correspond to different threshold levels expressed as percentage of the total sample. Panel A shows results for total sample and split by occupancy type. Panel B shows results for total sample and split by Economic Development. Panel C shows results for total sample and split by Time Period.

0

0.5

1

1.5

2

2.5

3

0 20 40 60 80 100 120

Panel A: Parameter Estimates for Hill estimator for FGU from 2000-2013 by Occupancy

Total CO MA

83


0

0.5

1

1.5

2

2.5

3

0 20 40 60 80 100 120

Panel B: Parameter Estimates for Hill estimator for FGU from 2000-2013 by Economic Development


0

0.5

1

1.5

2

2.5

3

0 20 40 60 80 100 120

Panel C: Parameter Estimates for Hill estimator for FGU from 2000-2013 by Period

Total Before Crisis After Crisis

84

Figure 7: The figure shows Parameter Estimates for FGU (losses from the ground up) from 2000-2013, obtained with the Weighted Hill estimator. The parameter reported on the y-axes is 1/γ. The x-axes correspond to different threshold levels expressed as percentage of the total sample. Panel A shows results for total sample and split by occupancy type. Panel B shows results for total sample and split by Economic Development. Panel C shows results for total sample and split by Time Period.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 20 40 60 80 100 120

Panel A: Parameter Estimates for Weighted-Hill estimator for FGU from 2000-2013 by Occupancy

Total CO MA

85


0

0.1

0.2

0.3

0.4

0.5

0.6

0 20 40 60 80 100 120

Panel B: Parameter Estimates for Weighted-Hill estimator for FGU from 2000-2013 by Economic Development


0

0.1

0.2

0.3

0.4

0.5

0.6

0 20 40 60 80 100 120

Panel C: Parameter Estimates for Weighted-Hill estimator for FGU from 2000-2013 by Period


86

Figure 8: The figure shows Parameter Estimates for FGU (losses from the ground up) from 2000-2013, obtained with the Generalized Pareto distribution. The parameter reported on the y-axes is 1/ξ. The x-axes correspond to different threshold levels expressed as percentage of the total sample. Panel A shows results for total sample and split by occupancy type. Panel B shows results for total sample and split by Economic Development. Panel C shows results for total sample and split by Time Period.

-20-15-10

-505

101520253035

0 20 40 60 80 100 120

Panel A: Parameter Estimates for Generalized Pareto for FGU from 2000-2013 by Occupancy

Total CO MA

87


-6

-4

-2

0

2

4

6

8

10

0 20 40 60 80 100 120

Panel B: Parameter Estimates for Generalized Pareto for FGU from 2000-2013 by Economic Development


0

1

2

3

4

5

6

7

0 20 40 60 80 100 120

Panel C: Parameter Estimates for Generalized Pareto for FGU from 2000-2013 by Period


88

c. Plots of Cumulative Density Functions (CDFs)

Figure 9 shows empirical and estimated CDFs obtained with the three different

methodologies compared across different subsample. The y-axis reports the forecasted

quantiles, while the x-axis maps different loss values scaled by the threshold. In all panels

the threshold chosen is the median. Panel A shows results for the whole sample. Panel B

shows results for developed countries before the crisis. Panel C shows results for

developing countries before the crisis. Panel D shows results for developed countries

after the crisis. Panel E shows results for developing countries after the crisis.

During the whole period the CDF obtained with GPD is the one with thinnest tail,

and is the closest to the empirical CDF. The Weighted Hill CDF is the heaviest tailed,

whereas the Hill CDF is in the middle.

For developed countries before the crisis, GPD has thinnest tail and is close to the

empirical distribution. The Hill CDF has heavier tail and is is still quite close to the

empirical CDF. The Weighted Hill CDF has the heaviest tail, but is far from the empirical

CDF.

For the developing group before the crisis, GPD provides again results close the

empirical distribution and lighted tail. Hill and Weighted Hill CDFs are heavier tailed

than GPD, with the Weighted Hill CDF having heaviest tail.

For the developed group after the crisis, both Hill and GPD CDF are lighter tailed

than the Weighted Hill CDF, and close to the empirical CDF.

89

For the developing group after the crisis, all the CDFs look close to each other.

GPD has lightest tail and approximates well the empirical CDF for higher losses. The

Weighted Hill CDF is again the heaviest tailed, while the Hill CDF is in the middle.

90

Figure 9: The figure shows empirical and estimated CDFs for FGU (losses from the ground up) from 2000-2013, obtained with the three different methodologies. The y-axes report the forecasted quantiles, while the x-axes correspond to different losses values scaled by the threshold. In all panels the threshold chosen is the median. Panel A shows results for total sample. Panel B shows results for the developed countries before the crisis. Panel C shows results for the developing countries before the crisis. Panel D shows results for the developed countries after the crisis. Panel E shows results for the developing countries after the crisis.

0

0.2

0.4

0.6

0.8

1

1.2

0 20 40 60 80 100 120

Panel A: CDFs for FGU from 2000 to 2013 for Total

Empirical Hill Weighted Hill GPD

91


0

0.2

0.4

0.6

0.8

1

1.2

0 20 40 60 80 100 120

Panel B: CDFs for FGU from 2000 to 2008 for Developed


0

0.2

0.4

0.6

0.8

1

1.2

0 20 40 60 80 100 120

Panel C: CDFs for FGU from 2000 to 2008 for Developing


92


0

0.2

0.4

0.6

0.8

1

1.2

0 20 40 60 80 100 120

Panel D: CDFs for FGU from 2009 to 2013 for Developed


0

0.2

0.4

0.6

0.8

1

1.2

0 20 40 60 80 100 120

Panel E: CDFs for FGU from 2009 to 2013 for Developing


93

d. Regression results

Tables 15 and 16 show results for the Hill and GPD regressions with covariates,

respectively. The parameters are estimated for different thresholds ranging from 15% of

the observations (85th percentile) to 100% (full sample), and for different set of

covariates. All the dummy variables implemented are relative to the constant. The models

are estimated on the whole dataset, but with EON and RE excluded. The dummy for MA

is relative to CO, the dummy for developing (Emerg.) is relative to developed, and the

dummy for after-crisis is relative to before-crisis. We also inserted a dummy called

interaction for the interaction between being at the same time in a developing country and

in the period after-crisis (Interact.). The only continuous variable we use is the logarithm

of total insured values (logTIV*), to control for size effects. Panel A uses only dummies

for developing, after-crisis and interaction as covariates. Panel B adds the dummy for MA

to the initial regression. Panel C adds the logarithm of total insured value, removing the

records with TIV not available from the regression. Panel D adds both a dummy for MA

and logTIV* to the initial regression. It is worth pointing out that the coefficients

estimated for the GPD regression are relative to the ξ parameter here, hence the average

marginal effect on α=1/ξ is opposite to their sign.

Table 15 refers to the Hill regression with covariates. For the baseline model

(panel A) the dummy for developing is always positive and significant above 80% of the

sample, meaning that developing countries tend to have higher α (thinner tail) than

developed ones. The dummy for crisis is negative (heavier tails after the crisis) but not

significant. Interaction is positive and significant above 59% of sample (lower tail for

developing countries after the crisis).

94

When adding the dummy for the MA class (panel B), this variable is generally

positive for thresholds below the median, and negative for higher thresholds, but not

statistically significant. The other results are unaffected.

When controlling for size (adding log of TIV*, see panel C) the constant loses

statistical power. The dummy for developing countries is still generally positive and

statistically significant only above 80% of the sample, as in panels A and B. The crisis

dummy is still generally negative and not statistically significant. Interaction is still

positive and statistically significant for sample sizes ranging between 55% and 90% of

the sample, the same as for panels A and B. LogTIV* is generally negative (hence tail

becomes heavier as we increase TIV*) but not statistically significant. The results for the

full model (panel D) are similar to those of panels A, B and C. Hence, adding the dummy

for occupancy, or controlling for size, or both, does not change much the results.

Table 16 refers to the GPD regression with covariates. For the baseline model

(panel A) the dummy for developing countries is positive above 50% of the sample, and

statistically significant only above 80%, meaning that developing countries tend to have

lower α (heavier tail) than developed ones, the opposite of the Hill regression. However,

for thresholds above the median the dummy for developing countries is negative,

although not statistically significant (except for the highest thresholds), which implies

higher α (lighter tail) as in the Hill regression. The crisis dummy is negative (lighter tails

after the crisis), the opposite of what we found with Hill, and statistically significant for

thresholds below the median. As opposed to the results of the Hill regression, interaction

is generally positive, but not statistically significant.

95

By adding the MA dummy, the results are not much affected as happened with the

Hill methodology. The dummy is negative, but not statistically significant, except for a

couple of high thresholds for which it is positive, meaning that MA is slightly heavier

tailed.

When controlling for size (panel C) the constant loses statistical power (as in the

Hill case). The dummy for developing countries is still generally positive when more than

50% of the sample is used, negative otherwise. It is statistically significant for thresholds

lower than the 25th percentile and above the 15th. The crisis dummy is still almost

everywhere negative, and statistically significant for thresholds below the median.

Interaction is generally positive, but not statistically significant. LogTIV* is generally

negative (hence the tail becomes thinner as we increase TIV*, the opposite of what we

obtained with the Hill regression), and it is rarely statistically significant (only when the

full sample or 90% of it is used).

The results for the full model (panel D) are similar to those presented in panels A,

B and C, meaning that including either MA or size or both has no great impact on the

other parameters.

Tables 17 and 18 show results for the Hill and GPD regressions when different

explanatory variables are used. Table 17 features the addition of the EON class to the

sample and as a new dummy. Table 18 shows regressions where EON is still out of

sample, and changes the dummy for the after-crisis period with a dummy for the period

after the Thai flood of 2011. As a result, the interaction dummy changes as well. The

parameters are estimated for different levels of thresholds ranging from 15% of the

96

observations (85th percentile) to 100% (full sample), and for different sets of covariates.

All the dummy variables implemented are relative to the constant. The dummy for the

MA and EON classes are relative to the CO class, the dummy for developing (Emerg.) is

relative to developed, and the dummy for after crisis (flood) is relative to before the crisis

(flood) period. The variable “Interact” refers to the interaction between being in a

developing country and after the crisis (flood). The only continuous variable is the

logarithm of total insured values (LogTIV*), which again we use to control for size

effects. Significant t-statistics (above 1.96) are reported in bold font. Panel A shows

results for the Hill regression without TIV*. Panel B adds LogTIV* to the Hill

regression. Panel C shows results for GPD regression without TIV*. Panel D adds

LogTIV* to the GPD regression. The coefficients estimated for the GPD are relative to

the ξ parameter here, hence the average marginal effect to α=1/ξ is opposite to their sign.

For the case where the EON class is added to the sample (table 17) and as

covariate the results do not change. Parameters estimated with and without TIV* for both

Hill and GPD are similar to those of tables 15 and 16, panels B and D. The EON dummy

is not generally statistically significant and is negative for GPD, meaning slightly thinner

tail than CO, while for Hill it is still not statistically significant, but it is positive (i.e.

thinner tail like GPD) only for thresholds above the median, and it is negative (thicker

tail) for lower thresholds.

When the crisis dummy is substituted with the flood one (table 18), for the Hill

regression this dummy is generally positive, but not statistically significant without the

variable LogTIV*, while it is in general negative, but not statistically significant, when

we control for size effects. The interaction between flood and developing dummies is

97

positive, but not statistically significant in both cases. The MA and developing dummies

behave similarly as in the previous case, with the difference that the developing dummy

is statistically significant for more thresholds, while the MA one is never statistically

significant.

Results for the GPD regression are very similar to those of table 16, where the

flood dummy behaves as the crisis one, and the results for the interaction between flood

and developing dummies are similar to those of the interaction between crisis and

developing countries.

Table 15: The table shows results for the Hill regression with covariates respectively. The parameters are estimated for different levels of thresholds ranging from 15% of observation (85th percentile) to 100% (full sample), and for different set of covariates. All the dummy variables implemented are relative to the constant. The dummy for manufacturing (Manufact.) is relative to commercial, the dummy for developing (Emerg.) is relative to developed, and the dummy for after crisis is relative to before the crisis. We also inserted a dummy called interaction for the interaction between being in a developing country after the crisis (Interact.). The only continuous variable is the logarithm of total insured values (LogTIV*), to control for size effects. Significant t-Statistics (above 1.96) are in bold and italics. The models are estimated with the exclusion of energy-on-shore. Panel A uses only dummies for developing, after crisis and interaction as covariates. Panel B adds the dummy for manufacturing to the initial regression. Panel C adds the logarithm of total insured value, removing the records with TIV not available from the regression. Panel D adds both dummy for manufacturing and LogTIV* to the initial regression.

98

Table 15: Panel A

Hill Regression (without TIV* and occupancy)

%Obs

Const. Emerg. After Crisis Interact.

15 Coeff 0.125 0.137 0.248 0.142 t-stat 0.78 0.47 0.70 0.22

20 Coeff -0.043 0.075 0.300 0.241 t-stat 0.30 0.29 1.02 0.47

25 Coeff -0.228 0.165 0.472 0.048 t-stat 1.68 0.69 1.89 0.11

30 Coeff -0.306 0.057 0.261 0.151 t-stat 2.51 0.26 1.13 0.39

35 Coeff -0.325 0.094 0.108 0.178 t-stat 2.89 0.47 0.49 0.49

40 Coeff -0.505 0.072 0.082 0.357 t-stat 4.71 0.38 0.39 1.10

45 Coeff -0.702 0.091 0.070 0.349 t-stat 6.80 0.51 0.35 1.16

50 Coeff -0.816 0.136 0.038 0.430 t-stat 8.12 0.79 0.19 1.52

55 Coeff -0.936 0.103 -0.029 0.553 t-stat 9.64 0.62 0.15 2.04

60 Coeff -1.013 0.133 -0.077 0.570 t-stat 10.72 0.84 0.39 2.17

65 Coeff -1.085 0.181 -0.059 0.606 t-stat 11.59 1.17 0.31 2.40

70 Coeff -1.172 0.252 -0.070 0.558 t-stat 12.62 1.69 0.36 2.26

75 Coeff -1.217 0.268 -0.082 0.623 t-stat 13.22 1.84 0.43 2.58

80 Coeff -1.277 0.312 -0.085 0.616 t-stat 13.93 2.19 0.44 2.59

85 Coeff -1.322 0.372 -0.062 0.544 t-stat 14.48 2.69 0.32 2.35

90 Coeff -1.356 0.437 -0.067 0.508 t-stat 14.91 3.25 0.35 2.23

95 Coeff -1.400 0.487 -0.012 0.417 t-stat 15.41 3.69 0.06 1.88

100 Coeff -1.460 0.516 -0.019 0.360 t-stat 16.12 3.99 0.10 1.64

99

Table 15: Panel B

Hill Regression (without TIV*)

%Obs

Const. Manufac. Emerg. After Crisis Interact.

15 Coeff 0.432 -0.350 0.092 0.264 0.171 t-stat 1.36 1.09 0.31 0.74 0.26

20 Coeff -0.003 -0.046 0.072 0.300 0.234 t-stat 0.01 0.16 0.28 1.02 0.46

25 Coeff -0.210 -0.021 0.165 0.472 0.043 t-stat 0.78 0.08 0.69 1.89 0.10

30 Coeff -0.377 0.080 0.058 0.260 0.169 t-stat 1.50 0.33 0.27 1.13 0.43

35 Coeff -0.354 0.034 0.094 0.108 0.185 t-stat 1.55 0.15 0.47 0.48 0.51

40 Coeff -0.443 -0.072 0.073 0.082 0.340 t-stat 2.19 0.36 0.38 0.39 1.04

45 Coeff -0.645 -0.066 0.093 0.071 0.334 t-stat 3.44 0.36 0.52 0.35 1.10

50 Coeff -0.716 -0.119 0.140 0.040 0.405 t-stat 4.11 0.70 0.82 0.20 1.42

55 Coeff -0.923 -0.016 0.104 -0.029 0.550 t-stat 5.45 0.10 0.62 0.15 2.01

60 Coeff -0.970 -0.052 0.136 -0.076 0.560 t-stat 6.06 0.34 0.85 0.39 2.12

65 Coeff -1.103 0.021 0.180 -0.060 0.609 t-stat 7.00 0.14 1.16 0.31 2.40

70 Coeff -1.252 0.096 0.247 -0.072 0.572 t-stat 7.98 0.64 1.65 0.37 2.31

75 Coeff -1.326 0.131 0.261 -0.085 0.641 t-stat 8.59 0.89 1.79 0.44 2.64

80 Coeff -1.320 0.052 0.309 -0.086 0.622 t-stat 8.95 0.37 2.16 0.45 2.61

85 Coeff -1.399 0.093 0.367 -0.064 0.556 t-stat 9.55 0.68 2.64 0.34 2.39

90 Coeff -1.395 0.047 0.434 -0.070 0.515 t-stat 9.83 0.36 3.22 0.37 2.25

95 Coeff -1.430 0.035 0.485 -0.013 0.421 t-stat 10.26 0.28 3.67 0.07 1.89

100 Coeff -1.439 -0.025 0.517 -0.018 0.358 t-stat 10.67 0.21 4.00 0.10 1.62

100

Table 15: Panel C

Hill Regression (without occupancy)

%Obs

Const. Emerg. After Crisis Interact. LogTIV*

15 Coeff 0.994 0.168 0.535 -0.136 -0.049 t-stat 0.72 0.55 1.48 0.23 0.72

20 Coeff 1.249 0.154 0.586 -0.027 -0.074 t-stat 1.05 0.55 1.96 0.06 1.27

25 Coeff 1.655 0.026 0.256 0.079 -0.097 t-stat 1.61 0.10 0.95 0.19 1.89

30 Coeff 1.330 0.046 0.068 0.223 -0.082 t-stat 1.44 0.20 0.26 0.59 1.79

35 Coeff 0.995 -0.040 0.052 0.319 -0.073 t-stat 1.18 0.18 0.22 0.91 1.73

40 Coeff 0.223 0.028 -0.034 0.391 -0.042 t-stat 0.28 0.14 0.15 1.19 1.09

45 Coeff -0.439 0.069 -0.029 0.476 -0.015 t-stat 0.59 0.36 0.12 1.54 0.40

50 Coeff -0.627 0.054 -0.087 0.575 -0.012 t-stat 0.88 0.29 0.38 1.93 0.33

55 Coeff -0.639 0.108 -0.132 0.583 -0.015 t-stat 0.94 0.61 0.58 2.02 0.45

60 Coeff -0.734 0.132 -0.120 0.601 -0.014 t-stat 1.13 0.76 0.54 2.15 0.42

65 Coeff -0.922 0.224 -0.128 0.538 -0.007 t-stat 1.47 1.35 0.58 1.97 0.23

70 Coeff -1.188 0.251 -0.131 0.553 0.003 t-stat 1.95 1.55 0.59 2.06 0.11

75 Coeff -1.519 0.263 -0.119 0.599 0.017 t-stat 2.54 1.65 0.54 2.26 0.59

80 Coeff -1.673 0.311 -0.071 0.546 0.022 t-stat 2.85 1.98 0.33 2.11 0.78

85 Coeff -1.709 0.368 -0.079 0.525 0.023 t-stat 2.98 2.41 0.36 2.05 0.81

90 Coeff -1.638 0.425 -0.092 0.486 0.017 t-stat 2.93 2.85 0.42 1.93 0.63

95 Coeff -1.717 0.476 -0.021 0.353 0.019 t-stat 3.12 3.25 0.10 1.44 0.69

100 Coeff -1.720 0.515 -0.022 0.307 0.016 t-stat 3.17 3.58 0.10 1.26 0.59

101

Table 15: Panel D

Hill Regression

%Obs

Const. Manufac. Emerg. After Crisis Interact. LogTIV*

15 Coeff 0.995 0.117 0.177 0.532 -0.128 -0.054 t-stat 0.71 0.33 0.58 1.47 0.22 0.78

20 Coeff 1.228 0.151 0.161 0.584 0.000 -0.080 t-stat 1.02 0.50 0.57 1.95 0.00 1.33

25 Coeff 1.600 0.233 0.031 0.250 0.123 -0.104 t-stat 1.53 0.83 0.12 0.92 0.29 1.99

30 Coeff 1.302 0.085 0.048 0.067 0.239 -0.084 t-stat 1.40 0.34 0.21 0.26 0.62 1.82

35 Coeff 1.007 -0.035 -0.040 0.052 0.312 -0.072 t-stat 1.19 0.16 0.18 0.22 0.89 1.70

40 Coeff 0.237 -0.025 0.027 -0.034 0.387 -0.042 t-stat 0.30 0.12 0.13 0.14 1.17 1.07

45 Coeff -0.407 -0.080 0.070 -0.029 0.464 -0.013 t-stat 0.54 0.43 0.36 0.12 1.49 0.35

50 Coeff -0.644 0.043 0.053 -0.087 0.581 -0.013 t-stat 0.90 0.24 0.28 0.38 1.94 0.36

55 Coeff -0.655 0.041 0.107 -0.132 0.588 -0.016 t-stat 0.96 0.24 0.60 0.58 2.03 0.48

60 Coeff -0.759 0.072 0.130 -0.121 0.610 -0.015 t-stat 1.16 0.44 0.75 0.54 2.17 0.48

65 Coeff -0.973 0.147 0.220 -0.130 0.554 -0.011 t-stat 1.54 0.90 1.32 0.58 2.02 0.34

70 Coeff -1.241 0.162 0.246 -0.132 0.568 -0.001 t-stat 2.02 1.01 1.52 0.60 2.11 0.03

75 Coeff -1.565 0.151 0.258 -0.119 0.612 0.013 t-stat 2.60 0.97 1.61 0.54 2.31 0.45

80 Coeff -1.700 0.097 0.307 -0.072 0.554 0.019 t-stat 2.89 0.66 1.96 0.33 2.14 0.67

85 Coeff -1.730 0.076 0.365 -0.079 0.530 0.020 t-stat 3.01 0.53 2.39 0.36 2.07 0.72

90 Coeff -1.664 0.098 0.422 -0.092 0.494 0.015 t-stat 2.96 0.69 2.82 0.42 1.95 0.53

95 Coeff -1.736 0.070 0.473 -0.021 0.358 0.017 t-stat 3.14 0.51 3.23 0.10 1.46 0.61

100 Coeff -1.723 0.015 0.514 -0.022 0.309 0.015 t-stat 3.17 0.11 3.57 0.10 1.27 0.57

102

Table 16: The table shows results for the GPD regression with covariates respectively. The parameters are estimated for different levels of thresholds ranging from 15% of observation (85th percentile) to 100% (full sample), and for different set of covariates. All the dummy variables implemented are relative to the constant. The dummy for manufacturing (Manufact.) is relative to commercial, the dummy for developing (Emerg.) is relative to developed, and the dummy for after crisis is relative to before the crisis. We also inserted a dummy called interaction for the interaction between being in a developing country after the crisis (Interact.). The only continuous variable is the logarithm of total insured values (LogTIV*), to control for size effects. Significant t-Statistics (above 1.96) are in bold and italics. The models are estimated with the exclusion of energy-on-shore. Panel A uses only dummies for developing, after crisis and interaction as covariates. Panel B adds the dummy for manufacturing to the initial regression. Panel C adds the logarithm of total insured value, removing the records with TIV not available from the regression. Panel D adds both dummy for manufacturing and LogTIV* to the initial regression. The coefficients estimated are relative to the ξ parameter here, hence the average marginal effect to α=1/ξ is opposite to their sign.

103

Table 16: Panel A

GPD Regression (without TIV* and occupancy)

%Obs

Const. Emerg. After Crisis Interact.

15 Coeff 0.324 -1.285 -0.346 5.763 t-stat 1479.72 196.96 0.52 1.74

20 Coeff 0.336 -1.020 -0.052 1.426 t-stat 1.64 2.88 0.11 1.93

25 Coeff 0.285 -0.617 0.211 0.627 t-stat 1.59 1.66 0.54 1.03

30 Coeff 0.433 -0.621 -0.139 0.737 t-stat 2.21 1.71 0.42 1.38

35 Coeff 0.604 -0.299 -0.389 0.477 t-stat 2.82 0.74 1.21 0.89

40 Coeff 0.553 -0.285 -0.360 0.611 t-stat 2.98 0.84 1.25 1.22

45 Coeff 0.526 -0.178 -0.332 0.494 t-stat 3.10 0.57 1.23 1.09

50 Coeff 0.545 -0.007 -0.364 0.570 t-stat 3.25 0.02 1.39 1.19

55 Coeff 0.602 -0.051 -0.458 0.752 t-stat 3.53 0.17 1.78 1.67

60 Coeff 0.673 0.087 -0.547 0.687 t-stat 3.80 0.26 2.11 1.52

65 Coeff 0.682 0.282 -0.534 0.739 t-stat 3.86 0.80 2.05 1.55

70 Coeff 0.692 0.649 -0.553 0.348 t-stat 3.93 1.67 2.14 0.71

75 Coeff 0.717 0.764 -0.583 0.433 t-stat 4.02 1.95 2.25 0.87

80 Coeff 0.722 0.994 -0.592 0.202 t-stat 4.06 2.50 2.29 0.41

85 Coeff 0.732 1.310 -0.574 -0.160 t-stat 4.10 3.20 2.19 0.32

90 Coeff 0.746 1.624 -0.589 -0.449 t-stat 4.14 3.86 2.25 0.90

95 Coeff 0.738 1.705 -0.519 -0.600 t-stat 4.13 4.26 1.92 1.24

100 Coeff 0.750 1.621 -0.534 -0.597 t-stat 4.18 4.38 1.97 1.32

104

Table 16: Panel B

GPD Regression (without TIV*)

%Obs

Const. Manufac. Emerg. After Crisis Interact.

15 Coeff 0.187 0.039 -1.171 -0.523 5.943 t-stat 618.01 15.21 178.15 0.95 1.81

20 Coeff -0.975 1.302 -0.013 0.029 0.026 t-stat 176.19 7.13 2.40 1.28 0.46

25 Coeff -0.964 1.345 -0.020 0.010 0.012 t-stat 91.65 9.24 2.26 0.71 0.78

30 Coeff -0.955 1.358 -0.021 0.025 -0.003 t-stat 63.44 9.90 1.74 1.55 0.14

35 Coeff -0.415 1.028 -0.427 -0.161 0.383 t-stat 0.87 3.55 0.97 0.38 0.71

40 Coeff 0.538 0.010 -0.241 -0.364 0.581 t-stat 1.25 0.02 0.66 1.26 1.11

45 Coeff 0.560 -0.050 -0.142 -0.333 0.490 t-stat 1.50 0.13 0.43 1.24 1.05

50 Coeff 0.903 -0.426 0.004 -0.342 0.630 t-stat 2.25 1.07 0.01 1.30 1.30

55 Coeff 0.779 -0.221 -0.047 -0.449 0.781 t-stat 2.27 0.66 0.15 1.75 1.73

60 Coeff 0.965 -0.374 0.085 -0.516 0.739 t-stat 2.66 1.05 0.26 1.99 1.63

65 Coeff 0.907 -0.293 0.279 -0.508 0.772 t-stat 2.56 0.83 0.81 1.96 1.61

70 Coeff 0.816 -0.160 0.647 -0.541 0.359 t-stat 2.43 0.48 1.67 2.10 0.73

75 Coeff 0.891 -0.219 0.768 -0.564 0.429 t-stat 2.54 0.63 1.96 2.18 0.86

80 Coeff 1.064 -0.434 1.026 -0.549 0.165 t-stat 2.84 1.18 2.57 2.13 0.33

85 Coeff 0.979 -0.314 1.341 -0.542 -0.209 t-stat 2.79 0.92 3.28 2.08 0.42

90 Coeff 1.050 -0.391 1.670 -0.546 -0.526 t-stat 2.97 1.14 3.99 2.09 1.05

95 Coeff 1.073 -0.425 1.755 -0.475 -0.690 t-stat 3.08 1.28 4.41 1.78 1.42

100 Coeff 1.130 -0.489 1.669 -0.476 -0.690 t-stat 3.34 1.53 4.57 1.79 1.53

105

Table 16: Panel C

GPD Regression (without occupancy)

%Obs

Const. Emerg. After Crisis Interact. LogTIV*

15 Coeff -0.644 -0.943 0.270 1.175 0.048 t-stat 1.14 3.62 0.44 1.37 1.57

20 Coeff -0.956 -0.947 -0.094 1.126 0.070 t-stat 0.36 1.42 0.18 1.27 0.55

25 Coeff 1.210 -0.466 -0.304 0.600 -0.035 t-stat 0.86 1.08 0.76 1.01 0.54

30 Coeff 1.227 -0.420 -0.614 0.705 -0.026 t-stat 0.80 0.92 1.59 1.15 0.36

35 Coeff 1.123 -0.533 -0.590 0.820 -0.021 t-stat 0.79 1.36 1.67 1.46 0.31

40 Coeff 0.314 -0.275 -0.584 0.663 0.018 t-stat 0.20 0.67 1.77 1.05 0.24

45 Coeff -0.229 -0.064 -0.542 0.671 0.046 t-stat 0.15 0.15 1.63 0.98 0.60

50 Coeff 0.435 -0.120 -0.658 0.963 0.015 t-stat 0.30 0.30 2.08 1.58 0.20

55 Coeff 0.845 0.055 -0.766 0.940 -0.003 t-stat 0.60 0.13 2.48 1.63 0.04

60 Coeff 1.519 0.141 -0.836 1.057 -0.034 t-stat 1.12 0.33 2.74 1.86 0.52

65 Coeff 1.405 0.749 -0.840 0.402 -0.027 t-stat 1.05 1.58 2.77 0.69 0.43

70 Coeff 1.637 0.883 -0.880 0.325 -0.037 t-stat 1.23 1.94 2.91 0.59 0.58

75 Coeff 1.794 0.892 -0.899 0.423 -0.044 t-stat 1.35 2.05 2.97 0.80 0.70

80 Coeff 2.177 1.071 -0.880 0.252 -0.064 t-stat 1.65 2.48 2.94 0.48 1.02

85 Coeff 2.701 1.327 -0.928 0.088 -0.089 t-stat 2.10 3.01 3.15 0.17 1.46

90 Coeff 3.228 1.512 -0.970 -0.079 -0.114 t-stat 2.62 3.46 3.34 0.15 1.97

95 Coeff 3.166 1.490 -0.869 -0.255 -0.111 t-stat 2.48 3.66 2.92 0.52 1.86

100 Coeff 3.363 1.458 -0.868 -0.258 -0.122 t-stat 2.71 3.80 2.94 0.55 2.09

106

Table 16: Panel D

GPD Regression

%Obs

Const. Manufac. Emerg. After Crisis Interact. LogTIV*

15 Coeff -0.629 1.293 -0.136 0.018 0.331 -0.009 t-stat 4.25 4.77 1.97 0.13 1.12 1.19

20 Coeff -0.721 1.193 -0.089 0.037 0.102 -0.005 t-stat 11.01 8.00 1.11 0.54 0.62 4.21

25 Coeff 0.329 0.833 -0.472 -0.170 0.470 -0.032 t-stat 0.09 1.53 0.62 0.30 0.51 0.23

30 Coeff 1.102 0.820 -0.799 -0.499 0.818 -0.055 t-stat 1.98 3.81 1.94 1.06 1.62 2.19

35 Coeff 1.095 0.813 -0.974 -0.644 0.922 -0.046 t-stat 1.90 3.52 3.13 1.53 2.32 1.61

40 Coeff 0.408 -0.165 -0.262 -0.562 0.677 0.019 t-stat 0.24 0.41 0.63 1.65 1.01 0.24

45 Coeff 0.395 -0.496 -0.068 -0.508 0.791 0.034 t-stat 0.23 1.13 0.16 1.47 1.09 0.42

50 Coeff 0.640 -0.192 -0.130 -0.659 1.010 0.012 t-stat 0.40 0.55 0.32 2.04 1.60 0.16

55 Coeff 1.003 -0.258 -0.001 -0.777 1.066 0.000 t-stat 0.67 0.72 0.00 2.42 1.79 0.00

60 Coeff 1.519 -0.263 0.077 -0.834 1.167 -0.024 t-stat 1.08 0.74 0.19 2.65 2.03 0.36

65 Coeff 1.375 -0.131 0.710 -0.831 0.410 -0.020 t-stat 0.97 0.38 1.51 2.68 0.70 0.30

70 Coeff 1.608 -0.162 0.835 -0.875 0.337 -0.028 t-stat 1.14 0.46 1.84 2.82 0.61 0.42

75 Coeff 1.789 -0.238 0.859 -0.888 0.445 -0.034 t-stat 1.29 0.65 1.98 2.87 0.84 0.52

80 Coeff 2.510 -0.352 1.070 -0.890 0.271 -0.066 t-stat 1.88 0.92 2.49 2.90 0.52 1.06

85 Coeff 3.067 -0.427 1.427 -0.896 -0.016 -0.091 t-stat 2.40 1.10 3.21 2.95 0.03 1.55

90 Coeff 3.467 -0.392 1.622 -0.933 -0.216 -0.112 t-stat 2.85 1.08 3.67 3.16 0.41 2.01

95 Coeff 3.230 -0.408 1.609 -0.798 -0.414 -0.100 t-stat 2.54 1.19 3.92 2.65 0.83 1.72

100 Coeff 3.507 -0.407 1.545 -0.823 -0.362 -0.114 t-stat 2.91 1.26 4.04 2.77 0.77 2.07

107

Table 17: The table shows results for the Hill and GPD regression with covariates when energy-on-shore (EON) is added. The parameters are estimated for different levels of thresholds ranging from 15% of observation (85th percentile) to 100% (full sample), and for different set of covariates. All the dummy variables implemented are relative to the constant. The dummy for manufacturing (Manufact.) is relative to commercial, the dummy for developing (Emerg.) is relative to developed, and the dummy for after crisis is relative to before the crisis. Intercat. is for the interaction between being in a developing country after the crisis (Interact.). The only continuous variable is the logarithm of total insured values (LogTIV*), to control for size effects. Significant t-Statistics (above 1.96) are in bold and italics. Panel A uses shows results for Hill regression without TIV*. Panel B LogTIV* to the Hill regression. Panel C uses shows results for GPD regression without TIV*. Panel D LogTIV* to the GPD regression. The coefficients estimated for the GPD are relative to the ξ parameter here, hence the average marginal effect to α=1/ξ is opposite to their sign.

108

Table 17: Panel A

Hill Regression

%Obs

Const. EON Manufac. Emerg. After Crisis Interact.

15 Coeff 0.388 -0.404 -0.270 0.064 0.252 -0.082 t-stat 1.19 0.83 0.81 0.23 0.71 0.12

20 Coeff -0.024 0.015 -0.029 0.028 0.245 0.556 t-stat 0.08 0.04 0.10 0.11 0.81 1.15

25 Coeff -0.250 0.087 0.004 0.109 0.528 0.308 t-stat 0.93 0.24 0.02 0.47 2.12 0.78

30 Coeff -0.370 0.238 0.046 0.063 0.291 0.236 t-stat 1.49 0.74 0.19 0.30 1.24 0.64

35 Coeff -0.368 0.316 0.109 0.052 0.135 0.369 t-stat 1.57 1.06 0.47 0.27 0.60 1.07

40 Coeff -0.368 0.283 -0.063 0.201 0.176 0.111 t-stat 1.75 1.09 0.30 1.15 0.82 0.35

45 Coeff -0.483 0.237 -0.103 0.173 0.235 0.013 t-stat 2.48 1.00 0.53 1.04 1.19 0.04

50 Coeff -0.704 0.079 -0.047 0.160 0.114 0.181 t-stat 3.84 0.35 0.26 1.01 0.59 0.66

55 Coeff -0.722 -0.093 -0.125 0.144 0.047 0.298 t-stat 4.29 0.43 0.74 0.94 0.25 1.14

60 Coeff -0.885 -0.132 -0.068 0.125 -0.021 0.445 t-stat 5.48 0.62 0.42 0.84 0.11 1.77

65 Coeff -0.987 -0.199 -0.057 0.139 -0.033 0.483 t-stat 6.39 0.96 0.37 0.96 0.18 1.98

70 Coeff -1.139 -0.185 0.050 0.211 -0.059 0.492 t-stat 7.39 0.89 0.33 1.52 0.32 2.08

75 Coeff -1.233 -0.158 0.083 0.233 -0.078 0.529 t-stat 8.15 0.78 0.56 1.71 0.42 2.28

80 Coeff -1.288 -0.209 0.080 0.252 -0.083 0.590 t-stat 8.74 1.04 0.56 1.88 0.45 2.60

85 Coeff -1.331 -0.258 0.074 0.318 -0.064 0.529 t-stat 9.27 1.31 0.54 2.44 0.35 2.38

90 Coeff -1.338 -0.329 0.034 0.359 -0.070 0.526 t-stat 9.63 1.70 0.26 2.81 0.38 2.40

95 Coeff -1.380 -0.368 0.027 0.438 -0.018 0.398 t-stat 10.12 1.92 0.21 3.52 0.10 1.87

100 Coeff -1.399 -0.422 -0.026 0.465 -0.024 0.351 t-stat 10.60 2.24 0.21 3.82 0.14 1.67

109

Table 17: Panel B

Hill Regression

%Obs

Const. EON Manufac. Emerg. After Crisis Interact. LogTIV*

15 Coeff 0.939 -0.319 0.146 0.172 0.522 0.034 -0.053 t-stat 0.70 0.57 0.41 0.59 1.46 0.06 0.79

20 Coeff 0.455 0.379 0.224 0.120 0.605 0.391 -0.047 t-stat 0.38 0.91 0.71 0.43 1.95 0.88 0.80

25 Coeff 0.683 0.399 0.143 0.079 0.381 0.141 -0.059 t-stat 0.67 1.13 0.51 0.32 1.39 0.35 1.14

30 Coeff 1.007 0.460 0.279 0.020 0.132 0.350 -0.075 t-stat 1.10 1.40 1.03 0.09 0.51 0.94 1.63

35 Coeff 0.856 0.393 0.019 0.183 0.196 0.050 -0.064 t-stat 1.02 1.39 0.08 0.90 0.79 0.14 1.53

40 Coeff 0.619 0.277 -0.032 0.142 0.221 0.029 -0.057 t-stat 0.80 1.07 0.15 0.74 0.98 0.09 1.45

45 Coeff -0.154 0.084 -0.021 0.108 0.060 0.201 -0.025 t-stat 0.21 0.34 0.11 0.60 0.27 0.67 0.68

50 Coeff -0.298 -0.094 -0.097 0.101 -0.002 0.312 -0.019 t-stat 0.43 0.40 0.53 0.59 0.01 1.10 0.55

55 Coeff -0.818 -0.079 0.030 0.104 -0.041 0.445 -0.005 t-stat 1.23 0.34 0.17 0.62 0.19 1.62 0.15

60 Coeff -0.506 -0.187 -0.013 0.110 -0.118 0.501 -0.022 t-stat 0.80 0.83 0.08 0.68 0.56 1.87 0.68

65 Coeff -0.809 -0.169 0.103 0.139 -0.109 0.537 -0.013 t-stat 1.32 0.75 0.63 0.89 0.52 2.07 0.44

70 Coeff -1.034 -0.151 0.153 0.194 -0.102 0.525 -0.008 t-stat 1.73 0.69 0.95 1.27 0.49 2.06 0.27

75 Coeff -1.077 -0.216 0.164 0.186 -0.126 0.605 -0.007 t-stat 1.87 0.99 1.05 1.25 0.61 2.43 0.26

80 Coeff -1.048 -0.321 0.101 0.229 -0.077 0.537 -0.008 t-stat 1.89 1.51 0.68 1.58 0.38 2.23 0.30

85 Coeff -1.470 -0.341 0.117 0.316 -0.064 0.466 0.008 t-stat 2.67 1.61 0.81 2.21 0.31 1.94 0.31

90 Coeff -0.576 -0.309 0.104 0.244 -0.370 0.804 -0.033 t-stat 1.10 1.52 0.75 1.77 1.73 3.25 1.26

95 Coeff -1.625 -0.419 0.063 0.426 -0.019 0.352 0.014 t-stat 3.05 2.05 0.46 3.09 0.10 1.51 0.54

100 Coeff -1.122 -0.378 0.046 0.313 -0.377 0.734 -0.009 t-stat 2.17 1.91 0.35 2.35 1.74 2.99 0.35

110

Table 17: Panel C

GPD Regression

%Obs

Const. EON Manufac. Emerg. After Crisis Interact.

15 Coeff 0.326 0.017 0.034 -1.314 -0.691 2.644 t-stat 1942.77 1.70 155.84 319.76 1.27 2.07

20 Coeff 0.400 -0.008 0.028 -1.364 -0.352 2.055 t-stat 64.79 0.50 4.52 101.37 0.86 3.14

25 Coeff 0.332 -0.016 0.029 -1.269 0.177 1.245 t-stat 15.83 0.55 1.01 55.92 0.48 2.62

30 Coeff -0.962 1.679 1.319 -0.018 0.020 -0.003 t-stat 85.71 3.58 10.13 1.90 2.04 0.18

35 Coeff -0.949 2.071 1.461 -0.024 0.043 -0.011 t-stat 52.30 3.70 9.62 1.69 1.86 0.43

40 Coeff 0.236 0.631 0.358 0.246 -0.335 -0.172 t-stat 0.53 1.22 0.77 0.54 1.07 0.29

45 Coeff 0.429 0.318 0.138 0.008 -0.256 -0.013 t-stat 1.05 0.72 0.32 0.02 0.86 0.03

50 Coeff 0.420 0.095 0.172 0.043 -0.351 0.058 t-stat 1.20 0.25 0.46 0.15 1.33 0.14

55 Coeff 0.884 -0.363 -0.300 -0.002 -0.386 0.298 t-stat 2.39 0.87 0.81 0.01 1.48 0.71

60 Coeff 0.904 -0.329 -0.290 -0.024 -0.452 0.560 t-stat 2.60 0.80 0.84 0.09 1.79 1.31

65 Coeff 1.005 -0.429 -0.364 0.031 -0.460 0.629 t-stat 2.85 1.03 1.04 0.11 1.79 1.46

70 Coeff 0.888 -0.376 -0.135 0.275 -0.534 0.460 t-stat 2.67 0.92 0.41 0.92 2.02 1.03

75 Coeff 0.910 -0.285 -0.157 0.379 -0.557 0.535 t-stat 2.70 0.67 0.47 1.21 2.14 1.21

80 Coeff 1.013 -0.323 -0.300 0.470 -0.544 0.652 t-stat 2.82 0.72 0.84 1.48 2.11 1.48

85 Coeff 1.029 -0.335 -0.286 0.789 -0.533 0.262 t-stat 2.90 0.71 0.83 2.23 2.01 0.57

90 Coeff 1.111 -0.340 -0.383 1.009 -0.534 0.080 t-stat 3.07 0.68 1.10 2.74 2.01 0.17

95 Coeff 1.119 -0.264 -0.403 1.393 -0.466 -0.427 t-stat 3.17 0.50 1.20 3.59 1.71 0.91

100 Coeff 1.176 -0.288 -0.465 1.370 -0.469 -0.473 t-stat 3.43 0.55 1.46 3.81 1.72 1.08

111

Table 17: Panel D

GPD Regression

%Obs

Const. EON Manufac. Emerg. After Crisis Interact. LogTIV*

15 Coeff 1.363 0.400 0.896 -0.460 -0.064 0.707 -0.087 t-stat 2.23 2.06 3.10 1.08 0.18 2.21 2.63

20 Coeff 0.468 0.250 0.502 -0.714 0.072 0.823 -0.028 t-stat 0.28 0.96 1.75 3.37 0.17 1.50 0.30

25 Coeff 0.252 1.522 0.885 -0.341 -0.126 0.360 -0.037 t-stat 0.21 3.05 4.09 1.19 0.65 1.72 1.00

30 Coeff 0.904 1.453 0.859 -0.779 -0.500 0.760 -0.044 t-stat 0.58 2.60 2.94 1.68 1.02 1.36 0.61

35 Coeff 1.217 0.818 0.138 -0.337 -0.610 0.533 -0.032 t-stat 0.81 1.22 0.31 0.75 1.68 0.91 0.46

40 Coeff 0.981 0.353 0.074 -0.379 -0.591 0.508 -0.015 t-stat 0.69 0.74 0.16 1.06 1.73 0.98 0.23

45 Coeff -0.107 0.009 0.197 -0.242 -0.710 0.311 0.041 t-stat 0.08 0.02 0.41 0.75 2.10 0.68 0.65

50 Coeff 0.389 -0.489 -0.348 -0.138 -0.610 0.386 0.037 t-stat 0.31 1.06 0.82 0.45 1.98 0.82 0.60

55 Coeff 0.282 -0.334 -0.107 -0.080 -0.659 0.489 0.034 t-stat 0.21 0.78 0.29 0.25 2.20 0.94 0.52

60 Coeff 1.122 -0.563 -0.365 -0.148 -0.804 0.854 0.006 t-stat 0.92 1.21 0.94 0.46 2.76 1.62 0.10

65 Coeff 0.929 -0.476 -0.091 0.053 -0.806 0.657 0.009 t-stat 0.74 1.03 0.25 0.15 2.62 1.18 0.14

70 Coeff 0.941 -0.348 -0.016 0.259 -0.781 0.556 0.004 t-stat 0.76 0.76 0.05 0.73 2.55 1.07 0.06

75 Coeff 1.169 -0.311 -0.073 0.337 -0.801 0.713 -0.005 t-stat 0.93 0.66 0.20 0.93 2.62 1.40 0.08

80 Coeff 1.405 -0.509 -0.331 0.516 -0.793 0.586 -0.007 t-stat 1.12 0.96 0.84 1.40 2.59 1.20 0.11

85 Coeff 1.728 -0.498 -0.207 0.669 -0.842 0.360 -0.025 t-stat 1.34 0.91 0.57 1.69 2.68 0.74 0.39

90 Coeff 2.575 -0.504 -0.356 0.947 -0.907 0.182 -0.061 t-stat 2.05 0.84 0.96 2.28 2.96 0.38 1.01

95 Coeff 2.647 -0.575 -0.410 1.028 -0.813 -0.005 -0.062 t-stat 2.04 0.91 1.12 2.47 2.53 0.01 1.00

100 Coeff 2.965 -0.592 -0.459 1.019 -0.816 -0.052 -0.077 t-stat 2.42 0.94 1.33 2.63 2.58 0.12 1.31

112

Table 18: The table shows results for the Hill and GPD regression with covariates when a dummy for the Thai flood of 2011 and after is added instead of the crisis one. The parameters are estimated for different levels of thresholds ranging from 15% of observation (85th percentile) to 100% (full sample), and for different set of covariates. All the dummy variables implemented are relative to the constant. The dummy for manufacturing (Manufact.) is relative to commercial, the dummy for developing (Emerg.) is relative to developed, and the dummy for the flood is relative to before the flood period. Intercat. is for the interaction between being in a developing country after the flood (Interact.). The only continuous variable is the logarithm of total insured values (LogTIV*), to control for size effects. Significant t-Statistics (above 1.96) are in bold and italics. The models are estimated with the exclusion of energy-on-shore. Panel A uses shows results for Hill regression without TIV*. Panel B LogTIV* to the Hill regression. Panel C uses shows results for GPD regression without TIV*. Panel D LogTIV* to the GPD regression. The coefficients estimated for the GPD are relative to the ξ parameter here, hence the average marginal effect to α=1/ξ is opposite to their sign.

113

Table 18: Panel A

Hill Regression

%Obs

Const. Manufac. Emerg. Flood Interact.

15 Coeff 0.390 -0.299 0.155 0.808 -0.725 t-stat 1.22 0.93 0.55 1.55 0.89

20 Coeff 0.017 -0.059 0.188 0.664 -0.690 t-stat 0.06 0.20 0.78 1.66 1.03

25 Coeff -0.124 -0.068 0.276 0.706 -0.826 t-stat 0.48 0.26 1.30 2.17 1.43

30 Coeff -0.325 0.035 0.157 0.470 -0.370 t-stat 1.34 0.14 0.80 1.57 0.73

35 Coeff -0.338 0.002 0.175 0.331 -0.208 t-stat 1.52 0.01 0.97 1.15 0.45

40 Coeff -0.410 -0.124 0.206 0.260 -0.108 t-stat 2.10 0.63 1.24 0.97 0.26

45 Coeff -0.616 -0.104 0.188 0.169 0.163 t-stat 3.40 0.57 1.20 0.65 0.44

50 Coeff -0.694 -0.156 0.245 0.155 0.260 t-stat 4.13 0.92 1.64 0.61 0.76

55 Coeff -0.904 -0.058 0.251 0.073 0.372 t-stat 5.51 0.35 1.77 0.29 1.13

60 Coeff -0.962 -0.087 0.278 0.020 0.409 t-stat 6.20 0.56 2.04 0.08 1.29

65 Coeff -1.085 -0.017 0.346 -0.005 0.461 t-stat 7.10 0.11 2.64 0.02 1.50

70 Coeff -1.239 0.063 0.399 -0.021 0.444 t-stat 8.13 0.42 3.16 0.08 1.48

75 Coeff -1.316 0.100 0.418 -0.035 0.554 t-stat 8.76 0.68 3.37 0.14 1.89

80 Coeff -1.305 0.015 0.500 -0.038 0.431 t-stat 9.12 0.11 4.18 0.15 1.49

85 Coeff -1.384 0.056 0.555 0.007 0.327 t-stat 9.73 0.41 4.75 0.03 1.16

90 Coeff -1.382 0.012 0.618 0.002 0.272 t-stat 10.04 0.09 5.43 0.01 0.98

95 Coeff -1.411 0.001 0.664 0.047 0.156 t-stat 10.46 0.01 5.96 0.20 0.58

100 Coeff -1.425 -0.053 0.671 0.038 0.126 t-stat 10.92 0.43 6.13 0.16 0.47

114

Table 18: Panel B

Hill Regression

%Obs

Const. Manufac. Emerg. Flood Interact. LogTIV*

15 Coeff 1.121 0.169 0.239 1.062 -0.983 -0.061 t-stat 0.83 0.47 0.83 2.37 1.39 0.91

20 Coeff 1.690 0.115 0.281 0.727 -0.800 -0.098 t-stat 1.51 0.38 1.14 1.96 1.31 1.72

25 Coeff 1.837 0.201 0.125 0.406 -0.293 -0.114 t-stat 1.85 0.72 0.56 1.22 0.55 2.25

30 Coeff 1.430 0.051 0.164 0.244 -0.170 -0.090 t-stat 1.60 0.21 0.82 0.77 0.35 2.00

35 Coeff 1.201 -0.072 0.083 0.143 0.032 -0.081 t-stat 1.48 0.33 0.45 0.48 0.07 1.95

40 Coeff 0.424 -0.051 0.127 0.041 0.289 -0.051 t-stat 0.55 0.25 0.72 0.14 0.72 1.33

45 Coeff -0.167 -0.108 0.192 0.072 0.342 -0.025 t-stat 0.23 0.58 1.15 0.25 0.92 0.68

50 Coeff -0.402 0.012 0.218 -0.003 0.415 -0.025 t-stat 0.58 0.07 1.37 0.01 1.16 0.71

55 Coeff -0.448 0.018 0.261 -0.052 0.458 -0.027 t-stat 0.67 0.10 1.72 0.19 1.32 0.81

60 Coeff -0.501 0.053 0.277 -0.095 0.562 -0.029 t-stat 0.78 0.32 1.89 0.34 1.67 0.89

65 Coeff -0.787 0.128 0.367 -0.101 0.459 -0.020 t-stat 1.26 0.78 2.60 0.36 1.38 0.65

70 Coeff -1.045 0.147 0.390 -0.104 0.499 -0.011 t-stat 1.72 0.92 2.83 0.37 1.54 0.37

75 Coeff -1.342 0.135 0.424 -0.093 0.519 0.002 t-stat 2.26 0.87 3.14 0.33 1.63 0.06

80 Coeff -1.522 0.076 0.499 -0.013 0.353 0.011 t-stat 2.63 0.51 3.78 0.05 1.14 0.37

85 Coeff -1.597 0.052 0.564 -0.018 0.289 0.014 t-stat 2.83 0.36 4.38 0.07 0.94 0.49

90 Coeff -1.590 0.073 0.617 -0.030 0.229 0.011 t-stat 2.88 0.52 4.90 0.11 0.75 0.40

95 Coeff -1.686 0.047 0.639 0.026 0.115 0.014 t-stat 3.10 0.34 5.16 0.10 0.39 0.54

100 Coeff -1.698 -0.005 0.659 0.025 0.089 0.014 t-stat 3.17 0.04 5.39 0.10 0.30 0.54

115

Table 18: Panel C

GPD Regression

%Obs

Const. Manufac. Emerg. Flood Interact.

15 Coeff 0.111 0.056 -1.093 -0.998 2.687 t-stat 52.84 4.19 138.08 8.90 2.50

20 Coeff -0.960 1.164 -0.025 0.000 0.185 t-stat 68.91 7.18 2.08 0.00 1.11

25 Coeff 0.023 0.293 0.154 -0.475 0.599 t-stat 0.04 0.57 0.23 1.27 0.62

30 Coeff -0.468 0.916 -0.387 -0.315 0.745 t-stat 1.45 3.36 1.36 1.02 1.21

35 Coeff 0.276 0.277 -0.083 -0.765 1.070 t-stat 0.76 0.82 0.22 2.42 1.37

40 Coeff 0.542 -0.066 -0.043 -0.723 1.122 t-stat 1.15 0.14 0.14 1.93 1.65

45 Coeff 0.563 -0.099 -0.056 -0.746 1.271 t-stat 1.41 0.24 0.21 2.19 2.12

50 Coeff 0.856 -0.428 0.100 -0.666 1.187 t-stat 2.25 1.13 0.37 2.26 2.06

55 Coeff 0.639 -0.135 0.162 -0.813 1.185 t-stat 1.79 0.37 0.61 2.62 2.21

60 Coeff 0.838 -0.312 0.300 -0.832 1.012 t-stat 2.32 0.85 1.07 2.86 2.02

65 Coeff 0.738 -0.164 0.596 -0.911 0.923 t-stat 1.95 0.41 1.93 2.93 1.73

70 Coeff 0.665 -0.057 0.816 -0.968 0.686 t-stat 2.06 0.18 2.62 3.29 1.38

75 Coeff 0.743 -0.123 0.890 -0.986 0.895 t-stat 2.15 0.35 2.91 3.31 1.73

80 Coeff 0.978 -0.409 1.218 -0.910 0.295 t-stat 2.58 1.07 3.94 3.21 0.63

85 Coeff 0.863 -0.260 1.345 -0.895 0.083 t-stat 2.41 0.72 4.44 3.04 0.18

90 Coeff 0.952 -0.357 1.490 -0.878 -0.087 t-stat 2.71 1.03 4.99 3.05 0.20

95 Coeff 1.035 -0.438 1.504 -0.849 -0.194 t-stat 3.00 1.31 5.27 2.94 0.45

100 Coeff 1.126 -0.540 1.365 -0.839 -0.118 t-stat 3.36 1.69 5.18 2.93 0.29

116

Table 18: Panel D

GPD Regression

%Obs

Const. Manufac. Emerg. Flood Interact. LogTIV*

15 Coeff 0.794 1.065 -0.238 -0.237 0.762 -0.071 t-stat 2.39 3.01 1.07 1.26 1.31 8.88

20 Coeff 1.298 0.831 -0.425 -0.415 1.028 -0.084 t-stat 4.24 4.32 1.95 1.67 1.93 9.70

25 Coeff 1.696 0.743 -0.656 -0.678 1.356 -0.091 t-stat 2.98 4.13 2.34 2.16 2.13 4.04

30 Coeff 1.152 -0.072 -0.301 -1.366 1.900 -0.025 t-stat 5.45 0.46 0.92 6.24 2.73 5.62

35 Coeff 1.230 0.008 -0.412 -1.443 1.914 -0.029 t-stat 9.23 0.08 1.37 6.69 3.36 3.69

40 Coeff 1.054 -0.025 -0.246 -1.282 2.004 -0.024 t-stat 10.35 0.17 0.89 7.53 3.62 4.94

45 Coeff 0.249 -0.497 0.048 -0.713 1.300 0.036 t-stat 0.19 1.20 0.15 1.94 1.87 0.58

50 Coeff -0.068 -0.182 0.090 -0.834 1.195 0.041 t-stat 0.05 0.50 0.29 2.18 1.89 0.66

55 Coeff 0.231 -0.201 0.310 -0.933 1.044 0.029 t-stat 0.16 0.51 0.94 2.10 1.63 0.40

60 Coeff 0.849 -0.093 0.452 -1.245 1.347 -0.004 t-stat 0.64 0.33 1.32 1.48 1.39 0.06

65 Coeff 0.976 -0.121 0.766 -1.233 0.826 -0.006 t-stat 0.30 0.26 2.04 0.93 0.61 0.04

70 Coeff 1.372 -0.180 0.796 -1.418 1.046 -0.021 t-stat 10.19 0.69 2.32 4.28 5.50 1.46

75 Coeff 1.409 -0.249 0.910 -1.297 0.913 -0.021 t-stat 1.57 0.67 2.70 2.72 1.50 0.50

80 Coeff 1.689 -0.356 1.148 -1.145 0.480 -0.031 t-stat 1.49 0.82 3.42 2.18 0.77 0.60

85 Coeff 1.944 -0.419 1.358 -1.244 0.362 -0.041 t-stat 2.28 0.90 4.02 3.14 0.69 1.13

90 Coeff 2.014 -0.307 1.405 -1.402 0.373 -0.048 t-stat 2.69 0.69 4.25 10.83 1.11 1.66

95 Coeff 2.091 -0.580 1.315 -1.423 0.425 -0.040 t-stat 5.15 1.77 4.33 7.77 1.17 3.25

100 Coeff 2.235 -0.597 1.207 -1.285 0.392 -0.047 t-stat 2.50 1.36 4.17 2.87 0.75 0.97

117

8. Price sensitivity analysis

In this section we adopt the perspective of an insurer operating in the APAC

region, and explore the impact of business mix on tail risk. In particular, we look at how

the risk profile of a portfolio of policies covering risks with statistical characteristics

similar to the ones of our dataset would change as we rebalance the underwriting strategy

towards specific exposures. In order to do this, we fix a baseline threshold of USD 1

million and estimate the parameter α of the Pareto model with the Hill estimator. We then

perform two types of analyses: one by occupancy type, where we change the percentage

of CO and MA risks in the portfolio; one by economic development, where we change

the percentage of developed and developing countries in the portfolio.

We report the Hill estimates for the Pareto models above USD 1 million in panel

A of table 19, whereas panel A of table 20 reports α parameters for developed and

developing countries. Class CO has α slightly higher than MA, making it slightly less

heavy tailed, while developed countries have α that is way lower than developing

countries, meaning that they are heavier tailed.

The analysis of portfolio risk is performed via Monte Carlo simulation (MCS).

We run 1 million Monte Carlo simulations from a mixture of two Pareto models with

different weights, representing the exposure of the portfolio to a given type of exposures.

We also use Bootstrapping, by simulating 1 million samples of size 50 from the different

subsamples of the empirical data. We report the average severity of the losses, where the

latter are truncated at USD 10 million, as well as the 80th percentile.

118

Panel B of table 20 shows results for price sensitivity analysis for different types

of portfolios, where we vary the occupancy type. Row 1 simulates a portfolio with only

MA risk. Row 2 simulates a portfolio with 25% share in CO and 75% in MA. Row 3 is an

equally weighted portfolio between the CO and MA class. Row 4 simulates a portfolio

with 75% business written in the CO class, and 25% in the MA class. Finally, row 5

presents a portfolio with only CO risks. Both with MCS and Bootstrap, severity is stable

and does not change much as we modify the portfolio weight between the CO and MA

classes. The 80% quantile, however, increases as the percentage of MA risk in the

portfolio increases relatively to the CO class.

Severity for MCS is lower than for Bootstrap. However, the MCS quantiles are

generally higher (except for the portfolio with only CO risk, for which the Bootstrap

quantile is higher), meaning than the Pareto model puts more weights on the tail than the

empirical distribution. Severity for MCS is around USD 3.2 million, while it is around

USD 4.1 million for the Bootstrap method. The 80% quantile ranges between USD 28.9

and 44.6 million for the MCS, and between USD 29 and 37.9 million for the Bootstrap

method. It is important to mention that these are conditional quantiles, given that the loss

exceeded a threshold of USD 1 million. Hence, conditional on a loss overshooting the

USD 1 million level, there is a 20% chance that it will be above USD 30 to 40 million,

depending on the insurer’s portfolio.

Panel B of table 20 shows results for price sensitivity analysis for different types

of portfolios where wee vary the exposure by different country. Row 1 simulates a

portfolio with only developing countries. Row 2 simulates a portfolio with 25% share in

developed and 75% in developing countries. Row 3 is an equal weight portfolio between

119

developed and developing countries. Row 4 simulates a portfolio with 75% share in

developed and 25% in developing countries. Finally, row 5 represents a portfolio with

only developed country risk. Both with MCS and Bootstrap, severity increases as we

increase the share of developed countries in the portfolio, relative to developing ones,

with the Bootstrap method being more sensitive. Also the 80% quantile increases as the

percentage of developed country risk in the portfolio increases, relatively to developing.

Severity with MCS is lower than the Bootstrap one, however the MCS quantile is

higher, meaning than Pareto model puts more weights on the tail than the empirical

distribution. Severity via MCS goes from USD 3.1 to 3.3 million as the ratio of

developed to developing increases, while it goes from USD 4.1 to 5.1 million with the

Bootstrap method. Severity is the highest in the portfolio with only developed countries.

The 80% quantile ranges between USD 22.8 and 73 million for the MCS and between

USD 21.2 and 46.3 million for the Bootstrap. Since these are conditional quantiles over a

threshold of USD 1 million, there is a 20% chance that a loss will be above USD 22 to 46

million (23 to 73 with MCS) depending on the insurer’s portfolio, conditional on the loss

being above USD 1 million.

120

Table 19: The table shows results for price sensitivity analysis to insurer’s portfolios with different shares in occupancy risk. All analyses are carried out above a common threshold of USD 1 million. Estimated parameters of the Pareto model for commercial and manufacturing, obtained via Hill, are reported in panel A. Panel B reports the actual results. The ‘MCS from Pareto’ columns report results from 1 million Monte Carlo simulations from a mixture of two Pareto models (one for commercial and one for manufacturing risk) with different weights, representing the exposure of the portfolio to a given combination of risks. The ‘Bootstrap’ columns instead, show results for 1 million bootstrap samples of size 50 from different subsamples of the empirical data. The severity column reports the simulated truncated average loss between 1 and 10 million.

Panel A

Alpha Commercial

Alpha Manufacturing

0.478 0.424

Panel B

Diversification by Occupancy

MCS from Pareto Bootstrap

%CO %MA Severity 80%-Quantile Severity 80%-Quantile 0 100 3,267,610 44,662,723 4,286,755 37,959,024 25 75 3,244,315 39,846,516 4,223,719 35,300,511 50 50 3,234,257 35,744,576 4,168,801 33,074,711 75 25 3,210,258 32,086,974 4,119,654 31,015,769 100 0 3,187,284 28,995,511 4,071,358 29,048,506

121

Table 20: The table shows results for price sensitivity analysis to insurer’s portfolios with different shares in country risk. All analyses are carried out above a common threshold of USD 1 million. Estimated parameters of the Pareto model for developed and developing countries, obtained via Hill, are reported in panel A. Panel B reports the actual results. The ‘MCS from Pareto’ columns report results from 1 million Monte Carlo simulations from a mixture of two Pareto models (one for developed and one for developing risk) with different weights, representing the exposure of the portfolio to a given combination of risks. The ‘Bootstrap’ columns instead, show results for 1 million bootstrap samples of size 50 from different subsamples of the empirical data. The severity column reports the simulated truncated average loss between 1 and 10 million.

Panel A

Alpha Developed

Alpha Developing

0.375 0.515

Panel B

Diversification by Economic Development

MCS from Pareto Bootstrap

%Developed %Developing Severity 80%-Quantile Severity 80%-Quantile 0 100 3,142,132 22,799,495 4,152,679 21,274,137 25 75 3,183,666 29,985,268 4,320,499 27,420,642 50 50 3,230,400 40,035,807 4,519,356 33,985,140 75 25 3,277,642 53,917,778 4,765,618 40,301,381 100 0 3,334,827 72,969,358 5,102,308 46,301,724

122

9. Limitations

This study shares limitations that are common to data enrichment exercises for large

commercial risks. They are mainly due to challenges faced at the data collection phase, to

limited granularity of the data collected, and to common small sample biases that affect

studies focusing on extremes. We now address in detail these limitations, which are

shared by both the London market and Singapore projects (see Biffis and Chavez, 2014).

• Data collection: Data sourcing is complicated by the fact that different

departments within a company may store different information, or the same

information in different format, depending on whether the focus is on pricing,

reserving, or claims, for example. Some companies may organize data on a claim

basis, some on an event basis, and others on both bases. (Re)insurance companies

are concerned with losses directly affecting their business, but proper

understanding of risk requires FGU losses, and hence the need to address

censoring and truncation issues induced by deductibles and limits. Recovering

FGUs is complicated by the fact that such information may be available only to

primary insurers or brokers. This means that data sourcing is often not self-

sufficient, and needs to rely on external inputs (e.g., broker submissions, loss

adjuster’s reports). Losses have several important dimensions, such as fees,

physical damage, business interruption, and third party liability. In this project, we

computed FGU losses for settled claims, by aggregating all these loss dimensions.

The extension of the analysis to claims in development, or to different loss

123

dimensions (e.g. physical damage vs. business interruption), would be extremely

interesting, and may be the object of a follow up project.

• Data quality and granularity: The link between claims and exposures is a typical

challenge of data enrichment exercises. As discussed in sections 4 and 5, an

important proxy for the exposure would be the TIV at location. However, this is

often not available, in particular when losses originate from large policy

schedules, which may only report top location TIVs or aggregate TIVs. In some

cases, only the TSI is available, leading to underestimation of the exposure. In this

project, we addressed this issue by introducing the simple proxy TIV*, which

aims at making the best use of what the data tell us. However, an extension of the

dataset to more precise TIV information would allow us to estimate the link

between claim and exposures more precisely. When it comes to data sharing, the

classification of an exposure into different occupancy types is often

heterogeneous, as companies develop internal systems that reflect individual

operational and business considerations. In this study, we relied on the

classification framework developed in Biffis and Chavez (2014). More granular

approaches are in principle possible, but would require time consuming manual

analysis of loss adjuster’s reports and policy schedules.

• Small sample issues: As discussed in the methodological part of this report, the

study of extremes can be severely biased in small samples, particularly in the

presence of heavy tails. In this project, we started with the collection of losses for

which the data provider incurred a loss above EUR 1 million in the year of

occurrence, and then extended the analysis to losses below that threshold. Time

124

constraints and data quality issues allowed us to partially extend the data to claims

above EUR 140k. The London market component of the data was sourced

satisfying similar criteria. A lower loss threshold, as well as the addition of more

data sources, would mitigate small sample issues and improve the quality of our

estimates. In the aggregated dataset, the residential (RE) and miscellaneous (MI)

classes are severely underrepresented, and we cannot carry out a comparison with

other classes as done in Biffis and Chavez (2014). The Energy-on-Shore (EON)

class is also underrepresented, but we still managed to carry out partial analysis of

this risk class.

A limitation of the sensitivity analysis carried out in Section 8, is that pricing and capital

modeling considerations are limited to the severity dimension of LCRs. Clearly, it would

have been extremely valuable to have information on empirical frequency, but given the

challenges faced in the data collection phase, this was not feasible. Despite this

drawback, however, the sensitivity analysis of severity is still very useful for two main

reasons:

i) The study of severity offered by the IRFRC LCR dataset can be

complemented by internal information on frequency available to end users.

ii) Pricing actuaries and underwriters rely on rating methodologies based on

adjusting baseline curves depending on amount of insurance, occupancy

characteristics, etc.. Any divergence (in the statistical sense) of such tools

from empirical counterparts is often poorly understood, making it hard to

pinpoint margins for prudence (if any) applied to different layers and types of

exposure. Our sensitivity analysis helps to understand such divergence, by

125

distinguishing between frequency and severity dimension, and by providing

some useful results on the latter.

Going forward, we envisage that, as data enrichment gets prioritized by insurance

companies, and wider sharing agreements are developed, most of the limitations

discussed above will become less material. Although competitive pressure and

confidentiality issues still represent significant hurdles to studies like the present one, and

may slow down the process, the returns from data enrichment are becoming more widely

appreciated by the industry. We think that this study provides some convincing evidence

precisely in that direction.

126

10. Acknowledgements

We would like to acknowledge the excellent research assistance of Xiaoxu Liu, Bowen

Yang and Wei Yu. We are grateful to the generous support by the Insurance Risk and

Finance Research Centre (IRFRC), at Nanyang Business School, Nanyang Technological

University. We are grateful to Luke Armitage, John Buchanan, Rob Caton, Lauren

Clarke-Wiest, Daniela Collis, Michel Dacorogna, Markus Gesmann, Mike Hood, Chris

Kent, Sie Liang Lau, Bronek Masojada, James Slaughter, Paul Wee and Jerry Wu for

valuable advice throughout the project. Any errors are our own responsibility.

127

11. References

Beirlant, J., Goegebeur, Y., Segers, J., & Teugels, J. (2006). Statistics of extremes: theory and applications. John Wiley & Sons.

Bernegger, S. (1997). The Swiss Re exposure curves and the MBBEFD distribution class, ASTIN Bulletin, 27(1), pp. 99-111.

Biffis, E. and Chavez, E. (2014). Tail risk in commercial property insurance, Risks, 2(4), 393-410; doi:10.3390/risks2040393.

Buchanan, J. and Angelina, M. (2014). The hybrid reinsurance pricing method: A practitioner’s guide. Technical report, to appear in Variance. Chavez‐Demoulin, V., Embrechts, P., & Hofert, M. (2015). An extreme value approach for modeling operational risk losses depending on covariates. The Journal of Risk and Insurance. Dacorogna, M.M., Müller, U.A., Pictet, O.V., & De Vries, C.G. (1995). The Distribution of Extremal Foreign Exchange Rate Returns in Extremely Large Data Sets. Technical Report. Tinbergen Institute: Rotterdam, Netherland. Dacorogna, M. M., Müller, U. A., Pictet, O. V., & De Vries, C. G. (2001). Extremal forex returns in extremely large data sets. Extremes, 4(2), 105-127. Desmedt, S., Snoussi, M., Chenut, X., & Walhin, J. F. (2012). Experience and exposure rating for property per risk excess of loss reinsurance revisited. ASTIN bulletin, 42(01), 233-270. Embrechts, P., Klüppelberg, C., & Mikosch, T. (1997). Modelling extremal events (Vol. 33). Springer Science & Business Media.

Gabaix, X. and Ibragimov, R. (2011). Rank- 1/2: a simple way to improve the OLS estimation of tail exponents. Journal of Business & Economic Statistics.

Gillott, N., Carroll, P., Chamberlin, G., Hudson, B., Malde, S., Masters, G., Taylor, P., and Thomson, A. (1988). Commercial fire insurance. Technical report.

Guggisberg, D. (2004). Exposure Rating. Swiss Re Technical Publishing Series. Hill, B. M. (1975). A simple general approach to inference about the tail of a distribution. The Annals of Statistics, 3(5), 1163-1174. Huisman, R., Koedijk, K. G., Kool, C. J. M., & Palm, F. (2001). Tail-index estimates in small samples. Journal of Business & Economic Statistics, 19(2), 208-216.

Ibragimov, R. and Walden, J. (2007). The limits of diversification when losses may be large. Journal of Banking & Finance, 31(8):2551–2569.

http://dx.doi.org/10.3390/risks2040393

128

Klugman, S., H. Panjer, and G E. Willmot. 2012. Loss models: from data to decisions. Vol. 715. John Wiley & Sons.

Michaelides, N., Brown, P., Chacko, F., Graham, M., Haynes, J., Hindley, D., Howard, S., Johnson, H., Morgan, K., Pettengell, C., Rodriguez, R., and Simmons, D. (1997). The premium rating of commercial risks. Technical report, Working Party on Premium Rating of Commercial Risks, General Insurance Convention. Pickands III, J. (1975). Statistical inference using extreme order statistics. The Annals of Statistics, 119-131. Riegel, U. (2010). On fire exposure rating and the impact of the risk profile type. ASTIN Bulletin, 40(02), 727-777.

Swiss Re (2012). Insuring ever-evolving commercial risks. SIGMA, 2012. Wang, H., & Tsai, C. L. (2009). Tail index regression. Journal of the American Statistical Association, 104(487), 1233-1240.

129

12. Appendix A: Data Collection Details

Overview

The LCR APAC dataset is constructed by merging together the hand-collected dataset

from SCOR Services Asia-Pacific Pte Ltd and the Imperial IICI dataset contributed by

Hiscox and Liberty, and developed under the lead of Dr Enrico Biffis at Imperial College

London. As previously mentioned the final dataset used is called the IRFRC LCR dataset

and is freely available for download from the IRFRC’s website (www.irfrc.com).

In this data appendix we provide details on the data collection exercise involving the

three contributing companies.

Data definitions:

• LCR: A large commercial risk (LCR) is defined as a loss caused by man-made risks

(e.g. fire, explosion, etc.). We exclude natural catastrophe events, and started by

focusing on claims that made the data provider incur a loss amount of at least EUR 1

million. We then extended our dataset to include claims leading to loss amounts

smaller that EUR 1 million. Given time constraints, we only partially extended loss

data by obtaining FGU losses larger than EUR 140k. One should note that any

selection bias arising from the data collection exercise is driven by both data quality

and reliability. Based on our experience, the latter two attributes are homogeneous

across developed and developing countries APAC claims.

• FGU: From the ground up losses. They were sourced by using a combination of

internal and external sources. The latter include loss adjuster’s reports and broker’s

submissions.


130

Process followed at SCOR

Conditional on the occurrence of an event, several claims may be triggered. One reason is

that different claims could have originated from different cedants, and be structured as

different types of contracts. Hence, while collecting FGU losses, information from

several claim cases had to be collected to reach a robust value for the FGU loss.

In many cases, we had to parse through numerous email exchanges between clients,

brokers, and the reinsurance company (among others) before confidently being able to

record an accurate value for the FGU loss. Extra care was paid to making sure that all

loss references were made relative to the insured and not the intermediary.

Some of the documents through which we had to look for information are the following:

final loss adjuster’s report, final loss advice from broker, court arbitration, loss summary

table, etc.. The target data collection source was the “Loss adjuster's final report” as it

was deemed the safest data source for our data collection efforts. When in doubt, claims

managers were consulted.

When no reliable information on FGU loss could be found, or the claim belonged to a

facultative contract, then the FGU amount was reconstructed from the contract structure

and the loss incurred by SCOR, as specified in their Claims Database. We used the

resulting figure as an approximation for the FGU when it was not available. The same

figure was also used for cross-validation.

We found that some events had duplicate entries, so careful parsing had to be done to

remove any duplicate information. The same procedure was done when merging the

SCOR data with the Imperial-IICI dataset.

131

In addition to FGU losses, we also collected claim narratives, which provide considerable

help in validating the classification of an exposure into the occupancy types described in

Section 4 of the report. Although information is rich enough to allow quite granular

categorization of occupancy types, sample size considerations made us focus on broadly

defined occupancy types (occupancy level one and two, according to the classification

outlined in Biffis and Chavez, 2014).

The period of coverage begins in 1992 or later. However, a negligible number of data

points exists before year 2000. In line with the Imperial-IICI dataset, we therefore focus

only on data after 2000. Losses included in the final dataset are those which are fully

settled and considered “permanently closed” at the time of data collection (May 2015).

Process followed at Hiscox and Liberty

The process is similar to the one described above, and was used as a template to structure

the data collection exercise at SCOR. See Section 2.1 for an overview, and Biffis and

Chavez (2014) for details.

large commercial risks (lcr) in insurance: focus on asia...

Documents