topic one { financial risk: loss distributions and risk
TRANSCRIPT
Risk Management
Topic One – Financial risk: Loss distributions and risk measures
1.1 Types of financial risks
• Systemic risk: Wall Street fails Main Street
• Black Monday in 1987: Market failure arising from program trading
• 2010 Flash crash
• Cash flow and liquidity risk
• Operational / Model / Legal risks
1.2 Hedging of market risks
• Dynamic hedging of options
• Minimum variance hedge ratio
1
1.3 Portfolio loss distribution
• Credit risk: Loan portfolio losses
• Fitting of loss distribution
1.4 VaR (Value-at-Risk), expected shortfall and economic capital
• VaR calculations
• Expected shortfall
• Coherent risk measures
• Risk control for expected utility-maximizing investors
• Economic capital
• Extreme value theory
2
1.1 Types of financial risks
Risk can be defined as loss or exposure to mischance, or more quantita-
tively as the volatility of unexpected outcomes, generally related to the
value of assets or liabilities of concern. It is best measured in terms of
probability distribution functions.
• While some firms may passively accept financial risks, others attempt
to create a competitive advantage by judicious exposure to financial
risk. This is similar to the counterparties (buyer of insurance and
insurance company) in insurance contracts. Another example is the
highly liquid credit default swap, where the counterparties are the
protection buyer and protection seller. The protection is referencing
single credit asset or basket of risky assets. In both cases, these risks
must be monitored because of their potential for damage (or even
ruin).
Risk management is the process by which various risk exposures are iden-
tified, measured, and controlled.
3
Systemic risk: Wall Street fails Main Street
Systemic risk is the risk of loss from some catastrophic event that can
trigger a collapse in a certain industry or economy.
• The Asian turmoil of 1997 wiped off about three-fourth of the dollar
capitalization of equities in Indonesia, Korea, Malaysia, and Thailand.
• The Russian default in August 1998 sparked a global financial crisis
that culminated in the near failure of a big hedge fund, Long Term
Capital Management.
• The subprime lending crisis and mortgage meltdown triggered the fi-
nancial tsunami in 2008, witnessing the bankruptcy of Lehman Broth-
ers and failure of other major financial institutions, like AIG, CitiGroup,
Merill Lynch.
• The European debt crisis in 2011, caused by failure of several euro-
zone member states (PIIGS: Portugal, Italy, Ireland, Greece, Spain)
to repay or refinance their government debts, triggered crashes in the
financial markets around the globe.
4
Black Monday, October 19, 1987
U.S. stocks (DJIA) collapsed by 22.68 percent, wiping out $1 trillion in
capital. This decline remains to be the largest one-day percentage decline
in DJIA. This is a trading event, but not an economic one.
• The most important factor was program trading. In its intention to
protect every single portfolio from risk, it became the largest single
source of market risk.
• The computer programs in program trading automatically began to
liquidate stocks as certain loss targets were hit, pushing prices lower.
The lower prices fueled more liquidation with stocks dropping 22%
on the day. This is an example of behavior that is rational on an
individual level, but irrational if everyone adopts the same behavior.
5
• One of the core assumptions of these programs was proven false, as
it assumed that there are sufficient buyers and sellers on both sides
to provide enough liquidity. The same programs also automatically
turned off all buying. Since these programs were widely adopted by
institutional investors, so buying orders vanished all around the stock
market at basically the same time that these programs started selling.
Global market impacts in the aftermath
By the end of October, stock markets had fallen in Hong Kong (45.5%),
Australia (41.8%), Spain (31%), United Kingdom (26.45%). Howev-
er, the US economy was barely affected and growth actually increased
throughout 1987 and 1988, with the DJIA regaining its pre-crash closing
high of 2,722 points in early 1989. Can we do investing like Buffet: Buy
on the fear, sell on the greed?
6
Market downward corrections occurred on a few days prior to Black Mon-
day coupled with political crisis (Iran hit an American supertank with a
Silkworm missile).
7
Market factor: excessive valuations
• 1987 had been a strong year for the stock market leading up to the
crash, as it continued the bull market that began in 1982. The stock
market and economy were diverging for the first time in the bull
market. Due to this factor, valuation for the stock market climbed to
excessive levels, with the price to earnings ratio climbing above 20.
Future estimates for earnings were trending lower, and stocks were
unaffected.
• Market participants were aware of these issues, but the use of portfolio
insurance (hedging a portfolio of stocks against the market risk by
short selling index futures) led many to ignore these warning signs.
This false belief ended up fueling excessive risk taking, which only
became apparent when stocks began to weaken in the days leading
up to the stock market crash.
8
2010 Flash Crash
May 6, 2010 Flash Crash was a trillion-dollar stock market crash, which
started at 2:32pm and lasted for approximately 36 minutes. The DJIA
plunged 998.5 points (about 9%), most within minutes, only to recover
a large part of the loss.
9
• Navinder Singh Sarao (London based home trader) was trading E-
mini S&P 500 futures contracts as a spoofer (bid or offer with intent
to cancel before the orders are filled). He would put in a big order to
sell a whole bunch of futures at a price a few ticks higher than the
best offer. He would not sell any futures since he was not offering
the best price. But he had to keep constantly updating his orders to
keep them a few ticks higher than the best offer, to make sure that
he did not accidentally sell any futures as the market moved.
• Sarao built a spoofing robot (an order is canceled by an automated
algorithm if the market gets close) and traded a ton of E-mini fu-
tures during the flash crash: “62,077 E-mini S&P contracts with a
notional value of $3.5 billion” and made “approximately $879,018 in
net profits” on that day.
10
Cash flow risk: Long Term Capital Management
• Portfolios that are highly leveraged and subject to margin calls from
the lender. The portfolio manager may be forced to liquidate the
assets, so transforming paper losses into realized losses.
• Long Term Capital Management (LTCM) is a hedge fund formed in
the mid 1990s. The hedge fund’s investment strategy was known as
convergence arbitrage.
• It would find two bonds, X and Y , issued by the same company
promising the same payoffs, with X being less liquid than Y . The
market always places a value on liquidity. As a result the price of
X would be less than the price of Y . LTCM would buy X, short Y
and wait, expecting the prices of the two bonds to converge at some
future time. Normally, with better liquidity on X provided by LTCM,
price of X moves up while price of Y moves down.
11
• When interest rates increased (decreased), the company expected
both bonds to move down (up) in price by about the same amount, so
that the collateral it paid on bond X would be about the same as the
collateral it received on bond Y . It therefore expected that there would
be no significant outflow of funds as a result of its collateralization
agreements.
• In August 1998, Russia defaulted on its debt and this led to what is
termed a“flight to quality” in capital markets. One result was that
investors valued liquid instruments more highly than usual and the
spreads between the prices of the liquid and illiquid instruments in
LTCM’s portfolio increased dramatically (instead of convergence).
12
Posting Collateral
• The prices of the bonds LTCM had bought went down and the prices
of those it had shorted increased. LTCM was required to post collat-
eral on both.
• The company was highly leveraged (about 50 times) and unable to
make the payments required under the collateralization agreements.
The result was that positions had to be closed out and there was a
total loss of about $4 billion.
• If the company had been less highly leveraged, it would probably have
been able to survive the flight to quality and could have waited for
the prices of the liquid and illiquid bonds to become closer (eventual
convergence of prices). Collaterization is meant to safeguard against
default risk. However, in many historical cases (like Orange County
in Treasury Bonds position and AIG in senior default swaps position),
the call for posting collaterization led to default due to immediate
realization of losses.
13
Liquidity risk
Asset liquidity risk
This arises when a transaction cannot be conducted at prevailing market
prices due to the size of the position relative to normal trading lots.
– Some assets, like Treasury bonds, have deep markets where most
positions can be liquidated easily with very little price impact.
– Other assets, like OTC (over-the-counter) derivatives or emerging
market equities, any significant transaction can quickly affect prices.
14
Measuring market liquidity
If someone is willing to bid in a stock at $9.9 but a seller is only willing to
post an offer price at $10.1, then the bid-offer spread is $10.1 − $9.9 =
$0.2.
One measure of the market liquidity of an asset is its bid-offer spread.
This can be measured either as a dollar amount or as a proportion of the
asset price. The dollar bid-offer spread is
p = Offer price−Bid price.
The proportional bid-offer spread for an asset is defined as
s =Offer price−Bid price
Mid-market price,
where the mid-market price is halfway between the bid and the offer price.
15
One measure of the liquidity of a book is how much it would cost to liqui-
date the book in normal market conditions within a certain time. Suppose
that si is an estimate of the proportional bid-offer spread in normal market
conditions for the ith financial instrument held by a financial institution
and αi is the dollar value of the position in the instrument. We then have
cost of liquidation (normal market) =n∑
i=1
siαi
2
where n is the number of positions.
Since si increases with the size of position i, holding many small positions
rather than a few large positions therefore tends to entail less liquidity
risk. Setting limits to the size of any one position can therefore be one
way of reducing liquidity trading risk.
16
Loss if risk controls: Barings Bank disaster
• Nicholas Leeson, an employee of Barings Bank in the Singapore office
in 1995, had a mandate to look for arbitrage opportunities between
the Nikkei 225 futures prices on the Singapore exchange and the
Osaka exchange. Over time Leeson moved from being an arbitrageur
to being a speculator without anyone in the Barings London head
office noticed that he had changed the way he was using derivatives.
In 1994, Leeson is thought to have made $20 million for Barings,
one-fifth of the total firm’s profit. He drew a $150,000 salary with a
$1 million bonus.
• He began to make losses in 1995, which he was able to hide. He then
began to take bigger speculative positions in an attempt to recover
the losses, but only resulted in making the losses worse. In the end,
Leeson’s total loss was close to 1 billion dollars. As a result, Barings
– a bank that had been in existence for 200 years – was wiped out.
18
• Lesson to be learnt: Both financial and nonfinancial corporations
must set up controls to ensure that derivatives are being used for
their intended purpose. Risk limits should be set and the activities
of traders should be monitored daily to ensure that the risk limits are
adhered to.
Similar cases continued to occur in later years
• Societe Generate lost $6.7 billion in January 2008 after trader Jerome
Kerviel took unauthorized positions on European stock index futures.
• UBS lost $2 billion in September 2011 after Kweku Adoboli took
unauthorized positions on currency swap trades.
19
Model risk
• Mathematicians understand the limitations of the model while practi-
tioners use them to advance their flavors.
The mathematical model used to value positions is misused. A good
example is the rating of Constant Proportional Debts Obligations
(CPDO) based on flawed mathematical models. Investors received
LIBOR+200 bps coupon rate on CPDO, which had been rated AAA.
Most investors on CPDO lost almost 100% of their investments during
the financial tsunami in 2008.
• Suppose 3 different models give prices of $6 million, $7.5 million and
$8.5 million for a particular structured product. Even if the financial
institution believes the first model is the best one and plans to use
that model for daily repricing and hedging, it would ensure that the
price it charges the client is at least $8.5 millions. If the product is
sold at $9 million, the potential profit will be $3 million if the market
behaviors follow the first model.
20
Legal risk
Legal risk is generally related to credit risk, since counterparties that lose
money on a transaction may try to find legal grounds for invalidating
the transaction. It may take the form of shareholder law suits against
counterparty corporations.
Examples
• Two municipalities in Britain had taken large positions in interest rate
swaps that turned out to produce large losses. The swaps were later
ruled invalid by the British High Court. The court decreed that the city
councils did not have the authority to enter into these transactions
and so the cities were not responsible for the losses. Their bank
counterparties had to swallow the losses.
• A Hong Kong example is the Lehman Brothers’ mini-bonds. Local
banks which sold these sophisticated products to layman (measured
by education level) investors had to swallow the losses.
21
1.2 Hedging of market risks
Dynamic hedging of options
A trader sells 100,000 European call options on a non-dividend-paying
stock: S = $49, X = $50, r = 5%, σ = 20%, T = 20 weeks.
Terminal payoff of a call option = max(ST −X,0).
22
Some theoretical considerations
Let V (S, t) denote the price function of the European call option, S be
the underlying stock price and t is the calendar time.
Write ∆ =∂V
∂Sas the rate of change of V with respect to S. Note that
0 ≤∆ ≤ 1 (why?). Suppose an issuer writes a European call option, he is
faced with the liability of paying ST −X at maturity T if ST > X (option
buyer chooses to exercise).
The risk arises from the fluctuation of the stock price S.
Given the short position in V , the writer can hedge the liability exposure
by long buying α units of the stock to make the portfolio delta neutral.
Value V (S, t) and delta ∆ =∂V
∂Sof the call are calculated based on an
option pricing model (potential exposure to model risk since the hedger
needs to specify the volatility of the underlying asset price).
23
Let π(S, t) be the portfolio value, where
π(S, t) = −V (S, t) + αS.
The risk arises from S, so we are interested to examine∂π
∂S(variation of
the portfolio value with varying levels of S). We obtain
∂π
∂S= −
∂V
∂S+ α = −∆+ α.
If we choose α = ∆, then the portfolio becomes delta neutral.
Black and Scholes pioneered the concept of riskless hedging to derive the
option pricing model. Scholes received the Nobel award in Economics in
1997 in recognition of this contribution (Black passed away in 1995).
24
Intuitively, suppose ∆ =∂V
∂S= 0.3, one dollar increase in S leads to $0.3
increase in option value. The liability is 0.3. If the writer holds 0.3 units
of stock simultaneously, the net gain in the hedged portfolio is zero.
Note that ∆ has dependence on S and t, so ∆ changes dynamically over
the life of the option as S changes continuously. The number of units
of stock held by the option writer has to change dynamically over time.
This is why we use the term “dynamic hedging”.
This is not the same as the forward contract, where ∆ = 1 (always
holds one unit of the underlying asset) since the forward buyer has the
obligation to buy the underlying asset (unlike an option buyer who has
the right but not the obligation to exercise the option).
25
Black-Scholes-Merton option pricing formula:
V (S, t) = SN(d1)−Xe−r(T−t)N(d2),
where N(x) =1√2π
∫ x
−∞e−t
2/2 dt,
d1 =ln S
X +(r + σ2
2
)(T − t)
σ√T − t
and d2 =ln S
X +(r − σ2
2
)(T − t)
σ√T − t
.
Here, σ is volatility (standard derivation of log daily return) of the stock
price. The interest rate r is assumed to be constant since the impact of
fluctuation in r on option price is secondary.
26
The challenge is the specification of σ, which is the major source of
the model risk. The hedger may use the implied volatility inferred from
traded options or his own choice of volatility. However, either volatility
would not match with volatility of Mother Nature in general. The profit
or loss depends on the difference of the actual volatility and implemented
volatility multiplied by1
2S2∂∆
∂S=
1
2S2∂
2V
∂S2, where
∂∆
∂S=
∂2V
∂S2is called the
gamma Γ.
On the major contribution to option trading, the option pricing theory
provides ∆ =∂V
∂S= N(d1), which can be found once σ is specified.
The writer has to rely on the option price formula to obtain ∆ in his
implementation of the dynamic hedging strategy.
The option pricing theory plays an important role to facilitate trading of
options. Before the advent of the Black-Scholes-Merton theory, trading
of options was not popular since writers did not know how to perform
hedging and earned the fee beyond the fair premium at low risk.
27
• When the current stock price is low, the chance of expiring in-the-
money is low, so the delta is close to zero.• When the current stock price is high (relative to the strike price X),
the chance of expiring in-the-money is high, so the delta is close to
one.• When the current stock price is around the strike price, the delta
is around 0.5, representing roughly equal chance of expiring in-the-
money or out-of-the-money.28
Similar ideas can be extended to hedging a portfolio of instruments with
various risk sources. The challenges include
(i) identification of the risk sources;
(ii) sensitivity of the portfolio value with respect to these random risk
sources.
Some of these risk sources may not be prices of available traded instru-
ments, with examples like macro economic factor (say, Fed rate). The
Fed rate is not a direct tradeable instrument. However, there are in-
struments whose values are dependent on the Fed rate. Suppose one’s
portfolio value is dependent on the Fed rate, the portfolio risk can be
hedged by holding off-setting instruments whose values are also depen-
dent on the Fed rate.
The number of units held for the hedging instruments depends on the
ratio of sensitivity of the portfolio value and hedged instrument to the
underlying risk factor.
29
Dynamic hedging of an European call position at work
At the time of the trade, the call option fair value is $2.40 and the delta
is 0.522. Suppose the amount received by the seller for the options is
$300,000 (good for the seller). Since the seller is short 100,000 options,
the value of the seller’s portfolio is −$240,000.
Immediately after the trade, the seller’s portfolio can be made delta neu-
tral by buying 52,200 shares of the underlying stock. The cost of shares
purchased = 52,200× $49 = 2,557.8 thousand.
Since the delta changes when stock price changes over the life of the
option, the trader has to adjust the stock holding amount via rebalancing
in order to maintain delta-neutral. This is called dynamic hedging.
30
Cash flows arising from rehedging (dynamic rebalancing) and interest
costs
The stock price falls by the end of the first week to $48.12. The delta
declines to 0.458. A long position in 45,800 shares is now required to
hedge the option position. A total of 6,400 (= 52,200− 45,800) shares
are therefore sold to maintain the delta neutrality of the hedge.
The strategy realizes $308,000 in cash, and the cumulative borrowings
at the end of week 1 are reduced to $2,252,300. Note that interest rate
cost of one week, calculated by
2,557.8 thousand× 0.05/52 ≈ 2.5 thousand
has to be added. This comes out to be (in thousands)
2,557.8− 308+ 2.5 = 2,252.3.
32
• During the second week, the stock price reduces to $47.37 and delta
declines again. This leads to 5,800 shares being sold at the end of
the second week.
• During the third week, the stock price increases to over $50 and delta
increases. This leads to 19,600 shares being purchased at the end of
the third week!
Toward the maturity date of the option, it becomes apparent that the
option will be exercised and delta approaches 1.0. By week 20, therefore,
the hedger owns 100,000 shares.
Since the strike price is $50, the hedger receives $5 million (= 100,000×$50) for these shares when the option is exercised so that the total cost
of hedging it is $5,263,300− $5,000,000 = $263,300.
How would you compare the fair value of the call, which is $2.4 at initiation
of the trade, with the total cost of hedging per unit of the call option,
which is $2.633?
33
It is necessary to adjust the time value, where the value of the call 20
weeks after the trade is $2.4 × (1 + 0.05 × 20/52) ≈ 2.446. The seller
loses if he charges the price of the call option at the “fair value”.
• The higher cost of hedging when compared with the fair value may
be attributed to the overhedging due to delay in rebalancing (weekly
adjustment of hedging position). However, more frequent rebalancing
means higher transaction costs.
• Luckily, the seller received $3 per call option, so he maintains a gain
of $3× (1+ 0.05× 20/52) = $3.058− $2.633 = $0.425 per option at
maturity.
• The delta-hedging procedure in effect creates a long position in the
option synthetically to neutralize the seller’s short option position.
The hedger is forced into the buy-high and sell-low trading strategy
since the hedging procedure involves selling stock just after the price
has gone down and buying stock just after the price has gone up.
Note that transaction costs have not been included.
34
Remarks
• As the call option expires in-the-money (ST = $57.25 and X = $50),
the total sum of the stock units purchased over the 20 weeks must be
100,000 shares. These shares can be delivered to honor the obligation
since the option buyer chooses to exercise the call.
• The fair call option premium received upfront by the writer is the
present value of the total costs of setting up the hedging procedure.
35
One may query whether the cost of buying 100,000 shares over time can
be covered by the option premium of $3 per each unit. For example, can
the hedger cover the high cost if the stock price increases sharply (say,
up to $150 which is well above X = $50)?
It is not necessary to worry if one follows the dynamic hedging procedure
throughout the whole life of the option (not to start buying more shares
only when the stock price increases sharply). This is quite a miracle.
Indeed, the hedger almost holds the full amount of 100,000 units well
before the stock price rises to $150.
36
Remarks
• In case the call option expires out-of-the-money, the net number of
shares bought throughout the hedging procedure would be zero.
• Though the hedged portfolio ends up with zero number of shares held
at maturity, there is cost incurred in performing the dynamic hedging
procedure. This hedging cost is compensated by the option premium
collected at initiation.
Understanding and implementation of the dynamic hedging strategy en-
hance the growth of trading of options (like the strong warrant markets
in Hong Kong).
38
Minimum variance hedge ratio
Suppose a risk manager wants to hedge his exposure on asset S using
N units of hedging instrument F . One example is the hedge of jet fuel
using available heating oil futures contract. The total change in the value
of the hedged portfolio from the current time to the end of the hedging
period is given by
∆V = ∆S +N∆F.
Our goal is to find the number of units of the hedging instrument to be
used in order to minimize the variance of ∆V (which is used as a proxy
of the risk exposure). Recall var(X+Y ) = var(X)+2cov(X,Y )+var(Y ),
we have
σ2∆V = σ2∆S +2Nσ∆S,∆F +N2σ2∆F .
The variances and covariance are expressed in dollars, not in rates of
return. The first order condition to find N∗ that minimizes σ2∆V is given
by
0 =∂σ2∆V
∂N= 2σ∆S,∆F +2Nσ2∆F .
39
In our later exposition, we drop ∆ for notational simplicity. We obtain
N∗ = −σSFσ2F
= −ρSFσSσF
, where ρSF =σSFσSσF
.
Suppose we perform linear regression of ∆SS on ∆F
F , where
∆S
S= α+ βSF
∆F
F+ ϵ,
where ϵ is the error term with zero mean. The beta coefficient is known
to be βSF = σSF/σ2F (recall a similar derivation in CAPM). At the optimal
hedge N∗, the variance of ∆V becomes
σ∗V2 = σ2S +
(−σSFσ2F
)2σ2F +2
(−σSFσ2F
)σSF = σ2S −
σ2SFσ2F
.
The variance is reduced by σ2SF/σ2F . If the correlation coefficient of F and
S is close to one, then
σ∗V2 = σ2S − ρ2SFσ
2S → 0+ as ρSF → 1−.
40
Effectiveness of the hedge can be quantified by R, where
R2 =σ2S − σ∗V
2
σ2Sor σ∗V = σS
√1−R2.
This represents the relative decrease of variance of S with the introduction
of the N∗ units of the hedging instruments. Combining all the above
results, we obtain
R2 =σ2S −
(σ2S −
σ2SFσ2F
)σ2S
=σ2SFσ2Sσ
2F
= ρ2SF .
As expected, R→ 1− when ρSF → 1−. This ideal scenario of almost per-
fect correlation corresponds to the absence of basis risk. Basis risk is the
risk associated with imperfect hedging, which arises from the difference
between the price of the asset to be hedged and the price of the asset
serving as the hedge.
41
Example
An airline knows that it will need to purchase 10,000 metric tons of jet
fuel in three months. It wants some protection against an upturn in prices
using futures contracts.
The company can hedge using heating oil futures contracts traded on
NYMEX. The notional for one contract is 42,000 gallons. As there is no
futures contract on jet fuel, the risk manager wants to check if heating
oil could provide an efficient hedge instead. The current price of jet fuel
is $277/metric ton. The futures price of heating oil is $0.6903/gallon.
The standard derivation of the rate of change in jet fuel prices over three
months is 21.17%, that of futures is 18.59%, and the correlation is 0.8243.
42
Compute the following quantities.
(a) The notional and standard derivation of the unhedged fuel cost in
dollars.
(b) The optimal number of futures contract to buy, rounded to the closest
integer.
(c) The standard derivation of the hedged fuel cost in dollars.
Solution
(a) The position notional of the jet fuel is $2,770,000. The standard
derivation in dollars is
σS = 0.2117× $277× 10,000 = $586,409.
For reference, that of one futures contract is
σF = 0.1859× $0.6903× 42,000 = $5,389.72
with a futures notional of $0.6903× 42,000 = $28,992.60.
43
(b) The airline company has to buy heating oil futures as protection.
First, we compute beta of the rates of return of the two different
types of oil prices, which is
βsf =ρsfσs
σf= 0.8243(0.2117/0.1859) = 0.9387.
The corresponding covariance term is
σsf = 0.8243× 0.2117× 0.1859 = 0.03244.
Adjusting for the notionals, this is
σSF = 0.03244× 2,770,000× 28,993 = 2,605,268,452.
The optimal hedge ratio is given by
N∗ =σSFσ2F
=2,605,268,452
5,389.722= 89.7
or 90 contracts after rounding.
44
(c) We find the risk of the hedged position and effectiveness of the hedge.
The volatility of the unhedged position is σS = $586,409. The vari-
ance of the hedged position is
σ2S = ($586,409)2 = +343,875,515,281
−σ2SF/σ2F = −(2,605,268,452/5,390)2 = −233,653,264,867
V (hedged) = +110,222,250,414
Taking the square root, the volatility of the hedged position is σ∗V =
$331,997. Thus the hedge has reduced the risk from $586,409 to
$331,997. Computing the R2, we find that one minus the ratio of
the hedged and unhedged variances is
1−110,222,250,414
343,875,515,281= 67.95%.
This is exactly the square of the correlation coefficient, 0.82432 =
0.6795, or effectiveness of the hedge.
45
Technical result on linear regression
Suppose we perform linear regression of S on F , where
S = α+ βF + ϵ,
where α is the intercept and β is the slope of the regression line.
The residual ϵ can be taken to have zero mean, where E(ϵ) = 0, since any
nonzero E(ϵ) can be absorbed into α. Also, the random noises ϵ should
be uncorrelated with F .
Consider
cov(S, F ) = cov(α+ βF + ϵ, F )
= cov(α, F ) + βcov(F, F ) + cov(ϵ, F ) = βvar(F ).
so that
β =cov(S, F )
var(F )=
ρSFσFσSσ2F
,
where ρSF is the correlation coefficient of S and F .
46
1.3 Portfolio loss distribution
Risk elements
1. Exposure at default and recovery rate, both are random variables.
2. Default probability.
3. Credit migration – the process of changing the creditworthiness of an
obligor as characterized by the transition probabilities from one credit
state to other credit states.
• Arrival risk: timing of the event of default, modeled by a stopping
time τ
• Magnitude risk: loss amount (exposure net of the recovery value)
Loss amount = par value (possibly plus accrued interest) –
market value of a defaultable bond
47
Characterization of the credit risk of loans
• Financial variables to be considered include
– default probability (DP )
– loss fraction called the loss given default (LGD)
– exposure at default (EAD)
Goal: Derive the portfolio loss distribution based on the information of
individual risks and their correlations.
48
Loss variable of single name
L = EAD × SEV × L
where L = 1D, E[1D] = DP . Here, D is the default event that the
obligor defaults within a certain period of time. We treat severity (SEV )
of loss in case of default as a random variable with E[SEV ] = LGD.
Based on the assumption that the exposure, severity and default event
are independent, the expected loss (EL):
EL = E[L] = E[EAD]× LGD ×DP.
Here, EAD is in general stochastic and E[EAD] is the expectation of
several relevant underlying random variables.
It is common to have the situation where the severity of losses and the
default events are random variables driven by a common set of underlying
factors. In this case we need to have the information of the joint distri-
bution of SEV and 1D in order to perform the expectation calculations.
49
Unexpected loss – standard deviation of L
As a measure of the magnitude of the deviation of losses from the EL,
a natural choice is the standard deviation of the loss variable L, which is
termed unexpected loss in the risk management literature.
Unexpected loss (UL) =√var(L) =
√var(EAD × SEV × L).
Under the assumption that the severity and the default event D are inde-
pendent, and also EAD is taken to be deterministic, we have
UL = EAD ×√var(SEV )×DP + LGD2 ×DP (1−DP ).
In contrast to EL, UL served as the proxy for the uncertainty faced by
the bank when investing in a portfolio since UL captures the deviation
from the expectation.
50
Proof
We make use of var(X) = E[X2]− E[X]2, so that
var(1D) = DP (1−DP )
since E[12D] = E[1D] = DP . Assuming SEV and 1D are independent,
we have
var(SEV1D) = E[SEV 212D]− E[SEV1D]2
= E[SEV 2]E[12D]− E[SEV ]2E[1D]2
= var(SEV ) + E[SEV ]2DP − E[SEV ]2DP2
= var(SEV )×DP + LGD2 ×DP (1−DP ).
When SEV is random, an additional contribution to var(SEV1D) arises
from var(SEV )×DP .
51
Portfolio losses
Consider a portfolio of m risky obligors
Li = EADi × SEVi ×1Di, i = 1, . . . ,m, P [Di] = E[1Di
] = DPi.
The random portfolio loss Lp is given by
Lp =m∑
i=1
Li =m∑
i=1
EADi × SEVi ×1Di.
Using the additivity of expectation, we obtain
ELp =m∑
i=1
ELi =m∑
i=1
EADi × LGDi ×DPi.
In the case UL, additivity holds if the loss variable Li are pairwise uncor-
related. That is,
var(m∑
i=1
Li) =m∑
i=1
var(Li),
provided that Li are uncorrelated. Unfortunately, correlations are the
“main part of the game” and a main driver of credit risk.
52
The general case with non-zero correlations is given by
ULp =√var(Lp)
=
√√√√√ m∑i=1
m∑j=1
EADi × EADj × cov(SEVi ×1Di, SEVj ×1Dj
).
For a portfolio with constant severities, we have the following simplified
formula
UL2p =
m∑i=1
m∑j=1
EADi×EADj×LGDi×LGDj×√DPi(1−DPi)DPj(1−DPj)ρij,
where
ρij = correlation coefficient between default events
=cov(1Di
,1Dj)√
var(1Di)var(1Dj
),
with var(1Di) = DPi(1−DPi).
53
Example: Two-asset credit portfolio
Take m = 2, LGDi = EADi = 1, i = 1,2, then
UL2p = p1(1− p1) + p2(1− p2) + 2ρ
√p1(1− p1)p2(1− p2),
where pi is the default probability of obligor i, i = 1,2, and ρ is the
correlation coefficient.
Remarks on default correlation
(i) When ρ = 0, the two default events are uncorrelated. Under full
diversification with widely different assets of diversified classes, the
correlation is viewed as being close to zero.
54
(ii) When ρ > 0, the default of one counterparty increases the likelihood
that the other counterparty may also default. Consider
P [1D2= 1|1D1
= 1] =P [1D2
= 1,1D1= 1]
P [1D1= 1]
=E[1D1
1D2]
p1
=p1p2 + cov(1D1
,1D2)
p1= p2 +
cov(1D1,1D2
)
p1.
Positive correlation leads to a conditional default probability higher
than the unconditional default probability p2 of obligor 2.
• Under the case of perfect correlation and p = p1 = p2, we have
ULp = 2√p(1− p).
This means the portfolio contains the risk of only one obligor but
with double intensity (concentration risk). The default of one obligor
makes the other obligor defaulting almost surely.
55
Matrix representation of portfolio variance UL2p in terms of individual
unexpected loss ULi
Recall
UL2p = var(Lp) = cov(L1 + · · ·+ Lm, L1 + · · ·+ Lm)
=m∑
i=1
m∑j=1
cov(Li, Lj) =m∑
i=1
m∑j=1
ULiULjρij,
where UL2i = var(Li) and the correlation coefficient between loss variables
ρij =cov(Li, Lj)√
var(Li)√var(Lj)
, i, j = 1,2, . . . ,m.
56
Suppose we write
L = (UL1 UL2 · · ·ULm)T
as the vector of unexpected losses of the m obligors and
Ω =
ρ11 · · · ρ1m... ...
ρm1 · · · ρmm
as the correlation matrix, then the matrix representation of portfolio vari-
ance is given by
UL2P = LTΩL.
Since UL2P ≥ 0, so Ω is semi-positive definite. A matrix A is said to be
semi-positive definite matrix if xTAx ≥ 0 for all x.
57
We examine how ULP is affected by the change in ULk.
For a fixed k, consider
∂UL2P
∂ULk=
N∑j=1
N∑i=1
∂ULi
∂ULkULjρij +
N∑i=1
N∑j=1
∂ULj
∂ULkULiρij
=N∑
j=1
ULjρkj +N∑
i=1
ULiρik
Since ρkj = ρjk, we obtain
∂UL2P
∂ULk= 2
N∑j=1
ULjρkj
giving
∂ULP
∂ULk=
N∑j=1
ρkjULj
ULP.
58
Risk contribution
Recall that∂ULp∂ULi
gives the change in the portfolio risk ULP due to one
unit of exposure of risky asset i. The risk contribution of a risky asset i
to the portfolio unexpected loss is defined to be the incremental risk that
the exposure of a single asset contributes to be the portfolio’s total risk,
namely,
RCi = ULi∂ULp
∂ULi=
ULi∑
j ULjρij
ULp.
Using the unexpected losses ULi and ULp as the quantifiers of risk, we
expect that the risk contributions from the risky assets is simply the total
portfolio risk. As a verification, it is seen mathematically that∑i
RCi =
∑iULi
∑j ULjρij
ULp= ULp.
59
Calculation of EL, UL and RC for a two-asset credit portfolio
ρ default correlation coefficient between the two exposures
ELpportfolio expected loss
ELp = EL1 + EL2
ULp
portfolio unexpected loss
ULp =√UL2
1 + UL22 +2ρUL1UL2
RC1risk contribution from Exposure 1
RC1 = UL1(UL1 + ρUL2)/ULp
RC2risk contribution from Exposure 2
RC2 = UL2(UL2 + ρUL1)/ULp
ULp = RC1 +RC2
60
Fitting of loss distribution
The two statistical measures about the credit portfolio are
1. mean, or called the portfolio expected loss;
2. standard deviation, or called the portfolio unexpected loss.
We approximate the loss distribution of the original portfolio by a be-
ta distribution through matching the first and second moments of the
portfolio loss distribution.
The risk quantiles of the original portfolio can be approximated by the
respective quantities of the approximating random variable X. The price
for such convenience of fitting is the model risk.
61
Beta distribution
The density function of a beta distribution is
f(x;α, β) =
Γ(α+β)Γ(α)Γ(β)x
α−1(1− x)β−1, 0 < x < 1
0 otherwiseα > 0, β > 0,
where Γ(α) =∫∞0 e−xxα−1 dx. Mean µ = α
α+β and variance σ2 =αβ
(α+β)2(α+β+1). Determine α and β in terms of µ and σ2.
A beta distribution with only two degrees of freedom is perhaps insufficient
to give an adequate description of the tail events in the loss distribution.
62
Third and fourth order moments of a distribution
• Skewness describes the departure from symmetry of a distribution:
γ =
∫∞−∞(x− E[X])3f(x) dx
σ3.
The skewness of a normal distribution is zero. Positive skewness
indicates that the distribution has a long right tail and so entails large
positive values.
• Kurtosis describes the degree of flatness of a distribution:
δ =
∫∞−∞(x− E[X])4f(x) dx
σ4.
The kurtosis of a normal distribution is 3. A distribution with kurtosis
greater than 3 has the tails decay less quickly than that of the normal
distribution, implying a greater likelihood of large value in both tails.
63
Characteristics of loss distributions for different risk types
Type of risk Second moment Third moment Fourth moment
(standard deviation) (skewness) (kurtosis)
Market risk High Zero Low
Credit risk Moderate Moderate Moderate
Operational risk Low High High
• The market risk loss distribution is symmetrical but not perfectly
normally distributed.
64
• The credit risk loss distribution is quite skewed with long right tail.
• The operational risk distribution has a quite extreme shape. Most
of the times, losses are modest, but occasionally they are very large.
65
Loss distribution of a credit portfolio
All risk quantities on a portfolio level are based on the portfolio loss
variable Lp. Once the loss distribution is generated, all risk measures can
be calculated.
66
1.4 VaR (value-at-risk), expected shortfall and coherent risk mea-
sure
The Value-at-Risk (VaR) can be translated as “I am X percent certain
there will not be a loss of more than V dollars in the next N days.” If
the VaR on a risky portfolio is $1 million at one-month, 95% confidence
levels, then there is only 5% chance that the portfolio loses more than
$1 million over the next one month period.
• The variable V is the VaR of the portfolio. It is a function of (i) time
horizon (N days); (ii) confidence level (X%).
• It is the loss level over N days that has a probability of only (100−X)%
of being exceeded.
• The Bank of International Settlements proposes banks to calculate
VaR for market risk with N = 10 and X = 99.
67
• Weakness of volatility as the measure of risk: it does not care about
the direction of portfolio value movement. Thick “head” of upside
gain is viewed the same as thick “tail” of downside loss by volatility.
Calculation of VaR from the probability distribution of the change in
the portfolio value; confidence level is X%. Gains in portfolio value are
positive; losses are negative.
68
• VaR disregards the details of loss distribution beyond the VaR Level
(tail risk)
Alternative situation where VaR is the same, but the mean of loss
beyond VaR is larger.
• VaR is commonly calculated based on historical scenarios. Under
catastrophic market conditions or an extreme dependence structure of
assets (clustering effect), VaR may underestimate risk due to survival
bias. We require a default correlation model to quantify tail risk under
distressed state, which may not be captured by historical scenarios.
69
Formal definition
VaR is defined for some confidence level α as the α-quantile of a loss
random variable X
VaRα(X) = infx|P [X ≤ x] ≥ α.
For example, take α = 99% and one-month horizon; the above definition
states that with 99% chance that the loss amount (value of X) is less
than VaRα(X) within the one-month period.
Banks should hold some capital cushion (economic capital) against unex-
pected losses. Using UL is not sufficient since there might be a significant
likelihood that losses will exceed portfolio’s EL by more than one standard
deviation of the portfolio loss. While VaR is determined with reference to
a given choice of α, there is no consideration of confidence level in UL.
Risk measures that rely on one absolute value and a single choice of
confidence level are subject to game playing by fund managers. A man-
ager may choose a portfolio that meets the VaR requirement but pays no
attention to the severity of losses beyond VaR.
70
Generalized inverse and α-quantile
Given a nondecreasing function F : R→ R, the generalized inverse of F is
given by
F ←(y) = infx ∈ R : F (x) ≥ y
with the convention inf ϕ =∞.
If F is strictly increasing, then F ← = F−1. We recover the usual inverse.
Using the generalized inverse, we define the α-quantile of F by
qα(F ) = F ←(α) = infx ∈ R : F (x) ≥ α, α ∈ (0,1).
Note that VaRα(F ) = qα(F ), where F is the loss distribution. Also, it is
seen that
VaR(aX + b) = αVaRα(X) + b,
for a > 0 and b ∈ R.
71
Example 1 – Portfolio gain treated as a normal random variable
Suppose that the gain from a portfolio during six months is normally
distributed with a mean of $2 million and a standard deviation of $10
million.
Recall the cumulative normal distribution:
N(x) =∫ x
−∞
1√2π
e−t2/2 dt,
and N(−2.33) = 0.01 = 1%.
• From the properties of the normal distribution, the one-percentile
point of this distribution is 2− 2.33× 10, or -$21.3 million.
• The VaR for the portfolio with a time horizon of six months and
confidence level of 99% is therefore $21.3 million.
72
Example 2
Suppose that for a one-year project all outcomes between a loss $50
million and a gain of $50 million are considered equally likely.
• The loss from the project has a uniform distribution extending from
−$50 million to +$50 million. There is a 1% chance that there will
be a loss greater than $49 million.
• The VaR with a one-year time horizon and a 99% confidence level is
therefore $49 million.
73
Calculation of VaR using historical simulation
Suppose that VaR is to be calculated for a portfolio using a 1-day time
horizon, a 99% confidence level, and 501 days of data.
• The first step is to identify the market variables affecting the portfolio.
These are typically exchange rates, equity prices, interest rates, etc.
• Data is then collected on the movements in these market variables
over the most recent 501 days. This provides 500 alternative scenarios
for what can happen between two consecutive days.
• Scenario 1 is where the percentage changes in the values of all vari-
ables are the same as they were between Day 0 and Day 1, scenario
2 is where they are the same as they were between Day 1 and Day 2,
and so on.
74
Data for VaR historical simulation calculation
DayMarket Market
. . .Market
variable 1 variable 2 variable n
0 20.33 0.1132 . . . 65.37
1 20.78 0.1159 . . . 64.91
2 21.44 0.1162 . . . 65.02
3 20.97 0.1184 . . . 64.90... ... ... ... ...
498 25.72 0.1312 . . . 62.22
499 25.75 0.1323 . . . 61.99
500 25.85 0.1343 . . . 62.10
75
Define vi as the value of a market variable on Day i and suppose that
today is Day m, say m = 500. The ith scenario assumes that the value
of the market variable tomorrow will be vmvi
vi−1.
For the first variable, the value today, v500, is 25.85. Also v0 = 20.33 and
v1 = 20.78. It follows that the value of the first market variable in the
first scenario is 25.85 × 20.7820.33 = 26.42. For the second scenario, we have
25.85× 21.4420.78 = 26.67.
We generate all 500 historical scenarios for each market variable, and
repeat the calculation for all market variables.
76
Scenarios generated for tomorrow (Day 501) using data in the last table
Scenario Market Market. . .
Market Portfolio value Change in value
number variable 1 variable 2 variable n ($ millions) ($ millions)
1 26.42 0.1375 . . . 61.66 23.71 0.21
2 26.67 0.1346 . . . 62.21 23.12 -0.38
3 25.28 0.1368 . . . 61.99 22.94 -0.56... ... ... ... ... ... ...
499 25.88 0.1354 . . . 61.87 23.63 0.13
500 25.95 0.1363 . . . 62.21 22.87 -0.63
Under the assumption of scenario 1, the change in portfolio value between
Day 500 and Day 501 is repeating the same change in Day 0 and Day 1.
Based on today’s portfolio value of 23.50, the change in portfolio value
in scenario 1 and scenario 2 are, respectively, 23.71 − 23.50 = 0.21 and
23.12− 23.50 = −0.38.
77
How to estimate the 1-percentile point of the distribution of changes in
the portfolio value?
Since there are a total of 500 scenarios, we can estimate this as the fifth
worst number in the final column of the table. With confidence level of
99%, the maximum loss does not exceed the fifth worst number. The
N-day VaR for a 99% confidence level is calculated as√N times the 1-day
VaR.
Query
Why the choice of√N as the multiplier when the time horizon is N days?
This is associated with the property of dispersion that grows at square
root of time. This is consistent with the physics of diffusion, where
diffusion distance ∼√diffusion time. To double the diffusion distance,
the diffusion time required is four-fold.
78
Expected shortfall
The expected shortfall (tail conditional expectation) with respect to a
confidence level α is defined as
ESα(X) = E[X|X > VaRα(X)].
Let c = VaRα(X), a critical loss threshold corresponding to some con-
fidence level α, the expected shortfall capital provides a cushion against
the mean value of losses exceeding the critical threshold c.
The expected shortfall focusses on the expected loss in the tail, starting
at c, of the portfolio’s loss distribution.
• When the loss distribution is normal, VaR and expected shortfall give
essentially the same information. Both VaR and ES are multiples of
the standard deviation. For example, VaR at the 99% confidence
level is 2.33σ while ES of the same level is 2.67σ. This is because
normal distribution is fully specified by mean and standard deviation
σ, so both VaR and ES are multiples of σ.
79
Expected shortfall: E[X|X > V aRα(X)].
• The computation of the expected shortfall requires the information
on the tail distribution (extreme value distribution).
80
Relation between VaR and ES
For a loss L with continuous distribution function FL and density function
fℓ, the expected shortfall is given by
ESα(L) =1
1− α
∫ 1
αVaRu(L) du.
To show the claim, note that
ESα(L) = E[L|L ≥ VaRα] =1
1− αE[L1L≥VaRα
]=
1
1− α
∫ ∞VaRα
ℓfL(ℓ) dℓ.
We set u = FL(VaRu) and write ℓ as VaRu so that
u = ℓ when ℓ = VaRα and u = 1 when ℓ tends to ∞.
Also, we observe ℓfL(ℓ) dℓ = VaRu du. Hence, we obtain
ESα(L) =1
1− α
∫ 1
αVaRu(L) du.
81
Coherent risk measures
Let X and Y be two random variables, like the dollar loss amount of two
portfolios. A risk measure is called a coherent measure if the following
properties hold:
1. monotonicity
For X ≤ Y, γ(X) ≤ γ(Y )
2. translation invariance
For all X ∈ R, γ(X + x) = γ(X) + x.
Here, x is a deterministic scalar quantity.
3. positive homogeneity
For all λ > 0, γ(λX) = λγ(X)
4. subadditivity (benefit of diversification)
γ(X + Y ) ≤ γ(X) + γ(Y )
82
Financial interpretation of the four properties
1. Monotonicity: If a portfolio produces a worse result than another port-
folio for every state of the world, its risk measure should be greater.
2. Translation invariance: If an amount of cash K is added to a portfolio,
its risk measure should go down by K. This is seem by setting x = −K,
so that γ(X −K) = γ(X)−K. The risk as quantified by γ(X) can be
reduced to zero by adding γ(X) cash into the credit portfolio.
3. Positive homogeneity: Changing the size of a portfolio by a positive
factor λ, while keeping the relative amounts of different items in the
portfolio the same, should result in the risk measure being multiplied
by λ.
83
4. Subadditivity: The risk measure for two portfolios after they have
been merged should be no greater than the sum of their individual
risk measures before they were merged. This property reflects the
benefit of diversification, in a similar spirit to
σX+Y ≤ σX + σY ,
where σX denotes the standard deviation of X. Note that
σ2X+Y = σ2X + σ2Y +2ρσXσY
≤ σ2X + σ2Y +2σXσY = (σX + σY )2.
VaR satisfies the first three conditions. However, it does not always
satisfy the fourth one.
84
Example 3 – Violation of subaddivity in VaR
Suppose each of two independent projects has a probability of 0.02 of
loss of $10 million and a probability of 0.98 of a loss of $1 million during
a one-year period. Suppose we set the confidence level α to be 97.5%.
In this example, the loss random variable X of single project can assume
two discrete values: $1 million and $10 million.
• We have P [X = 1] = 0.98 and P [X = 10] = 0.02, so that P [X ≤ 1] =
0.98 and P [X ≤ 10] = 1. For 1 ≤ x < 10, we have P [X ≤ x] = 0.98 ≥0.975, for x < 1, P [X ≤ x] = 0. The smallest value of x that satisfies
P [X ≤ x] ≥ 97.5% is x = 1, so VaR97.5% = 1.
85
One-project portfolio
P [X = 1] = 0.98 and P [X = 10] = 0.02 so that P [X ≤ 1] = 0.98 and
P [X ≤ 10] = 1. The distribution function F (x) = P [X ≤ x] jumps by 0.98
and 0.02 when x crosses 1 and 10, respectively.
86
Two-project portfolio
P [Y = 2] = 0.9604, P [Y = 11] = 0.0392 and P [Y = 20] = 0.0004 so that
P [Y ≤ 2] = 0.9604, P [Y ≤ 11] = 0.9996 and P [Y ≤ 20] = 1.
87
When the projects are put in the same portfolio, there is a 0.02× 0.02 =
0.0004 probability of a loss of $20 million, a 2 × 0.02 × 0.98 = 0.0392
probability of a loss of $11 million, and a 0.98×0.98 = 0.9604 probability
of a loss of $2 million.
Let Y denote the loss random variable of the two projects. Note that
P [Y ≤ 2] = 0.9604, P [Y ≤ 11] = 0.9996 and P [Y ≤ 20] = 1. For y < 11,
P [Y ≤ y] = 0.9604; for 11 ≤ y < 20, P [Y ≤ y] = 0.9996 > 97.5%. The
smallest value of y that satisfies P [Y ≤ y] > 97.5% is y = 11, so the
one-year 97.5% VaR for the portfolio is $11 million.
The sum of the VaRs of the projects considered separately is $2 million.
The VaR of the portfolio is therefore greater than the sum of the VaRs
of the projects by $9 million. This violates the subadditivity condition.
88
Expected shortfall
In Example 3, the VaR for one of the projects considered on its own is $1
million. To calculate the expected shortfall for a 97.5% confidence level
we note that, of the 2.5% tail of the loss distribution, X is either equal
to 1 or 10. We observe 2% corresponds to a loss of $10 million and the
remaining 2.5%− 2% = 0.5% to a loss of $1 million.
• Conditional that we are in the 2.5% tail of the loss distribution, there
is therefore an0.02
0.025= 80% probability of a loss of $10 million and
100%− 80% = 20% probability of a loss of $1 million. The expected
loss is 0.8× 10+ 0.2× 1 or $8.2 million.
• When the two projects are combined, of the 2.5% tail of the loss
distribution, 0.04% corresponds to a loss of $20 million and the re-
maining 2.5%− 0.04% = 2.46% corresponds to a loss of $11 million.
Conditional that we are in the 2.5% tail of the loss distribution, the
expected loss is therefore (0.04/2.5)×20+(2.46/2.5)×11, or $11.144
million. Since 8.2+8.2 > 11.144, the expected shortfall measure does
satisfy the subadditivity condition for this example.
89
Example 4 – Violation of subaddivity in VaR
A bank had two $10 million one-year loans, each of which has a 1.25%
chance of defaulting. If a default occurs, all losses between 0% and 100%
of the principal are equally likely. If the loan does not default, a profit of
$0.2 million is made. To simplify matters, we suppose that if one loan
defaults it is certain that the other loan will not default. We would like
to compute VaR99%. Since the random loss variable is continuous, it
amounts to find x such that P [X > x] = 0.01 or
P [X > x|1D1= 1] =
0.01
0.0125= 80%.
90
1. Consider first a single loan. This has a 1.25% chance of defaulting.
When a default occurs, the loss experienced is evenly distributed be-
tween zero and $10 million. Conditional on a loss being made, there
is an 80% (0.8) chance that the loss will be greater than $2 million.
Since the probability of a loss is 1.25% (0.0125), the unconditional
probability of a loss greater than $2 million is 0.8× 0.0125 = 0.01 or
1%. Mathematically, we observe
P [X > 2|1D1= 1] = 0.8
so that
P [X > 2] = P [X > 2|1D1= 1]P [1D1
= 1] = 0.8× 0.0125 = 0.01.
The one-year 99% VaR is therefore $2 million.
91
2. Consider next the portfolio of two loans. Each loan defaults 1.25%
of the time and they never default together so that
P [one loss] = 2× 1.25% = 2.5%
P [two losses] = 0.
Upon occurrence of a loan loss event, the loss amount is uniformly
distributed between zero and $10 million.
There is a 2.5% (0.025) chance of one of the loans defaulting and
conditional on this event there is an 40% (0.4) chance that the loss
on the loan that defaults is greater than $6 million. That is,
P [X > 6|1D1= 1,1D2
= 0 or 1D2= 1,1D1
= 0]
=0.01
0.025= 40%.
The unconditional probability of a loss from a default being greater
than $6 million is therefore 0.4× 0.025 = 0.01 or 1%.
In the event that one loan defaults, a profit of $0.2 million is made on
the other loan, showing that the one-year 99% VaR is $5.8 million.
92
The total VaR of the loans considered separately is 2 + 2 = $4 million.
The total VaR after they have been combined in the portfolio is $1.8
million more at the value $5.8 million. This shows that the subadditivity
condition is violated.
Managers can game the VaR measure to report good risk management
while exposing the firm to substantial risks
• Computing firmwide VaR is often a formidable task to perform. The
alternative is to segment the computations by instruments and risk
drivers, and to compute separate VaR’s on tranches and desks of a
company. This is quite necessary in financial institutions since the
technological trading platforms are often desk by desk. With loss of
subadditivity, the firmwide VaR cannot be properly assessed since the
firmwide VaR may become significantly larger than the sum of all
VaRs from individual desks.
93
Expected shortfall
One-loan portfolio
We showed that the VaR for a single loan is $2 million. In the 1.0% tail
where X > VaR99% = 2, the loss ranges from 2 million to 10 million.
The expected shortfall from a single loan when the time horizon is one
year and the confidence level is 99% is therefore the expected loss on the
loan conditional on a loss greater than $2 million is halfway between $2
million and $10 million, or $6 million.
Two-loan portfolio
When one loan defaults, the other loan (by assumption) does not. The
non-defaulting loan conributes a profit of 0.2 million. The profit/loss is
uniformly distributed between a gain of $0.2 million and a loss of $9.8
million. The VaR for a portfolio consisting of the two loans was calculated
in Example 4 as $5.8 million. The expected shortfall from the portfolio is
therefore the expected loss on the portfolio conditional on the loss being
greater than $5.8 million.
94
In the 1.0% tail of the loss distribution, where X > VaR99% = 5.8, the
loss ranges from 5.8 to 9.8. The expected loss, given that we are in the
part of the uniform distribution between $5.8 million and $9.8 million, is
$7.8 million. This is the expected shortfall of the portfolio.
Since $7.8 million is less than 2×$6 = $12 million, the expected shortfall
measure does satisfy the subadditivity condition.
The subadditivity condition is not of purely theoretical interest. It is
not uncommon for a bank to find that, when it combines two portfolios
(e.g., its equity portfolio and its fixed income portfolio), the VaR of the
combined portfolio goes up.
95
Risk control for expected utility-maximizing investors
Utility-maximizing investors with VaR constraint optimally choose to con-
struct vulnerable positions that can result in large losses exceeding the
VaR level.
Example
Suppose that an investor invests 100 million yen in the following four
mutual funds:
• concentrated portfolio A, consisting of only one defaultable bond with
4% default rate;
• concentrated portfolio B, consisting of only one defaultable bond with
0.5% default rate;
• a diversified portfolio that consists of 100 defaultable bonds with 5%
default rate;
• a risk-free asset.
96
The profiles of all bonds in these funds are started as follows:
• maturity is one year
• defaults are mutually independent
• recovery rate is 10%.
Bond A has higher coupon rate with higher default rate. The yield to
maturity, default rate, and recovery rate are fixed until maturity.
Profiles of bonds included in the mutual funds
Number of bonds Yield to Default Recovery
included maturity(%) rate(%) rate(%)
Concentrated portfolio A 1 4.75 4.00 10
Concentrated portfolio B 1 0.75 0.50 10
Diversified portfolio 100 5.50 5.00 10
Risk-free asset 1 0.25 0.00 -
97
W final wealth, W0 initial wealth,X1 amount invested in concentrated portfolio A,X2 amount invested in concentrated portfolio B,X3 amount invested in diversified portfolio.
Assuming logarithmic utility, the expected utility of the investor is:
E[u(W )] =100∑n=0
0.96 · 0.995 · 0.05n · 0.95100−n ·100 Cn · ln w(1,1)
+100∑n=0
0.04 · 0.995 · 0.05n · 0.95100−n ·100 Cn · ln w(0.1,1)
+100∑n=0
0.96 · 0.005 · 0.05n · 0.95100−n ·100 Cn · ln w(1,0.1)
+100∑n=0
0.04 · 0.005 · 0.05n · 0.95100−n ·100 Cn · ln w(0.1,0.1),
where
w(a, b) =1.0475aX1 +1.0075bX2 +1.055X3100− 0.9n
100+ 1.0025(W0 −X1 −X2 −X3).
The multiplier 1.0475 in the first term is 1+ yield to maturity of bond A,
etc.98
The terms in E[u(W )] correspond to (i) non-default of the two bonds with
a = b = 1, (ii) default of the first bond and non-default of the second
bond with a = 0.1 and b = 1, etc. The index n counts the number of
defaults in the basket of 100 bonds. Note that 100−0.9n100 is the fraction
of par values received from the 100 − n non-defaulting bonds from the
diversified portfolio and W0−X1−X2−X3 is the amount invested in the
riskfree asset.
We analyze the impact of risk management with VaR and expected short-
fall on the rational investor’s decisions by solving the following five opti-
mization problems, where the holding period is one year.
1. No constraint
maxX1X2X3
E[u(W )].
99
2. Constraint with VaR at the 95% confidence level
maxX1X2X3
E[u(W )]
subject to VaR(95% confidence level) ≤ 3.
3. Constraint with expected shortfall at the 95% confidence level
maxX1X2X3
E[u(W )]
subject to expected shortfall(95% confidence level) ≤ 3.5.
4. Constraint with VaR at the 99% confidence level
maxX1X2X3
E[u(W )]
subject to VaR(99% confidence level) ≤ 3.
5. Constraint with expected shortfall at the 99% confidence level
maxX1X2X3
E[u(W )]
subject to expected shortfall(99% confidence level) ≤ 3.5.
100
Impact in risk concentration under VaR or ES constraint
• We analyze the effect of risk management with VaR and expected
shortfall by comparing solutions (2)-(5) with solution (1).
• The solution of the optimization problem with a 95% VaR constraint
shows that the amount invested in concentrated portfolio A is greater
than that of solution (1): that is, the portfolio concentration is en-
hanced by risk management using VaR. While VaR is reduced from
3.35 (unconstrained case) to 3, the expected shortfall increases from
5.26 (unconstrained case) to 14.35.
• The figure depicts the tails of the cumulative probability distributions
of the profit-loss of the portfolios. The left tail under VaR constraint
(95% confidence level) may suffer significant loss when bond A de-
faults (this risk is not well captured by VaR95%).
101
95% confidence level: Portfolios obtained with (i) no constraint, (ii) VaR
constraint, (iii) ES constraint.
102
Cumulative distribution of profit-loss: the left tail (95% confidence level).
VaR95% can be found directly by finding the profit/loss value at 5% of
the cumulative probability. The stepwise increments of the cumulative
distribution reflect the discrete loss amount upon default of a bond in
the portfolio. The extended horizontal segment along the 4% level in the
broken curve (2) exemplifies the occurrence of default of bond A (with
significant holding of 20.1%) with default rate 4%.
103
Under VaR constraint
When constrained by VaR, the investor must reduce her investment in the
diversified portfolio to reduce maximum losses with a 95% confidence lev-
el. This is made possible by increasing investments either in concentrated
portfolio or in a risk-free asset.
• Concentrated portfolio A has little effect on VaR, since the probabil-
ity of default lies beyond the 95% confidence interval. Concentrated
portfolio A also yields a higher return than other assets, except diver-
sified portfolio. Thus, the investor chooses to invest in concentrated
portfolio A.
• Although VaR is reduced, the optimal portfolio is vulnerable due to
its concentration and larger losses under conditions beyond the VaR
level. With high percentage holding (20.1%) on bond A that provides
high yield to maturity of 4.5%, VaR is kept under control while the
tail can become quite fat.
104
Under expected shortfall constraint
When constrained by expected shortfall, there is a higher chance that
the investor chooses optimally to reallocate his investment to a risk-free
asset, significantly reducing the portfolio risk.
• The investor cannot increase his investment in the concentrated port-
folio without affecting expected shortfall, which takes into account the
losses beyond the VaR level.
• Unlike risk management with VaR, risk management with expected
shortfall does not enhance credit concentration.
Remark
If investors can invest in assets whose loss is infrequent but large (such
as concentrated credit portfolios), the problem of tail risk can be serious.
Investors can manipulate the profit-loss distribution using those assets,
so that VaR becomes small while the tail becomes fat.105
Raising the confidence level to 99%
• We examine whether raising the confidence level of VaR solves the
problem. The new table gives the results of the optimization problem
with a 99% VaR or expected shortfall constraint. It shows that when
constrained by VaR at the 99% confidence level, the investor optimally
chooses to increase his/her investment in concentrated portfolio B
because the default rate of concentrated portfolio B is 0.5%, outside
the confidence level of VaR.
• Risk management with expected shortfall reduces the potential loss
beyond the VaR level by reducing credit concentration.
• VaR may enhance credit concentration because it disregards losses
beyond the VaR level, even at high confidence levels. On the other
hand, expected shortfall reduces credit concentration because it takes
into account losses beyond the VaR level as a conditional expectation.
106
99% confidence level: Portfolios obtained with (i) no constraint, (ii) VaR
constraint, (iii) ES constraint.
• With higher confidence level, it is expected that the portfolio obtained
with ”no constraint“ would have higher VaR and ES.
107
Cumulative distribution P [X ≤ x], X is the random profit
Cumulative distribution of profit-loss when the tail risk of VaR occurs.
108
Since we plot cumulative distribution against profit/loss, VaR can be
obtained directly by the point that the horizontal line 1 − α cuts the
curve.
In the upper curve on the left side, it shows higher probability of significant
loss. The expected shortfall on loss is larger (shown by the left side arrow
of expected shortfall). The corresponding VaR has less negative value on
profit (or smaller value of VaR on loss). This shows a tradeoff between
VaR and expected shortfall, where lower VaR on loss is achieved with
higher expected shortfall on loss.
109
Spectral risk measure
A risk measure can be characterized by the weights it assigns to quantiles
of the loss distribution.
• VaR gives a 100% weighting to the Xth quantile and zero to other
quantiles.
• Expected shortfall gives equal weight to all quantiles greater than the
Xth quantile and zero weight to all quantiles below Xth quantile.
A spectral risk measure is defined by making assumptions about the
weights assigned to quantiles. A general result is that a spectral risk
measure is coherent (i.e., it satisfies the subadditivity condition) if the
weight assigned to the qth quantile of the loss distribution is a nonde-
creasing function of q. Expected shortfall satisfies this condition since
nondecreasing property is (marginally) satisfied under constant weights.
110
Economic capital (risk capital)
• This is the amount of capital a financial institution needs in order to
absorb losses over a certain time horizon (usually one year) with a
certain confidence level.
• The confidence level depends on financial institutions’ objectives.
Corporations rated AA have a one-year probability of default less than
0.1%. This suggests that the confidence level should be 99.9%, or
even higher.
111
Take a target level of statistical confidence into account. For a given
level of confidence α, let Lp denote the random portfolio loss amount, we
define the credit VaR by the α-quartile of Lp:
qα = infq > 0|P [LP ≤ q] ≥ α.
Also, we define
ECα = economic capital = qα − ELP .
Say, α = 99.98%, this would mean ECα will be sufficient to cover losses in
9,998 out of 10,000 years (two occurrences over 10,000 years), assuming
a planning horizon of one year.
Why reducing the quantile qα by the EL in setting ECα? This is the usual
practice of decomposing the total risk capital into (i) expected loss (ii)
cushion against catastrophic losses.
112
Extreme value theory
Extreme value theory (EVT) is used to estimate the tails of a distribution.
EVT can be used to improve VaR and ES estimates with a very high
confidence level. It involves smoothing and extrapolating the tails of an
empirical distribution.
Suppose that F (v) = P [V ≤ v] is the cumulative distribution function for
a loss variable V (such as the loss on a portfolio over a certain period of
time) and that u is a value assumed by V in the right-hand tail of the
distribution. The probability that V lies between u and u + y (y > 0) is
F (u + y) − F (u). The probability that V assumes value that is greater
than u is 1− F (u). Define Fu(y) as the probability that V lies between u
and u+ y conditional on V > u. This is
P [V ≤ u+ y|V > u] =P [u < V ≤ u+ y]
P [V > u]= Fu(y) =
F (u+ y)− F (u)
1− F (u).
The variable Fu(y) defines the right tail of the probability distribution.
It is the cumulative probability distribution for the amount by which V
exceeds u by the amount y given that it does exceed u.
114
Generalized Pareto distribution
For a wide class of distributions F (v), the distribution of the tail beyond u
as denoted by Fu(y) converges to a generalized Pareto distribution as the
threshold u is increased. The generalized Pareto (cumulative) distribution
is
Gξ,β(y) = 1−(1+ ξ
y
β
)−1/ξ≈ Fu(y) when u is large.
The distribution has two parameters that have to be estimated from the
data, namely, ξ and β. The parameter ξ is the shape parameter which
determines the heaviness of the tail of the distribution. The parameter β
is a scale parameter that serves to scale y through the form y/β.
When the underlying variable V has a normal distribution, ξ = 0. As
the tails of the distribution become heavier, the value of ξ increases. For
most financial data, ξ is positive and in the range 0.1 to 0.4. For example,
when ξ is 1/2, the power term corresponds to square root power. Larger
positive ξ shows thicker tail distribution.
115
Estimating ξ and β
The parameters ξ and β can be estimated using maximum likelihood
methods. The probability density function, gξ,β(y), of the cumulative
distribution is calculated by differentiating Gξ,β(y) with respect to y. This
gives
gξ,β(y) =1
β
(1+
ξy
β
)−1/ξ−1.
We first choose a value for u. Recall that y is the loss above u. A value
close to the 95th percentile point of the empirical distribution usually
works well. We then rank the observations on V from the highest to the
lowest and focus our attention on those observations for which V > u.
Suppose there are nu such observations and they are vi (1 ≤ i ≤ nu).
116
We calibrate the distribution function by choosing the parameters to max-
imize the joint probability of the data points given the parameter values.
Assuming the data points are sampled independent, the joint probability
is equal to the product of their individual probabilities. The likelihood
function (assuming that ξ = 0) is the product of the probability values of
these nu observations:
nu∏i=1
1
β
(1+
ξ(vi − u)
β
)−1/ξ−1.
Maximizing this function is the same as maximizing its logarithm:
nu∑i=1
ln
1β
(1+
ξ(vi − u)
β
)−1/ξ−1Standard numerical procedures can be used to find the values of ξ and β
that maximize this expression.
117
Estimating the tail of the distribution
The probability that V > u+ y conditional that V > u is 1−Gξ,β(y). The
probability that V > u is 1 − F (u). The unconditional probability that
V > x (when x > u) is therefore
[1− F (u)][1−Gξ,β(x− u)], where y = x− u.
If n is the total number of observations, an estimate of 1−F (u), calculated
from the empirical data, is nu/n. The unconditional probability that V > x
is therefore
P(V > x) =nu
n[1−Gξ,β(x− u)] =
nu
n
(1+ ξ
x− u
β
)−1/ξ. (A)
118
Calculation of VaR and ES
To calculate VaR with a confidence level of q, we solve
F (VaR) = q.
Since F (x) = 1− P(V > x), we obtain
q = 1−nu
n
(1+ ξ
VaR− u
β
)−1/ξ,
so that
VaR = u+β
ξ
[n
nu(1− q)
]−ξ− 1
. (B)
The expected shortfall is given by
ES =∫ ∞VaR
xgξ,β(x− u) dx =∫ ∞VaR
x
β
[1+
ξ(x− u)
β
]−1ξ−1
dx
=VaR+ β − ξu
1− ξ.
119
Historical simulation
To express the approach algebraically, define vi as the value of a market
variable on Day i and suppose that today is Day n. The ith scenario in
the historical simulation approach assumes that the value of the market
variable tomorrow will be
value under the ith scenario = vnvi
vi−1.
120
September 25, 2008, is an interesting date to choose in evaluating an
equity investment. The turmoil in credit markets, which started in August
2007, was more than a year old. Equity prices had been declining for
several months. Volatilities were increasing. Lehman Brothers had filed
for bankruptcy 10 days earlier. The Treasury Secretary’s $700 billion
Troubled Asset Relief Program (TARP) had not yet been passed by the
United States Congress. Note that Nikkei 225 (Japan) and FTSE 100
(UK) were more hard hit compared to DJIA (US) in the midst of the
2008 financial tsunami.121
Sample calculations
The DJIA was 11,022.06 on September 25, 2008. On August 8, 2006,
it was 11,173.59, down from 11,219.38 on August 7, 2006. The value
of the DJIA under Scenario 1 is therefore
11,022.06×11,173.59
11,219.38= 10,977.08.
Similarly, the value of the FTSE 100, the CAC 40, and the Nikkei 225
(measured in U.S. dollars) are 9,569.23, 6,204.55, and 115.05, respec-
tively. The value of the portfolio under Scenario 1 is therefore (in $000s):
4,000×10,977.08
11,022.06+ 3,000×
9,569.23
9,599.90
+ 1,000×6,204.55
6,200.40+ 2,000×
115.05
112.82= 10,014.334.
122
Scenario generated for September 26, 2008, using Data in the above
table (all indices are measured in U.S. dollars)
The FTSE100 in British pound × exchange rate of one British pound in
US dollars gives the FTSE100 in US dollars.
123
Extreme Value Theory Calculations
The parameter u is chosen to be 160 so that nu = 22. The trial values
for β and ξ are 40 and 0.3, respectively.
• If the choice of u is changed to 200, then nu = 8. There is a tradeoff
between u and nu. Higher u leads to smaller nu since there will be
less number of scenarios of losses that exceed 200.125
The above table shows calculations for the trial values β = 40 and ξ = 0.3.
The value of the log-likelihood function is −108.37.
The search for the values of β and ξ that maximize the log-likelihood
function gives β = 32.532 and ξ = 0.436, and the maximum value of the
log-likelihood function is −108.21.
Suppose that we wish to estimate the probability that the portfolio loss be-
tween September 25 and September 26, 2008, will be more than $300,000
(or 3% of its value). Using u = 160, we obtain from eq.(A) that
22
500
(1+ 0.436
300− 160
32.532
)−1/0.436= 0.0039,
which is more accurate than counting observations. The probability that
the portfolio loss will be more than $500,000 (or 5% of total portfolio
value) is 0.00086 by following a similar procedure.
126
VaR calculations
Using eq.(B), the value of VaR with a 99% confidence limit is
160 +32.532
0.436
[500
22(1− 0.99)
]−0.436− 1
= 227.8
or $227,800. In this instance, the VaR estimate is about $25,000 less
than the fifth worst loss. When the confidence level is increased to 99.9%,
VaR becomes
160+32.532
0.436
[500
22(1− 0.999)
]−0.436− 1
= 474.0
or $474,000. When it is increased further to 99.97%, VaR becomes
160+32.532
0.436
[500
22(1− 0.9997)
]−0.436− 1
= 742.5
or $742,500.
127
ES calculations
We can improve ES estimates and allow the confidence level used for ES
estimates to be increased. In our example, when the confidence level is
99%, the estimated ES is
227.8+ 32.532− 0.436× 160
1− 0.436= 337.9
or $337,900. When the confidence level is 99.9%, the estimated ES is
474.0+ 32.532− 0.436× 160
1− 0.436= 774.8
or $774,800.
128
Probability density calculations
The probability density function evaluated at the VaR level for the prob-
ability distribution of the loss, conditional on it being greater than 160,
is given by the gξ,β function. It is
1
32.532
(1+
0.436× (227.8− 160)
32.532
)−1/0.436−1= 0.0037.
The unconditional probability density function evaluated at the VaR level
is nu/n = 22/500 times 0.0037 = 0.00016.
129
Choices of u
This represents the tradeoff between accuracy of approximating the tail
distribution (higher u) and more data points available in the calibration
(lower u).
It is often found that values of ξ and β do depend on u, but the estimates
of F (x) remain roughly the same. We want u to be sufficiently high that
we are truly investigating the shape of the tail of the distribution, but
sufficiently low that the number of data items included in the maximum
likelihood calculation is not too low. More data lead to improved accuracy
in the assessment of the shape of the tail.
A rule of thumb is that u should be approximately equal to the 95th
percentile of the empirical distribution. In the case of the data we have
been looking at, the 95th percentile of the empirical distribution is 156.5.
In the search for the optimal values of ξ and β, both variables should be
constrained to be positive.
130
Semi-positive definiteness and nonnegativity of eigenvalues
Recall that a matrix A is said to be semi-positive definite matrix if xTAx ≥0, for all x. The eigenvalue λ and eigenvector v of the matrix A is defined
by
Av = λv.
The eigenvalues of a semi-positive definite matrix A are known to be
non-negative.
To prove the claim, suppose not, then vTAv = λvTv < 0, if λ is negative.
This leads to a contradiction.
131