random number generation using low discrepancy points

Random Number GenerationUsing Low Discrepancy

PointsDonald Mango, FCAS, MAAA

Centre Solutions

June 7, 1999

1999 CAS/CARe Reinsurance Seminar

Baltimore, Maryland

What is Discrepancy?

• Large # of points inside a unit hypercube :n-dimensional hypercube of length 1 on each side• For any “sub-volume” of the hypercube,

Discrepancy = the difference between

the proportion of points inside the volumeand

the volume itself

Low Discrepancy Point Generator:

• Method to generate a set of points which fills out a given n-dimensional unit hypercube, with as little discrepancy as possible

• Attempt to be systematic and efficient in filling a space, given the number of points

• My paper discusses “Faure” Points, just one of many alternatives

• Faure method relies on prime numbers

Other Low Discrepancy Point Generators:

• Named after number theorists: Sobol’, Neiderreiter, Halton, Hammersley, ...

• More advanced methods use “irreducible polynomials” -- polynomial equivalents of prime numbers (cannot be factored)

• More complex algorithms • Less flexible than Faure

Linear Congruential Generator:• Xn+1 = (aXn + c) mod m

• Used in spreadsheets -- RAND() in Excel, @RAND in Lotus

• Sequential• Cyclical, with a long cycle length or “period”• “Randomized” in spreadsheets by using a

random seed value ( X0 ) = the system clock

LDPMAKER Excel 97 Workbook:

• Available in the 1999 Spring Forum section of the CAS Website:

www.casact.org/pubs/forum/99spforum/99spftoc.htm

• Includes both:• A spreadsheet-only calculation (recalc-driven),

and• A Visual Basic for Applications (VBA) macro-

driven generator (run with a button)

LDPMAKER Excel 97 Workbook:

• “Example” sheet is spreadsheet-only calculation

• Demonstrates formulas• Not very flexible

Example: 4 Dimensions, 24 Iterations• Dimension #1:

• First, convert each iteration number N to base Prime (= 5)

• Iteration 1 = 01base5Iteration 10 = 20base5

• F(N, 1) = Faure point (Iteration N, Dimension 1)F(1,1) = 0/52 + 1/5 = 0.20F(10,1) = 2/52 + 0/5 = 0.08

Example: 4 Dimensions, 24 Iterations• Dimension #2:

• Start with the base Prime digits from Dimension #1 and “shuffle” them

• Using combinations, sum of digits and MOD operator

• First digit in Dimension #2 = [ Sum (first digit, second digit) from Dimension #1 ] MOD Prime

•Dimension #1, Iteration 10 = 20base5Dimension #2, Iteration 10 = 22base5

• Formula for F(N,2) is the same

Example: 4 Dimensions, 24 Iterations• Dimensions #3 and higher:

• Start with the base Prime digits from the previous dimension and “shuffle” them

• Formula for F(N,3) ... is the same

Loops in the Faure Algorithm:

• Fills out the space in ever-larger loops of ever-smaller spacing

• Fills out the space sequentially• There MAY be an issue with ending the

iterations in the middle of one of these loops • Examples later in the test results...

Visual Basic for Applications (VBA) Version:

• VBA = real programming language• Recursive algorithm using “dynamic arrays” -

arrays which are dimensioned (sized) at run-time• Generalization of spreadsheet-only calculations• FAST

Performance Test #1:Sum of Limited Paretos

Test # /Pareto #

B Q Policy Limit Limited Expected Value

1 / 1 10,000 1.10 100,000 21,3211 / 2 15,000 1.30 250,000 28,874

Test # 1 Theoretical Result 50,1942 / 1 10,000 1.10 50,000 16,4042 / 2 15,000 1.30 25,000 12,7452 / 3 25,000 1.20 40,000 21,7442 / 4 12,500 1.40 50,000 14,8342 / 5 30,000 2.00 25,000 13,636

Test # 2 Theoretical Result 79,364

Table 2 (from Paper) - Pareto Parameters


# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error

250 49,170 -2.04% 47,573 -5.22%

728 50,022 -0.34% 50,267 0.15%

1,000 49,769 -0.85% 49,640 -1.10%

1,500 49,903 -0.58% 51,307 2.22%

2,186 50,137 -0.11% 50,737 1.08%

Table 3: Sum of 2 Limited Paretos


Table 4: Sum of 5 Limited Paretos

# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error

342 79,319 -0.06% 80,179 1.03%

1,000 79,201 -0.21% 78,837 -0.66%

1,500 79,206 -0.20% 79,088 -0.35%

2,000 79,280 -0.11% 79,049 -0.40%

2,400 79,358 -0.01% 79,154 -0.27%

Performance Test #2:Sum of Poissons

Table 5: Sum of 2 Poissons ( = 8)

# of Iterations LDP % Error RAND() % Error250 -0.42% 1.30%728 -0.03% 0.64%

1,000 -0.22% 0.23%2,000 -0.09% -0.08%2,186 -0.01% 0.17%

Performance Test #2:Sum of Poissons

Table 6: Sum of 5 Poissons ( = 8)

# of Iterations LDP % Error RAND() % Error342 -0.24% 0.78%

1,000 -0.20% 0.59%2,000 -0.11% -0.22%2,400 -0.04% -0.23%

Performance Test #3:Low Frequency Events

Pareto # B Q Policy Limit Limited Expected Value1 10,000 1.30 50,000 13,860

Test #1 Theoretical Result 6932 25,000 1.60 50,000 20,1133 5,000 1.10 50,000 10,660

Test #2 Theoretical Result 2,232

Table 7 - Pareto Parameters used for Severity


Table 8: One Event, 5% Prob of Occurrence

# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error250 563 -18.82% 1,009 45.60%728 615 -11.19% 657 -5.23%

1,000 670 -3.27% 569 -17.93%1,500 667 -3.81% 613 -11.58%2,186 690 -0.50% 662 -4.45%


Table 9: Two Events, each with 5% Prob of Occurrence

# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error342 2,199 -1.46% 3,175 42.26%

1,000 2,251 0.86% 2,456 10.04%1,500 2,221 -0.49% 2,295 2.83%2,400 2,204 -1.22% 2,348 5.20%

Performance Test #4:99th Percentile of Sum of NormalsTable 10 - Normal Parameters

Test # /Normal #

Mean StdDev

99th

Percentile1 / 1 2,000 750 -1 / 2 1,000 500 -

1 Combined 3,000 901.4 5,0972 / 1 1,000 300 -2 / 2 1,000 800 -2 / 3 500 300 -2 / 4 750 600 -2 / 5 2,000 100 -

2 Combined 5,250 1090.9 7,788

Performance Test #4:99th Percentile of Sum of NormalsTable 11 - 99th Pctle of Sum of 2 Normals

# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error250 5,084 -0.25% 4,800 -5.82%728 5,036 -1.19% 4,898 -3.91%

1,000 4,995 -2.00% 4,934 -3.19%1,500 5,047 -0.98% 4,989 -2.12%2,186 5,070 -0.52% 4,967 -2.55%

Performance Test #4:99th Percentile of Sum of NormalsTable 12 - 99th Pctle of Sum of 5 Normals

# of Iterations LDP Value LDP % Error RAND() Value RAND() % Error342 7,661 -1.63% 7,524 -3.38%

1,000 7,808 0.26% 7,653 -1.73%1,500 7,808 0.26% 7,650 -1.76%2,400 7,804 0.21% 7,703 -1.09%

Performance Test #5:Mixed Bag

• Sum of 5 each from:• LogNormal• Pareto• Uniform• Normal

• Testing variability of estimates over 10 runs

Performance Test #5:Mixed Bag

# of Iterations LDP Average %Error

LDP Std Dev of% Error

Rand Average% Error

Rand Std Devof % Error

250 -10.39% 0.33% -0.36% 5.51%500 -2.28% 0.71% -3.03% 7.79%

1,000 -0.47% 1.36% -0.76% 4.39%1,500 -0.41% 0.69% -0.67% 4.62%2,000 -0.41% 0.62% -1.40% 4.01%3,000 -0.72% 0.47% -1.17% 2.79%

Table 14 - Avg % Error and Std Dev of % Error over 10 runs

Possible Concerns in Using LDPs

• Unused Dimensions:• Example: modeling Excess Claims• # of Excess claims between 0 and 30

•requires 30 dimensions• If # claims < 30, are the “used” dimensions

still filled out with low discrepancy?• Dr. Tom?


• Time Series:• Example: Probability of 2 consecutive

years of loss ratio exceeding 75%• How many dimensions is this problem?• Can’t use a single dimension of LDPs,

because they are sequentially dependent• Need to know “over how many years”,

then set dimensions


• Correlation:• If two variables are

•100% correlated ==> 1 dimension• 0% correlated ==> 2 dimensions• x% correlated ==> ? dimensions

• Is promise of “low discrepancy” still fulfilled?

• How to implement?


• Loop Boundaries:• Faure algorithm fills out space

sequentially in ever-expanding loops of ever-finer granularity

• If iteration count does not finish on a loop boundary (depends on Prime), there may be potential bias...

• See Appendix B of paper

random number generation using low discrepancy points

Documents

sum of digits

sum of limited paretostable

sum of poissonstable

iteration number n

faure point iteration

base prime digits

little discrepancy

previous dimension