detecting fraud using benford’s law

51
Detecting Fraud Using Benford’s Law

Upload: mckenzie-english

Post on 01-Jan-2016

40 views

Category:

Documents


5 download

DESCRIPTION

Detecting Fraud Using Benford’s Law. 2. 3. 3. 8. 8. 1. 6. 1. 3. 1. 5. 3. 3. 6. 7. 8. 3. 7. 1. 5. 6. 1. 5. 3. 4. 7. 1. 5. 7. 9. 8. 8. 5. 5. 5. 8. 9. 8. 2. 9. 2. 2. 8. 9. 8. 3. 3. 9. 2. 5. 8. 4. 5. 8. 9. 3. 5. 3. 9. 8. 9. 9. 7. 1. 9. - PowerPoint PPT Presentation

TRANSCRIPT

Detecting Fraud Using Benford’s Law

BENFORD’S LAWBENFORD’S LAWBENFORD’S LAWBENFORD’S LAW

6

3

8

1 99

1

6

3

4

5

1

8

4

9

3

172

1 96 71

2

3

85

7

192

4

3

9

7

91

7

14

4

4

3

3

5

5

5

5

5

8

8

8

8

8

8

6

3

8

79

9

1

6

3

43

6

89

9

3

1

72

1

9

65

1

2

3

8

5

2

1 4 2 4

3 9

7

9

1

7

443

3

3

5

5

5

5

5

8

8

8

8

88 6

3

31

9

9

2

6

2

4

5

1

8

99 3

1

7

2

49

6312

285

7

19

24

3

9

7

9 2

7

56

43

3

35

5

2

5

5

6

8

8

8

8

8

1

Simon NewcombSimon NewcombSimon NewcombSimon Newcomb

Benford’s LawBenford’s LawBenford’s LawBenford’s Law

6

38

1

9

9

1

6

3

4

5

1

84

9

172

1

9

6

7

1

2

3

8

5

7

1

9

2

4

3

3

Benford’s LawBenford’s LawBenford’s LawBenford’s Law

Number Log Formula

10 1.0000 LOG(10)11 1.0414 LOG(11)12 1.0792 LOG(12)13 1.1139 LOG(13)14 1.1461 LOG(14)15 1.1761 LOG(15)16 1.2041 LOG(16)17 1.2304 LOG(17)18 1.2553 LOG(18)19 1.2788 LOG(19)20 1.3010 LOG(20)

Logarithm ExampleLogarithm ExampleLogarithm ExampleLogarithm Example

Multiply 320 by 417 (Answer 133,440)

Log(320) = 2.50515

Log(417) = 2.620136

Log (320) + Log (417) = 5.125286

10^5.5125286 = 133,440

Note on the Frequency of Use of Note on the Frequency of Use of the Different Digits in Natural the Different Digits in Natural NumbersNumbers

Note on the Frequency of Use of Note on the Frequency of Use of the Different Digits in Natural the Different Digits in Natural NumbersNumbers

Theory: “A multi-digit number is more likely to begin with ‘1’ than any other number.” In other words, these are probably the most faded numbers on our calculators.

Newcomb’s ShortcomingNewcomb’s ShortcomingNewcomb’s ShortcomingNewcomb’s Shortcoming

He failed to provide a reason why his theory and formula worked!!!!!

Frank BenfordFrank BenfordFrank BenfordFrank Benford

Frank BenfordFrank BenfordFrank BenfordFrank Benford

Noted the same phenomena as Newcomb in the same exact manner in the late 1920’s, and theorized that unless his friends had a predilection for low digit numbers, there must be a reason to explain this phenomena.

Benford TestsBenford TestsBenford TestsBenford Tests

Analyzed 20,229 sets of numbers, including, areas of rivers, baseball averages, numbers in magazine articles, atomic weights of atoms, electricity bills on the Solomon Islands, etc.

Benford’s ConclusionBenford’s ConclusionBenford’s ConclusionBenford’s Conclusion

• Multi digit numbers beginning with 1, 2 or 3 appear more frequently than multi digit numbers beginning with 4, 5, 6, etc.

• The frequency of which these digits appear in nature was published in “The Law of Anomalous Numbers”

PercentagesPercentagesPercentagesPercentagesPercentagesDigit - Position in Number

1st 2nd 3rd

1. 301 .113 .10132. 176 .108 .10093. 124 .104 .10054. 096 .100 .10015. 079 .096 .09976 .066 .093 .0994

PercentagesPercentagesPercentagesPercentages

First Digit First Digit First Digit 1 2 3

Area Rivers 3116.4 10.7Populations 33.9 20.4 14.2Newpapers 3018 12Pressure 29.6 18.3 12.8Mol. Weight 26.725.2 15.4Atomic Weight 47.2 18.7 5.5X-Ray Volts 27.917.5 14.4Batting Averages 32.7 17.6 12.6Death Rate 2718.6 15.7Average 30.6 18.5 12.4Probable Error 0.80.4 0.4

Conclusion Cont.Conclusion Cont.Conclusion Cont.Conclusion Cont.

• The number 1 predominates every step of most progressions.

• Stock Market example: Assume 20% annual return on a $1,000 investment. It takes 4 years for the stock to go from $1,000 to $2,000, approximately 3 years to go from $2,000 to $3,000, approximately 2 years to go from $3,000 to $4,000. Before long you start over at 1 or $10,000.

Conclusion Cont.Conclusion Cont.Conclusion Cont.Conclusion Cont.

Months in which Investment ranged between:

$1,000 and $1,999 41 29.50%

$2,000 and $2,999 25 17.99%

$3,000 and $3,999 17 12.23%

$4,000 and $4,999 14 10.07%

$5,000 and $5,999 11 7.91%

$6,000 and $6,999 9 6.47%

$7,000 and $7,999 8 5.76%

$8,000 and $8,999 7 5.04%

$9,000 and $9,999 7 5.04%

Stock Market ExampleStock Market ExampleStock Market ExampleStock Market Example

• Sample of 12,00 stock market quotes from the Wall Street Journal.

Actual Expected Actual Expected

Frequency Frequency Frequency Frequency Difference

Digit 1 3364 3619 27.98% 30.10% -2.12%

Digit 2 1554 2116 12.93% 17.60% -4.67%

Digit 3 1182 1502 9.83% 12.49% -2.66%

Digit 4 1240 1165 10.31% 9.69% 0.62%

Digit 5 1026 952 8.53% 7.92% 0.61%

Digit 6 1103 804 9.17% 6.69% 2.48%

Digit 7 897 697 7.46% 5.80% 1.66%

Digit 8 820 616 6.82% 5.12% 1.70%

Digit 9 836 551 6.95% 4.58% 2.37%

12,022

Stock Market ExampleStock Market ExampleStock Market ExampleStock Market Example

Newcomb vs. BenfordNewcomb vs. BenfordNewcomb vs. BenfordNewcomb vs. Benford

• Benford also did not have an explanation for this phenomena, however, at least he had evidence that demonstrated the laws ubiquity.

• The theory remained unchallenged, but failed to generate any publicity.

1961196119611961

• Research conducted revealed that Benford’s probabilities are scale invariant, therefore, it doesn't’t matter if the numbers are denominated in dollars, yens, marks, pesos, rubbles, etc.

BettingBettingBettingBetting

• Other than proving the financial reasonableness of forecasts, the main use for Benford’s Law was used for making money by betting with unsuspecting friends.

Mark NigriniMark NigriniMark NigriniMark Nigrini

• In 1992, Nigrini published a thesis noting that Benford’s Law could be used to detect fraud.

How Does this help us?How Does this help us?How Does this help us?How Does this help us?

• Because human choices are not random, invented numbers are unlikely to follow Benford’s Law, I.e., when people invent numbers, their digit patterns (which have been artificially added to a list of true numbers) will cause the data set to appear unnatural.

Source: Mark Nigrini

Five Major Digit Tests.Five Major Digit Tests.Five Major Digit Tests.Five Major Digit Tests.

• 1st digit test• 2nd digit test• First two digits• First three digits• Last two digits

Source: Mark Nigrini

First Digit TestFirst Digit TestFirst Digit TestFirst Digit Test

• High Level Test• Will only identify the blinding glimpse of the

obvious• Should not be used to select audit samples,

as the sample size will be too large.

Sourec: Mark Nigrini

Second Digit TestSecond Digit TestSecond Digit TestSecond Digit Test

• Also a high level test• Used to identify conformity• Should not be used to select audit samples

Source: Mark Nigrini

First Two Digits TestFirst Two Digits TestFirst Two Digits TestFirst Two Digits Test

• More focused• Identifies manifested deviations for further

review• Can be used to select audit targets for

preliminary review

Source: Mark Nigrini

First Three Digits TestFirst Three Digits TestFirst Three Digits TestFirst Three Digits Test

• Highly Focused• Used to select audit samples• Tends to identify number duplication

Source: Mark Nigrini

Last Two Digits TestLast Two Digits TestLast Two Digits TestLast Two Digits Test

• Used to identify Invented (overused) and rounded numbers

• Expected proportion of all possible last two digit combinations is .01

Source: Mark Nigrini

Not all Data Conforms!!!!!!!!!Not all Data Conforms!!!!!!!!!Not all Data Conforms!!!!!!!!!Not all Data Conforms!!!!!!!!!

• The data set should describe similar data (populations of towns)

• Artificial limits should not exist (no minimum sale amount)

• The data can’t consist or pre-arranged numbers (SSN, Tel Numbers)

• The data should consist of more small items than large items

Not all Data ConformsNot all Data ConformsNot all Data ConformsNot all Data Conforms

• The data should not be a subset of a set• Does not work if data has been aggregated,

I.e. daily deposits are combined and recorded weekly

• Data should relate to s specific period• The data population should be large enough

so that the proportions can manifest themselves

Fraud CasesFraud CasesFraud CasesFraud Cases

• What will you generally see:• Fraudster starts out small then increases the

dollar amount. The amounts will be just below a limit that requires further review. The numbers will not follow a digital pattern. The amounts will not be rounded, and certain digit patterns will be repeated.

Source: Mark Nigrini

ExampleExampleExampleExample

• Examined over 1,000 cash disbursements (entire population) during the year (amounts over $500 required 2 signatures and amounts over $5,000 required competitive bids).

• Sample is on next slide

ExampleExampleExampleExample

Amount Description Check. No.

$225.95 SEIU - LU 82 ED ASSES FUND 6/98. 4001

$1,212.97 SCHINDLER ELEV CORP JUN 98. 4002

$4,999.50 YORK INT CORP - 7/98-9/98. 4003

$339.13 US FOODSERVICE 10/29/98. 4004

$473.98 VIRGINIA DEPT OF TAXATION JUNE '98 4005

$250.81 W W GRAINGER INC - SUPPLIES 4006

$504.00 LJC LIGHTING SUPPLY - LIGHT BULBS. 4007

$171.70 CLERK, DC SUPERIOR COURT 12/25/98. 4008

$225.15 SEIU - SEIU LU 82 ED ASSES FD 9/98. 4009

$477.26 VIRGINIA DEPT OF TAXATION -1998. 4010

Expected First Digit FrequencyExpected First Digit FrequencyExpected First Digit FrequencyExpected First Digit Frequency

0%

10%

20%

30%

40%

50%

1 2 3 4 5 6 7 8 9

Actual First Digit FrequencyActual First Digit FrequencyActual First Digit FrequencyActual First Digit Frequency

0%

10%

20%

30%

40%

50%

1 2 3 4 5 6 7 8 9

Expected First and Second Digit Expected First and Second Digit FrequencyFrequencyExpected First and Second Digit Expected First and Second Digit FrequencyFrequency

0%

1%

2%

3%

4%

5%

6%

7%

8%

10 16 22 28 34 40 46 52 58 64 70 76 82 88 94

Actual First and Second Digit Actual First and Second Digit FrequencyFrequencyActual First and Second Digit Actual First and Second Digit FrequencyFrequency

0%

1%

2%

3%

4%

5%

6%

7%

8%

10 16 22 28 34 40 46 52 58 64 70 76 82 88 94

Actual First and Second Digit Actual First and Second Digit FrequencyFrequencyActual First and Second Digit Actual First and Second Digit FrequencyFrequency

0%

1%

2%

3%

4%

5%

6%

7%

8%

10 16 22 28 34 40 46 52 58 64 70 76 82 88 94

13

30

47 - 50

7793

87

Actual First and Second Digit Actual First and Second Digit FrequencyFrequencyActual First and Second Digit Actual First and Second Digit FrequencyFrequency

0%

1%

2%

3%

4%

5%

6%

7%

8%

10 16 22 28 34 40 46 52 58 64 70 76 82 88 94

13

30

47 - 50

7793

87

Regular payroll garnishment..

Monthly supply contract for $303.

Maint. Contract.

Kay Grogan Food/Bev. Company uses ARAMARK.

Pest Control.

Possible structuring to avoid authorization thresholds.

Applying Benford’s LawApplying Benford’s LawApplying Benford’s LawApplying Benford’s Law

• Income tax agencies.• Audits of Accounts Payable (I/A, Ext.

Auditors, Fraud Examiners, etc).• Expenses reimbursements.

Who Uses ThisWho Uses ThisWho Uses ThisWho Uses This

• US West, Sprint, Colgate, P&G, Nortel, American Airlines, United Airlines, Ameritech, Lockheed Martin, KPMG, ARCO, State of Texas.

Source: Mark Nigrini

Cost of Data Analysis SoftwareCost of Data Analysis SoftwareCost of Data Analysis SoftwareCost of Data Analysis Software

• $245 for 13 programs which run on Excel 97 or Excel 2000.

• $795 for all programs. Works with ACL and Idea.

Source: Mark Nigrini

CautionCautionCautionCaution

• Does not work with Lottery• May not work for certain types of expenses in

which documentation is not required for expenses under a certain category.

• Authorization Levels.

CautionCautionCautionCaution

• It only works with natural numbers (those numbers that are not ordered in a particular numbering scheme, I.e., telephone numbers, social security numbers.

CAUTION

CautionCautionCautionCaution

• The sample should be large enough so that the predicted proportions can assert themselves, and they should be free of artificial limits. I.E., don’t analyze the prices of 10 different types of beer, as the sample is small and the prices are forced by competition to stay within a narrow range.

CAUTION

SummarySummarySummarySummary

• Benford’s Law provides a data analysis method that can help alert us to possible errors, biases, potential fraud, costly processing inefficiencies or other irregularities.

STOP

ArticlesArticlesArticlesArticles

• Journal of Accountancy (5/99)• New Scientist (7/99)• Internal Auditor (2/99)• Inside Fraud Bulletin (3/99)• Auditing: A Journal of Practice &

Theory (Fall of 1997)

STOP

Articles ContinuedArticles ContinuedArticles ContinuedArticles Continued

• White Paper (4/94)• White Paper (9/99)• New York Times (8/4/98)• Information Technology (9/97)

STOP

Web SitesWeb SitesWeb SitesWeb Sites

• www.doc.ic.ac.uk• www.maximag.co.uk/bull701.htm• www.Nigrini.com/Benford’s_law

STOP

Web SitesWeb SitesWeb SitesWeb Sites

• Benford’s Law• Digital Analysis• Fraud Detection• Analytical Procedures

STOP

BooksBooksBooksBooks

• Digital Analysis Using Benford’s Law (Mark Nigrini)

STOP