Transcript
Page 1: Analysis and propagation of errors

1

Peter Fox

GIS for Science

ERTH 4750 (98271)

Week 8, Tuesday, March 20, 2012

Analysis and propagation of errors

Page 2: Analysis and propagation of errors

Contents• Error!!!

• Projects

• Lab assignment on Friday

2

Page 3: Analysis and propagation of errors

Spatial analysis of continuous fields

• Possibly more important than our answer is our confidence in the answer.

• Our confidence is quantified by uncertainties as discussed earlier.

• Once we combine numbers, we need to be able to assess how the uncertainties change for the combination.

• This is called propagation of errors or more correctly the propagation of our understanding/ estimate of errors in the result we are looking at…

3

Page 4: Analysis and propagation of errors

Types of errors• Mistakes

• Natural variation

• Systematic and random equipment problems

• Data collection methods

• Observer diligence

• Locations errors/accuracy

• Rasterizing and digitizing

• Mismatch of data collected by different methods (e.g., seafloor bathymetry)

4

Page 5: Analysis and propagation of errors

Bathymetry

5

Page 6: Analysis and propagation of errors

Cause of errors?

6

Page 7: Analysis and propagation of errors

Resolution

7

Page 8: Analysis and propagation of errors

Reliability• Changes in data over time• Non-uniform coverage• Map scales• Observation density• Sampling theorem (aliasing)• Surrogate data and their relevance• Round-off errors in

computers

8

Page 9: Analysis and propagation of errors

Error propagation• Errors arise from data quality, model quality

and data/model interaction.

• We need to know the sources of the errors and how they propagate through our model.

• Simplest representation of errors is to treat observations/attributes as statistical data – use mean and standard deviation.

9

Page 10: Analysis and propagation of errors

Analytic approaches

10

Addition and subtraction

Page 11: Analysis and propagation of errors

Multiply, divide, exponent, log

11

Page 12: Analysis and propagation of errors

Monte Carlo simulation• If a new attribute U is given by U = f (A1, A2, A3, ….

An) where the A’s are attributes and f represents some function combining them, then we want to know what is the standard deviation of the combination U and how does the standard deviation of each A contribute to it?

• By MC simulation we look at the statistical distribution of a lot of realizations (random samples) of U.

12

Page 13: Analysis and propagation of errors

MC (ctd)• A single realization of U is Ui = f (R1, R2, R3,

…. Rn) where each Rn is a random sample of its corresponding attribute An based on the statistical properties (mean and standard deviation, for example) of An.

• The probability functions for the attributes themselves need not be Gaussian and could even be taken from histograms of observed values.

13

Page 14: Analysis and propagation of errors

Recall…• The mean and standard deviation of U is

estimated by– m = N-1 SUM i=1,N (Ui)

– s2 = (N-1)-1 SUM i=1,N (Ui - m)2

• where N is a very large number of realizations (hundreds or thousands).

14

Page 15: Analysis and propagation of errors

When to use?• MC simulation is most useful when the

function relating the attributes is complex or the statistical distribution is known only empirically (from a histogram, for example).

• For simpler combinations of attributes, there are easier, direct (analytical) ways to estimate the new uncertainties from the attribute uncertainties.

15

Page 16: Analysis and propagation of errors

Generating pseudo random numbers

• For the Monte Carlo simulation, you will want to generate a series of random numbers with a normal (bell-curve) distribution.

• There are 2 ways to do this in Excel.

• In older versions of Excel, you can use the Tools > Data Analysis > Random number generation > Normal distribution to generate a sequence of random numbers. 16

Page 17: Analysis and propagation of errors

Second way• Or, you can take advantage of the central limit

theorem that states that under certain conditions, random samples of any distribution will have a normal distribution.

• The Excel function RAND generates a uniformly distributed random number, that is, the probability is the same for any number between 0 and 1 to be generated.

• To get a normally distributed random sample with mean of 0 and standard deviation of 1 we can simply add 12 uniformly distributed random numbers and subtract 6.

17

Page 18: Analysis and propagation of errors

• To get a normally distributed random sample with mean of m and standard deviation of s we use:

• [ SUM i=1,12 RAND() - 6 ] * s + m

• In Matlab – RAND

• In R – randu, seed, sample

18

Page 19: Analysis and propagation of errors

Tip• Because this expression is quite long in Excel

you can create a macro to facilitate using it again and again.

• To record a macro, select Tools > Macro > Record new macro > give name to the macro > ok > type in expression > Stop recording.

• You can refer to re-named cells from within a macro, so you might want to use variable names for the mean and standard deviation to keep your macro general.

19

Page 20: Analysis and propagation of errors

Shortcuts • You can also specify a Control-key to run the

macro from the worksheet. Otherwise, to run the macro, go to Tools > Macro > Macros > select the macro name and press Run.

• Once the macro is run in a cell, you can drag the expression to other cells using the drag handle in the lower-right corner of the cell.

20

Page 21: Analysis and propagation of errors

Statistical ‘tests’• F-test: test if two distributions with the same

mean are the same or different based on their variances and degrees of freedom.

• T-test: test if two distributions with different means are the same or different based on their variances and degrees of freedom

21

Page 22: Analysis and propagation of errors

F-test

22

F = S12 / S2

2

where S1 and S2 are the

sample variances.

The more this ratio deviates from 1, the stronger the evidence for unequal population variances.

Page 23: Analysis and propagation of errors

T-test

23

Page 24: Analysis and propagation of errors

Variability

24

Page 25: Analysis and propagation of errors

Dealing with errors• In analyses:

– report on the statistical properties– does it pass tests at some confidence level?

• On maps:– exclude data that are not reliable (map only

subset of data)– show additional map of some measure of

confidence

25

Page 26: Analysis and propagation of errors

Elevation map

26

meters

Page 27: Analysis and propagation of errors

Larger errors ‘whited out’

27

m

Page 28: Analysis and propagation of errors

Elevation errors

28

meters

Page 29: Analysis and propagation of errors

Contaminants

29

Page 30: Analysis and propagation of errors

Regions with errors ‘whited out’

30

Page 31: Analysis and propagation of errors

Map of errors

31

Page 32: Analysis and propagation of errors

Summary• Topics for GIS (for Science)

– Estimating, propagating and displaying error considerations

• For learning purposes remember:– Demonstrate proficiency in using geospatial applications and tools

(commercial and open-source).

– Present verbally relational analysis and interpretation of a variety of spatial data on maps.

– Demonstrate skill in applying database concepts to build and manipulate a spatial database, SQL, spatial queries, and integration of graphic and tabular data.

– Demonstrate intermediate knowledge of geospatial analysis methods and their applications.

32

Page 33: Analysis and propagation of errors

Friday Mar. 23• Lab assignment session – three problems, up

on ~ Wednesday

• Complete them in class, get signed off before leaving

• 10% of grade

33

Page 34: Analysis and propagation of errors

Reading for this week• http://www.chemtopics.com/aplab/errors.pdf

• http://www.nuim.ie/staff/dpringle/gis/gis11.pdf

34

Page 35: Analysis and propagation of errors

Next classes

• Friday, March 23 – lab with material from week 7 (lab assignment 10%)

• Tuesday, March 27, Using uncertainties, working with discrete entity types

• Note March 30 – open lab (no assignment, work on your projects, get help from Max), attendance will be taken

35


Top Related