student_garden_geostatistics_course

Student Garden Geostatistics course

1

INDEX

• Types and purpose of data

• Point data (s.3)

• Point data file format (s.4)

• Viewing point data (s.5-7)

• Grid data (s.8)

• Grid data file format and viewing (s.9-10)

• Grid spatial parameters (s.11-12)

• Statistical parameters

• Basic parameters (s.13)

• Mean (s.14)

• Variance (s.15)

• Percentile (s.16)

• Univariate analysis

• Histogram (s.17)

• Boxplot (s.18)

• Lineplot (s.19)

• Bivariate analysis

• Scatterplot (s.20)

• Correlation and regression (s.21)

• Stereonet (s.22)

• Special scatterplots (s.23)

• Spatial estimation

• Purpose (s.24)

• Nearest neighbor (s.25)

• Inverse weighted distance (s.26)

• Variography

• Anisotropy (s.27)

• Building a variogram (s.28-31)

• Kriging

• Simple kriging (s.32-35)

• Ordinary kriging (s.36)

• Sequential simulation

• Uncertainty (s.37-38)

• Random walk (s.39)

• Node value as hard value (s.40)

• Probability function generation

(s.41-42)

• Procedures (s.43)

• Sequential Gaussian Simulation

(s.44)

• Direct Sequential Simulation (s.45) 2

INDEX

• Simulation post-processing

• Getting mean and variance of simulations (s.47)

• Co-located co-simulation

• When to use… (s.48)

• How to do… (s.49-51)

• Sequential indicator simulation

• Categorical data (s.52)

• Indicator function (s.53)

• Indicator variogram (s.54)

• How to do… (s.55)

• Indicator simulation post-processing

• Getting most-likely value and entropy of simulations (s.56)

• Stochastic Genetic procedures

• Genetic algorithms (s.57)

• Global stochastic inversion (s.58)

• Convolution (s.59-61)

• Objective function (s.62-65)

3

Types and purpose of data – point data

Place where a sample was gathered

Objective

• Study the dispersion of a contaminant

in the flooding areas of a river.

We’ve gathered samples in the areas

where flooding occurred and retrieved the

following variables:

• X coordinate

• Y coordinate

• Z coordinate

• Iron content

• Organic content

We call this point-data (and hard-

data because it was retrieved with

direct methods resulting in a physical

sample). 4

Types and purpose of data – point data file format

Flooding_contents_project 5 X Y Z Iron_content Organic_content 4.1 4 0.9 0.11 0.09 3.8 6.6 1.1 0.10 0.09 3.2 7.2 1.3 0.12 0.11 4.4 7.9 1.2 0.09 0.09 2.6 8.2 1.3 0.08 0.10 3.5 8.6 1.1 0.07 0.09 2.9 8.8 0.9 0.07 0.11 2.4 9.6 0.7 0.06 0.07 3.9 9.8 1.4 0.03 0.04 3.3 10.3 1.5 0.01 0.03

x

y z 5

6

7

8

9

10

3 4 5

This is an example of an ASCII (text) point-

data file (GEOEAS format because it has an

header). On the right you can see the plot

of the data in the file. One of the points is

even indicated ( ) both in the file and

plot. 5

Types and purpose of data – viewing point data

We usually view variables as colors. Each

color indicates a specific range of values for

that variable. For this example we’ll use the

“Jet” color mapping (sometimes called

colorbar). Let’s view the iron content:


0.01 , 0.021, 0.032, 0.043, 0.054, 0.065, 0.076, 0.087, 0.098, 0.109, 0.12

x

y z 5

6

7

8

9

10

3 4 5

6


We’ve used the colorbar but how do we build one? I want

to make a colorbar with 10 colors which means having 10

value ranges.

0.01 , 0.021, 0.032, 0.043, 0.054, 0.065, 0.076, 0.087, 0.098, 0.109, 0.12

a) I retrieve the minimum from the variable to be color

mapped: 0.01

b) I retrieve the maximum from the same variable: 0.12

c) I calculate the range between them: 0.12 – 0.01 = 0.11

d) I divide that range by 10 (because I want 10 colors):

0.11/10 = 0.011

e) Than i calculate the interval in each bin by summing the

superior limit from last bin to the calculated bin range.

f) 0.01 + 0.011 = 0.021 so first bin is [0.01, 0.021[

g) 0.021 + 0.011 = 0.032, second bin is [0.021,0.032[

h) 0.032 + 0.011 = 0.043, third bin is [0.032,0.011[

i) And so on…, until the last bin

Each color is given by a RGB triplet (it

may be RGB-A but the last value is

transparency).

• R for red

• G for green

• B for blue

• (optional) A for alpha

It is quite common that every

software that gives the user

opportunity to choose color to have

a color dialog where you insert the

exact RGA triplet you want..

• Red is: 255;0;0 (255 is the max).

• Green is: 0;255;0

• Blue is: 0;0;255

• Purple is: 128;0;128

To build purple we need 128 parts in

255 of red, 0 of green, and 128 parts

in 255 of blue.

7


There are many kinds of colorbar. Many have been

developed to achieve some specific purpose like display a

colored image in black and white or getting the best

contrast between positive values, negative values and zero

values in a seismic cube. Like this:

min. max.

This color map is usually called “Seismic” colormap or

“RdBu” (for red to blue or blue to red). Notice that in the

seismic cube this colormap will show negatives in blue

colors, positives in red colors and near zero values in whites.

It gets very simple to retrieve the strength of the seismic

signal.

min. max.

The “Jet’ color map (sometimes called “rainbow”) on the

other hand was made to show easily a wider range of

values although there is still the felling of continuity.

RGB triplets

from blue to red

“Jet”

[ 0 0 143]

[ 0 0 239]

[ 0 79 255]

[ 0 175 255]

[ 15 255 255]

[111 255 159]

[207 255 63]

[255 223 0]

[255 127 0]

[255 31 0]

[191 0 0]

8

Types and purpose of data – grid data

On the left you have a grid. In this case we

are viewing that grid as a surface but it is still

a grid. Let’s make a definition.

- A grid is a mesh of cells, each with its own

position, and its own value (or values if

multiple variables).

x=1 x=2 x=3 x=4

y=1

y=2

y=3

y=4

y=5

y=6

This is a regular rectangular mesh

(geostatistics grid with constant cell

size for each axis) but there are other

kinds of meshes. 9

Types and purpose of data – grid data

Let’s see some other examples of grids.

Regular grid (geostatistics) Irregular grid (size may change)

Structured grid (the shape of

cell changes as well as size)

Structured grid (the shape of

cell changes as well as size)

In geostatistics we usually use

the regular grid with

rectangular cells. However it

would be possible to do in

other formats.

10

Types and purpose of data – grid data file format and viewing

Flooding_contents_project 1 Iron_content 0.09 0.10 0.09 0.08 0.07 0.05 0.05 0.03 0.01

x=1 x=2 x=3

y=1

y=2

y=3

0.09 0.12 0.09

0.08 0.07 0.05

0.05 0.03 0.01

This is an example of an ASCII (text) grid-

data file (GEOEAS format because it has an

header). On the right you can see the

disposition of the variables values in the

column in-file.

0.01 , 0.021, 0.032, 0.043, 0.054, 0.065, 0.076, 0.087, 0.098, 0.109, 0.12

min. max.

x=1 x=2 x=3

y=1

y=2

y=3

0.09 0.12 0.09

0.08 0.07 0.05

0.05 0.03 0.01

11

Types and purpose of data – grid spatial parameters

size y = 2

size x = 1

First Y coordinate = 1

First X coordinate = 2.2

The parameters that define this

regular grid are:

a) Number of cells in X: 3

b) Number of cells in Y: 3

c) Number of cells in Z: 1

d) Size of cell in X: 1

e) Size of cell in Y: 2

f) Size of cell in Z: 1

g) First coordinate in X: 2.2

h) First coordinate in Y: 1

i) First coordinate in Z: 0

We use these parameters to put the

grid with correct disposition,

dimensions, and location. Without

them we only have a column of

values.

12

Types and purpose of data – grid spatial parameters

Typical problem of misplacing the

grid with the hard-data due to

wrong size and first coordinate.

Ensure both point-data and grid-

data are in the same spatial units

and correctly positioned.

13

Statistical parameters – basic parameters

30, 5, 32, 26, 18, 9, 11, 11, 13, 7, 24, 25, 28, 8, 9, 9, 7,24, 32, 27, 26, 18, 12

Let’s use a few sample values:

Some basic parameters important to understand you r data are:

• Minimum: 5

• Maximum: 32

• Arithmetic mean: 17.86

• Standard deviation: 9.03

• Variance: 81.67

• Percentile 25: 9

• Percentile 50 (median): 18

• Percentile 75: 26

Let’s think about each one of them. There are many types of mean:

• Arithmetic mean

• Geometric mean

• Harmonic mean

• Etc…

Each has it’s own advantages but throughout this course you’ll consider mainly two: the

arithmetic mean and the weighted mean. 14

Statistical parameters – mean

The arithmetic mean is given by: μ = 𝑥𝑖𝑛𝑖=1

𝑛

μ = 𝑤𝑖𝑥𝑖𝑛𝑖=1

𝑛 The weighted mean is given by: , where “W” is the weight for each sample

value.

, where “X” is the sample “i" value.

We’ll talk later about weighted mean but we use it every time we want to calculate a mean from

some samples but some samples are more important than others. Examples are kriging and inverse

weighted distance. The samples are worth more depending on their distance and/or direction (also

depends on other stuff but for now let’s take it easy).

The arithmetic mean is used to achieve a representative value (a central tendency meaning your

distribution is clustered around this value) for a distribution that could have many samples therefore

difficult to study as an entity. But using only the mean has a problem:

30, 5, 32, 26, 18, 9, 11, 11, 13, 7, 24, 25, 28, 8, 9, 9, 7,24, 32, 27, 26, 18, 12

17.86, 17.86, 17.86, 17.86,17.86, 17.86, 17.86, 17.86,17.86, 17.86, 17.86, 17.86,17.86, 17.86, 17.86,

17.86,17.86, 17.86, 17.86, 17.86,17.86, 17.86, 17.86

Set 1:

Set 2:

Both sets have the same mean. However one varies a lot, the other doesn’t vary at all. We need

something to tell us the level of variation. 15

Statistical parameters – variance

One possible way of measuring the variability of a distribution is by calculating the distance of

each value to the mean (the most arithmetically representative value of a distribution). We call

this the absolute deviation:

𝑑 = μ − 𝑥𝑖

Of course we still need to get a representative value for the variability so we do the mean of

absolute deviations:

𝑑μ = μ − 𝑥𝑖𝑛𝑖=1

𝑛

Unfortunately the modulus is not a straightforward function (it’s actually the result a square root

of a squared number, or a composed function). So someone replaced the modulus by an

exponential of 2, therefore making the mean of squared deviations, also called the variance:

σ2 = (μ − 𝑥𝑖)

2𝑛𝑖=1

𝑛

The problem with variance is that you change the order of magnitude so it’s pretty common to

put a square root in variance, calling this the standard deviation:

σ = (μ − 𝑥𝑖)

2𝑛𝑖=1

𝑛

16

Statistical parameters – percentile

The percentile is a ways of calculating a number that limits a given quantity in a distribution. For

example if I have the samples (notice they are sorted): 2,2,3,4,5,8,13,20,21

3 is the number that divides the first 25% of values with the other 75%, thus 3 is the percentile

25. There are two numbers to the left of 3 and 6 numbers to the right of 3.

I also know that 5 is the number that divides the first 50 % of my data with the last 50 %. Thus 5 is

the percentile 50 (median) with 4 numbers on the left, and four numbers on the right.

So percentile is about quantity, about local quantity in a distribution. Imagine that you want to

compare two distributions of samples. They have the same mean and same variance, as well

same maximum and minimum value. Are the two distributions equal? You can’t really state that.

In fact the percentiles may be different meaning that the data is clustered in a different manner

throughout the distribution.

Before we finish this section let’s just see exactly what the mode is. The mode is the value that

appears most often in a set of data. In the case of continuous variables the mode is the value at

which its probability density function has its maximum value, so, informally speaking, the mode is

at the peak. This is the reason some distributions are called bi-modal, because they have two

peaks (also multi-modal, meaning multiple peaks).

17

Univariate analysis - histogram

30, 5, 32, 26, 18, 9, 11, 11, 13, 7, 24, 25, 28, 8, 9, 9, 7,24, 32, 27, 26, 18, 12

Let’s use the sample values from the previous section:

Univariate analysis means that you’re studying the variable by itself. In fact the previous section (about

mean, variance and so on) was already univariate analysis. Now we’re going to plot our data. The

most typical univariate plot is the histogram. To do an histogram I must:

• Calculate the maximum (32) and minimum (5) and calculate the difference (32-5=27).

• Now we calculate the bin (for 7 bins) size which is 27/5=5.4.

• Now we build de limits of our bins: 1º:[5,10.4[, 2º:[10.4,15.8[, 3º:[15.8,21.2[, 4º:[21.2,26.6[, 5º:[26.6,32]

• And see how many values are inside each bin: 1º:7 , 2º:4, 3º:2, 4º:5, 5º:5

• Finally we plot the intervals on the X-axis, and the frequency (number of values per bin) in the Y-

axis.

Sorted: 5, 7, 7, 8, 9, 9, 9, 11, 11, 12, 13, 18, 18, 24, 24, 25, 26,26, 27, 28, 30, 32, 32

5 10.4 15.8 21.2 26.6 32

2

4

6

8

18

Univariate analysis - boxplot

5 10.4 15.8 21.2 26.6 32

2

4

6

8 With an histogram you can see how

probable a given bin is. The mean from

this data-set is 17.86 which actually

stands on the bin less probable. You

can actually see some resemblance to

two peaks or two populations. This

could mean that more than one

phenomena is involved with this

variable.

This is a boxplot. It shows minimum (5),

maximum (32), percentile 25 (9),

percentile 50 (18), percentile 75 (26),

and mean (17.86). The boxplot is very

useful when studying data by it’s

quantities. The bigger the blue box the

wider the interval between percentile

25 and percentile 75. In this distribution

you’ll notice a tendency towards the left

side of the variable range. The first 25 %

of data have the least variability.

5 9 18 26 17.86

32

vari

able

19

Univariate analysis - Lineplot

Sorted: 5, 7, 7, 8, 9, 9, 9, 11, 11, 12, 13, 18, 18, 24, 24, 25, 26,26, 27, 28, 30, 32, 32

30, 5, 32, 26, 18, 9, 11, 11, 13, 7, 24, 25, 28, 8, 9, 9, 7,24, 32, 27, 26, 18, 12

Histogram: 7,4,2,5,5

Histogram Percentage: 30.43478261, 17.39130435, 8.69565217, 21.73913043, 21.73913043

Histogram Percentage cumulated: 30.43478261, 47.82608696, 56.52173913, 78.26086957, 100.

1

20

40

60

80

100

2 3 4 5

A lineplot is used when we want to see

information where only one of the axis

varies randomly (on the left the X axis

goes from 1 to 5 with equal growth,

the Y axis actually gives the information

about our study variable). A common

example of lineplot are well logs

because you see information

throughout depth (for example) which

is continually growing. Lineplot are also

used to study cumulated probability

distribution as the one in the example

(actually this was taken from the

histogram of the variable and not the

variable itself, but the point is there). 20

Bivariate analysis - scatterplot

21


Remember the point data from slide 2? We have two

variables. Let’s see how they relate.

This is a scatterplot. From a scatterplot you can see

the relation between two variables. In this case

you’ll notice that there seems to be something

similar to a linear positive (when one grows the

other also grows) relation between iron content

and organic content. Numerically we could

retrieve the correlation coefficient and the linear

regression line (plotted as dashed red).

0.0

1

0.0

2

0.0

3

0.0

4

0.0

5

0.0

6

0.0

7

0.0

8

0.0

9

0.1

0

0.1

1

0.1

2

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10

0.11

0.12

Iron_content

Organic_content

Bivariate analysis – correlation and regression

22

There are many methods to measure relation and dependence between two or more variables.

In fact there are quite a few correlation coefficient. The most usual is the Pearson correlation

coefficient.

ρ =𝐸[ 𝑋 − μ𝑥 𝑌 − μ𝑦 ]

σ𝑋σ𝑌

The Pearson coefficient is between -1 and 1. Numbers closer to 1 (or -1) indicate stronger

correlation being positive if close to 1, and negative (one variable increases, the other decreases)

if closer to -1. Numbers around 0 mean no Pearson correlation exists (normally they appear as

clouds with little to no shape).

To do linear regression means to find a line that represents the general relation of your data (if it

is at all linear or similar). That means discovering this:

𝑌 = 𝑚 ∗ 𝑋 + 𝑏

“Y” and “X” are know to us. They’re the variable data that stands on the Y-axis and X-axis. The only

problem is how to discover both “m” and “b”. The formulas are:

𝑏 = 𝑌 −𝑚 ∗ 𝑋

𝑛

𝑋 = 𝑥𝑖 , 𝑌 = 𝑌𝑖

𝑚 = 𝑛 ∗ 𝑋𝑌 − 𝑋 𝑌

𝑛 ∗ 𝑋2 − ( 𝑋)2

Bivariate analysis – Stereonet

23

1:2 1:3 1:4 1:5 2:3 2:4 2:5 3:4 3:5 4:5

68(22) -58(148) -68(158) -20(110) 28(62) -23(113) 17(73) -75(165) 5(85) 50(40)

0 30 0 45 82 5 0 10 0 0

4.5 3.2 6.5 5.4 3.1 4.1 6.6 4.7 3.7 4.9

0.36 0.64 6.76 4.41 0.16 4 2.25 3.24 1.69 0.16

This is a variogram table we will see later how to

build. For now we need the azimuth and dip

columns. 0º

45º

90º

135º

180º

225º

270º

315º

90º

45º

0º

Notice that in the variogram table above I’ve put inside parenthesis the normal mathematical

value of angle (originally are geostatistics angles) in order to be easier to interpret the stereo plot.

The stereonet or stereo plot (sometimes these names are given to specific kind o stereo plot) are

exactly the same as the scatterplot. The only difference is that the axis have a polar projection. It’s

good for variogram directions, fractures orientations and any phenomena which depends two

angles.

Bivariate analysis – Special scatterplots

24

The plots you saw in the previous slides are generalist plots for one or two

variables. It should be clear that you could make a 3D scatterplot for

three variables:

Point projected in three axis.

Also you can have a variable to the color of the

marker (and perhaps adding a colorbar):

0.0

1

0.0

2

0.0

3

0.0

4

0.0

5

0.0

6

0.0

7

0.0

8

0.0

9

0.1

0

0.1

1

0.1

2

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12

Iron_content

Organic_content

And even add a variable specifically to size. Getting

four variables in one plot (or 5 if 3D).

0.0

1

0.0

2

0.0

3

0.0

4

0.0

5

0.0

6

0.0

7

0.0

8

0.0

9

0.1

0

0.1

1

0.1

2

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12

Iron_content

Organic_content

Spatial estimation - Purpose

25

1

2

3

4

5

1 m

2.3

2.9

3.1

4.4

4.9

Look at the data on your left. We only know what is

going on where point data exists and we need a

map in order to have a real notion of how a

phenomena or variable behaves in space.

There are 2 terms that specifically manage this kind

of problem: interpolation and estimation. The

difference will depend on the author but for the

purpose of this course when referring by those

terms I mean to do an exercise that demands

calculating a value in a place where it does not exist.

There are many methods for spatial estimation

(interpolation). Most are transversal to any number

of dimensions (from 1 dimension to “n” dimensions).

Specifically we’ll train how to do this in spatial

dimensions (2D or 3D, for “x ;y ;z”). Notice however

that nothing stops us from using time or any other

variable as a dimension.

Spatial estimation – Nearest neighbor

26

1

2

3

4

5

1 m

2.3

2.9

3.1

4.4

4.9

We have some point set and built a grid with size 1 in both X and Y directions. Than took the

following steps for each node:

1) Calculate the distance to all points.

2) Select the point with minimum distance.

3) Give the node the value of that point.

With this procedure we’ll only have values that appear on our data. So no continuous behavior

from one value to another appears.

Spatial estimation – Inverse weighted distance

27

1

2

3

4

5

1 m

2.3

2.9

3.1

4.4

4.9

μ(𝑥) = 𝑤𝑖(𝑥) ∗ μ𝑖 𝑤𝑗(𝑥)

𝑤 1 =1

2.22 = 0.20

𝑑 = 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑓𝑟𝑜𝑚 𝑛𝑜𝑑𝑒

𝑝 = 𝑝𝑜𝑤𝑒𝑟

With “p=2” we would have “inverse squared distance”.

To finish our estimation we have to do the calculations

above for every node.

2.3

2.9

3.1

1

2

3

2.2 m

1.9 m

3.1 m

𝑤 2 =1

1.92 = 0.27

𝑤 3 =1

3.12 = 0.10

𝑤𝑗(𝑥) = 0.2 + 0.27 + 0.1 = 0.57

μ 𝑥 =0.2 ∗ 2.3

0.57+0.27 ∗ 2.9

0.57+0.1 ∗ 3.1

0.57= 0.80 + 1.37 + 0.54 = 2.71

On your right there’s the calculation

for only one example node. Notice

that we are doing a weighted mean

where closer points have higher

weight than further points.

𝑤𝑖 𝑥 =1

𝑑(𝑥, 𝑥𝑖)2

Variography - anisotropy

28

We’ll be calling anisotropy a measure of how one direction has more continuity than the other.

Let’s see an example:

Can you guess which direction as a greater sense of continuity? In the horizontal direction

you’ll be following more or less the same geological layer so probably you’ll find things that are

more similar to your starting point. The more the similarity the greater the range of continuity.

On the other hand the vertical direction is transversal to the three example layers, thus less likely

to find anything similar to your starting point.

We can say that we have anisotropy where the horizontal is more continuous than the vertical

but we need some way to study this numerically. And we do know how to study variability. We

use a formula similar to variance to calculate a variogram. The tool that can give us a numeric

account of anisotropy.

Variography – building a variogram

These are 5 point-data each with a value, location, and a

number ID (1 to 5). Let’s make the variogram table:

0º

90º

68º

122º

-58º

112º

-68º

160º

-20º 28º

208º

-23º

17º

-75º

5º

50º

1

2

3

4

5

1 m

2.3

2.9

3.1

4.4

4.9

1:2 1:3 1:4 1:5 2:3 2:4 2:5 3:4 3:5 4:5

68 -58 -68 -20 28 -23 17 -75 5 50

0 0 0 0 0 0 0 0 0 0

4.5 3.2 6.5 5.4 3.1 4.1 6.6 4.7 3.7 4.9

0.36 0.64 6.76 4.41 0.16 4 2.25 3.24 1.69 0.16

Mean: 3.52 , Variance: 0.94

0.36/2 0.64/2 6.76/2 4.41/2 0.16/2 4/2 2.25/2 3.24/2 1.69/2 0.16/2

29

2γ 𝑥, 𝑦 = 𝐸( 𝑍 𝑥 − 𝑍(𝑦) 2)


Exercise 1

1:2 1:3 1:4 1:5 2:3 2:4 2:5 3:4 3:5 4:5

68 -58 -68 -20 28 -23 17 -75 5 50

0 0 0 0 0 0 0 0 0 0

4.5 3.2 6.5 5.4 3.1 4.1 6.6 4.7 3.7 4.9

0.36 0.64 6.76 4.41 0.16 4 2.25 3.24 1.69 0.16

0º

90º

68º

122º

-58º

112º

-68º

160º

-20º 28º

208º

-23º

17º

-75º

5º

50º

I want to make a variogram in azimuth = 20º with

tolerance 10º and 3 bins.

a) Let’s get all angles from [20-tol,20+tol[ = [10,30[

b) Maximum distance is 6.6 so our lag distance for 3 bins is 6.6/3 = 2.2.

Sill = 0.94

2.2 4.4 6.6

0.5

1.0

1.5

2.0

2.5

NOTE: I’m plotting semi-variogram values which are half the normal variogram values.

30


Exercise 2

1:2 1:3 1:4 1:5 2:3 2:4 2:5 3:4 3:5 4:5

68 -58 -68 -20 28 -23 17 -75 5 50

0 0 0 0 0 0 0 0 0 0

4.5 3.2 6.5 5.4 3.1 4.1 6.6 4.7 3.7 4.9

0.36 0.64 6.76 4.41 0.16 4 2.25 3.24 1.69 0.16

I want to make a variogram in azimuth = -70º with

tolerance 15º and 3 bins.

a) Let’s get all angles from [-70-tol,-70+tol[ = [-85,-55[

b) Maximum distance is 6.5 so our lag distance for 3 bins is 6.5/3 = 2.16.

0º

90º

68º

122º

-58º

112º

-68º

160º

-20º 28º

208º

-23º

17º

-75º

5º

50º

Sill = 0.94

2.16 4.32 6.5

0.5

1.0

1.5

3.0

4.0

NOTE: for the third bin ( ) I’ve calculated the mean ( ) of values ( ) inside that bin.

31


0º

90º

68º

122º

-58º

112º

-68º

160º

-20º 28º

208º

-23º

17º

-75º

5º

50º

If my main direction is azimuth = -70º than the minor

1 will be the orthogonal (-70+90) 20º.

To do this for a 3D case in which we may manipulate the

azimuth, dip and rake of the main direction we must do a

series of rotations (using linear algebra) to find which

directions are the orthogonal.

0º

90º

68º

122º

-58º

112º

-68º

160º

-20º 28º

208º

-23º

17º

-75º

5º

50º Let’s take an example of direction azimuth 90º with

tolerance of 10º. The considered interval should be [90-

10,90+10[ = [80,100[. Usually in geostatistics only ranges

between -90 and 90 are used so the actual considered

interval is a composition of [80,90[ U [-90,-80[.

32

Kriging – simple kriging

33

“0;0” – North (Y)

1.5

“90;0” – East (X)

3.2

3.2

1.5

γ ℎ = 𝐶0 + 𝐶1 ∗ (1 − 𝑒−3ℎ𝑎 )

We’ve studied a set of point data and got the

following variograms that were adjusted with

an exponential model. The ellipsoid is on your

right. The exponential model formula is above.

Notice that the main direction (with highest

range) is the “90;0”, and minor 1 “0;0”. There’s

no minor 2 since this is a 2D study case. γ ℎ = 0 + 1 ∗ (1 − 𝑒−3ℎ𝑎(θ) )


34

1

2

3

4

5

1 m

2.3

2.9

3.1

4.4

4.9

We intend to estimate the value of this node using 3 point

data and simple kriging method. Let’s start by studying

point 1:

2.3

2.9

3.1

1

2

3

2.2 m

1.9 m

3.1 m

(𝑥

𝑎)2+(𝑦

𝑏)2= 1

𝑥 = 𝑎 ∗ cos (θ)

𝑦 = 𝑏 ∗ sin (θ)

𝑟 θ = 𝑥2 + 𝑦2

𝑥1𝑝 = 3.2 ∗ cos (45) = 2.26

𝑦1𝑝 = 1.5 ∗ sin (45) = 1.06

𝑟1𝑝 θ = 𝑥2 + 𝑦2 = 2.49

𝑥12 = 3.2 ∗ cos (28) = 2.82

𝑦12 = 1.5 ∗ sin (28) = 0.70

𝑟12 θ = 𝑥2 + 𝑦2 = 2.91

1 3.2

1.5

45º 28º

-32º = 32 3

2 p

2.2 m 4.1 m

2.8 m

𝑥13 = 3.2 ∗ cos (32) = 2.71

𝑦13 = 1.5 ∗ sin (32) = 0.79

𝑟13 θ = 𝑥2 + 𝑦2 = 2.82

γ 4.1 = 0 + 1 ∗ 1 − 𝑒−3∗4.1

2.91 = 0.98 γ 2.8 = 0 + 1 ∗ 1 − 𝑒−3∗2.8

2.82 = 0.94

γ 2.2 = 0 + 1 ∗ 1 − 𝑒−3∗2.2

2.49 = 0.92


35

1

2

3

3.4 m

1.9 m

3.2 m

3.2

1.5

225º = 45º

180º = 0º

250º = 70º

𝑥23 = 3.2 ∗ cos (70) = 1.09

𝑦23 = 1.5 ∗ sin (70) = 1.40

𝑟23 θ = 𝑥2 + 𝑦2 = 1.78

γ 3.2 = 0 + 1 ∗ 1 − 𝑒−3∗3.2

1.78 = 1

p

𝑟21 θ = 𝑟12 θ = 2.91 γ 4.1 = 0.98

𝑟2𝑝 θ = 𝑥2 + 𝑦2 = 1.78

γ 1.9 = 0 + 1 ∗ 1 − 𝑒−3∗1.9

3.2 = 0.83

1

2

3

3.1 m

3.2

1.5 2.8 m

3.2 m

𝑟31 θ = 𝑟13 θ = 2.82 γ 2.8 = 0.94

𝑟32 θ = 𝑟23 θ = 1.78 γ 2.8 = 1

𝑥3𝑝 = 3.2 ∗ cos (70) = 0.55

𝑦3𝑝 = 1.5 ∗ sin (70) = 1.47

𝑟3𝑝 θ = 𝑥2 + 𝑦2 = 1.57

γ 3.1 = 0 + 1 ∗ 1 − 𝑒−3∗3.1

1.57 = 1

70º 100º=80º

148º=32º


36

1

2

3

1 2 3 p

0

0

0

0.98 0.94

0.98 1

0.94 1 w3

w1

w2

1

0.92

0.83

We need to find w1, w2 and w3. So we must solve the system.

I’ve solved it:

• w1 = 0.45

• W2 = 0.57

• W3 = 0.38

So to get the kriged value I must do:

2.3

2.9

3.1

1

2

3

2.2 m

1.9 m

3.1 m

𝑣 𝑝 = 2.3 − μ𝑝 ∗ 0.45 + 2.9 − μ𝑝 ∗ 0.57 + 3.1 − μ𝑝 ∗ 0.38 + μ𝑝 = 2.75

μ𝑝 =(2.3 + 2.9 + 3.1)

3 = 2.76

2.75

To achieve simple kriging we would have to do this

procedure for all cells in our grid. But this is pretty much it.

, this mean can be user input.

Kriging – ordinary kriging

37

The difference between simple and ordinary kriging is that in ordinary we must ensure

that the sum of weights is equal to 1. Therefore the following system modification is

required:

1

2

3

1 2 3 p

0

0

0

0.98 0.94

0.98 1

0.94 1 w3

w1

w2

1

0.92

0.83

1

0

0

0

1 1 1 0 !

I’ve solved it:

• w1 = 0.32 (aprox)

• W2 = 0.42 (aprox)

• W3 = 0.24 (aprox)

There is another value but it’s

not used to calculate the kriged

value.

So to get the kriged value I must do:

𝑣 𝑝 = 2.3 ∗ 0.32 + 2.9 ∗ 0.42 + 3.1 ∗ 0.24 = 2.68

To achieve ordinary kriging we would have to do

this procedure for all cells in our grid.

Sequential simulation - uncertainty

38

The first thing you need to know before studying sequential simulation methods is why do we use

simulation (stochastic) methods in the first place. Let’s start by an easy example:

Time (x)

Distance (y)

1

2

3

4

1 2 3 4 5

We have a relation between time and

distance that is:

𝑦 = 𝑚 ∗ 𝑥 ,𝑚 𝑖𝑠 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

The problem is we don’t know with certainty

the value of “m”. However we estimate that

it is somewhere between 0.7 and 1.3.

Which mean that in any given time we have

several possibilities of distance. This can be

seen on the plot to your left.

This is uncertainty. Mathematical uncertainty

since even the retrieving of the model with value

“m” is an estimation. This problem is easily solved

since the “m” has a constant value throughout

time. But what if it doesn’t? What if even doesn’t

follow any recognizable function? Perhaps we

should try stochastic methods.

Sequential simulation - uncertainty

39

Time (x)

Distance (y)

1

2

3

4

1 2 3 4 5

I’ve done 3 simulations, each with it’s own

color. To do this simulation I’ve randomly

generated a distance(y) for time=1 that

followed the given formula (m = [0.7,1.3]).

𝑦 = 𝑚 ∗ 𝑥 , 𝑚 𝑖𝑠 𝑛𝑜𝑡 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

Than for time =2 I’ve randomly generated a

distance that depends on time=1 (otherwise

we could have points outside of “m” value).

I’ve followed this procedure for all time steps

and done 3 stochastic simulations.

With three simulations we got a much

better sense of uncertainty range for time

step 3. In fact if we would want to decrease

all this uncertainty we could introduce new

data like with time = 3, distance = 2.9. This

way the distances that preceded and the

ones that followed are going to be

conditioned to the distance value of time=3.

In fact we could call it hard-data.

Stochastic simulation follows the same

concept. Let’s see what parameters are

randomized for these procedures.

Sequential simulation – random walk

40

1

2

3

4

5

1 m

2.3

2.9

3.1

4.4

4.9

1 2 3 4 5 …

When we do kriging, or any other conventional estimation

method, only the hard-data is used to estimate any point on

the grid.

For this reason we could estimate the cells from first to last,

or from last to first that it wouldn’t make a difference.

In simulation, however, when you estimate (simulate

actually) a cell, that cell can be used to simulate the

following values. Which means the simulating from the first

cell to the last, or from the last to the first does have

differences (and probably a lot, depending on the case).

To avoid tendencies in the simulation the cells are simulated

considering a random walk which says that the first cell to

be simulated is in x,y = 3,9, the second x,y=5,2, and so on…

(this is an example). So we actually randomly generated the

time when a node is simulated.

1 5 9

6 7 3

8 4 2

3 9 5

8 6 2

4 7 1

5 8 3

4 9 6

1 7 2 Examples of 3 random walks in a 3x3

grid.

Sequential simulation – node value as hard value

41

1

2

3

4

5

1 m

2.3

2.9

3.1

4.4

4.9

2.3

2.9

3.1

1

2

3

2.2 m

1.9 m

3.1 m

2.3

2.9

3.1

1

2

3

First value being simulated…

1

2

3

4

5

6

7

8

9

10

11

…

Second value being simulated…

As said before the simulated nodes can be used as hard-data

to simulate new ones. The procedure above show the two

first nodes simulated with a given random walk (only a few

numbers appear).

Sequential simulation – probability function generation

42

The random walk is one of two stochastic steps when doing a simulation. When we krige a node,

the kriged value won’t be (or probably wont be…) the simulated value. There is something that

happens in between. When you do kriging you can retrieve two things. The kriging mean (which

we already saw how to calculate) and kriging variance.

So to get the kriging mean on the slide 25 example we would do:

𝑣 𝑝 = 2.3 − μ𝑝 ∗ 0.45 + 2.9 − μ𝑝 ∗ 0.57 + 3.1 − μ𝑝 ∗ 0.38 + μ𝑝 = 2.75

μ𝑝 =(2.3 + 2.9 + 3.1)

3 = 2.76

𝑘𝑣 𝑝 = 0.45 ∗ 0.92 + 0.57 ∗ 0.83 + 0.38 ∗ 1

1

2

3

1 2 3 p

0

0

0

0.98 0.94

0.98 1

0.94 1 w3

w1

w2

1

0.92

0.83

1

0

0

0

1 1 1 0 !

And the kriging variance would be: NOTE: this is just for illustration purposes in

fact we usually solve the kriging matrix with

correlogram and not variogram values.

Sequential simulation – probability function generation

43

So if we have a mean and a variance we can build a Gaussian distribution. And inside that

distribution randomly generate a value which is more probable around the mean (closer to the

mean).

Value range

Probability

Value range Probability

So I generate a

probability from 0

to 1.

An retrieve the

respective

simulated value.

This is a probability function of Gaussian

distribution with given mean and

variance.

This is the cumulated probability function

of Gaussian distribution with given mean

and variance.

Sequential simulation - procedures

44

So sequential simulation has two fundamental stochastic steps:

a) The random walk.

b) The random value retrieved from probability distributions.

To do a sequential simulation we would do for all nodes:

1) See which node is to be simulated in the random walk.

2) Search for the neighboring nodes and hard-data.

3) Get the kriged value and kriging variance with those nodes and points.

4) Build a probability function based on the kriged value and kriging variance.

5) Generate a probability and retrieve the value that corresponds with that probability.

Point 4 is an important point because the main differences between procedures of sequential

simulation are here. We will see two types of procedures: Sequential Gaussian Simulation and

Direct Sequential Simulation. They’re almost identical except in the way they build the

probability distribution function.

Sequential simulation – Sequential Gaussian Simulation

45

To do sequential gaussian simulation we must do a transformation to our variable

distribution, a gaussian transformation, which means transforming the real values into

gaussian values. From this point on we would proceed with the normal sequential

simulation procedure:

1) See which node is to be simulated in the random walk.

2) Search for the neighboring nodes and hard-data.

3) Get the kriged value and kriging variance with those nodes and points.

Here in point 4) we would use the mean and variance kriging to build a local gaussian

distribution. And from that distribution we would retrieve our simulated value.

Since all values simulated are from a gaussian transformation in the end we would have to

transform all simulated gaussian values into normal values. Meaning we would do the exact

opposite of the first step.

Sequential Gaussian Simulation assumes a Gaussian behavior for variables and may have

problems when this is far from truth. It is still widely used although another algorithm was

developed to avoid doing the gaussian transformation and instead doing a procedure which,

while still using Gaussian distributions, is much closer to the real data. We call it Direct

Sequential Simulation.

Sequential simulation – Direct Sequential Simulation

46

Equivalent Gaussian interval

Sampled interval in real data

In Direct Sequential Simulation we would do the common procedure for sequential simulation

and than when getting a kriging mean and variance we would convert that interval (in real

distribution) into a Gaussian distribution. From the Gaussian interval we would build a local

Gaussian function and randomly generate a probability there. That probability has an

equivalent in the global Gaussian distribution. And the global as an equivalent in the real

distribution. This would be our final simulated value.

It is important to use Gaussian distributions

to ensure that the values closer to the

mean are more probable, and values

further less probable. If we didn’t do this

we would not have any guarantee that

the variogram would be replicated in the

simulation (the input variogram ellipsoid).

Sequential simulations usually reproduce both the distribution of the real data (can be seen on a histogram

for example) and the input variogram (can be seen on a mesh variogram). Also usually (depending on the

procedure), the limits of the data (minimum and maximum) remain the same.

Simulation post-processing – getting mean and variance

47

Simulation 1 Simulation 2 Simulation 3

Mean of simulations

+ + = 3

… …

… …

Variance of simulations

… …

… …

= 3

- ( ) 2

( - ) 2

- ) 2

+ + (

Since we have a set of simulations for the same case study than we have a distribution

for each node. This means we can take any statistical parameter from that node

distribution. The more common, however, are mean and variance.

Co-located co-simulation – When to use…

48

Sometimes we have to variables that are correlated to each other:

0.0

1

0.0

2

0.0

3

0.0

4

0.0

5

0.0

6

0.0

7

0.0

8

0.0

9

0.1

0

0.1

1

0.1

2

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10

0.11

0.12

Iron_content (we use this…)

Organic_content (to estimate this…) If so we can measure that correlation and

retrieve a number. If we have an image we

know has little or less uncertainty than the

correlated variable we intend to estimate than

we could use that image as a secondary

variable and estimate the primary variable with

co-located co-kriging methods.

That said let’s see how to perform this in a

stochastic sequential simulation (doing,

therefore, co-located co-simulation).

(using this linear

correlation)

Co-located co-simulation – How to do…

49

2γ ℎ = 𝐸 𝑍 𝑥 − 𝑍 𝑥 + ℎ 2

γ ℎ = 𝐶 0 − 𝐶(ℎ)

, 𝑏𝑒𝑖𝑛𝑔 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝐸 𝑡ℎ𝑒 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 (𝑚𝑒𝑎𝑛)

, 𝑏𝑒𝑖𝑛𝑔 𝐶 0 𝑡ℎ𝑒 𝑠𝑖𝑙𝑙 𝑎𝑛𝑑 𝐶 ℎ 𝑡ℎ𝑒 𝑐𝑜 − 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒

ρ ℎ = 𝐶(ℎ)

𝐶(0) , 𝑏𝑒𝑖𝑛𝑔 ρ ℎ 𝑡ℎ𝑒 𝑐𝑜𝑟𝑟𝑒𝑙𝑜𝑔𝑟𝑎𝑚

So far we’ve been dealing directly with the variogram value in the kriging matrix but

actually normally we use correlogram value (or co-variance with sill =1). Let’s see how to

calculate the correlogram from the variogram.

Subtracting the variogram value to the sill will give use the co-variance value. The co-

variance divided by the sill will give us the correlogram. Since the sill is 1 for out study

case the correlogram equals the co-variance. The difference in a plot would be:

Variogram Co-variance or correlogram

Correlation = 1

Correlation = 0


50

So if we would assume the study case from slide 25 (actually the kriging matrix from slide 42 in

simulation) and we would want to do co-simulation, than we should have a secondary image and

correlation for that node.

2.3

2.9

3.1

1

2

3

2.2 m

1.9 m

3.1 m

𝑐𝑐 = 0.7 𝑎𝑛𝑑 𝑉𝑠 𝑓𝑜𝑟 𝑠𝑒𝑐𝑜𝑛𝑑𝑎𝑟𝑦 𝑣𝑎𝑙𝑢𝑒

So our kriging matrix should be this one (notice the variogram values were transformed into

correlogram and the changes that appear in purple). The correlation value is cc=0.7.

1

2

3

1 2 3 p

1

1

1

0.02 0.06

0.02 0

0.06 0 w3

w1

w2

0

0.08

0.17

cc

cc*0.08

cc*0.17

cc*0

1 1 1 1

ws

1

1

1

1 ! 1

0.08 0.17 0 1 1

s

s


51

𝑣 𝑝 = 2.3 − μ𝑝 ∗ 0.10 + 2.9 − μ𝑝 ∗ 0.15 + 3.1 − μ𝑝 ∗ 0.07 + 𝑉𝑠 − μ𝑝 ∗ 0.74 + μ𝑝

W = 0.10812852, 0.15659974, 0.07054397, 0.74175945, (!)-0.07703169

μ𝑝 =(2.3 + 2.9 + 3.1)

3 = 2.76

We know the weights and the value Vs = 2.7. So the kriged value is:

𝑣 𝑝 = 2.3 − μ𝑝 ∗ 0.10 + 2.9 − μ𝑝 ∗ 0.15 + 3.1 − μ𝑝 ∗ 0.07 + 2.7 − μ𝑝 ∗ 0.74 + μ𝑝 = 2.71

Notice we use the secondary image value as a sample which has a weight. There is

another important point thought. We must ensure the secondary variable has the same

range as the primary variable. For this reason we must, before anything else, do a linear

transformation for the secondary variable to have the same minimum and maximum as

the primary. You can do it using this formula:

𝑉𝑠 =𝑉𝑠−min 𝑉𝑠 ∗(max 𝑉𝑝 −m𝑖 𝑛 𝑉𝑝 )

(max 𝑉𝑠 −m𝑖 𝑛 𝑉𝑠 )+m𝑖 𝑛 𝑉𝑝

Vs is the secondary variable. Vp is the primary.

Sequential indicator simulation – categorical data

52

Until now we’ve used only continuous variables but sometimes it’s useful to estimate

and/or simulate discrete variables which we commonly call categorical because they’re

largely based on categories. One possible example would be the estimation of the area

covered by a specific kind of vegetation. In this case you would have two categories:

covered, and uncovered. You can also have the same example but with more than one

type of cover (different types of vegetation). The first case would be binary (or indicator,

I’ll explain latter why), the second multiphasic (multiple phases or categories).

The example on your left shows a map which has

two colors, meaning two different categories. It is

likely a simulation of dissemination of some kind of

phenomena (either exists or not) because the blue

color (or whatever that may be) seems to fill the

entire study area as opposition of the orange which

is quite more scattered.

Sequential indicator simulation – indicator function

53

The first thing you need to understand when developing with indicator algorithms is that

each class or category is a variable. A variable whose nature is the probability of the

category itself. This means that for every sample that exists we have two possible outcomes:

either probability 1( category exists in that position) or 0 (category does not exist on that

position).

1

2

3

4

Categorical_project 5 X Y Z Category_1 Category_2 Category_3 Category_4 1 2 0 1 0 0 0 2.8 1.6 0 0 0 1 0 2 2 0 0 1 0 0 2.4 0 0 0 0 0 1

This example of file (whose

format will depend on the

software) show the real nature of

the information in that data. In

each of the samples one category

has probability 1, all the others 0.

𝐼𝐶 𝑥 ≔ 1, 𝑥 ∈ 𝐶 0, 𝑥 ∉ 𝐶

So each category has the following function. We call this

function indicator function because it either gives us 1 in “x”

belong to the category “C” or 0 if it does not.

Sequential indicator simulation – indicator variogram

54

We usually do variograms for continuous variables but a set of “n” categories are “n”

different variables. So we need to do a variogram for each of those variables.

2γ𝐼𝑧 𝑥, 𝑥 + ℎ = 𝐸( 𝐼𝑧 𝑥 − 𝐼𝑧(𝑥 + ℎ)2)

For the case study in the previous slide we would have four different indicator variograms

because of the four different categories. The correct procedure to simulate or krige indicator

variables is using all of the variogram ellipsoids (for the several categories) and use them to

build the kriging matrix. However sometimes a multiphasic variogram is used which is built

by the sum of the variograms of all variables. Other times a mean approximate is used.

Depends largely on the intended result.

If the variables only have between 1 and 0 values you can probably guess the variogram

model will be something like a probability model for that specific category.

Think about this. Imagine that we have four categories, therefore 4 variogram models and

we intend to use a multiphasic to do simulation. The problem is that one of categories is so

rare that using it’s variogram for the multiphasic could endanger the correct simulation of

other categories. I could consider building a multiphasic with all categories except that

one…

Sequential indicator simulation – how to do…

55

Assuming you have the all the variogram ellipsoids or simply the multiphasic we pretty

much build the kriging matrix as in the normal continuous sequential simulation as show in

slide 42.

Once you have the weights you need to multiply them by each of the samples values

meaning for sample 1 in slide 53: 1;0;0;0

This means you’ll have a kriged mean for each of the samples (ex: 0->0.3 , 1->0.2, 2-> 0.4, 3-> 0.1)

meaning a probability for each category. So know we can build the our distribution (we actually

normalize these values first by dividing them by their total sum):

0.1

0.3

0.6

0.9

1

0 1 2 3

So I generate a

probability from 0

to 1.

An retrieve the

respective simulated

category.

Notice that some categories,

because they have a bigger

probability are more likely to

be generated.

Indicator simulation post-processing – most likely and entropy

56

Simulation 1 Simulation 2 Simulation 3

We’ve seen this before for continuous variables. Right now however we have 3

simulations. And for teach of the nodes we may have 3 different categories. The most

likely value is the category, for each node, that appears more often (it’s actually the

mode). Entropy gives us a level of uncertainty based on an entropy.

Most likely value

… …

… …

Entropy

… …

… …

𝑒 = − 𝑝𝑘 ∗ log 𝑝𝑘

Pk is actually the

probability of category k.

Stochastic genetic procedures – genetic algorithms

57

Using stochastic simulation for basic parameters uncertainty studies is only one of the

possible uses. In fact since stochastic sequential simulations explore multiple solutions to

a single parameterization we can use it in optimization algorithms by genetic approach.

Genetic algorithms is the name given to a procedure which relies on different

generations, each created using the previous, and evaluated through an objective

function (quantifying fitness, if using the original expression). Let’s see a general

illustration for a genetic procedure.

Generation 0 Fitness

evaluation

Best fit individuals for Generation 0

Generation 1 (created from

best individuals in generations 0)

Fitness evaluation

Best fit individuals for Generation 1

Generation 2 (created from

best individuals in generations 1)

So we decide many parameters like the number of individuals for generations, the

objective function that evaluates fitness, the number of generations, etc.

Stochastic genetic procedures – global stochastic inversion

58

Global stochastic inversion (GSI) is a type of genetic approach to build a model of

acoustic impedance by evaluating the fitness of each generation using an objective

function which compares the real seismic data to the synthetics seismic data from each

generation. The best locations (more similar to the real data) are used to create the

individuals for the next generation.

Simulation “n”

Simulation 2

Simulation 1 Simulation 0

Generation 0 uses hard-data to do

simulation

Fitness evaluation (comparing simulated data with real data)

Best image from generation 0

Simulation “n”

Simulation 2

Simulation 1 Simulation 0

Generation 1 uses hard-data and best image to do co-

simulation

So how does the evaluation actually occurs? And what is a best image?

Stochastic genetic procedures – convolution

59

On the left you have the real seismic (profile). On the right you have a simulation of

acoustic impedance (the same profile).

So how do we compare the real seismic data with the simulation of acoustic

impedance? Well, we actually build a synthetic seismic from the simulation using a

procedure called convolution.

To do a convolution we need a wavelet which is usually built using the real well log

data (acoustic impedance) and the seismic data in the same location. Let’s see how a

wavelet looks like.


60

-4.000 588.006 -3.000 -567.287 -2.000 -2130.426 -1.000 -3632.075 0.000 -4242.837 1.000 -3562.341 2.000 -1889.319 3.000 -104.545 4.000 1097.485 0 1 2 3 4 -1 -2 -3 -4

To the left you have a plot of a

wavelet. To the right you have an

example of a wavelet file (not the

same example as on the left).

The X-axis gives us the depth step,

the Y-axis the wavelet magnitude.

Wavelets can transform reflection data into seismic

data. But from this point we still need to calculate the

reflections from the acoustic impedance simulation. We

do this for each vertical trace in simulation.

𝑅𝑖 =(𝐴𝐼𝑖+1 − 𝐴𝐼𝑖)

(𝐴𝐼𝑖 + 𝐴𝐼𝑖+1)

So for every trace in every depth position “i” we

calculate a reflectivity using the following formula (and

from this point on we have a reflectivity image for our

simulation): i

i+1


61

0

1

2

3

4

-1

-2

-3

-4

= + .

wavelet

Using the reflectivity image, for every trace, in every depth position “i“, we calculate a value

which is the result of convolution. Notice however that if I start in position i=0 (first value in

trace) the calculation will be done not only in position “i“ but also in the interval [i-wavelet up

size, i+wavelet down size]. So the same point is going to get involved in multiple operations.

For instance if the wavelet up size =3 than i=0 will be transformed when calculating on

position i=0, i=1,i=2,i=3 because the interval for that trace is [i-3,i+3] = [0,6] if i=3. If i=4 the

interval is [1,7] and i=0 is no longer considered. So we can say that the calculation of each

position happens following this procedure:

𝑆𝑖 = 𝑅𝑖 + 𝑅𝑖 ∗ 𝑊𝑖

Notice however that, although

I’m saying Si is a seismic value,

that can only be true when the

trace is fully convolved. Ri

would be reflectivity, Wi is

wavelet value for position i.

Stochastic genetic procedures – Objective function

62

So now that we have a synthetic seismic and the real seismic we can

compare both by doing the correlation between them (the

following is Pearson correlation, others can be used).

ρ =𝐸[ 𝑋 − μ𝑥 𝑌 − μ𝑦 ]

σ𝑋σ𝑌 𝑋 = 𝑥𝑖 , 𝑌 = 𝑌𝑖

The correlation is done using something we call layer map. The layer

map is a instruction (stochastically generated for each generation) for

the series used in the correlation so for instance:

=

Layer 1 with a series of 3 values

Correlation trace

Layer 2 with a series of 5 values

The correlation is done for each trace

using the series defined in the layer

map. In the end we have a

correlation image for the acoustic

impedance simulation.


63

Let’s review all steps for each simulation in the GSI procedure:

Acoustic impedance simulation

Reflectivity image

Synthetic seismic image

Correlation image

Slide 60 Slide 61 Slide 62

We use wavelet

We compare with real seismic

Acoustic impedance simulation

Correlation image

In the end we have two very important images. The first is the acoustic impedance

simulation, the second the correlation for that simulation.


64

Acoustic impedance

simulation 0

Correlation image 0

Acoustic impedance

simulation 1

Correlation image 1

Acoustic impedance

simulation 2

Correlation image 2

Acoustic impedance

simulation n

Correlation image n

…

Generation 0

As you can image the first generation (we call iteration 0) has “n” simulations images,

and “n” correlation images. So we can build one acoustic impedance image that has all

the best parts from these simulations. By best I mean have the higher correlations. As

an example if I want to see the best value for node 1 that I’ll search in all correlation

images in node 1, which one has the higher value. Than I take that value from the

respective acoustic impedance simulation and put it in the best acoustic impedance

image. I do this for all nodes. In the end I have the best acoustic impedance image and

the best correlation image.

Best acoustic impedance image

Best correlation

image


65

So moving from seeing a single generation (iteration) for the whole procedure we

would get:

Generation 0 (iteration 0)

Best acoustic impedance image 0

Best correlation

image 0 Simulations for generation 0

Generation 1 (iteration 1)

Best acoustic impedance image 1

Best correlation image 1

Co-Simulations for generation 1

Generation n (iteration n)

Best acoustic impedance image n

Best correlation

image n Co-Simulations for generation n

As you can probably guess the higher the

iteration, the higher the correlations from

simulation. What usually happens is from some

iteration forward the improvement is so low

that doing more iterations would be only

wasting time. By the end of the procedure you

can see which simulation had the higher

correlation of all. That is your best acoustic

impedance model (not to be mistaken by the

best image in each iteration).

Template

66

Template

67

Template

68

student_garden_geostatistics_course

Documents

hard data

purpose of datapoint

grid data file format

content organic

indicator function

hard value

direct sequential simulation

indicator variogram