scale mixture of gaussians modelling of polarimetric …

SCALE MIXTURE OF GAUSSIANS MODELLING OF POLARIMETRIC SAR DATAAnthony Doulgeris (1), Torbjørn Eltoft (1)

(1) University of Tromsø, Department of Physics and Technology, NO-9037 Tromsø, NorwayEmail: {anthony.doulgeris,torbjorn.eltoft}@phys.uit.no

ABSTRACT

This paper discusses a multivariate, non-Gaussian para-metric modelling technique to analyse fully polarimetricSAR data (PolSAR). We investigate a simple class ofmultivariate non-Gaussian distributions, the ’scale mix-ture of Gaussians’, and discuss their appropriateness formodelling radar data on both theoretical grounds andby observation. Three closed-form non-Gaussian scalemixture models are described and a fast parameter esti-mation method is presented. Each of the models, plus theGaussian, are then analysed and inter-compared for theirgoodness-of-fit to real polarimetric radar data. Real dataevaluations are made using airborne fully polarimetricSAR studies. Applications for the modelling are imageclassification based upon the new modelled parametricfeature set.

1 INTRODUCTION

Many previous works have looked at scale mixturemodels [1] for non-Rayleigh amplitude modelling andmultivariate non-Gaussian modelling. These include thewell used K-distribution [2]–[4], and the normal inverseGaussian distribution [5], [6]. The basic scale mixtureapproach is to model the non-Gaussian distribution asa product of a scalar scale random variable with a nor-malised multivariate Gaussian distribution. In this paperwe are only investigating the symmetric case of the moregeneralised models that would also include a skewnessterm. The multidimensional scale mixture of Gaussiansmodel used here is expressed as

Y = µ +√

Z Γ12 X, (1)

where µ is the mean vector of Y, Z is the (positiveonly) scalar scale parameter, Γ is the internal covariancestructure matrix, normalised such that det(Γ) = 1, andX is a standardised Gaussian variable with zero mean andidentity covariance matrix, i.e. X ∼ N (0, I).

The scale mixture model is effectively a scheme andmany different model families can be derived depend-ing on the scale parameters assumed distribution. Thispaper explores the three models described in [7], thatis, the multivariate K-distribution (MK) from a Gammadistributed scale parameter, the normal inverse Gaussian(MNIG) from an inverse Gaussian distributed scale, the

Laplacian (ML) from an exponential scale distributionand, for comparison, the Gaussian (MG) from a constant(or Dirac delta distributed) scale parameter.

Although these four model families can appear verydifferent in appearance, they all have some commontraits because of the underlying scheme. All models aresymmetric about their mean and although each dimen-sion may have different relative widths, distributed bythe normalised covariance matrix, each dimension willhave a similar shape governed by the scale parameter’sdistribution.

We firstly investigate whether the scale mixture of Gaus-sians models are well suited to radar data by observingtheir predicted property of global shape for each dimen-sion independently.

Secondly we investigate which, if any, model is the mostsuitable by fitting all four parametric models to realPolSAR data sets, and using the log-likelihood score as ameasure of the goodness-of-fit to the data. The results canbe depicted as a ’best fit’ coloured map indicating whichmodel was best at each location, and clearly indicatethe known result that radar data is often non-Gaussian.Additionally, by taking a log-likelihood threshold forgood or poor fitting to the data, we have shown that thetwo parameter flexible models are well suited to fit mostreal PolSAR data points.

This paper presents a little scattering theory in section 2,describes the three models in section 3, gives argumentsfor the suitability of the models in section 5, and discussesthe model fitting results in section 6.

2 THEORY

The theoretical basis for the scale mixture model is thatthe distributed target radar reflection results in a vectorsum of many individual scattered signals within the reso-lution cell area.

Er ∝N∑

i=1

SiEtejφi (2)

where the received signal Er, is the vector sum of anunknown number N of scattered signals SiEt with in-dependent phases φi. The individual scattered signals are

the transmitted signal Et transformed by the complexscattering matrix Si, which describes the target mediascattering characteristics.

If we consider the unknown number of scatterers, N , tobe a random variable and take the continuous limit of therandom walk, then we reach the scale mixture descriptionin (1). The scale mixture modelling is essentially mod-elling the effect known as speckle.

It is commonly assumed that the target cell roughness isgreater than the radar wavelength so that the geometricalphase component, φi, can be considered uniformly ran-dom. This leads directly to the expected ’in phase’ (I) and’quadrature’ (Q) signals, being cos(φi) and sin(φi) timesthe amplitude, having a mean value of zero.

This work views each complex scattering matrix coeffi-cient as two real values, the I and Q signals. Thereforeall four coefficients give an 8 dimensional vector at eachimage location, that is, the real and imaginary parts ofeach of VV, VH, HV and HH polarisation. Split in thisway, it is easy to show that the main diagonal elements ofthe corresponding covariance matrix will have pair-wiseequal values and zeroes (i.e. uncorrelated) in each pairby pair off diagonal. Natural scattering symmetries alsolead to many other zero elements in the covariance matrixstructure, and reciprocity implies that the four cross-polarterms are equally distributed.

3 THE MODELS

Given a probability density function (pdf) for the scaleparameter, fZ(z), the marginal pdf for Y can be obtainedby averaging the pdf of Y|Z over the density of Z

fY(y) =∫ ∞

0

fY|Z(y|z) fZ(z) dz

=∫ ∞

0

1

(2π)d2

exp(− 1

2 zq(y)

)fZ(z)dz, (3)

where for brevity we have defined

q(y) = (y − µ)T Γ−1(y − µ). (4)

The three distributions derived in this manner are namedas multivariate extensions to existing one dimensionaldistributions and retain the general characteristics of theirnamesakes.

3.1 Multivariate Laplacian

The multivariate Laplacian, ML(λ, µ,Γ), is derived withZ from an exponential distribution

fZ(z) =1λ

exp(− z

λ

), (5)

and the d dimensional marginal pdf of Y, using (3),becomes

fY(y) =1

(2π)d2

2λ

K d2−1

(√2λq(y)

)(√

λ2 q(y)

) d2−1

, (6)

where Km(x) denotes the modified Bessel function ofthe second kind and order m, evaluated at x.

3.2 Multivariate K

The multivariate K, MK(α, λ, µ,Γ), is formed when Zis from a Gamma distribution

fZ(z) =λα+1zα

Γ(α + 1)exp(−λz), (7)

and the general d dimensional pdf of Y is

fY(y) =

2

(2π)d2

λα+1

Γ(α + 1)

(√q(y)2λ

)α+1− d2

Kα+1− d2

(√2λq(y)

).

(8)

3.3 Multivariate Normal Inverse Gaussian

When Z has an inverse Gausssian distribution,

fZ(z) =δ√2π

eδγz−32 exp

(−1

2

(δ2

z+ γ2z

)), (9)

the mixture model produces the multivariate Normal In-verse Gaussian, MNIG(δ, γ, µ,Γ) with marginal densityfunction

fY(y) =

2δeδγ

(γ

2π√

δ2 + q(y)

) d+12

K d+12

(γ√

δ2 + q(y)).

(10)

Gaussian Laplacian

K dist.

NIG

Fig. 1. Example shapes for each model distribution, fixed width.

3.4 Properties

All models are symmetric about the mean and althougheach dimension may have different relative widths, dis-tributed by the covariance matrix Γ, they will each have asimilar (global) shape governed by the scalar parameters.All models are also sparse distributions, meaning thatthey are more pointed in the peak and heavier tailedthan the Gaussian. The MG and ML distributions havea fixed shape and the scalar parameter varies the width.The MK’s and MNIG’s two scalar parameters lead to arange of shapes as well as overall width. The shapes rangefrom more pointed than Laplacian, through to roundedlike the Gaussian (see figure 1). The shape parameterdoes not vary linearly in value with most of the varia-tion occuring near zero and converging asymptoticallytowards the Gaussian. Also note that both the ML andMK distribution’s pdfs can go to infinity at the meanvalue, whereas the MNIG always has a finite peak.

The discrete mixture of Gaussians method was also re-viewed, because many of the non-Gaussian shapes maybe created by a mixture of several independent Gaussiandistributions. However, there is no clear rule to determinehow many Gaussians to use or why, and the estimationwould be poor with the reasonably small sample sizesused (we use as small as 169 samples of 8 dimensions).The scale mixture of Gaussians is a much more compactdescription, with only 3 or 4 parameters, that still cap-tures the overall shape of the distribution.

4 PARAMETER ESTIMATION

Statistical modelling is achieved by a local neighbour-hood approach to image analysis. Every pixel is modelledusing the statistical data contained in that pixel and itssurrounding neighbours. In practise, we found that weneeded a neighbourhood of 13x13 pixels (20m×20m)for reasonable estimation accuracy. Larger windows may

be used to improve statistical accuracy, but then thereis greater likelihood of including a mixture of terraintypes in the statistical collection, and therefore blurringthe overall result.

Observe that since Z and X are assumed independent,from (1) we obtain the expectations

E{Y} = µ and (11)

E{(Y − µ)(Y − µ)T } = E{Z} Γ. (12)

Hence, for all of the models, we can estimate the param-eter µ directly from the sample mean, and Γ from thenormalised sample covariance matrix.

The other parameters can be estimated from one or twomoment estimates for Z, related back to each chosen dis-tribution. Key to this approach is the assumed separabilityof the X and Z. Initially, an EM iterative approach wasused (as in [7]), however the long processing time provedimpractical to use over a large image, and we have chosento use a moment based method on the grounds of speed.

The first moment of Z is derived from the determinant ofthe sample covariance matrix, and the second moment ofZ from a multivariate fourth order moment in the sample,Mardia’s kurtosis [8], defined as

β2,d = E{[(Y − µ)T Σ−1(Y − µ)]2}, (13)

where d is the data dimension. The special case of astandard normal variable N (0, I) gives

β2,d = d(d + 2). (14)

With the use of (12) and (14),

E{[(Y − µ)T Σ−1(Y − µ)]2}

= E{[(√

ZΓ12 X)T Γ−1

E{Z}(√

ZΓ12 X)]2}

=E{Z2}

(E{Z})2d(d + 2), (15)

which can be used to estimate E{Z2} thus

E{Z2} =(E{Z})2

N

N∑i=1

[(y − µ)T Γ−1(y − µ)]2

d(d + 2)(16)

where µ, Γ and E{Z} are calculated from the samplemean and sample covariance.

In the case of the multivariate Laplacian, the first momentof Z from (5) can be shown to be E{Z} = λ. Combinedwith the observation of (12), we can estimate λ directlyfrom the determinant of the sample covariance thus

λ = E{Z} ≈ (det (cov(Y))1d . (17)

The multivariate K and NIG distributions have two pa-rameters each, and require two moment estimates to solvefor two parameters. For the K-distribution with Z Gammadistributed, (7), we have

E{Z} =α + 1

λ(18)

E{Z2} =(α + 1)(α + 2)

λ2(19)

which can be solved to find the estimates α and λ.

The inverse Gaussian distribution, (9), has moments

E{Z} =δ

γ(20)

E{Z2} =(

1 +1δγ

)δ2

γ2(21)

and we can therefore solve for δ and γ.

In the general case, all the model parameters are free to beoptimised in the fitting procedures. However, if we havesome a priori knowledge about a parameter’s value, wewould expect a better model fit by actually constrainingit. The most obvious constraint in our radar data is the ex-pected zero-mean and we have further zero and pair valueconstraints on the covariance structure matrix. Simulatedstudies show that applying the constraints has a great im-provement in the estimated parameters, particularly whenthe sample size is small, with the covariance constraintsbeing the most significant. However, in this paper themodelling has been left free because the sample sizes arelarge enough that the mean is very nearly zero and thecovariance constraints are approximately met anyway.

5 SUITABILITY

If we take our assumption of scale mixture of Gaussiansmodelling and our theoretical radar scattering as a vectorsum with uniformly random phase, then two main prop-erties emerge: zero-mean and global shape. It seemedappropriate to investigate whether the real PolSAR data

showed similar general features as a validation for usingsuch a mixture model.

Fg. 2 shows three different sets of real PolSAR data dis-tributions depicted as marginal Parzen estimates for eachof the 8 dimensions of the data vectors. The intention isto observe the two general features of the data. Firstly, itcan be seen, in all sets of data, that they do appear to bedistributed with a mean of zero for each dimension. Sec-ondly, taking each set individually (columns), it appearsthat the shape is consistent down all dimensions (rows),although the shape can appear distinctly different fromone sample location to another. Specifically, the first set,(a), has a Gaussian-like roundness to each dimensionaldistribution, the second set, (b), has a reasonably pointedpeak with smoothly decaying sides, quite Laplacian inappearance, and the third set, (c), has a marked kink in thesides and is very heavy tailed, again with each dimensionshowing basically the same shape.

It is interesting to also note that the polarimetric informa-tion becomes visible in the form of the different widths ofeach dimension, which can be roughly all similarly scaledas in the first set (a), or with some wide and some narrowas in the other two plots. Also note the pair-wise equalityin the distributions, because of our real and imaginarysplitting of the data, and the centre four dimensions beingequal due to reciprocity.

6 MODELLING RESULTS

After obtaining four parametric descriptions of the data,we then compare a goodness-of-fit measure of each todetermine which model fits best. Since we are comparingfour different parametric descriptions to the same dataset, it is sufficient to use a relative ranking measure only,we do not require an absolute, or normalised measure offit. The log-likelihood measure is fast and efficient, andsimply requires summing the log of the model pdf valueat each data point. The logarithmic nature of this measurealso makes it sensitive to differences in the tails of thedistributions and is therefore well suited for testing sparsedistributions.

A ’best fit’ map is produced by goodness testing all fourfitted models and mapping the chosen best-fitted modelin different colours. White for Gaussian, green for Lapla-cian, red for K-distribution and blue for normal inverseGaussian. We also observed that although one model maybe chosen as the best fit, some of the other models maybe quite reasonable fits, with very similar log-likelihoodscores. By choosing an empirical threshold of 0.5% log-likelihood score, we can produce a ’coverage’ map foreach distribution that shows not only where it was thebest fit, but also where it was considered a ’good’ fit too.Where each model was the best fit is depicted in red, a

!0.04 !0.03 !0.02 !0.01 0 0.01 0.02 0.03 0.040

50

Bleikvatnet example

!0.04 !0.03 !0.02 !0.01 0 0.01 0.02 0.03 0.040

50

!0.04 !0.03 !0.02 !0.01 0 0.01 0.02 0.03 0.040

50

!0.04 !0.03 !0.02 !0.01 0 0.01 0.02 0.03 0.040

50

!0.04 !0.03 !0.02 !0.01 0 0.01 0.02 0.03 0.040

50

!0.04 !0.03 !0.02 !0.01 0 0.01 0.02 0.03 0.040

50

!0.04 !0.03 !0.02 !0.01 0 0.01 0.02 0.03 0.040

50

!0.04 !0.03 !0.02 !0.01 0 0.01 0.02 0.03 0.040

50

(a) Rounded, Gaussian-like

!0.2 !0.15 !0.1 !0.05 0 0.05 0.1 0.15 0.20

5

Foulum example

!0.2 !0.15 !0.1 !0.05 0 0.05 0.1 0.15 0.20

5

!0.2 !0.15 !0.1 !0.05 0 0.05 0.1 0.15 0.20

5

10

!0.2 !0.15 !0.1 !0.05 0 0.05 0.1 0.15 0.20

5

10

!0.2 !0.15 !0.1 !0.05 0 0.05 0.1 0.15 0.20

5

10

!0.2 !0.15 !0.1 !0.05 0 0.05 0.1 0.15 0.20

5

10

!0.2 !0.15 !0.1 !0.05 0 0.05 0.1 0.15 0.20

5

!0.2 !0.15 !0.1 !0.05 0 0.05 0.1 0.15 0.20

5

(b) Smooth, Laplacian-like

!0.15 !0.1 !0.05 0 0.05 0.1 0.150

5

10

Okstinden example

!0.15 !0.1 !0.05 0 0.05 0.1 0.150

5

10

!0.15 !0.1 !0.05 0 0.05 0.1 0.150

20

!0.15 !0.1 !0.05 0 0.05 0.1 0.150

20

!0.15 !0.1 !0.05 0 0.05 0.1 0.150

20

!0.15 !0.1 !0.05 0 0.05 0.1 0.150

20

!0.15 !0.1 !0.05 0 0.05 0.1 0.150

5

10

!0.15 !0.1 !0.05 0 0.05 0.1 0.150

5

10

(c) Kinked, heavy tailed

Fig. 2. Density distributions from three real PolSAR sample locations showing example shape representations for all 8 data dimensions.

good fit in magenta and a poor fit in black. Of real interestis in fact the regions where each model was considered apoor fit (black) and thus do not represent the data verywell at all.

The modelling was tested on several different real Pol-SAR data images to compare the behaviour for quitedifferent terrain types. The ’Bleikvatnet’ (a) mountainlake and forest area and the ’Okstinden’ (b) mountainglacier area, both in Norway, are from airborne EMISARflights in 1995. The ’sea ice’ (c) image from an airborneCONVAIR flight in Canada in 2001. And the ’Foulum’(d) agricultural and urban area from an airborne EMISARflight over Denmark in 1998. The best fit maps, andreference intensity maps, are shown for all four areasin fig. 3, from which the following observations may bemade:

• Uniform, smooth or homogeneous areas are usuallybest fitted as Gaussian (white), as seen in the centrallake area in (a), the large open snow areas in (b), the(presumably) snow covered old ice patches in (c) andthe water inlet and several large fields in (d).

• The land in general, the visible icy crevasses, rockyoutcrops, urban areas and certainly anything withsmall scale details and high contrast are certainly non-Gaussian in nature and were poorly fitted by the Gaus-sian model.

• All types of vegetated land appear to be best describedby the normal inverse Gaussian distribution, whereas

the visible glacial ice and sea ice more by the K-distribution, although the difference compared to theNIG was negligible.

• The urban areas and coastlines best fitted more oftenby the Laplacian, however this may be due to highcontrast edge mixture effects because it appears atall water/land boundaries, around point sources likeknown huts within the forest, and along hedge/fencelines around fields.

The coverage maps for each separate model and for eacharea are shown in fig. 4. We observe the following:

• The Gaussian model is usually a poor fit for significantparts of the image area, over 20%.

• The Laplacian model is very good at detecting edgesand point sources and is otherwise very poor at fittingto natural terrain types. Its seemingly good fit for urbanareas is presumably because of the predominance ofpoints and edges of mixed terrain in the urban land-scape.

• In all cases the two parameters of the MK and MNIGgive a shape space that finds a ’good’ fit for themajority of the data points (over 90%), and mostly’fail’, i.e. are more poorly fitted, for the high contrastedges and point sources.

• The normal inverse Gaussian model has the greatest’good’ fitted area for all areas and is usually the great-est best fit also.

Sub!area Bleikvatnet intensity, C!band, July.

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

BestFit map: MG ’ w ’ 38.2385%; ML ’ g ’ 3.3255%; MK ’ r ’ 17.5594%; MNIG ’ b ’ 40.8766%

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

(a) Mountain lake and forest

!"#!$%&$'()!*+,-&,'+,*&,!+*./'0!#$,-/'1"2.3

455 655 755 855 955 :55 ;55 <55 =55 4555

455

655

755

855

955

:55

;55

<55

=55

4555

BestFit map: MG ’ w ’ 53.70%; ML ’ g ’ 4.76%; MK ’ r ’ 13.55%; MNIG ’ b ’ 27.99%

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

(b) Mountain glacier: snow, ice and rock

!"##$%&'$(!)$*+,'+-*,./

01 211 201 311 301 411 401 511 501 011

01

211

201

311

301

411

401

511

501

011

BestFit: MG ’ w ’ 16.63%; ML ’ g ’ 0.35%; MK ’ r ’ 49.24%; MNIG ’ b ’ 33.78%

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

(c) Sea ice

!"#!$%&$'()"*"+,'-.,'/01&0!/123

455 655 755 855 9555

455

655

755

855

9555

9455

9655

9755

BestFit: MG ’ w ’ 16.29%; ML ’ g ’ 25.67%; MK ’ r ’ 9.69%; MNIG ’ b ’ 48.36%

100 200 300 400 500 600 700 800 900 1000

200

400

600

800

1000

1200

1400

1600

(d) Agricultural, urban and water

Fig. 3. Intensity and ’best fit’ coloured maps for the four areas. Gaussian in white, Laplacian in green, K-dist. in red and normal inverse Gaussian inblue.

The reason the Laplacian seems better at edges thaneither the MK or MNIG, is probably because the of theinfluence of the outer shoulder in those distributions. Ifindeed the ’edge’ distribution is from a very narrow anda very broad mix, then it is unlikely to have the expectedpeak height for the apparent shoulder and tail for eitherthe MK or MNIG and the best fit parameters seem toemphasise the tail region. The Laplacian has a fixed shapeand simply fits roughly centrally over the middle ’kink’and therefore leads to a higher goodness-of-fit score. Thiscan be seen in the simple example, a mixture of twoGaussians, in fig. 5. The mixed distribution is shown inblack. Observe that the Laplacian (in green) fits higher upthe peak and lower in the tails, whereas the K-distribution(red) and normal inverse Gaussian (blue dashed) arealmost identical and fit most tightly in the tails.

It is important to remember that only the goodness-of-fit testing of each model has been depicted in thefigures so far, and not an actual image segmentation basedupon the modelled parameters. Image segmentation ofthe parametric feature set derived from the modelling hasa much richer distinguishing power as seen in a simple

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.50

0.5

1

1.5

2

2.5

3

3.5

4

4.5

LL scores : MG= 4350.9911, ML= 5656.0925, MK= 4831.0024, MNIG= 4713.1035

Fig. 5. Two part discrete mixture of Gaussians (black) fitted asLaplacian (green), K-dist. (red) and normal inverse Gaussian (bluedashed).

Gaussian coverage: came top (’ r ’) 38.2385%; within 0.5% (’ m ’) 41.7931%; poor fit (’ k ’) 19.9683%.

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

Laplacian coverage: came top (’ r ’) 3.3255%; within 0.5% (’ m ’) 6.432%; poor fit (’ k ’) 90.2425%.

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

K!distribution coverage: came top (’ r ’) 17.5594%; within 0.5% (’ m ’) 73.6287%; poor fit (’ k ’) 8.812%.

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

NIG coverage: came top (’ r ’) 40.8766%; within 0.5% (’ m ’) 54.5686%; poor fit (’ k ’) 4.5548%.

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

(a) Bleikvatnet area, model coverage maps: MG, ML, MK, MNIG.

Gaussian coverage: came top (’ r ’) 53.70%; within 0.5% (’ m ’) 19.78%; poor fit (’ k ’) 26.52%.

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

Laplacian coverage: came top (’ r ’) 4.76%; within 0.5% (’ m ’) 10.44%; poor fit (’ k ’) 84.80%.

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

K!distribution coverage: came top (’ r ’) 13.5469%; within 0.5% (’ m ’) 72.7993%; poor fit (’ k ’) 13.6539%.

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

NIG coverage: came top (’ r ’) 27.9908%; within 0.5% (’ m ’) 64.9808%; poor fit (’ k ’) 7.0283%.

100 200 300 400 500 600 700 800 900 1000

100

200

300

400

500

600

700

800

900

1000

(b) Okstinden area, model coverage maps: MG, ML, MK, MNIG.

Gauss.cover: top (’ r ’) 16.63%; within 0.5% (’ m ’) 59.82%; poor fit (’ k ’) 23.55%.

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

Lapl.cover: top (’ r ’) 0.35%; within 0.5% (’ m ’) 1.60%; poor fit (’ k ’) 98.06%.

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

!!"ist.c()e+, t(. /0 + 01 43.2456 7it8in 0.55 /0 < 01 48.3856 .((+ >it /0 ? 01 1.A85.

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

NIG.cover: top (’ r ’) 33.78%; within 0.5% (’ m ’) 63.67%; poor fit (’ k ’) 2.55%.

50 100 150 200 250 300 350 400 450 500

50

100

150

200

250

300

350

400

450

500

(c) Sea ice area, model coverage maps: MG, ML, MK, MNIG.

Gauss.cover: top (’ r ’) 16.29%; within 0.5% (’ m ’) 16.47%; poor fit (’ k ’) 67.25%.

100 200 300 400 500 600 700 800 900 1000

200

400

600

800

1000

1200

1400

1600

Lapl.cover: top (’ r ’) 25.67%; within 0.5% (’ m ’) 27.69%; poor fit (’ k ’) 46.64%.

100 200 300 400 500 600 700 800 900 1000

200

400

600

800

1000

1200

1400

1600

K!dist.cover: top (’ r ’) 9.69%; within 0.5% (’ m ’) 38.52%; poor fit (’ k ’) 51.79%.

100 200 300 400 500 600 700 800 900 1000

200

400

600

800

1000

1200

1400

1600

NIG.cover: top (’ r ’) 48.36%; within 0.5% (’ m ’) 33.54%; poor fit (’ k ’) 18.11%.

100 200 300 400 500 600 700 800 900 1000

200

400

600

800

1000

1200

1400

1600

(d) Foulum area, model coverage maps: MG, ML, MK, MNIG.

Fig. 4. Coverage maps for all four models and all areas. Best fit in red, good fit in magenta and poor fit in black.

ZG data set, BMoG classify classify into 16 classes.

Fig. 6. Bleikvatnet image segmentation example based upon modelledparameters. 16 classes.

preliminary segmentation of the Bleikvatnet area shownin fig. 6. Depicted is an unsupervised image classificationcreated as a discrete mixture of Gaussians clustering,to 16 classes on a 10% sub-sample of the image, withsubsequent Bayesian classification of the whole imageto the 16 Gaussian classes obtained. The input featureset is the width and shape parameters, transformed bylogarithm to improve linearity, plus four key Γ matrixelements.

7 CONCLUSION

The scale mixture of Gaussians models do indeed seemwell suited to modelling PolSAR data which show in-herently sparse distributions with a global shape for eachdimension.

We have confirmed that many terrain types are clearlynon-Gaussian in nature and that a flexible, two param-eter model is able to capture the full shape range ofPolSAR data distributions. Different terrain types canshow quite different distribution shapes, therefore theshape parameter should be of benefit to subsequent imagesegmentation.

It was demonstrated that the normal inverse Gaussiandistribution is the better fitting model, out of those anal-ysed, and usually better than the commonly used K-distribution. It captures the greater proportion of distri-bution shape variations and has less trouble at boundarymixtures, than the MK. The normal inverse Gaussian alsohas strong theoretical grounds derived from Brownian

motion theory.

Future work might include:

• Investigating discrete mixture models, perhaps fittinga mixture of MNIGs, to try to reduce the edge mixtureeffects

• Applying decomposition theorems based upon physi-cal constraints to the model parameters

• Attempting to find physical interpretations, such astexture, for the modelled parameters

• And, of course, investigating image classification withthe new features.

8 ACKNOWLEDGMENTS

The authors would like to thank Prof. Henning Skiver,Dr Jorgen Dalli and the Danish Technical Universityfor the Foulum data-set. Thanks to Dr Daniel Delisle,Dr Sahebi Mahmod Reza and the Canadian SpaceAgency for the sea ice data-set. Both data-sets weredowloaded from the European Space Agency website(http://earth.esa.int/polsarpro/datasets.html). Thanks toNorut IT of Tromsø, Norway, for the Bleikvatnet andOkstinden data-sets.

9 REFERENCES

[1] A. F. Andrews and C. L. Mallows, “Scale mixtures of normaldistributions,” Journal of the Royal Statistical Society. Series B,vol. 36, pp. 99–102, no. 1 1974.

[2] E. Jakeman and P. N. Pusey, “A model for non-Rayleigh sea echo,”IEEE Trans. Antennas Propagat., vol. 24, 6, pp. 806–814, Nov.1976.

[3] S. H. Yueh, J. A. Kong, J. K. Jao, R. T. Shin, and L. M. Novak,“K-Distribution and polarimetric terrain radar clutter,” J. Electro.Waves Applic., vol. 3, pp. 747–768, 1989.

[4] E. Jakeman and R. J. A. Tough, “Generalized K distribution: astatistical model for weak scattering,” J. Opt. Soc. Am. A, vol. 4,9, pp. 1764–1772, Sept. 1987.

[5] O. E. Barndorff-Nielsen, “Normal Inverse Gaussian Distributionsand Stochastic Volatility Modelling,” Scand. J. Statist., vol. 24, pp.1–13, 1997.

[6] T. A. Øigard, A. Hanssen, R. E. Hansen, and F. Godtliebsen, “EM-estimation and modeling of heavy-tailed processes with the mul-tivariate normal inverse Gaussian distribution,” Signal Processing,vol. 85, 8, pp. 1655–1673, 2005.

[7] T. Eltoft, T. Kim, and T.-W. Lee, “Multivariate scale mixture ofGaussians models,” in Proceedings of the International Conferenceon Independent Component Analysis 2006, Charleston, SC, USA,Mar. 2006.

[8] K. V. Mardia, “Measure of Multivariate Skewness and Kurtosiswith Applications,” Biometrica, vol. 57, 3, pp. 519–530, Dec. 1970.

scale mixture of gaussians modelling of polarimetric …

Documents