modelling the distribution of fish accounting for spatial correlation and overdispersion

12
Modelling the distribution of fish accounting for spatial correlation and overdispersion Peter Lewy and Kasper Kristensen Abstract: The spatial distribution of cod (Gadus morhua) in the North Sea and the Skagerrak was analysed over a 24-year period using the Log Gaussian Cox Process (LGCP). In contrast to other spatial models of the distribution of fish, LGCP avoids problems with zero observations and includes the spatial correlation between observations. It is therefore possible to predict and interpolate unobserved densities at any location in the area. This is important for obtaining unbiased esti- mates of stock concentration and other measures depending on the distribution in the entire area. Results show that the spatial correlation and dispersion of cod catches remained unchanged during winter throughout the period, in spite of a drastic decline in stock abundance and a movement of the centre of gravity of the distribution towards the northeast in the same period. For the age groups considered, the concentration of the stock was found to be constant or declining in the pe- riod. This means that cod does not follow the theory of density-dependent habitat selection, as the concentration of the stock does not increase when stock abundance decreases. Re ´sume ´: Nous avons analyse ´ la re ´partition spatiale de la morue (Gadus morhua) dans la mer du Nord et le Skagerrak sur une pe ´riode de 24 ans a ` l’aide du processus log-gaussien de Cox (LGCP). Contrairement aux autres mode `les spatiaux de re ´partition des poissons, le LGCP e ´vite les proble `mes relie ´s a ` l’absence d’observations et inclut la corre ´lation spatiale entre les observations. Il est, par conse ´quent, possible de pre ´dire et d’interpoler les densite ´s non mesure ´es dans n’importe quel site dans la re ´gion. Cela est important pour obtenir des estimations non errone ´es de la concentration du stock et d’autres mesures qui de ´pendent de la re ´partition dans la re ´gion toute entie `re. Les re ´sultats montrent que la corre ´lation spatiale et la dispersion des captures de morues restent inchange ´es durant l’hiver pendant toute la pe ´riode, malgre ´ un de ´clin spectacu- laire de l’abondance du stock et un de ´placement vers le nord-est du centre de gravite ´ de la re ´partition durant la me ˆme pe ´ri- ode. Pour les groupes d’a ˆge conside ´re ´s, la concentration du stock est demeure ´e constante ou a de ´cline ´ durant la pe ´riode. Cela signifie que les morues ne se conforment pas a ` la the ´orie de la se ´lection des habitats de ´pendante de la densite ´, puis- que la concentration du stock n’augmente pas lorsque l’abondance du stock diminue. [Traduit par la Re ´daction] Introduction Knowledge of the spatial distribution of fish and the tem- poral changes are important for the fishery and fishery man- agement and for understanding the mechanisms of fish behaviour. The distribution of cod (Gadus morhua) has been analysed in several studies (Perry et al. 2005; Rindorf and Lewy 2006). These analyses used a single point, the centre of gravity, as an overall measure to describe changes in the spatial distribution. However, if we want to study the spatial distribution of stock abundance in the entire area, an- other type of modelling is required. Previously, fishery or scientific survey data have been an- alysed assuming that observations are independent, irrespec- tive of trawl position, and distributed according to either extensions of the lognormal (Stefa ´nsson 1996) or negative binomial distributions (O’Neill and Faddy 2003; Kristensen et al. 2006). Hrafnkelsson and Stefa ´nsson (2004) presented extensions of the multinomial distribution to account for dis- persion and correlation in length measurement samples. To avoid the assumptions of independent observations, other authors used kriging to account for spatial correlation in the analysis of trawl and acoustic survey data (Rivoirard et al. 2000; Stelzenmu ¨ller et al. 2005). Estimating parameters us- ing kriging methods, however, requires that data follow a multivariate normal distribution, an assumption that usually is not fulfilled, at least not when part of the data consists of zeroes. The log(catch + constant) transformation is often ap- plied to avoid this problem, a solution which is problematic because the results depend heavily on the choice of the con- stant. Here we instead use a counting model to describe the discrete catch in number of observations (including the zero catch observations) and to account for the spatial correlation between catches. This model, the so-called Log Gaussian Cox Process, LGCP (K. Kristensen, unpublished; Møller et al. 1998; Diggle and Ribeiro 2007), is also known as the multivariate Poisson–lognormal distribution (Aitchison and Ho 1989) and is a mixture of Poisson-distributed observa- Received 20 November 2008. Accepted 22 June 2009. Published on the NRC Research Press Web site at cjfas.nrc.ca on 13 October 2009. J20889 Paper handled by Associate Editor Terrance Quinn II. P. Lewy 1 and K. Kristensen. National Institute of Aquatic Resources, Technical University of Denmark, Charlottenlund Castle, 2920 Charlottenlund, Denmark. 1 Corresponding author (e-mail: [email protected]). 1809 Can. J. Fish. Aquat. Sci. 66: 1809–1820 (2009) doi:10.1139/F09-114 Published by NRC Research Press

Upload: kasper

Post on 09-Oct-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modelling the distribution of fish accounting for spatial correlation and overdispersion

Modelling the distribution of fish accounting forspatial correlation and overdispersion

Peter Lewy and Kasper Kristensen

Abstract: The spatial distribution of cod (Gadus morhua) in the North Sea and the Skagerrak was analysed over a 24-yearperiod using the Log Gaussian Cox Process (LGCP). In contrast to other spatial models of the distribution of fish, LGCPavoids problems with zero observations and includes the spatial correlation between observations. It is therefore possibleto predict and interpolate unobserved densities at any location in the area. This is important for obtaining unbiased esti-mates of stock concentration and other measures depending on the distribution in the entire area. Results show that thespatial correlation and dispersion of cod catches remained unchanged during winter throughout the period, in spite of adrastic decline in stock abundance and a movement of the centre of gravity of the distribution towards the northeast in thesame period. For the age groups considered, the concentration of the stock was found to be constant or declining in the pe-riod. This means that cod does not follow the theory of density-dependent habitat selection, as the concentration of thestock does not increase when stock abundance decreases.

Resume : Nous avons analyse la repartition spatiale de la morue (Gadus morhua) dans la mer du Nord et le Skagerrak surune periode de 24 ans a l’aide du processus log-gaussien de Cox (LGCP). Contrairement aux autres modeles spatiaux derepartition des poissons, le LGCP evite les problemes relies a l’absence d’observations et inclut la correlation spatiale entreles observations. Il est, par consequent, possible de predire et d’interpoler les densites non mesurees dans n’importe quelsite dans la region. Cela est important pour obtenir des estimations non erronees de la concentration du stock et d’autresmesures qui dependent de la repartition dans la region toute entiere. Les resultats montrent que la correlation spatiale et ladispersion des captures de morues restent inchangees durant l’hiver pendant toute la periode, malgre un declin spectacu-laire de l’abondance du stock et un deplacement vers le nord-est du centre de gravite de la repartition durant la meme peri-ode. Pour les groupes d’age consideres, la concentration du stock est demeuree constante ou a decline durant la periode.Cela signifie que les morues ne se conforment pas a la theorie de la selection des habitats dependante de la densite, puis-que la concentration du stock n’augmente pas lorsque l’abondance du stock diminue.

[Traduit par la Redaction]

Introduction

Knowledge of the spatial distribution of fish and the tem-poral changes are important for the fishery and fishery man-agement and for understanding the mechanisms of fishbehaviour. The distribution of cod (Gadus morhua) hasbeen analysed in several studies (Perry et al. 2005; Rindorfand Lewy 2006). These analyses used a single point, thecentre of gravity, as an overall measure to describe changesin the spatial distribution. However, if we want to study thespatial distribution of stock abundance in the entire area, an-other type of modelling is required.

Previously, fishery or scientific survey data have been an-alysed assuming that observations are independent, irrespec-tive of trawl position, and distributed according to eitherextensions of the lognormal (Stefansson 1996) or negativebinomial distributions (O’Neill and Faddy 2003; Kristensenet al. 2006). Hrafnkelsson and Stefansson (2004) presentedextensions of the multinomial distribution to account for dis-

persion and correlation in length measurement samples. Toavoid the assumptions of independent observations, otherauthors used kriging to account for spatial correlation in theanalysis of trawl and acoustic survey data (Rivoirard et al.2000; Stelzenmuller et al. 2005). Estimating parameters us-ing kriging methods, however, requires that data follow amultivariate normal distribution, an assumption that usuallyis not fulfilled, at least not when part of the data consists ofzeroes. The log(catch + constant) transformation is often ap-plied to avoid this problem, a solution which is problematicbecause the results depend heavily on the choice of the con-stant. Here we instead use a counting model to describe thediscrete catch in number of observations (including the zerocatch observations) and to account for the spatial correlationbetween catches. This model, the so-called Log GaussianCox Process, LGCP (K. Kristensen, unpublished; Møller etal. 1998; Diggle and Ribeiro 2007), is also known as themultivariate Poisson–lognormal distribution (Aitchison andHo 1989) and is a mixture of Poisson-distributed observa-

Received 20 November 2008. Accepted 22 June 2009. Published on the NRC Research Press Web site at cjfas.nrc.ca on 13 October2009.J20889

Paper handled by Associate Editor Terrance Quinn II.

P. Lewy1 and K. Kristensen. National Institute of Aquatic Resources, Technical University of Denmark, Charlottenlund Castle,2920 Charlottenlund, Denmark.

1Corresponding author (e-mail: [email protected]).

1809

Can. J. Fish. Aquat. Sci. 66: 1809–1820 (2009) doi:10.1139/F09-114 Published by NRC Research Press

Page 2: Modelling the distribution of fish accounting for spatial correlation and overdispersion

tions with mean densities following a multivariate lognormaldistribution. The Poisson process can be regarded as thesampling process generated by the fishing process. The spa-tial correlation is included by assuming correlation betweendensities to be a decreasing function of the distance betweenthem.

The focus of Kristensen (unpublished) was to developmethods for and implementation of maximum likelihood(ML) estimation of the parameters in the LGCP, which hith-erto had been estimated by Markov chain Monte Carlo(Møller and Waagepetersen 2004). Aspects of predictionsand interpolation were not included. These aspects are cru-cial when estimating total biomass or the biomass in speci-fied areas, a prerequisite to evaluating the effects of spatialclosures and temporal changes of stock concentration.

The objective of this paper is to develop ML-based meth-ods for predicting the unobserved densities at any point inspace and to enable goodness-of-fit tests. The use of themodel was illustrated by an analysis of the distribution ofNorth Sea and Skagerrak cod in 1983–2006. The temporalchange in the dispersion and spatial correlation was exam-ined, and the effect of a range of local hydrographical pa-rameters was investigated. Contour plots of the spatialdistribution of age group 1 were produced by interpolation.The theory of density-dependant habitat selection as formu-lated by MacCall (1990) was investigated, i.e., if the spatialdistribution of a stock contracts (expands) when stock abun-dance decreases (increases). The analysis will be based onthe measure of concentration, D95 (Swain and Sinclair1994), calculated from interpolations of densities.

Statistical modelLet Xi be the catch in number from haul i with a known

position, let li be the unknown true density at the same po-sition, let a and d be dispersion parameters, and let b be aspatial correlation parameter. Further, let X = (X1, . . ., Xn)’be the vector of n catch samples covering the area, and letl = (l1, . . ., ln)’ be the corresponding true densities, where tdenotes the transposition of a matrix. It is assumed that thedurations of the hauls are the same.

The model considered is a compound Poisson distributionin which the conditional distribution of the catches, Xi, giventhe density, li, are independent Poisson-distributed variablesand l follows a multivariate lognormal distribution:

ð1Þ Xijli � PoissonðliÞ

h ¼h1

..

.

hn

0B@1CA ¼ lnðl1Þ

..

.

lnðlnÞ

0B@1CA � Nðm;SÞ

where

m ¼ m . . . ;m� �0zfflfflfflfflfflfflffl}|fflfflfflfflfflfflffl{n

The variance–covariance matrix S is defined by

Sii ¼ VARðhiÞ ¼ aþ d; a � 0 ^ d � 0

where a + d is the dispersion of the model. This implies thatE(li) = em+(a+d)/2, i = 1, . . ., n.

The covariance between two densities is assumed to be adecreasing function of distance between the haul positionssuch that it approaches zero when distance increases:

ð2Þ Sij ¼ COVðhi; hjÞ ¼ a � e�b�distði;jÞ þ d � Idistði;jÞ¼0

where I is the indicator function and dist(i, j) denotes thedistance (in kilometres) on the surface of a sphere betweenpositions i and j. The interpretation of the covariance model,eq. 2, is described below. The relationship between the dis-tance between two points (in kilometres) and the corre-sponding longitudes and latitudes (lon and lat) is

distði; jÞ ¼ distðloni; lati; lonj; latjÞ¼ 6378:1 � acos sin lati � c

� �� sin latj � c

� ��þ cos lati � c

� �� cos latj � c

� �� cos½ðloni � lonjÞ � c�g

where c = p/180 (Meeus 1998). The correlation between logdensities is

ð3Þ CORðhi; hjÞ ¼a

aþ de�b�distði;jÞ; i 6¼ j

If d is zero, the correlation is e–b�dist(i,j), independent of a.The model contains the four parameters q = {m, a, b, d} andthe unobserved random effects densities, h.

Differences in the duration of the hauls have been ignoredand are implicitly included in the small-scale, nugget effect(see below).

The interpretation of the modelFirst, the observed numbers caught in a haul given the

density is assumed to follow a Poisson distribution. Thisprocess is interpreted as the fishery sampling process, for in-stance, due to variation of the behaviour of the trawl or fishmovements.

Second, the densities in the sea are assumed to follow amultivariate lognormal distribution in which the correlationbetween densities is a decreasing function of the distancebetween them. The mean (E) and variance (V) of observa-tions in the LGCP are as follows:

E ¼ EðXiÞ ¼ EðliÞ ¼ emþðaþdÞ=2

V ¼ VðXiÞ ¼ VðXijliÞ þ VðliÞ ¼ E þ ðeaþd � 1ÞE2

where the last term in the equation is the variance of a lognor-mally distributed variable with log mean E and log variancea + d. If the variance of log densities, a + d, is positive, thenthe variance of X is greater than the variance of the Poissonprocess. Hence, a + d can be regarded as overdispersion para-meters relative to the Poisson process. The variance–mean re-lation of LGCP corresponds to that of the negative binomialdistribution, for which V = E + constant�E2.

Third, the covariance defined in eq. 2 consists of a sum ofthe two terms, which can be considered as large- and small-scale components, respectively, of the process. The large-scale component includes the large-scale variance a and theparameter b (‡ 0) measuring the strength of the spatial cor-relation. When b is small, the large-scale correlation be-

1810 Can. J. Fish. Aquat. Sci. Vol. 66, 2009

Published by NRC Research Press

Page 3: Modelling the distribution of fish accounting for spatial correlation and overdispersion

tween densities is high, and vice versa. The scaling of thecorrelation is measured by 1/b, which is the distance forwhich the spatial correlation is 0.37 (if d = 0). The small-scale variance is d, which corresponds to the so-called‘‘nugget’’ effect in geostatistics, which may, for instance, bedue to fish movements. The clear large-scale variation dueto ‘‘spatial’’ correlation is illustrated for the case in whichthe ‘‘nugget’’ effect is excluded (Fig. 1, bold line) and thesuperimposed small-scale variation due the ‘‘nugget’’ effect(Fig. 1, broken line) blurs the large-scale effect.

Predictions of unobserved densities at positions withobservations available

The likelihood function of X of the LGCP expressed as afunction of the parameters, q, is

ð4Þ LðqÞ ¼Z

PðX; q; hÞdh ¼Z

e�lðX;q;hÞdh

where l(X, q, h), the negative log-likelihood of (X, h) is

lðX; q; hÞ ¼ e

Pn

i¼1

hi

�Xn

i¼1

Xihi þ1

2lnðdetðSÞÞ

þ 1

2ðh� mÞtS�1ðh� mÞ þ n

2lnð2pÞ

þXn

i¼1

lnðXi!Þ

Laplace approximations have been used to calculate L(q)in eq. 4 for ML estimation of bq and to test hypotheses (K.Kristensen, unpublished data).

For the positions where observations are available, esti-mates bhqðXÞ of log densities h | X for given observations X

can be obtained by maximizing l(X, q, h) defined by eq. 4,i.e., bhqðX Þ ¼ arg max

h

lðX ; q; hÞ. As indicated, the estimate

depends on q and X. As the estimate of h, we usebhðXÞ ¼ bhqðXÞ.Let lðhjXÞ ¼ lðX;bq; hÞ denote the likelihood of h | X. The

distribution of h | X is now approximated by the normal distri-bution with mean bhðXÞ using a Taylor expansion of l(h | X):

lðhjXÞ � l�bhðXÞ� ffi 0:5 � ðh� bhðXÞÞt @2lðh j XÞ

@hi@hj

!�1

h¼hðXÞ

ðh� bhðXÞÞ¼ 0:5 � ðh� bhðXÞÞtðS�1 � DhðXÞÞðh� bhðXÞÞ

where

DhðXÞ ¼eh1ðXÞ 0

. ..

0 ehnðXÞ

0B@1CA

i.e., the distribution of h | X is approximated by the quadraticapproximation

ð5Þ hjX � NðbhðXÞ; ðS�1 þ DhðXÞÞ�1Þ

Calculations using ‘‘realistic’’ parameters indicate that thisis a good approximation to the true distribution.

Assuming that the approximation holds, the estimatorbhðXÞ equals E(h | X), which is the posterior minimum var-iance unbiased estimator of h for given X.

Using that

S ¼ VðhÞ ¼ VðEðhjXÞÞ þ EðVðhjXÞÞffi VðbhðXÞÞ þ EððS�1 þ DhðXÞÞ�1Þ

we find that

ð6Þ VðbhðXÞÞ ffi S� ðS�1 þ DhðXÞÞ�1

ffi Sq � ðSq�1 þ DhðXÞ;qÞ�1

Spatial interpolationBy analogy to the kriging method, the LGCP can be used

to spatially interpolate the density at positions where no ob-servations exist. The best unbiased prediction of any func-tion of the unobserved density is the conditional mean

Fig. 1. Simulation of a LGCP without a ‘‘nugget’’, the small-scalevariation effect (bold line), and the same process including a posi-tive ‘‘nugget’’ effect (broken line).

Lewy and Kristensen 1811

Published by NRC Research Press

Page 4: Modelling the distribution of fish accounting for spatial correlation and overdispersion

given the observations. In the analyses below, we assumethat the formulas of the conditional means and variancesare based on the true value of the parameters. In practice,the true values are replaced by the MLEs.

Assume that we want to predict the densities lnew =(l1, . . ., lm)’ = ðeh1 ; . . . ; ehmÞ0 for m new positions without ob-servations. First, the log densities hnew = log(hnew) =(h1, . . ., hm)’ are predicted.

According to eq. 1, the combined set of h and hnew aredistributed as

h

hnew

� � N

m

mnew

� ;

S S12

S12 S22

� �

where mnew ¼ m; . . . ;m� �0zfflfflfflfflfflfflffl}|fflfflfflfflfflfflffl{m

and S12 are defined similarly as S.We know that the conditional distribution of hnew | h is

normal with mean and variance

ð7Þ EðhnewjhÞ ¼ mnew þ S21S�1ðh� mÞ

ð8Þ VðhnewjhÞ ¼ S22 � S12S�1S12

By analogy with the predictions of h | X, we want to useE(hnew | X) for spatially interpolation. From the definition ofthe LGCP, the conditional distribution of X | hnew, h only de-pends on the densities in the points with observations h. Ac-cording to Bremaud (1999, p. 12), this implies that hnew andX are conditionally independent given h and, hence,E(hnew | h, X) = E(hnew | h). This implies that

ð9Þ EðhnewjXÞ ¼ EðEðhnewjh;XÞjXÞ ¼ EðEðhnewjhÞjXÞ¼ Eðmnew þ S21S

�1ðh� mÞjXÞ¼ mnew þ S21S

�1ðEðhjXÞ � mÞffi mnew þ S21S

�1ðbhðXÞ � mÞ

ð10Þ VðhnewjXÞ ¼ VðEðhnewjhÞjXÞ þ EðVðhnewjhÞjXÞffi S12S

�1VðhjXÞS12S�1

þS22 � S12S�1S12

ffi S12S�1ðS�1 þ DhðXÞÞS�1S12

þ S22 � S12S�1S12

Having determined log density hnew(X), the interpolatedvalues of the density lnew(X) = E(lnew | X) can be approxi-mated using a Gaussian posterior approximation based oneqs. 9 and 10:

ð11Þ EðlnewjXÞ ¼ expðEðhnewjXÞ þ diagðVðhnewjXÞÞ=2Þ

We also wish to predict nonlinear functions of lnew(X)such as the measure of stock concentration, D95, defined be-low. E(f(lnew)|X) = Eðf ðehnewÞjXÞ and the variance is calcu-lated by simulation by drawing 100 times from theGaussian posterior approximation based on eqs. 9 and 10and calculating the mean and variance of simulated valuesof f ðehnewÞ.

The spatial interpolation is performed on a regular fine-scaled grid. The scale should be chosen sufficiently fine toobtain a good approximation to the continuous random field.

Other distribution measuresThe ability of the LGCP to perform spatial interpolations

of the (unobserved) population densities makes it possible toobtain unbiased estimates of stock characteristics based ondensities in the entire space. A measure of stock concentra-tion is considered. The measure, Dx, introduced by Swainand Sinclair (1994), is defined as the proportion of the mini-mum area containing x% of the stock, i.e., D95true (say) is

D95true ¼ infAE

jAjjEj :

ZA

lðyÞdyZE

lðyÞdy

¼ 0:95

8>><>>:9>>=>>;

where E indicates the entire area in consideration, and A,any subarea of E.

If the area is divided into n equally sized subareas and lirepresents the density in subarea i, then D95 can be approxi-mated by

D95 ¼mþ 0:95�zðmÞ

zðmþ1Þ�zðmÞn

where

zðmÞ ¼

Pmi¼1

lðiÞPni¼1

li

; m n

where m fulfills z(m) £ 0.95 £ z(m + 1) and where l(i), i =1, . . ., n, are the densities sorted in descending order.

D95 is greater than zero and less than 0.95. D95 is inver-sely proportional to the stock concentration, i.e., the concen-tration of a stock increases when D95 decreases. D95approaches zero when concentration increases and equals0.95 if the density is constant in the entire space, i.e., whenthe concentration is minimal.

The validity of the theory of density-dependant habitat se-lection was investigated by comparing the relation betweenD95 and stock abundance for 1983–2006, for which theabundance drastically was reduced. According to the theoryformulated by MacCall (1990), individuals first occupy hab-itats with the highest suitability, but as realized suitability ofthese habitats declines due to increasing population density,other previously less suitable unoccupied habitat becomecolonized. Hence the distribution is characterized by spa-tially equal realized suitability. If the theory holds, D95should increase when the stock abundance increases.

Calculation of D95 has been based on predicted densitiesperformed on the regular 50 � 50 grid consisting of 808points as described above. This procedure ensures that anunbiased estimate of D95 is obtained (see Appendix A).

Analysis of residuals and goodness-of-fit testsThe residuals can be calculated as X � ehðXÞ. Maximizing

the log-likelihood l(X, q, h) (eq. 4) with respect to h showsthat bhðXÞ � m ¼ S�1ðX � ehðXÞÞand hence the quantityR ¼ ðbhðXÞ � mÞ can be regarded as residuals on a log scale,which is a linear transformation of the original residuals. Weprefer to apply these transformed residuals, which express

1812 Can. J. Fish. Aquat. Sci. Vol. 66, 2009

Published by NRC Research Press

Page 5: Modelling the distribution of fish accounting for spatial correlation and overdispersion

the deviation between predicted log density and the mean.To obtain standardized residuals of R, the variance of R isneeded:

VðRÞ ¼ VðbhðXÞ � mÞ ¼ S� EððS�1 þ DhðXÞÞ�1Þ

As the last term in the expression of the variance is notknown, we instead use R* = R + u, whereu � Nð0; ðS�1 þ DhðXÞÞ�1Þ and for which V(R*) = S asmodified residuals to circumvent this problem. We now as-sume that R* * N(0,S), and accordingly, U* = L–1R* isused as normal, standardized, and independent residuals,where L is the lower Choleski triangle of S.

Now assume that we want to examine if residuals are in-dependent of some specified spatial characteristics, for in-stance, the longitude and latitude. We know that R* isrelated to the longitude, whereas U* is not. However, if theabsolute value of (L–1)ij is decreasing as the distance be-tween the points i and j increases, then a specific elementU�i ¼

Xj

ðL�1ÞijR�j of U* only depends on the residuals

close to the specified observation. This implies that the re-siduals may be considered as area-specific residuals, whichwas applied to examine model deviations according to thelongitude and latitude.

Goodness-of-fit tests for validation of the model havebeen based on the estimated values of log density, bh, andthe MLE of the other parameters, bq. Two tests were consid-ered, T1 and T2.

T1 ¼ ðbhðXÞ � bmÞtVðbhðXÞÞ�1ðbhðXÞ � bmÞwhere the variance–covariance matrix is determined byeq. 6.

The second test is based on the Kolmogorov–Smirnov testquantity and the quantity U ¼ L�1

VðhÞðbh � bMÞT2 ¼ max

xjFUðxÞ � NðxÞj

where LVðhÞis the lower Choleski triangle of VðbhÞ, FU is theempirical distribution function of U, and N is the distribu-tion function of the normal distribution.

If log density h(X) for given observations X is normallydistributed, i.e., hjX � Nðbh;VðbhÞÞ, then T1 and T2, respec-tively, follow the c2 and Kolmogorov distributions. Eventhat the assumption of normality probably is reasonable, im-plying that the two distributions may be used as test proba-bilities, we instead simulate the exact distributions of T1 andT2. Firstly, estimate the parameters in the model and calcu-late T1 and T2. Secondly, simulate new sets of parametersfrom the normal distribution Nðbq; bSÞ 100 times using theparameters estimated. Thirdly, calculate T1 and T2 for eachof the 100 repetitions.

The first test should be two-sided, and the second, one-sided. Hence, the test probabilities p1 = P(T1 > T1,obs) andp2 = 2�min(q2, 1 – q2), where q2 = P(T2 ‡ T2,obs), were calcu-lated. The model is accepted if p is greater that 0.05. Thelikelihood ratio test was applied to test successive hypothe-ses regarding the parameters.

ApplicationThe LGCP was applied to cod catch rates from the Inter-

national Bottom Trawl Survey (IBTS) in the North Sea andthe Skagerrak in February 1983–2006. IBTS is coordinatedby the International Council for the Exploration of the Sea(ICES) and data are available on www.ices.dk/datacentre/datras/public.asp. The area is confined within 48W and138E longitude and 508N and 628N latitude. The period1983 and onwards was chosen because the coverage andsurvey gear standardization was better compared with thoseof previous years. For the first-quarter survey, the annualnumber of hauls lies between 322 and 534, with a mean of390. The area contains 186 statistical rectangles (18 longi-tude � 0.58 latitude), which were covered twice or more.The gear used is a bottom trawl and the haul positionswithin the rectangles are randomly selected among trawlableareas. The haul duration is, on average, 30 min, but in 12%of the hauls taken before 1999, the duration was about 1 h,which may introduce a bias.

The length of the cod caught was recorded and used todetermine age using age–length keys. The spatial distribu-tion using LGCP was studied for each of the age groups 1,2, and 3 years and older.

The hydrographical data (depth, bottom temperature, andsalinity by haul) were provided by ICES’ hydrographicaldatabase. Data for the stock numbers by year and age wereobtained from the ICES Working Group Report (ICES2006).

ResultsThe model was used separately for each combination of

age groups 1, 2, and 3+ and years 1983–2006, i.e., for 3 �24 = 72 combinations. First, the model LGCP was used toinvestigate if the position, depth, temperature, and salinitycould describe the variation of the catch-per-unit-effort(CPUE), i.e., for given age and year, it is assumed that

ð12Þ lnðEðXage;year;iÞÞ ¼ aþ polyðlonage;year;i; 2Þþ polyðlatage;year;i; 2Þ þ polyðdepthage;year;i; 2Þ

þ polyðtage;year;i; 2Þ þ polyðsalage;year;i; 2Þ

where i denotes the sample number for a given age andyear; ‘‘lon’’, the longitude; ‘‘lat’’, the latitude; depth, thebottom depth (in metres); t, the temperature (in 8C); ‘‘sal’’,the salinity (in ppm); ‘‘poly(�, 2), a second-degree poly-nomial; and a, a parameter. Why the covariates enter theright-hand side of eq. 10 as a second-degree polynomial isbecause this enables the existence of, for instance, a pre-ferred temperature with a decreasing preference when mov-ing away from the optimum. The assumption of a log-linearmean structure was made to ensure that mean CPUE re-mains positive.

MLEs of the parameters in eq. 12 and their confidence in-tervals obtained from the Hessian matrix were used to testthe significance of the parameters. For all years, age groups,and parameters, the confidence intervals contained zero. Onemore run with the model with the second-degree terms re-moved gave the same result. Hence, for all age groups andyears, it was concluded that none of the effects associatedwith the covariates was found to be significant, i.e., log of

Lewy and Kristensen 1813

Published by NRC Research Press

Page 6: Modelling the distribution of fish accounting for spatial correlation and overdispersion

the expected value, m, is constant throughout the area inde-pendent of any of the explaining covariates. For this modeland for all ages and years, the four parameters m, a, b, andd have been estimated.

Regarding the residual analysis, the elements of the in-verse Choleski, (L–1)ij, was plotted against the distance be-tween points for all three age groups and all 24 years. Forall 72 plots, the functional relation between the inverse Cho-leski and the distance is very similar. As an example, agegroup 1 in the middle of the period from 1983 to 2006, i.e.,1994, has been selected. The result is illustrated (Fig. 2a),showing that the absolute values of the inverse Choleski el-

ements actually decrease when the distance increases. It ap-pears that outside a circle of 100 km, the correspondingresiduals R* can be neglected, indicating that only residualsU* within the circles are correlated. The residuals U* wereplotted against longitude and latitude. No trend or system-atic pattern was found for any of the residual plots (plotsagain for age group 1 in 1994 are shown in the middle andlower panels of Fig. 2).

The validity of the model has been tested using bothgoodness-of-fit test statistics T1 and T2. Both tests resultedin the model not being rejected for all age groups and yearsusing a level of significance of 5%.

Fig. 2. (a) Plots of the relationship between the elements of the inverse of the lower Choleski triangle, L–1, and the distance between corre-sponding points and (b–c) residuals plotted against (b) longitude and (c) latitude for age group 1 in 1994.

Table 1. Estimated parameters and the 95% confidence limits (lower (L95%) and upper (U95%)) by age.

Age 1 Age 2a Age 3+b

L95% Mean U95% L95% Mean U95% L95% Mean U95%1/bc 179.1 242.0 327.0 66.2 87.2 114.5 44.8 58.9 77.3a 4.08 4.98 6.07 2.11 2.41 2.75 0.98 1.12 1.27d 1.26 1.42 1.58 1.02d 0.71d .

a1999, 2001, and 2005 excluded.b1983, 1984, 1988, 1994, 2001, and 2005 excluded.c1/b, the characteristic distance, is the distance in kilometres for which the correlation between log densities is 0.37.dexpðlogðdÞÞ.

1814 Can. J. Fish. Aquat. Sci. Vol. 66, 2009

Published by NRC Research Press

Page 7: Modelling the distribution of fish accounting for spatial correlation and overdispersion

For each age group separately, we tested the hypothesisthat the parameters a, b, and d remain constant over yearsusing the likelihood ratio test in the following way. Letly(q) = –log(Ly(q)) denote the likelihood function for year y,let bqy denote the MLE of the parameters, and letbHy ¼ Hy;q¼qy

denote the estimated Hessian matrix. For eachyear, we approximate the likelihood function with thesecond-order approximation, i.e.,

lyðqyÞ � lyðbqyÞ ffi ðqy � bqyÞ0 bHyðqy � bqyÞUsing this approximation, the simultaneous likelihood

function including all years can be approximated by

ð13Þ lðqÞ � lðbqÞ ffi ðq � bqÞ0 bHðq � bqÞwhere

ð14Þ bH ¼ bH1 0

. ..

0 bHn

0B@1CA and q ¼ ðq1; . . . ; qnÞ0

For the likelihood function approximated in eq. 13, linearhypotheses of the form q = Ab can be tested by the likeli-hood ratio test and using bb ¼ ðAt bHAÞ�1At bHbq. The homoge-neity of the parameters over years, i.e., qy = q for all y, wastested by setting A = (q, . . ., q)’.

Fig. 3. Contour plots for 1-year-old cod in the North Sea and the Skagerrak in 1983–2006 based on interpolation onto a 50 � 50 grid. Theshading scale indicates log density. (Figure 3 is concluded on the next page.)

Lewy and Kristensen 1815

Published by NRC Research Press

Page 8: Modelling the distribution of fish accounting for spatial correlation and overdispersion

For age group 1, a, b, and d were accepted to be constantfor all years using a likelihood ratio test (p = 0.29). For agegroup 2, a and b were accepted to be constant for all yearsexcept 1999, 2001, and 2005. The analyses of age groups 1and 2 are the most important because the main part of thedata consists of positive catches. This is in contrast to age3+ for which the zero proportion is large as it increasesfrom about 40% to about 60% of the hauls. Hence, the re-sults for these age groups should be treated with caution.For age 3+, 1983 clearly was an outlier, which was excludedfrom the analysis. For the remaining years, two levels of pa-rameter d appear to divide the years into two groups: 1984,1988, 1994, 2001, and 2005 for which the nugget effect, d,was not significantly different from zero, and the remaining

18 years for which d is larger than 0.07. For the latter18 years, a and b were accepted to be constant (p = 0.86)(the results are summarized in Table 1).

The characteristic distance 1/b, the large-scale a, and thesmall-scale variation d are decreasing by increasing age, in-dicating that both the spatial correlation and the overdisper-sion or patchiness declines for increasing age (Table 1).

Contour plots and D95 were calculated based on interpo-lated values of stock density for a regular 50 � 50 grid cov-ering the North Sea and the Skagerrak (confined within 48Wand 138E longitude and 508N and 628N latitude). This corre-sponds to areas of about 27 � 24 km2. The areas coveringland have been removed, which leaves us with a total of808 positions for which the densities should be predicted

Fig. 3 (concluded).

1816 Can. J. Fish. Aquat. Sci. Vol. 66, 2009

Published by NRC Research Press

Page 9: Modelling the distribution of fish accounting for spatial correlation and overdispersion

compared with the average of 390 observations available foreach of the years 1983–2006. We also tried the finer 70 � 70grid. The deviations between the two cases with respect toboth the mean density and the measure of concentrationmentioned below were less than 2%, indicating that the50 � 50 grid results in reliable estimates functions of lnew.

Contour plots are provided for age 1 (Fig. 3). Until 1997,age group 1 was mainly situated in the southern North Seaand the Skagerrak but has since changed such that a majorpart is situated in the Skagerrak. It should be noted that thisgeographical change of distribution is not in contradictionwith the unchanged concentration measured using D95.This may take place if, for instance, high-density areasgeographically change place. Similarly, for age group 2, thehigh-density area was the northern North Sea and theSkagerrak before 2002 but mainly in the Skagerrak there-after.

The validity of the theory of density-dependent habitatselection

Plots of D95 and the 95% confidence limits vs. abun-dance and year are shown for the age groups 1, 2, and 3+

(Fig. 4). For age 1, linear regression analysis indicates thatD95 is independent of stock abundance, whereas D95 seemsto decline slightly with increasing abundance for age 2 andolder. This means that although stock abundance is decreas-ing drastically during the period, the concentration remainsunaffected or decreases and accordingly the theory ofdensity-dependent habitat selection for cod in the North Seain February–March does not hold.

Discussion

The LGCP applied to analyse the spatial distribution offishery survey data is a flexible counting model that was ableto describe the spatial distribution of cod in the North Sea andthe Skagerrak. The model does not assume that observationsare independent, but accounts for possible spatial correlationand enables modelling of separate small- and large-scalevariations. Problems with zero catches are avoided by thediscreteness of the LGCP. A method for calculating residualsrelated to latitude and longitude enabling graphical valida-tion of the model has been developed, which makes it possi-ble to examine possible geographical deviations from the

Fig. 4. Minimum area occupied by 95% of the stock, D95, by age plotted against stock number for cod in the North Sea and the Skagerrakin 1983–2006 (bold lines) and 95% confidence limits (broken lines): (a) age group 1; (b) age group 2; and (c) age group 3+. The straightlines indicate linear regression lines.

Lewy and Kristensen 1817

Published by NRC Research Press

Page 10: Modelling the distribution of fish accounting for spatial correlation and overdispersion

model. Finally, two simulated exact tests have been formu-lated and implemented to perform goodness-of-fit tests.

One of the most important features of the LGCP intro-duced is the ability to predict and interpolate unobserveddensities at any location in the area independent of the sam-pling locations. This ability is important because it is thenpossible to obtain unbiased estimates of, for instance, thestock concentration in the area (see Appendix A) or the totalsum of individuals or biomass. The expected value of theposterior distribution E(hnew | X) is used as the basis for in-terpolation of the spatial distribution of the densities as it isa minimum variance estimator of hnew (Diggle and Ribeiro2007). Many authors (e.g., Møller et al. 1998) have usedMCMC to simulate the posterior mean, which has the ad-vantage that the estimates are unbiased. In the present paper,we have instead used a Gaussian approximation to theposterior distribution to estimate posterior means analyti-cally. Simulations indicate that this assumption is reasonable(K. Kristensen, unpublished data). The analytical approachhas the advantage that the convergence problems withMCMC for high dimensional data are avoided and the com-puter time is reduced. The interpolation by sampling fromthe posterior distribution technique may further be improvedusing fast Fourier transform and conditioning by kriging(Rue and Held 2005).

The spatial correlation and the large-scale variation of thecod distribution did not change in 1983–2006. This is re-markable in that the conditions of the stock in the same pe-riod drastically changed as cod abundance declined byabout 75% (ICES 2006), and the centre of gravity of theNorth Sea component of the stock moved about 200 kmnortheast (Rindorf and Lewy 2006). This indicates that spa-tial correlation and variance for cod in the North Sea andthe Skagerrak seem to be insensitive to major stock changesin the period.

The stability also applies to the concentration of the stock,which is either unchanged over time (age group 1) ordeclines slightly (age 2 and older). This implies that thetheory of density-dependent habitat selection or otherdensity-dependent theories do not apply to cod in the NorthSea and the Skagerrak in winter. This result is in contrast tothe results of Blanchard et al. (2005), who analysed datafrom the English Groundfish Survey in the summer(August–September). The conflicting results may be due todifferences in the behaviour of cod in the winter andsummer or it could be caused by bias in the estimation ofD95 using raw or smoothed data, especially for small meancatch rates (see Appendix A).

From the point of view of fishery management, it is cru-cial that the concentration does not increase with decliningabundance. Other things being equal, this means that themean catch rates will not be retained in the commercial fish-ery when cod abundance declines. If a concentration tookplace, it could lead to an overestimation of the stock size,as was the case for cod off Newfoundland (Hutchings 1996;Atkinson et al. 1997).

Analyses of possible relations between local cod occur-rence and local hydrographical parameters such as tempera-ture, salinity, depth, latitude, and longitude, etc., showedthat none of the variables affected the cod distribution. Es-pecially, this means that there was no evidence that adult

cod locally move to avoid high or low temperature in thewinter, for which the range of temperature is –1 to 9 8C.This is in agreement with the results of Rindorf and Lewy(2006) that the centre of gravity for adult fish was not af-fected by average temperature and wind.

The effect of the spatial distribution of the fishery on dis-tribution of the stock is not included in the analyses becauseof lack of data. If bycatch and discard of age group 1 is lim-ited, the effect is of minor importance as the fishing mortal-ity rate for trawl and gill net fishery is only about half of thenatural mortality in the period. For fish 2 years old andolder, the effect may be important as the fishing mortalityis about 50% greater than the natural mortality in the period(ICES 2006).

The interpolated densities indicate that age group 1shifted from being located mainly in the southern North Seaand the Skagerrak to being mainly in the Skagerrak. This in-dication of temporal correlation would be valuable to incor-porate into the model. If such a model with positivetemporal correlation were accepted, it would enable annualor seasonal predictions of the spatial distribution of fishstocks.

In conclusion, LGCP is a flexible model of the spatial dis-tribution of fish accounting for spatial correlation betweendensities and avoiding problems with zero observations. It istherefore possible to interpolate the densities at any locationin the area, which could be used, for instance, in connectionwith evaluation of the effects of closed areas. This modelcan be used to test the significance of relations between fishoccurrence and hydrographical or climatic factors.

AcknowledgementsThe work was partly funded by the Danish National proj-

ect SUNFISH. We thank A. Nielsen, A. Rindorf, and tworeferees for helpful comments that improved the paper con-siderably.

ReferencesAitchison, J., and Ho, C.H. 1989. The multivariate Poisson – log

normal distribution. Biometrika, 76(4): 643–653. doi:10.1093/biomet/76.4.643.

Atkinson, D.B., Rose, G.A., Murphy, E.F., and Bishop, C.A. 1997.Distribution changes and abundance of northern cod (Gadusmorhua), 1981–1993. Can. J. Fish. Aquat. Sci. 54(Suppl. 1):132–138. doi:10.1139/cjfas-54-S1-132.

Blanchard, J.L., Mills, C., Jennings, S., Fox, C.J., Rackham, B.D.,Eastwood, P.D., and O’Brien, C.M. 2005. Distribution–abundance relationships for North Sea Atlantic cod (Gadusmorhua): observation versus theory. Can. J. Fish. Aquat. Sci.62(9): 2001–2009. doi:10.1139/f05-109.

Bremaud, P. 1999. Markov chains, Gibbs fields, Monte Carlo simu-lation, and queues. Springer, New York.

Diggle, P.J., and Ribeiro, P.J., Jr. 2007. Model-based geostatistics.Springer, New York.

Hrafnkelsson, B., and Stefansson, G. 2004. A model for categoricallength data from groundfish surveys. Can. J. Fish. Aquat. Sci.61(7): 1135–1142. doi:10.1139/f04-049.

Hutchings, J.A. 1996. Spatial and temporal variation in the densityof northern cod and a review of hypotheses for the stock’s col-lapse. Can. J. Fish. Aquat. Sci. 53(5): 943–962. doi:10.1139/cjfas-53-5-943.

1818 Can. J. Fish. Aquat. Sci. Vol. 66, 2009

Published by NRC Research Press

Page 11: Modelling the distribution of fish accounting for spatial correlation and overdispersion

International Council for the Exploration of the Sea. 2006. Reportof the Working Group on the Assessment of Demersal Stocks inthe North Sea and Skagerrak. ICES ACFM:35.

Kristensen, K., Lewy, P., and Beyer, J.E. 2006. How to validate alength-based model of single species fish stock dynamics. Can.J. Fish. Aquat. Sci. 63(11): 2531–2542. doi:10.1139/F06-135.

MacCall, A.D. 1990. Dynamic geography of marine fish popula-tions. Washington Sea Grant Program, Seattle, Washington.

Meeus, J. 1998. Astronomical algorithms. Willmann-Bell Inc.,Richmond, Virginia.

Møller, J., and Waagepetersen, R.P. 2004. Statistical inference andsimulation for spatial point processes. Chapmann & Hall/CRC,Boca Raton, Florida.

Møller, J., Syversveen, A., and Waagepetersen, R. 1998. Log Gaus-sian Cox processes. Scand. J. Stat. 25(3): 451–482. doi:10.1111/1467-9469.00115.

O’Neill, M.F., and Faddy, M.J. 2003. Use of binary and truncatednegative binomial modelling in the analysis of recreational catchdata. Fish. Res. 60(2-3): 471–477. doi:10.1016/S0165-7836(02)00101-7.

Perry, A.L., Low, P.J., Ellis, J.R., and Reynolds, J.D. 2005. Climatechange and distribution shifts in marine fishes. Science (Wash-ington, D.C.), 308(5730): 1912–1915. doi:10.1126/science.1111322.

Rindorf, A., and Lewy, P. 2006. Warm, windy winters drive codnorth and homing of spawners keeps them there. J. Appl. Ecol.43(3): 445–453. doi:10.1111/j.1365-2664.2006.01161.x.

Rivoirard, J., Simmonds, J., Foote, K.G., Fernandes, P., and Bez,N. 2000. Geostatistics for estimating fish abundance. BlackwellScience, Oxford, UK.

Rue, H., and Held, L. 2005. Gaussian Markov random fields: the-ory and applications. Chapman & Hall/CRC Boca Raton, Flor-ida.

Stefansson, G. 1996. Analysis of groundfish survey abundancedata: combining the GLM and delta approaches. ICES J. Mar.Sci. 53(3): 577–588. doi:10.1006/jmsc.1996.0079.

Stelzenmuller, V., Ehrich, S., and Zauke, G.-P. 2005. Effects ofsurvey scale and water depth on the assessment of spatial distri-bution patterns of selected fish in the northern North Sea show-ing different levels of aggregation. Mar. Biol. Res. 1(6): 375–387. doi:10.1080/17451000500361009.

Swain, D.P., and Sinclair, A.F. 1994. Fish distribution and catch-ability: what is the appropriate measure of distribution? Can. J.Fish. Aquat. Sci. 51(5): 1046–1054. doi:10.1139/f94-104.

Appendix AWhen calculating the index related to stock concentration,

D95, one has to ensure that the estimate is nonbiased. Theamount of bias of D95 will be examined here assuming thatdata follow the LGCP and that the estimation of D95 isbased on the estimated LGCP parameters and interpolationonto a 50 � 50 grid. Further, it will be shown that usingthe raw or smoothed observations as the basis for estimatingD95 may result in biased estimates for small values of themean catch rate. Finally, it will be demonstrated that D95 isclosely related to the dispersion and spatial correlation ofdata. The simulation experiments are performed as follows.

Estimation of D95

D95 estimated based on LGCP predictionsAn area confined by longitudes 08E to 108E and latitudes

508N to 608N has been considered for which the maximum

distance between the corners is 1280 km. For a regular 51 �51( = 2601) grid with longitudes (08, 0.28, . . ., 108) and lat-itudes (508, 50.28, . . ., 608), one realization of the densities,l = (l1, . . ., l2601)’, for the 2601 grid points has been simu-lated assuming that they follow a LGCP with known param-eters m (the common log density), a (the overdispersion),and b (the spatial correlation parameter). A nugget effect isnot included. The simulations are performed by first calcu-lating the distances between the 2601 points and, based onthat, the variance–covariance matrix, S, using the knownparameters a and b and eq. 2. Then the 2601 log densities,h = (h1, . . ., h2601)’, are simulated by randomly drawing fromthe multivariate normal distribution, N(M,S) where

M 0 ¼ ðlogðmÞ � a=2 ; . . . ; logðmÞ � a=2Þzfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{2601

The densities are then l ¼ eh ¼ ðeh1 ; . . . ; eh2601Þ0 for whichE(li) = m, i = 1, . . ., 2601. Based on the 2601 values of l,D95 has been calculated. For the selected values of the pa-rameters, it has been shown that a 51 � 51 grid is sufficientto obtain an estimate of the true D95 with the error less than0.01. Hence we consider this estimate, D95true, as the trueD95 for the realized distribution of densities.

We now simulate the catches X = (X1, . . ., X121)’ on the 11� 11 (= 121) grid with longitudes (08, 18, . . ., 108) and lati-tudes (508, 518, . . ., 608), which is a regular subset of the 51� 51 grid. This grid approximately corresponds to one haulbeing taken in a statistical square of 60 � 60 nautical miles,which is a rough grid with a poor covering of the area con-sidered. The catches are simulated by randomly drawingfrom the independent Poisson distributions with means equalto the corresponding subset, l11�11, of l. From the simulatedcatches, X, the estimates of the parameters, bm, ba, and bb havebeen obtained by ML, and based on that, the predictions onthe 51 � 51 grid of the log densitiesbhpredicted ¼ ðbh1; . . . ;bh2601Þ0 and the variance have been calcu-lated using eqs. 9 and 10. In principle, an estimate ofblpredicted ¼ expðbhpredicted þ ba=2Þ could be obtained, but asthis may be seriously biased (Aitchison and Brown 1976),we instead simulate an unbiased estimate by (1) drawingfrom the multivariate normal distributionbhsim ¼ Nðbhpredicted;VARðbhpredictedÞÞ, (2) calculatingbD95sim ¼ D95ðexpðbhsimÞÞ, and (3) repeating steps 1 and 21000 times and calculating the bD95LGC equal to the meanof bD95sim. The possible bias of bD95LGC has been calculatedas bD95LGC=D95true � 1.

D95 estimates based on the observationsAs some authors use the observations or smoothed value

of these as basis for estimating D95 (Swain and Sinclair1994; Atkinson et al. 1997; Blanchard et al. 2005), we alsoexamine the possible bias by calculating D95observations(X)based on the (raw) 121 observations and two alternativesbased on smoothed values. The first estimate, D9581(X),based on smoothed observations is obtained by dividing thelatitude and longitude ranges [0, 10] and [50, 60], definingthe area considered into nine intervals each and calculatingthe mean of the observations in each of the 9 � 9 (= 81)rectangles defined. Correspondingly, D9525(X) is obtainedby dividing the area into five intervals.

Lewy and Kristensen 1819

Published by NRC Research Press

Page 12: Modelling the distribution of fish accounting for spatial correlation and overdispersion

The relationship between D95 and the spatial correlationand the dispersion

To examine the above relationship, D95true has been cal-culated for a range of values of 1/b with fixed dispersion a,and vice versa.

Results: estimation of D95To examine the effect of varying mean catch rate, the fol-

lowing sets of simulations have been performed for fixedvalues of a and b for the following values: a = 1.5, b =0.005, catch rates = (0.25, 0.4, 0.50, 0.75, 1, 1.5, 2, 3, 5, 7,8, 10), i.e., m = ln(catch rate).

For b = 0.005, the characteristic spatial correlation dis-tance 1/b equals 200 km, the distance at which the spatialcorrelation equals 0.37. For the case considered, the propor-tion of the points for which the distance is less than 200 kmis about 10%, indicating that observations are available forestimating the parameters in the model. The coefficient ofvariation of the densities l is

ffiffiffiffiffiffiffi1:5

p= 1.22.

The results of the simulations are illustrated, showing therelationship between the relative bias of estimates of D95and the mean catch rate (Fig. A1). This figure shows that ingeneral, bD95LGC is the least biased estimator of D95 and thatthe bias is less than 5% for mean catch rates larger than 1.For mean catch rates less than 1, the relative bias is lessthan 15%. The smoothed estimate, D9581 is the second-bestestimator, with a relative bias less than 10% for mean catchrates larger than 2. For mean catch rates less than 2, how-ever, the relative bias is tremendous, up to about – 60%. Ingeneral, the bias of D9525 seems to be positive, up to 20%.For small values of mean catch rates, the bias is still limited,below 10%. D95observations is negatively biased, and the biasis especially huge (up to –60%) for small values of themean catch rate.

We conclude that the LGCP estimator, bD95LGCP, is thebest estimator. Smoothing of the observations may lead tosatisfactory D95 estimates for mean catch rates larger than1 or 2. Raw observations should not be used for estimationof D95.

Results: the relationship between D95 and spatialcorrelation and dispersion

The results are illustrated showing the relationship be-tween D95 and 1/b for fixed values of the mean catch rateof 10 and a = 1.5 (Fig. A2a), as well as the relationship be-tween D95 and a for a mean catch rate of 10 and b = 0.02(Fig. A2b). The figure shows that D95 depends on the spa-tial correlation, the dispersion, and the log densities. Hence,changes in the concentration of a stock (1 – D95/0.95) maybe caused by changes either in the dispersion or the spatialcorrelation or in both stock characteristics. The quantity 1 –D95/0.95 is a measure of the concentration, which lies be-tween 0 and 1.

ReferencesAitchison, J., and Brown, J.A.C. 1976. The lognormal distribution.

Cambridge University Press, Cambridge, UK.Atkinson, D.B., Rose, G.A., Murphy, E.F., and Bishop, C.A. 1997.

Distribution changes and abundance of northern cod (Gadusmorhua), 1981–1993. Can. J. Fish. Aquat. Sci. 54(S1): 132–138. doi:10.1139/cjfas-54-S1-132.

Blanchard, J.L., Mills, C., Jennings, S., Fox, C.J., Rackham,B.D., Eastwood, P.D., and O’Brien, C.M. 2005. Distribution–abundance relationships for North Sea Atlantic cod (Gadusmorhua): observation versus theory. Can. J. Fish. Aquat. Sci.62(9): 2001–2009. doi:10.1139/f05-109.

Swain, D.P., and Sinclair, A.F. 1994. Fish distribution and catch-ability: what is the appropriate measure of distribution? Can. J.Fish. Aquat. Sci. 51(5): 1046–1054. doi:10.1139/f94-104.

Fig. A1. Relative bias of estimated D95 versus mean catch rate:bold solid line, estimates based on predictions using the LGCP; so-lid line, estimates based on raw catch observations; broken lines,estimates based on smoothed catch observations. See text in theappendix.

Fig. A2. The relationship between D95 and (a) 1/b and (b) the var-iance of log density a.

1820 Can. J. Fish. Aquat. Sci. Vol. 66, 2009

Published by NRC Research Press