vpa-2box model diagnostics used in the …...vpa-2box model diagnostics used in the 2012 assessment...
TRANSCRIPT
SCRS/2013/030 Collect. Vol. Sci. Pap. ICCAT, 70(1): 190-201 (2014)
VPA-2BOX MODEL DIAGNOSTICS USED IN THE 2012 ASSESSMENT OF WESTERN ATLANTIC BLUEFIN TUNA (THUNNUS THYNNUS)
Shannon L. Cass-Calay and Matthew Lauretta1
SUMMARY This report documents the model diagnostics used in the assessment of West Atlantic bluefin tuna during 2012. Model diagnostics included model convergence statistics, sensitivity analyses of abundance indices and life history assumptions, and bootstrap analysis to evaluate model robustness and estimate central tendency and variance of parameter estimates. The usefulness of these diagnostics in evaluating model performance during the stock assessment is discussed, and recommendations for future improvement are provided.
RÉSUMÉ
Le présent rapport documente les diagnostics du modèle utilisés dans l'évaluation du thon rouge de l’Atlantique Ouest en 2012. Les diagnostics du modèle incluaient les statistiques de convergence du modèle, les analyses de sensibilité des indices d'abondance et les postulats sur le cycle vital et les analyses par bootstrap pour évaluer la solidité du modèle et estimer la tendance centrale, ainsi que la variance des estimations des paramètres. L'utilité de ces diagnostics dans l'évaluation des performances du modèle pendant l'évaluation des stocks est discutée et des recommandations sont formulées en vue d'améliorations futures.
RESUMEN
Este trabajo documenta los diagnósticos del modelo utilizados en la evaluación de atún rojo del Atlántico oeste durante 2012. Los diagnósticos del modelo incluían estadísticas de convergencia del modelo, análisis de sensibilidad de índices de abundancia y supuestos de ciclo vital y análisis de bootstrap para evaluar la robustez del modelo y estimar la tendencia central y la varianza de estimaciones de parámetros. Se debatía la utilidad de los diagnósticos a la hora de evaluar el funcionamiento del modelo durante la evaluación de stocks y se facilitaban recomendaciones para futuras mejoras.
KEYWORDS
Stock assessment, Model diagnostics, Bluefin tuna
1 NOAA Fisheries, Southeast Fisheries Science Center, Sustainable Fisheries Division, 75 Virginia Beach Drive, Miami, FL, 33149-1099, USA. E-mail: [email protected]
190
1. Introduction A number of model diagnostic tools were used in the 2012 assessment of West Atlantic bluefin tuna (Thunnus thynnus) to evaluate VPA-2BOX model performance and assess the robustness of stock biomass and fishing mortality estimates. These diagnostics were important for determining the validity of model estimates, given the assumptions of the data and life-history parameters of the stock. The diagnostics tools are outlined below, and figures are included to demonstrate the approach. 2. Methods Following is a list of the diagnostic tools used during the 2012 assessment of West Atlantic bluefin tuna. Relative Abundance Indices
A summary table was constructed to evaluate the appropriateness and validity of individual abundance indices based on predetermined criteria provided by ICCAT.
Plots were constructed to visually compare model fit to the standardized relative abundances. Individual outliers that were believed to be false signals of stock abundance trends (i.e. CAN GSL 2010
index) were removed to determine the influence of those data points on VPA results. Indices jackknife: Individual indices were iteratively removed from the VPA model to determine the
most influential data series and the effects of excluding individual indices on model results. Model Convergence
Model convergence statistics provided within the VPA-2BOX program were examined to verify model performance.
Model estimates were compared across a range of parameter starting values to assess model solution robustness.
Model Sensitivity Runs Selectivity sensitivity: Model sensitivity to selectivity assumptions were evaluated through alternative
model runs that included no constraints applied to selectivity schedules, and constraints applied to selectivity standard deviations and across different age classes. SSB, apical fishing mortality and recruitment estimates were compared for all selectivity sensitivity model runs to determine the effect of those assumptions on model results.
Maturity Sensitivity: Three maturity schedules were evaluated to determine the effects of stock maturity assumptions on VPA results. Alternative maturity assumptions included early maturation (i.e. eastern Atlantic oogive with 50% maturity at age 4 and full maturity of older age classes), late maturation (i.e. approximately logistic increase in maturity from 0 at age 8 to 1 at age 16), and intermediate maturation (fully mature at age 9). SSB and fishing mortality estimates under each alternative were compared.
Retrospective analysis: Annual indices and catch information were removed iteratively from the most recent year to up to five years prior to determine the influence of data from recent years on stock biomass and fishing mortality estimate trends.
Recruitment Sensitivity: Two alternative hypotheses of the functional form of the stock-recruitment relationship were modeled to determine model sensitivity to recruitment assumptions. Stock status future projections were compared between the alternative recruitment assumptions.
Bootstrap Iterations
500 bootstrap iterations of the VPA model were conducted to determine model stability and variation in SSB and fishing mortality estimates. Histograms of the bootstrap estimates were constructed to examine bias and normality of estimates across bootstraps.
191
Results and Discussion A summary table of abundance indices diagnostics (Table 1) constructed by ICCAT and filled in by Cooperating Parties was very useful for cross-indices comparisons of data assumptions and standardization methods. Plotting model fits to indices provided a visual comparison of model performance between the current assessment and the previous one (Figure 1), as well as between the base model and select sensitivity runs (Figure 2). Model convergence diagnostics Stock biomass and fishing mortality estimates across various sensitivity analyses indicated that historical stock trend patterns were fairly robust to the assumptions of parameter starting values, life-history parameters, particularly maturity assumptions (Figure 3), as well as selectivity assumptions (Figure 3) and the removal of individual abundance indices from the jack-knife analysis (Figure 4). Retrospective analyses indicated that model results were not largely influenced by the removal of indices and catch data from recent years. Contrastingly, the alternative hypotheses of the stock-recruitment relationship had a large effect on the perception of current stock status (Figure 5) and in the stock biomass projections (Figure 6). Bootstrap resampling provided a measure of model estimate uncertainty and histograms of stock biomass and fishing mortality estimates across bootstrap iterations demonstrated good central tendency and approximate normality of estimates. From these bootstraps, it was possible to estimate confidence intervals around parameter estimates (Figures 7 and 8) to characterize model uncertainty. Recommendations
192
Table 1. Summary table to evaluate the available Atlantic bluefin abundance indices. Reproduced from the 2012 Atlantic bluefin tuna stock assessment report. Fishery Independent Fishery Independent
SCRS doc SCRS/2012/100 SCRS‐12‐103 SCRS/2012/111 SCRS/2012/124 SCRS/2012/131 SCRS/2012/158 SCRS/2012/160 SCRS/2012/164 SCRS/2012/118 SCRS/2012/118Index Bay of Biscay Baitboat Morocco and Spanish traps Spanish traps Juvenile western Med Japanese LL US rod and reel US LL Larval survey southern Gulf of St.
LawrenceSouthwest Nova Scotia
DiagnosticsMost of the appropriate diagnostics appear to be included
Most of the appropriate diagnostics appear to be included
Most of the appropriate diagnostics appear to be included No diagnotics presented Yes diagnotics presented Few diagnostics presented
Presented ‐ Some Deviation for expectations
Most of the appropriate diagnostics appear to be included
all the appropriate diagnostics were included
all the appropriate diagnostics were included
Appropriateness of data exclusions and classifications (e.g. to identify targeted trips).
data exclusions/classifications are listed and justified, specific targeting factors included in standardisation
data exclusions not discussed, targetting not an issue
data exclusions not discussed, although data is classified
data described and method clearly explained with caveats and limitations
data exclusions are clearly identified and justified, alternate CPUE runs are attempted using additional exclusions. GLM includes factors that could be considered proxies for targetting
Data exclusions, if any are not mentioned. GLMM specifically includes a target factor
Data exclusions are described. Vessel exclusions applied to limit trips to those that caught BF in >= 2 years.
data collection method clearly explained, as is a survey, presumabely few data exclusions
data exclusions are indicated, classifications appropriate
data exclusions are indicated, classifications appropriate
Geographical Coverage Geographical coverage is limited to bay of Biscay, maps are provided
Coverage limited to the straits of Gibraltar
Coverage limited to the straits of Gibraltar
coverage limited to Med. Maps of surveys provided
covers west and north‐east Atlantic. Distribution maps are provided
Northeast US coast only, no distribution maps provided U.S. GOM
coverage limited to Med. No maps of surveys provided
coverage limited to Nafo area 4T
coverage limited to Nafo area 4X
Catch Fraction catch fraction is roughly 5% ? Not clear appears to be small NA significant ? Not discussed. NA
Length of Time Series relative to the history of exploitation.
longer series (starts in 1952) ; shorter series (starts in 2000)
time series starts at beginning of the 1980s
time series starts at beginning of the 1980s Starts 2000
for the west ; for the northeast
time series starts at beginning of the 1980s 1987‐2010 since 2001
since 1981, exploitation began in 1972‐73 since 1988
Are other indices available for the same time period? Yes (5) Yes (3) Yes (3) Yes (2) Yes (3) Yes (3) Yes (3) Yes (1) no no
Does the index standardization account for known factors that
influence catchability/selectivity?
analysis includes many factors that could affect fishing efficiency/selectivity. Multiple interactions included
factors included in the model, table 1, are not explained in the text and impossible to understand for those not immediately familiar with the fishery. It would appear only one factor was included that could influence catchability ‐ trap
due to the nature of the fishery, only the trap factor could explain catchability but the factors were explained. GLM contains only factors
Methodology for standardisation of the series appears to be appropriate for a survey
gear type is included as is a selectivity proxy. No interactions included
Model includes multiple factors that could influence cathability and selectivity
Methodology for standardisation of the series appears to be appropriate for a survey
factors are month, fleet, gear and hours fished
factors are month, fleet, gear and hours fished
Are there conflicts between the catch history and the CPUE
response?No conflict noted No conflict noted No conflict noted No conflict noted No conflict noted No conflict noted
Yes. Circle hooks, weak hooks and regulatory measures were discussed. The index was not adjusted for these factors. No conflict noted
response sensitive to management measures and shrinking quotas
response sensitive to management measures and shrinking quotas
Is the interannual variability within plausible bounds (e.g.
SCRS/2012/039) variability increases over the latter years of the series
there is a high degree of variability, but no formal tests conducted to see whether this is biologically plausible
There is a high degree of variability between the beginning of the series and the final few years, but no formal tests conducted to see whether this is biologically plausible
annual variability higher for west atlantic base case cpues, but northeast cpue has extreme increase in most recent years Unknown
variability does not impair interpretation of the trend. Increased variability in recent times.
variability does not impair interpretation of the trend. Increased variability in recent times.
Are biologically implausible interannual deviations severe? (e.g. SCRS/2012/039) moderate No tests condcuted No tests condcuted Unknown
deviations relate to known impacts
deviations relate to known impacts
Assessment of data quality and adequacy of data for
standardization purpose (e.g. sampling design, sample size,
factors considered)
information does not include length frequencies of catches in recent years. Multiple factors and interactions included. Model design takes into account effort distribution. Discussions of data quality touched on
document states LF data was recorded, but it is not presented. Document states series applied to spawners 10+, model is extremely low on factors
data is discussed and method has been adjusted from prior studies accordingly. Paper provides size distribution of sampled catch
data is presented and limitations of the method discussed
information includes length frequencies of catches. Multiple factors included. Sample design and sensitivity runs investigate effort distribution as well as data assumptions/concerns and effort is presented
information includes length frequencies of catches. Multiple factors included. Sample design and sensitivity runs investigate alternative data assumptions/concerns
data is presented and methodolgy for standardisation explicitly presented. Factors appear to be appropriate
data is presented and methodolgy for standardisation explicitly presented. Factors appear to be appropriate for a survey
includes trends in forage fish and recent changes in environmental variables. Shows weight frequencies and trends in condition.
includes trends in forage fish and recent changes in environmental variables. Shows weight frequencies and trends in weight.
Is this CPUE time series continuous?
Yes Yes Yes Yes Yes
No missing points, although series have been split into time periods Yes. yes yes
193
1970 1980 1990 2000 2010
01
23
45
6
CAN_GSL
1970 1980 1990 2000 2010
0.0
1.0
2.0
3.0
CAN_SWNS
1970 1980 1990 2000 2010
0.0
0.5
1.0
1.5
2.0
US_RR_0_145
1970 1980 1990 2000 2010
0.0
1.0
2.0
US_RR_66_114
1970 1980 1990 2000 2010
0.0
1.0
2.0
US_RR_115_144
1970 1980 1990 2000 2010
0.0
1.0
2.0
US_RR_195plus
1970 1980 1990 2000 2010
0.0
1.0
2.0
3.0
US_RR_177plus
1970 1980 1990 2000 2010
0.0
0.5
1.0
1.5
2.0
2.5
JPN_LL_AREA2
1970 1980 1990 2000 2010
01
23
45
6
US_LARVAL
1970 1980 1990 2000 2010
0.0
1.0
2.0
3.0
US_LL_GOM
1970 1980 1990 2000 2010
0.0
0.5
1.0
1.5
JPN_LL_GOM
1970 1980 1990 2000 2010
0.0
1.0
2.0
TAGGING
Year
Ind
ex
of A
bu
nd
an
ce
Figure 1. Fits to the CPUE indices for 2012 western Atlantic BFT continuity VPA (observed shown as black points, predicted shown as blue lines), compared the 2010 base model (observed shown as open points, predicted shown as red lines). Reproduced from the 2012 Atlantic bluefin tuna stock assessment report.
194
1970 1980 1990 2000 2010
0.0
1.0
2.0
3.0
CAN_GSL
1970 1980 1990 2000 2010
0.0
1.0
2.0
CAN_SWNS
1970 1980 1990 2000 2010
0.0
0.5
1.0
1.5
2.0
US_RR_0_145
1970 1980 1990 2000 2010
0.0
1.0
2.0
US_RR_66_114
1970 1980 1990 2000 2010
0.0
1.0
2.0
US_RR_115_144
1970 1980 1990 2000 2010
0.0
1.0
2.0
US_RR_195plus
1970 1980 1990 2000 2010
0.0
1.0
2.0
3.0
US_RR_177plus
1970 1980 1990 2000 2010
0.0
1.0
2.0
3.0
JPN_LL_AREA2
1970 1980 1990 2000 2010
01
23
45
6
US_LARVAL
1970 1980 1990 2000 2010
0.0
1.0
2.0
3.0
US_LL_GOM
1970 1980 1990 2000 2010
0.0
0.5
1.0
1.5
JPN_LL_GOM
1970 1980 1990 2000 2010
0.0
1.0
2.0
TAGGING
Year
Ind
ex
of A
bu
nd
an
ce
Figure 2. Fits to the CPUE indices (black points) for western Atlantic BFT base VPA runs without the Canadian GSL (red lines) and U.S RR >177 cm (turquoise lines) indices compared to the 2012 continuity model (dark blue lines). Reproduced from the 2012 Atlantic bluefin tuna stock assessment report.
195
Selectivity Sensitivity Models Maturity Sensitivity Models
Figure 3. Annual estimates of spawning stock biomass (SSB), depletion with regard to 1970 (SSB/SSB1970), apical fishing mortality and recruitment for the VPA continuity and select sensitivity runs. Sensitivity run 2 displayed poor model behavior (e.g. Apical F = 5.0 in 2011 – the upper bound on F allowable in VPA). Reproduced from the 2012 Atlantic bluefin tuna stock assessment report.
196
1970 1980 1990 2000 2010
010
000
3000
050
000
1970 1980 1990 2000
0e+
002e
+05
4e+
05
2000 2002 2004 2006 2008 2010
1000
014
000
1800
022
000
2000 2002 2004 2006 2008
010
0000
2000
0030
0000
ContinuityCAN_GSL_RemovedCAN_SWNS_RemovedUS_RR_66_114_RemovedUS_RR_115_144_RemovedUS_RR_145less_RemovedUSRR_177plus_RemovedUS_RR_195plus_RemovedJPN_LL_AREA2_RemovedUS_LL_GOM_RemovedUS_Larval_RemovedJPN_LL_GOM_RemovedTagging_Removed
SSB Recruits
Figure 4. Jack-knife analysis demonstrating the effects of iteratively removing individual relative abundance indices and associated partial catch-at-age matrices from the western BFT VPA continuity model. Reproduced from the 2012 Atlantic bluefin tuna stock assessment report.
197
Figure 5. Estimated status of stock relative to the Convention objectives (MSY) by year (1973 to 2011). The lines represent the time series of point estimates for each recruitment scenario. The estimated stock status in 2009 (the geometric mean fishing mortality during 2006-2008 is the proxy for F in 2009) in shown as a red “X”. Reproduced from the 2012 Atlantic bluefin tuna stock assessment report.
198
60% Probability – Low Recruitment Potential
60% Probability – High Recruitment Potential
Figure 6. WBFT: The projected SSB/SSBMSY and F/FMSY trajectories at various catch levels for the two recruitment scenarios. These trajectories correspond to a 60% probability of achieving a given level of SSB/SSBMSY or F/FMSY. Reproduced from the 2012 Atlantic bluefin tuna stock assessment report.
199
Figure 7. Median (solid line) estimates of spawning stock biomass, abundance of spawners (Age 9+), apical fishing mortality and recruitment. The 2009 to 2011 recruitment estimates were replaced with values from the two-line S-R relationship. Dashed lines indicate the 80% confidence interval. Reproduced from the 2012 Atlantic bluefin tuna stock assessment report.
200
Low Recruitment Scenario
High Recruitment Scenario
Figure 8. Histograms of bootstrap estimates of 2011 stock status at using FMSY and F0.1 references. The yellow bar contains the value corresponding to the continuity-case deterministic estimate. The cumulative frequency is shown as a solid red line. Reproduced from the 2012 Atlantic bluefin tuna stock assessment report.
201