library(rema)
library(ggplot2)
library(dplyr)
library(knitr)
ggplot2::theme_set(cowplot::theme_cowplot(font_size = 14) +
cowplot::background_grid() +
cowplot::panel_border())In this series of vignettes we walk through several examples of how
to prepare and fit a suite of random effects (RE) models for biomass
estimation and apportionment using the rema package.
rema workflow:Load rema and data. The user can read biomass or
other abundance index data from file (e.g. .csv file), or they can use
the rwout.rep report file from the ADMB version of the RE
model using read_admb_re().
Specify model structure and assumptions using
prepare_rema_input(). This function allows users to quickly
transition from a single to two survey model, specify alternative
process error structures, add likelihood penalties or priors on
parameters, and evaluate alternative assumptions about zero biomass
observations.
Fit the specified REMA model using fit_rema() and
determine whether the model has met basic convergence criteria (e.g.,
Hessian is positive definite, a maximum gradient component approximately
equal to zero).
Extract rema model output into clean, consistently
formatted data frames using tidy_rema(). The user can
visualize this model output using plot_rema(), or quickly
format it into tables for a report.
Compare alternative REMA models and conduct model selection using
compare_rema_models(). Output from this function includes a
table of Akaike Information Criteria (AIC) when appropriate, figures,
and tidied data frames. This function also accepts model output from the
ADMB version of the RE model for easy comparison to past
models.
Taken together, these functions allow R users to quickly fit and interrogate a suite of simple statistical models in TMB without needing software-specific training or expertise.
This example uses Aleutian Islands shortraker
(aisr_rwout.rep) and fits to NMFS bottom trawl survey
estimates.
rwout.rep, a custom report file generated
by the ADMB version of the random effects model.# ?read_admb_re
admb_re <- read_admb_re(filename = 'aisr_rwout.rep',
# optional label for the single biomass survey stratum
biomass_strata_names = 'Aleutians Islands',
model_name = 'ADMB: AI shortraker')
names(admb_re)
#> [1] "biomass_dat" "cpue_dat" "model_yrs"
#> [4] "init_log_biomass_pred" "admb_re_results"
kable(admb_re$biomass_dat)| strata | year | biomass | cv |
|---|---|---|---|
| Aleutians Islands | 1980 | 6829.3 | 0.5531111 |
| Aleutians Islands | 1983 | 26275.5 | 0.2006878 |
| Aleutians Islands | 1986 | 11667.2 | 0.2505718 |
| Aleutians Islands | 1991 | 21835.4 | 0.6916255 |
| Aleutians Islands | 1994 | 26284.4 | 0.2198947 |
| Aleutians Islands | 1997 | 36058.3 | 0.2694324 |
| Aleutians Islands | 2000 | 37151.8 | 0.4460828 |
| Aleutians Islands | 2002 | 15342.0 | 0.2009741 |
| Aleutians Islands | 2004 | 32611.6 | 0.3743298 |
| Aleutians Islands | 2006 | 11780.6 | 0.2522081 |
| Aleutians Islands | 2010 | 18223.7 | 0.2336468 |
| Aleutians Islands | 2012 | 15667.7 | 0.2636677 |
| Aleutians Islands | 2014 | 16400.5 | 0.3771796 |
| Aleutians Islands | 2016 | 15024.0 | 0.3199997 |
admb_re data
object.# ?prepare_rema_input
input <- prepare_rema_input(model_name = 'TMB: AI shortraker', admb_re = admb_re)
names(input)
#> [1] "data" "par" "map" "random" "model_name"
#> [6] "osa" "biomass_dat"The input$data, input$par,
input$map, and input$random are used by
TMB to fit the REMA model. The other objects are used by
other functions to process model results or conduct residual
analyses.
Data could alternatively be entered into
prepare_rema_input() using the biomass_dat
argument:
# (not run)
input <- prepare_rema_input(model_name = 'tmb_rema_aisr', biomass_dat = admb_re$biomass_dat)# ?fit_rema
m <- fit_rema(input)
#> Model runtime: 0.1 seconds
#> stats::nlminb thinks the model has converged: mod$opt$convergence == 0
#> Maximum gradient component: 1.92e-11
#> Max gradient parameter: log_PE
#> TMB:sdreport() was performed successfully for this modelNote that the 95% confidence intervals of the observations (i.e.,
obs_lci and obs_uci in
output$biomass_by_strata are based on the assumption of
normality in log-space; therefore, they are asymmetric on the arithmetic
scale.
# ?tidy_rema
output <- tidy_rema(rema_model = m)
kable(output$parameter_estimates) # estimated fixed effects parameters| model_name | parameter | estimate | std_err | lci | uci |
|---|---|---|---|---|---|
| TMB: AI shortraker | process_error | 0.1557774 | 0.1044475 | 0.0418582 | 0.5797331 |
| model_name | strata | variable | year | log_pred | sd_log_pred | pred | pred_lci | pred_uci | obs | obs_cv | log_obs | sd_log_obs | obs_lci | obs_uci |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TMB: AI shortraker | Aleutians Islands | biomass_pred | 1980 | 9.689900 | 0.3344751 | 16153.63 | 8386.187 | 31115.44 | 6829.3 | 0.5531111 | 8.828978 | 0.51664 | 2480.912 | 18799.27 |
| TMB: AI shortraker | Aleutians Islands | biomass_pred | 1981 | 9.768171 | 0.2704638 | 17468.78 | 10281.190 | 29681.22 | NA | NA | NA | NA | NA | NA |
| TMB: AI shortraker | Aleutians Islands | biomass_pred | 1982 | 9.846441 | 0.2155807 | 18891.00 | 12380.903 | 28824.22 | NA | NA | NA | NA | NA | NA |
| TMB: AI shortraker | Aleutians Islands | biomass_pred | 1983 | 9.924711 | 0.1784564 | 20429.01 | 14399.418 | 28983.42 | 26275.5 | 0.2006878 | 10.176392 | 0.19871 | 17799.538 | 38787.63 |
| TMB: AI shortraker | Aleutians Islands | biomass_pred | 1984 | 9.848307 | 0.1919366 | 18926.28 | 12992.375 | 27570.33 | NA | NA | NA | NA | NA | NA |
# ?plot_rema
plots <- plot_rema(tidy_rema = output, biomass_ylab = 'Biomass (t)') # optional y-axis label
plots$biomass_by_strata# ?compare_rema_models
compare <- compare_rema_models(rema_models = list(m),
admb_re = admb_re,
biomass_ylab = 'Biomass (t)')
compare$plots$biomass_by_strataThis example uses Bering Sea and Aleutian Islands shortspine
thornyhead (bsaisst_rwout.rep) and fits to the NMFS EBS
slope and AI bottom trawl survey estimates. The original ADMB model has
a single, shared process error (PE) over the three strata. This example
demonstrates how to define alternative PE structures, and how to u to
perform model selection using AIC.
admb_re <- read_admb_re(filename = 'bsaisst_rwout.rep',
biomass_strata_names = c('AI survey', 'EBS slope survey', 'S. Bering Sea (AI survey)'),
model_name = 'ADMB: single PE')
# the original ADMB model shared PE across all strata
input1 <- prepare_rema_input(model_name = 'TMB: single PE',
admb_re = admb_re,
PE_options = list(pointer_PE_biomass = c(1, 1, 1)))
m1 <- fit_rema(input1)
#> Model runtime: 0.1 seconds
#> stats::nlminb thinks the model has converged: mod$opt$convergence == 0
#> Maximum gradient component: 9.80e-10
#> Max gradient parameter: log_PE
#> TMB:sdreport() was performed successfully for this model
output <- tidy_rema(rema_model = m1)
kable(output$parameter_estimates) # estimated fixed effects parameters| model_name | parameter | estimate | std_err | lci | uci |
|---|---|---|---|---|---|
| TMB: single PE | process_error | 0.121168 | 0.0286548 | 0.0762236 | 0.1926136 |
| model_name | strata | variable | year | log_pred | sd_log_pred | pred | pred_lci | pred_uci | obs | obs_cv | log_obs | sd_log_obs | obs_lci | obs_uci |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TMB: single PE | AI survey | biomass_pred | 1982 | 8.792938 | 0.4034014 | 6587.561 | 2987.781 | 14524.48 | NA | 9.000024 | NA | NA | NA | NA |
| TMB: single PE | AI survey | biomass_pred | 1983 | 8.792938 | 0.3847739 | 6587.561 | 3098.878 | 14003.77 | NA | 9.000024 | NA | NA | NA | NA |
| TMB: single PE | AI survey | biomass_pred | 1984 | 8.792938 | 0.3651976 | 6587.561 | 3220.089 | 13476.63 | NA | 9.000024 | NA | NA | NA | NA |
| TMB: single PE | AI survey | biomass_pred | 1985 | 8.792938 | 0.3445107 | 6587.561 | 3353.333 | 12941.14 | NA | 9.000024 | NA | NA | NA | NA |
| TMB: single PE | AI survey | biomass_pred | 1986 | 8.792938 | 0.3224995 | 6587.561 | 3501.165 | 12394.72 | NA | 9.000024 | NA | NA | NA | NA |
| model_name | variable | year | pred | pred_lci | pred_uci |
|---|---|---|---|---|---|
| TMB: single PE | tot_biomass_pred | 1982 | 24821.5 | 11212.94 | 54946.08 |
| TMB: single PE | tot_biomass_pred | 1983 | 24821.5 | 11446.45 | 53825.14 |
| TMB: single PE | tot_biomass_pred | 1984 | 24821.5 | 11691.42 | 52697.34 |
| TMB: single PE | tot_biomass_pred | 1985 | 24821.5 | 11948.96 | 51561.55 |
| TMB: single PE | tot_biomass_pred | 1986 | 24821.5 | 12220.35 | 50416.46 |
We can use ggplot2 functions to modify formatting of
plots:
plots <- plot_rema(tidy_rema = output, biomass_ylab = 'Biomass (t)')
plots$biomass_by_strata + facet_wrap(~strata, ncol = 1, scales = 'free_y')One of the primary uses for the RE model is apportioning catch by
management area. The tidy_rema() function output includes a
table with proportions of predicted biomass by strata:
| model_name | year | AI survey | EBS slope survey | S. Bering Sea (AI survey) |
|---|---|---|---|---|
| TMB: single PE | 2018 | 0.2882088 | 0.6856694 | 0.0261218 |
| TMB: single PE | 2019 | 0.2882088 | 0.6856694 | 0.0261218 |
| TMB: single PE | 2020 | 0.2882088 | 0.6856694 | 0.0261218 |
We can compare the TMB model results with the original ADMB model. Note different confidence intervals in the ADMB and TMB versions. The ADMB code uses the Marlow method to sum the variances of biomass in log-space, whereas the TMB version applies the standard Delta method.
compare <- compare_rema_models(rema_models = list(m1),
admb_re = admb_re,
biomass_ylab = 'Biomass (t)')
# Note different confidence intervals between the ADMB version (Marlow method) and the TMB version (Delta method)
compare$plots$biomass_by_strata + facet_wrap(~strata, ncol = 1, scales = 'free_y')We can easily fit an alternative REMA model with strata-specific PE
parameters, and compare the two models using AIC. In this case we see
the single PE model has the lowest AIC value. Note that ADMB models
cannot currently be compared with REMA models using AIC, therefore we
omit the admb_re = admb_re argument in the
compare_rema_models() function call below:
# REMA defaults to strata-specific parameters, which could be explicitly defined
# as: PE_options = list(pointer_PE_biomass = c(1, 2, 3))
input2 <- prepare_rema_input(model_name = 'TMB: strata-specific PE',
admb_re = admb_re)
m2 <- fit_rema(input2)
#> Model runtime: 0.1 seconds
#> stats::nlminb thinks the model has converged: mod$opt$convergence == 0
#> Maximum gradient component: 2.31e-07
#> Max gradient parameter: log_PE
#> TMB:sdreport() was performed successfully for this model
compare <- compare_rema_models(rema_models = list(m1, m2),
biomass_ylab = 'Biomass (t)')
compare$plots$biomass_by_strata + facet_wrap(~strata, ncol = 1, scales = 'free_y')| model_name | objective_function | n_parameters | aic | delta_aic |
|---|---|---|---|---|
| TMB: single PE | 34.73786 | 1 | 71.5 | 0.0 |
| TMB: strata-specific PE | 34.65104 | 3 | 75.3 | 3.8 |
| model_name | parameter | estimate | std_err | lci | uci |
|---|---|---|---|---|---|
| TMB: single PE | process_error | 0.1211680 | 0.0286548 | 0.0762236 | 0.1926136 |
| TMB: strata-specific PE | process_error | 0.1257423 | 0.0384807 | 0.0690223 | 0.2290726 |
| TMB: strata-specific PE | process_error | 0.1084952 | 0.0419428 | 0.0508566 | 0.2314588 |
| TMB: strata-specific PE | process_error | 0.1552381 | 0.1289141 | 0.0304885 | 0.7904241 |