The example shown here will include a simulated spatial process giving rise to heterogeneity in fish length-at-age with a predominant north-south cline.
The first step is to ensure your data are formatted correctly and to
be aware of any sample size issues. This is accomplished via
check_data(), which returns plots of the observations and
residuals.
library(growthbreaks)
data(simulated_data) ## from the package
head(simulated_data)
#> year age length lat long
#> 2 1 1 5.174880 56.97959 -156.2602
#> 3 1 1 6.532786 57.34694 -142.2419
#> 5 1 2 9.067383 56.61224 -151.8233
#> 7 1 2 13.178737 67.63265 -169.8984
#> 8 1 2 16.415712 55.14286 -169.2174
#> 9 1 2 19.372655 51.46939 -148.3869
p <- check_data(simulated_data, showPlot = FALSE)The first plot (p[[1]]) shows the input data, colored by
year:
The second two plots p[[2]];p[[3]] are maps of the
observations and simple residuals (observation minus age-specific mean).
The red colors are the highest values. We’d expect these to look similar
to one another.
Once you’ve completed that step you are ready to investigate
potential breakpoints in length-at-age via get_Breaks().
This example will use the default option axes = 0 which
looks for spatial breakpoints only. The function is ignorant of any
underlying structure in the data and is not fitting growth curves at
this time. If you keep the default settings you will get back plots of
the hypothesized breaks as well as a dataframe with the breakpoints.
Now we can pass our data to get_Breaks().
The ages_to_use argument allows you to specify a subset
of your age observations for which you’d like to test for breakpoints.
If you are unsure, you may choose to use age(s) that are well sampled in
your data, or all ages. However, you will want to include at least some
observations of small (young) fish, since discrepancies in size may be
less obvious for fish at or near their asymptotic length. Here I am
testing three ages and saving the output to a dataframe called
breakpoints.
breakpoints <- get_Breaks(dat = simulated_data,
ages_to_use = c(5,10,15),
sex = FALSE,
axes = 0,
showPlot = FALSE)
breakpoints$breakpoints
#> lat long detected_break freq
#> 87 65.63636 -141.4534 TRUE 0.3333333
plot_Breaks(dat=simulated_data, breakpoints$breakpoints, showData = TRUE)
As we might have expected based on the raw observations, the algorithm
detected a break at about \(66^{\circ}\)N, as well as an unexpected one
at \(~141^{\circ}\)W. We can see this
on the map as well as in the dataframe. The
count column
indicates the proportion of tested ages for which this breakpoint was
detected (in this case, 1/3, suggesting that no breakpoints were
detected for 2/3 ages). You may choose to stop here. If you’re
interested in some of the automated checking and curve fitting
functionality, proceed to the next step.
The function refit_Growth() can be used to generate
plots of the growth parameters and associated curves by splitting your
data into the spatio/temporal zones (“strata”) defined by
breakpoints. This can allow you to make inferences about a)
the magnitude and significance of differences in individual parameters
and b) the resultant impact on our perception of length-at-age.
It is common for growth curves to be very similar especially when the number of strata is large (>3), so user judgment is required to discern the best use of the information provided here.
Run the function with the breakpoints as-is and inspect the outputs.
fits1 <- refit_Growth(simulated_data, breakpoints$breakpoints, showPlot = FALSE)
names(fits1$split_tables) ## description of the strata used
#> [1] "lat_less_66_long_less_-141" "lat_greater_66_long_less_-141"
#> [3] "lat_less_66_long_greater_-141"
## split_tables contains your observed data broken up by strataWe can inspect the fitted curves in two ways: first, by visualizing
the $par_plot. The red points & error bars indicate
estimates for which the mean fell outside the confidence interval(s) for
other strata. This can be used to infer which components of the growth
curve are possibly contributing to detected differences among strata. It
doesn’t look like the second strata’s \(L_{\infty}\) nor \(k\) values are very different than the
other regions’.
The AIC of this first model is given by fits1$AIC:
1.4499293^{4}.
The second way to explore the results are by visualizing the fitted
curves against the strata-specific observations contained in
$fits_plot. Here we see that the second and third strata
(all data points below \(66^{\circ}\))N
exhhibit very similar curves, and there are not many data points in the
second strata to begin with.
Based on the exploration above, you may decide to manually combine
the second and third strata, using just a single breakpoint north to
south. In this case, we would do away with the breakpoint at \(-141^{\circ}\))W, replacing it with
-Inf.
breakpoints$breakpoints$long <- -Inf
fits2 <- refit_Growth(simulated_data,
breakpoints$breakpoints,
showPlot = FALSE) ## refit the curvesWe can repeat the visualization and see that 1) the key parameter
values are now more distinct from one another and 2) so are the
curves.The AIC of this updated model is given by fits2$AIC:
1.4496484^{4}, for a \(\Delta\)AIC of
2.8091856.