Function components
The function() call in R is itself a function, all R
functions have three parts: the formals(), the
body(), and the environment(). Since we are
creating an R package the majority of our functions with be in the
package environment (as opposed to global). Since this function calls
return() we will not use return() at the end
of our functions. The function formals() are the variables,
TRUE/FALSE, or NULL values.
If a variable must have a value then it should be named or named w/a
value (e.g., year or year=2017). If a variable
will be TRUE or FALSE then set it at the most
oft used status (e.g., by_strata = FALSE). If the function
is dependent upon a variable that may or may not be included then use
NULL (e.g., alt_file = NULL).
What does this mean?
Do this
double <- function(x){
x * x
}
Not this
double <- function(x){
d <- x * x
return(d)
}
We will use roxygen2 documentation - a standard function will look
something like:
#' query fishery catch data
#'
#' @param year max year to retrieve data from
#' @param species species group code e.g., "DUSK"
#' @param area sample area "GOA", "AI" or "BS" (or combos)
#' @param db data server to connect to
#' @param save saves a file to the data/raw folder, otherwise sends output to global enviro (default: TRUE)
#'
#' @export
#'
#' @examples
#' \dontrun{
#' q_catch(year = 2022, species = 'NORK', area = 'GOA', db = akfin, save = TRUE)
#' }
q_catch <- function(year, species, area, db, save = TRUE) {
# function code...
}
We will use informative error messages throughout our functions -
this can be done using the stop() function.
q_catch <- function(year, species, area, db, save = TRUE) {
area = toupper(area)
if(!(area %in% c("GOA", "AI", "BS"))) {
stop("the area name is incorrect it must be 'GOA', 'BS', 'AI', or a combo e.g., ('BS', 'AI')")
}
}
When building an R package that uses functions from other packages
they must be called explicitly:
fsc_table <- function(year, folder){
option(scipen = 999)
fsc = vroom::vroom(here::here(year, "data", "output", "fish_size_comp.csv"))
fsc %>%
tidytable::select(n_s, n_h) %>%
t(.) %>%
as.data.frame() %>%
tibble::rownames_to_column("name") -> samps
fsc %>%
tidytable::select(-n_s, -n_h, -AA_Index) %>%
tidytable::pivot_longer(-year) %>%
tidytable::pivot_wider(names_from = year, values_from = value, names_prefix = "y") %>%
as.data.frame() %>%
tidytable::mutate(tidytable::across(tidytable::where(is.numeric), round, digits = 4)) %>%
tidytable::mutate(name = gsub("X", "", name),
name = ifelse(tidytable::row_number() == tidytable::n(),
paste0(name, "+"), name )) %>%
tidytable::rename_with(~stringr::str_replace(., "y", "")) -> comp
names(samps) <- base::names(comp)
tidytable::bind_rows(comp, samps) %>%
vroom::vroom_write(here::here(year, folder, "tables", "tbl_10_07.csv"), delim = ",")
}
In order to run this function we need to have five packages loaded
(vroom, here, tidytable, tibble, stringr), note that
base is also called, but not technically necessary.
The packages can be added to the R package using the
usethis package:
usethis::use_package("vroom")
This will add the packages to the DESCRIPTION file as
"Imports". Once packages are added
devtools::document() is run to update the package.
We want to keep the number of dependencies to a minimum.