Given one or more sites and a model specification, builds a model of vegetation cover and report model assessment.
Usage
fit(
site = NULL,
datafile = "data",
name = "",
method = "rf",
vars = "{*}",
exclude_vars = "",
exclude_classes = NULL,
reclass = c(13, 2),
max_samples = NULL,
years = NULL,
minscore = 0,
maxmissing = 20,
max_miss_train = 0.2,
top_importance = 20,
holdout = 0.2,
auc = FALSE,
hyper = NULL,
resources = NULL,
local = FALSE,
trap = TRUE,
comment = NULL
)Arguments
- site
Three letter site code, or vector of site names if fitting multiple sites
- datafile
Name of data file. It must be an
.RDSfile, but exclude the extension. If fitting multiple sites, either use a single datafile name shared among sites, or a vector matching site.- name
Optional model name
- method
One of
rffor Random Forest,boostfor AdaBoost. Default =rf.- vars
Vector of variables to restrict analysis to. Default =
{*}, all variables.varsis processed byfind_orthos, and may include file names, portable names, search names and regular expressions of file and portable names.- exclude_vars
An optional vector of variables to exclude. As with
vars, variables are processed byfind_orthos- exclude_classes
Numeric vector of subclasses to exclude
- reclass
Vector of paired classes to reclassify, e.g.,
reclass = c(13, 2, 3, 4)would reclassify all 13s to 2 and 4s to 3, lumping each pair of classes.- max_samples
Maximum number of samples to use - subsample if necessary
- years
Vector of years to restrict variables to
- minscore
Minimum score for orthos. Files with a minimum score of less than this are excluded from results. Default is 0, but rejected orthos are always excluded.
- maxmissing
Maximum percent missing in orthos. Files with percent missing greater than this are excluded.
- max_miss_train
Maximum proportion of missing training points allowed before a variable is dropped
- top_importance
Number of variables to keep for variable importance
- holdout
Proportion of points to hold out. For Random Forest, this specifies the size of the single validation set, while for boosting, it is the size of each of the testing and validation sets.
- auc
If TRUE, calculate class probabilities so we can calculate AUC
- hyper
Hyperparameters. To be defined.
- resources
Slurm launch resources. See launch. These take priority over the function's defaults.
- local
If TRUE, run locally; otherwise, spawn a batch run on Unity
- trap
If TRUE, trap errors in local mode; if FALSE, use normal R error handling. Use this for debugging. If you get unrecovered errors, the job won't be added to the jobs database. Has no effect if local = FALSE.
- comment
Optional launch / slurmcollie comment