Gather raster and vector data from the source (either Google Drive or SFTP), and clip to site
boundary, resample and align to standard resolution. Data will be copied from various source
locations (orthophotos, DEMs, canopy height models). Robust to crashes and interruptions: cached
datasets that are fully downloaded will be used over re-downloading, and processed rasters won't be
re-processed unless update = TRUE or replace = TRUE.
Usage
gather(
site,
pattern = "",
update = TRUE,
check = FALSE,
field = TRUE,
ignore_bad_classes = FALSE,
replace_caches = FALSE,
resources = NULL,
local = FALSE,
trap = TRUE,
comment = NULL
)Arguments
- site
One or more site names, using 3 letter abbreviation. Use
allto process all sites. In batch mode, each named site will be run in a separate job.- pattern
Regex filtering rasters, case-insensitive. Default = "" (match all). Note: only files ending in
.tifare included in any case. Examples:to match all Mica orthophotos, use
mica_orthto match all Mica files from July, use
Jun.*micato match Mica files for a series of dates, use
11nov20.*mica|14oct20.*mica
- update
If TRUE, only process new files, assuming existing files are good; otherwise, process all files and replace existing ones.
- check
If TRUE, just check to see that source directories and files exist, but don't cache or process anything
- field
If TRUE, download and process the field transects if they don't already exist. The shapefile is downloaded for reference, and a raster corresponding to
standardis created.- ignore_bad_classes
If TRUE, don't throw an error if there are classes in the ground truth shapefile that don't occur in
classes.txt. Only use this if you're paying careful attention, because bad classes will crashdo_mapdown the line.- replace_caches
If TRUE, all cached images (used for
screen) are replaced- resources
Slurm launch resources. See launch. These take priority #' over the function's defaults.
- local
If TRUE, run locally; otherwise, spawn a batch run on Unity
- trap
If TRUE, trap errors in local mode; if FALSE, use normal R error handling. Use this for debugging. If you get unrecovered errors, the job won't be added to the jobs database. Has no effect if local = FALSE.
- comment
Optional slurmcollie comment
Details
Additional parameters, set in the gather block in pars.yml (see init()):
sourcedriveone oflocal,google,sftplocal- read source from local drivegoogle- get source data from currently connected Google Drive (login via browser on first connection) and cache it locally. Must setcachediroption.sftp- get source data from SFTP site. Must setsftpandcachediroptions.
sourcedirdirectory with source rasters, generally on Google Drive or SFTP sitesubdirssubdirectories to search, ending with slash. Default = orthos, DEMs, and canopy height models (okay to include empty or nonexistent directories). Use<site>in subdirectories that include a site name, e.g.,<site> Share/Photogrammetry DEMs. WARNING: paths on the Google Drive are case-sensitive!transectsdirectory with field transect shapefileexcludelist of geoTIFFs to exclude, for whatever reasons. Note that files beginning withbadare also excludedsftplist(url = <address of site>, user = <credentials>). Credentials are eitherusername:passwordor*filenamewithusername:password. Make sure to include credential files in.gitignoreand.Rbuildignoreso it doesn't end up out in the world!
Source data:
geoTIFFs for each site
sitesfile, table of site abbreviation, site name, footprint shapefile, raster standard, and transect shapefile.
Results:
flights/geoTIFFs, clipped, resampled, and aligned. Make sure you've closed ArcGIS/QGIS projects that point to these before running!
models/gather_data.log
All source data are expected to be in EPSG:4326. Non-conforming rasters will be reprojected.
sites.txt must include the name of the footprint shapefile for each site, a field transect
shapefile, and a standard geoTIFF for each site. The footprint is used for clipping and must be
present. The transect contains ground truth data, and must be present if field = TRUE. The
standard must be present. It is used as the standard for grain and alignment; all rasters will be
resampled to match. Standards MUST be in the standard projection, EPSG:4326. Best to use a Mica
orthophoto, with 8 cm resolution.
Note that adding to an existing stack using a different standard will lead to sorrow. BEST
PRACTICE: don't change the standards in standards.txt; if you must change them, clear the
flights/ directory and rerun.
If you're reading from the Google Drive or SFTP, you'll need a cache. Best to put this on the
Unity scratch drive. Create it with ws_allocate cache 30 in the Unity shell. You can extend
the scratch drive (up to 5 times) with ws_extend cache 30. When you're done with it, be polite
and release it with ws_release cache. You'll need to point to the cache in ~/pars.yml, under
scratchdir:.
Note that initial runs with Google Drive in a session open the browser for authentication or wait for input from the console, so don't run blindly when using the Google Drive
At the end of a run, the log file will be copied to the flights directory.
Remember that some SFTP servers require connection via VPN
Example runs:
Complete for all sites:
Run for one site, June only:
Run for 2 sites, low tide only: