setupProject
calls a sequence of functions in this order:
setupOptions
(first time), setupPaths
, setupModules
, setupPackages
,
setupSideEffects
, setupOptions
(second time), setupParams
,
setupGitIgnore
, and setupRestart
.
This sequence will create folder structures, install missing packages from those
listed in either the packages
, require
arguments or in the modules reqdPkgs
fields,
load packages (only those in the require
argument), set options, download or
confirm the existence of modules. It will also return elements that can be passed
directly to simInit
or simInitAndSpades
, specifically, modules
, params
,
paths
, times
, and any named elements passed to ...
. This function will also
, if desired, change the .Rprofile file for this project so that every time
the project is opened, it has a specific .libPaths()
.
There are a number of convenience elements described in the section below. See Details.
Because of this sequence, users can take advantage of settings (i.e., objects)
that happen (are created) before others. For example, users can set paths
then use the paths
list to set options
that will can update/change paths
,
or set times
and use the times
list for certain entries in params
.
setupProject(
name,
paths,
modules,
packages,
times,
options,
params,
sideEffects,
config,
require = NULL,
studyArea = NULL,
Restart = getOption("SpaDES.project.Restart", FALSE),
useGit = getOption("SpaDES.project.useGit", FALSE),
setLinuxBinaryRepo = TRUE,
standAlone = TRUE,
libPaths = NULL,
updateRprofile = getOption("Require.updateRprofile", FALSE),
overwrite = FALSE,
verbose = getOption("Require.verbose", 1L),
defaultDots,
dots,
...
)
Optional. If supplied, the name of the project. If not supplied, an
attempt will be made to extract the name from the paths[["projectPath"]]
.
If this is a GitHub project, then it should indicate the full Github
repository and branch name, e.g., "PredictiveEcology/WBI_forecasts@ChubatyPubNum12"
a list with named elements, specifically, modulePath
, projectPath
,
packagePath
and all others that are in SpaDES.core::setPaths()
(i.e., inputPath
, outputPath
, scratchPath
, cachePath
, rasterTmpDir
).
Each of these has a sensible default, which will be overridden but any user
supplied values.
See setup.
a character string of modules to pass to getModule
. These
should be one of: simple name (e.g., fireSense
) which will be searched for locally
in the paths[["modulePath"]]
; or a GitHub repo with branch (GitHubAccount/Repo@branch
e.g.,
"PredictiveEcology/Biomass_core@development"
); or a character vector that identifies
one or more (not optional file extension) .R
file(s) (local or GitHub)
to parse that will produce a character vector assigned to
the name "modules". If the entire project is a git repository,
then it will not try to re-get these modules; instead it will rely on the user
managing their git status outside of this function.
See setup.
Optional. A vector of packages that must exist in the libPaths
.
This will be passed to Require::Install
, i.e., these will be installed, but
not attached to the search path. See also the require
argument. To force skip
of package installation (without assessing modules), set packages = NULL
Optional. This will be returned if supplied; if supplied, the values
can be used in e.g., params
, e.g., params = list(mod = list(startTime = times$start))
.
See help for SpaDES.core::simInit
.
Optional. Either a named list to be passed to options
or a character vector indicating one or more file(s) to source,
in the order provided. These will be parsed locally (not
the .GlobalEnv
), so they will not create globally accessible objects. NOTE:
options
is run 2x within setupProject
, once before setupPaths
and once
after setupPackages
. This occurs because many packages use options for their
behaviour (need them set before e.g., Require::require
is run; but many packages
also change options
at startup. See details.
See setup.
Optional. Similar to options
, however, this named list will be
returned, i.e., there are no side effects.
See setup.
Optional. This can be an expression or one or more file names or
a code chunk surrounded by {...}
.
If a non-text file name is specified (e.g., not .txt or .R currently),
these files will simply be downloaded, using their relative path as specified
in the github notation. They will be downloaded or accessed locally at that
relative path.
If these file names represent scripts (*.txt or .R), this/these will be parsed and evaluated,
but nothing is returned (i.e., any assigned objects are not returned). This is intended
to be used for operations like cloud authentication or configuration functions
that are run for their side effects only.
Still experimental linkage to the SpaDES.config
package. Currently
not working.
Optional. A character vector of packages to install and attach
(with Require::Require
). These will be installed and attached at the start
of setupProject
so that a user can use these during setupProject
.
See setup
Optional. If a list, it will be passed to
geodata::gadm
. To specify a country other than the default "CAN"
,
the list must have a named element, "country"
. All other named elements
will be passed to gadm
. 2 additional named elements can be passed for
convenience, subregion = "..."
, which will be grepped with the column
NAME_1
, and epsg = "..."
, so a user can pass an epsg.io
code to
reproject the studyArea
. See examples.
If the projectPath
is not the current path, and the session is in
RStudio, and interactive, it will create an RStudio Project file (and .Rproj.user
folder), restart with a new Rstudio session with that new project and with a root
path (i.e. working directory) set to projectPath
. Default is FALSE
, and no
RStudio Project is created.
A logical. If TRUE
, it will use git clone
and git checkout
to get and change branch for each module, according to its specification in
modules
. Otherwise it will download modules with getModules
. NOTE: CREATING A
GIT REPOSITORY AT THE PROJECT LEVEL AND SETTING MODULES AS GIT SUBMODULES IS
NOT YET IMPLEMENTED. IT IS FINE IF THE PROJECT HAS BEEN MANUALLY SET UP TO BE
A GIT REPOSITORY WITH SUBMODULES: THIS FUNCTION WILL ONLY EVALUTE PATHS. This can
be set with the option(SpaDES.project.useGit = xxx)
.
Logical. Should the binary RStudio Package Manager be used on Linux (ignored if Windows)
A logical. Passed to Require::standAlone
. This keeps all
packages installed in a project-level library, if TRUE
. Default is TRUE
.
Deprecated. Use paths = list(packagePath = ...)
.
Logical. Should the paths$packagePath
be set in the .Rprofile
file for this project. Note: if paths$packagePath
is within the tempdir()
,
then there will be a warning, indicating this won't persist. If the user is
using Rstudio
and the paths$projectPath
is not the root of the current
Rstudio project, then a warning will be given, indicating the .Rprofile may not
be read upon restart.
Logical vector or character vector, however, only getModule
will respond
to a vector of values. If length-one TRUE
, then all files that were previously downloaded
will be overwritten throughout the sequence of setupProject
. If a vector of
logical or character, these will be passed to getModule
: only the named
modules will be overwritten or the logical vector of the modules.
NOTE: if a vector, no other file specified anywhere in setupProject
will be
overwritten except a module that/those names, because
only setupModules
is currently responsive to a vector. To have fine grained control,
a user can just manually delete a file, then rerun.
Numeric or logical indicating how verbose should the function
be. If -1 or -2, then as little verbosity as possible. If 0 or FALSE,
then minimal outputs; if 1
or TRUE, more outputs; 2
even more. NOTE: in
Require
function, when verbose >= 2
, the return object will have an
attribute: attr(.., "Require")
which has lots of information about the
processes of the installs.
A named list of any arbitrary R objects.
These can be supplied to give default values to objects that
are otherwise passed in with the ...
, i.e., not specifically named for these
setup*
functions. If named objects are supplied as top-level arguments, then
the defaultDots
will be overridden. This can be particularly useful if the
arguments passed to ...
do not always exist, but rely on external e.g., batch
processing to optionally fill them. See examples.
Any other named objects passed as a list a user might want for other elements.
further named arguments that acts like objects
, but a different
way to specify them. These can be anything. The general use case
is to create the objects
that are would be passed to
SpaDES.core::simInit
, or SpaDES.core::simInitAndSpades
,
(e.g. studyAreaName
or objects
) or additional objects to be passed to the simulation
(in older versions of SpaDES.core
, these were passed as a named list
to the objects
argument). Order matters. These are sequentially evaluated,
and also any arguments that are specified before the named arguments
e.g., name
, paths
, will be evaluated prior to any of the named arguments,
i.e., "at the start" of the setupProject
.
If placed after the first named argument, then they will be evaluated at the
end of the setupProject
, so can access all the packages, objects, etc.
setupProject
will return a named list with elements modules
, paths
, params
, and times
.
The goal of this list is to contain list elements that can be passed directly
to simInit
.
It will also append all elements passed by the user in the ...
.
This list can be passed directly to SpaDES.core::simInit()
or
SpaDES.core::simInitAndSpades()
using a do.call()
. See example.
NOTE: both projectPath
and packagePath
will be omitted in the paths
list
as they are used to set current directory (found with getwd()
) and .libPaths()[1]
,
but are not accepted by simInit
. setupPaths
will still return these two paths as its
outputs are not expected to be passed directly to simInit
(unlike setupProject
outputs).
The overarching objectives for these functions are:
To prepare what is needed for simInit
.
To help a user eliminate virtually all assignments to the .GlobalEnv
,
as these create and encourage spaghetti code that becomes unreproducible
as the project increases in complexity.
Be very simple for beginners, but powerful enough to expand to almost any needs of arbitrarily complex projects, using the same structure
Deal with the complexities of R package installation and loading when working with modules that may have been created by many users
Create a common SpaDES project structure, allowing easy transition from one project to another, regardless of complexity.
Throughout these functions, efforts have been made to implement sequential evaluation,
within files and within lists. This means that a user can use the values from an
upstream element in the list. For example, the following where projectPath
is
part of the list that will be assigned to the paths
argument and it is then
used in the subsequent list element is valid:
setupPaths(paths = list(projectPath = "here",
modulePath = file.path(paths[["projectPath"]], "modules")))
Because of such sequential evaluation, paths
, options
, and params
files
can be sequential lists that have impose a hierarchy specified
by the order. For example, a user can first create a list of default options,
then several lists of user-desired options behind an if (user("emcintir"))
block that add new or override existing elements, followed by machine
specific
values, such as paths.
setupOptions(
maxMemory <- 5e+9 # if (grepl("LandWeb", runName)) 5e+12 else 5e+9
# Example -- Use any arbitrary object that can be passed in the `...` of `setupOptions`
# or `setupProject`
if (.mode == "development") {
list(test = 2)
}
if (machine("A127")) {
list(test = 3)
}
)
The arguments, paths
, options
, and params
, can all
understand lists of named values, character vectors, or a mixture by using a list where
named elements are values and unnamed elements are character strings/vectors. Any unnamed
character string/vector will be treated as a file path. If that file path has an @
symbol,
it will be assumed to be a file that exists on a GitHub repository in https://github.com
.
So a user can pass values, or pointers to remote and/or local paths that themselves have values.
The following will set an option as declared, plus read the local file (with relative path), plus download and read the cloud-hosted file.
setupProject(
options = list(reproducible.useTerra = TRUE,
"inst/options.R",
"PredictiveEcology/SpaDES.project@transition/inst/options.R")
)
)
This approach allows for an organic growth of complexity, e.g., a user begins with only named lists of values, but then as the number of values increases, it may be helpful to put some in an external file.
NOTE: if the GitHub repository is private the user must configure their GitHub
token by setting the GITHUB_PAT environment variable -- unfortunately, the usethis
approach to setting the token will not work at this moment.
paths
, options
, params
If paths
, options
, and/or params
are a character string
or character vector (or part of an unnamed list element) the string(s)
will be interpreted as files to parse. These files should contain R code that
specifies named lists, where the names are one or more paths
, options
,
or are module names, each with a named list of parameters for that named module.
This last named list for params
follows the convention used for the params
argument in
simInit(..., params = )
.
These files can use paths
, times
, plus any previous list in the sequence of
params
or options
specified. Any functions that are used must be available,
e.g., prefixed Require::normPath
if the package has not been loaded (as recommended).
If passing a file to options
, it should not set options()
explicitly;
only create named lists. This enables options checking/validating
to occur within setupOptions
and setupParams
. A simplest case would be a file with this:
opts <- list(reproducible.destinationPath = "~/destPath")
.
All named lists will be parsed into their own environment, and then will be
sequentially evaluated (i.e., subsequent lists will have access to previous lists),
with each named elements setting or replacing the previously named element of the same name,
creating a single list. This final list will be assigned to, e.g., options()
inside setupOptions
.
Because each list is parsed separately, they to not need to be assigned objects;
if they are, the object name can be any name, even if similar to another object's name
used to built the same argument's (i.e. paths
, params
, options
) final list.
Hence, in an file to passed to options
, instead of incrementing the list as:
a <- list(optA = 1)
b <- append(a, list(optB = 2))
c <- append(b, list(optC = 2.5))
d <- append(c, list(optD = 3))
one can do:
a <- list(optA = 1)
a <- list(optB = 2)
c <- list(optC = 2.5)
list(optD = 3)
NOTE: only atomics (i.e., character, numeric, etc.), named lists, or either of these that are protected by 1 level of "if" are parsed. This will not work, therefore, for other side-effect elements, like authenticating with a cloud service.
Several helper functions exist within SpaDES.project
that may be useful, such
as user(...)
, machine(...)
To allow for batch submission, a user can specify code argument = value
even if value
is missing. This type of specification will not work in normal parsing of arguments,
but it is designed to work here. In the next example, .mode = .mode
can be specified,
but if R cannot find .mode
for the right hand side, it will just skip with no error.
Thus a user can source a script with the following line from batch script where .mode
is specified. When running this line without that batch script specification, then this
will assign no value to .mode
. We include .nodes
which shows an example of
passing a value that does exist. The non-existent .mode
will be returned in the out
,
but as an unevaluated, captured list element.
setupPaths()
, setupOptions()
, setupPackages()
,
setupModules()
, setupGitIgnore()
. Also, helpful functions such as
user()
, machine()
, node()
## For more examples:
vignette("i-getting-started", package = "SpaDES.project")
#> Warning: vignette 'i-getting-started' not found
library(SpaDES.project)
origDir <- getwd()
tmpdir <- Require::tempdir2() # for testing tempdir2 is better
# \dontshow{
if (is.null(getOption("repos"))) {
options(repos = c(CRAN = "https://cloud.r-project.org"))
}
setwd(tmpdir)
# }
## simplest case; just creates folders
out <- setupProject(
paths = list(projectPath = ".") #
)
#> setting up paths ...
#> Copying SpaDES.project, data.table, Require, rprojroot packages to paths$packagePath (C:/Users/emcintir/AppData/Roaming/R/data/R/Require/packages/x86_64-w64-mingw32/4.3)
#> Setting:
#> options(
#> reproducible.cachePath = 'C:/Users/emcintir/AppData/Local/Temp/RtmpOovltm/Require/cache'
#> spades.inputPath = 'C:/Users/emcintir/AppData/Local/Temp/RtmpOovltm/Require/inputs'
#> spades.outputPath = 'C:/Users/emcintir/AppData/Local/Temp/RtmpOovltm/Require/outputs'
#> spades.modulePath = 'C:/Users/emcintir/AppData/Local/Temp/RtmpOovltm/Require/modules'
#> spades.scratchPath = 'C:/Users/emcintir/AppData/Local/Temp/RtmpOovltm/Require'
#> )
#> done setting up paths
#> no packages to set up
#> .libPaths() are: C:/Users/emcintir/AppData/Roaming/R/data/R/Require/packages/x86_64-w64-mingw32/4.3, C:/Program Files/R/R-4.3.1/library
setwd(origDir)