Releases: weecology/portalcasting
Releases · weecology/portalcasting
portalcasting v0.15.2
shifting to github of portalr
to address the backwards incompatibility between the CRAN and GitHub versions of portalr and the need for the newest (GH) version because of the break in portalData
portalcasting v0.15.1
patch bump to try zenodo build again (the zenodo build for v0.15.0 failed)
portalcasting v0.15.0
JAGS vignette
- Added a vignette that describes how to use the JAGS/runjags API within portalcasting.
- addresses
Pulls code for match.call.defaults
into the package
- Use of it from
DesignLibrary
causes a problematic dependency chain with the docker image building
Patch bug in most_abundant_species
- Wasn't using the species name function, and so was pulling in the traps column, which was causing a break in plotting.
portalcasting v0.14.0
(done retroactively from the branch where this was done)
Adds exclosure data to the prefab models
portalcasting v0.13.0
Full writing of control_files
in model scripts
- Previously, the controls list for the files in the model scripts was taken from the environment in which the script was run, which opens the script to everything, which is undesirable.
- After the need to include a control list for runjags models forced an explicit writing of the list inputs, the code was available to transfer to the files control list.
- This does mean that the function calls in the scripts are now super long and explicit, but that's ok.
- To avoid super long model script lines (where event default inputs are repeated in the list functions), a function
control_list_arg
was made to generalize what was coded up from the runjags list for use also with the files control list. This function writes a script component that only includes arguments to the list function that are different from the formal definition.
portalcasting v0.12.0
portalcast
updates model scripts according to controls_model
- Previously, if you changed any controls of a prefab model, you had to manually re-write the models using
fill_models
before runningportalcast
. - Using
fill_models
would result in hand-made scripts being overwritten, so a specific function (update_models
) for updating the models was created. update_models
by default only updates the models listed in thecontrols_model
input, to avoid overwriting model scripts. To change this behavior and also update all of the prefab models' scripts, setupdate_prefab_models = TRUE
. This is particularly handy when changing a global (with respect to model scripts) argument:main
,quiet
,verbose
, orarg_checks
.- addresses
Messaging around trying to use not-complete directory improved
- Indication now made that a component of the directory is missing and suggestion is made to run
create_dir
. - addresses
Patching data set bug in plotting
- There was a bug with matching the interpolated to the non interpolated data sets within the ensembling, which has been fixed.
- addresses
Updated messaging
- Moved most of the messaging into tidied functions.
Changed behavior of prep_rodents_table
and prep_rodents
- Now there is no
start_moon
argument, and all of the data prior toend_moon
are returned. - This aligns the rodents prep functions with the other (moons, covariates) prep functions.
- Facilitates use of data prior to
start_moon
in forecasting models (e.g., for distributions of starting state variables). - Requires that model functions now explicitly trim the rodents table being used. This has been added to all prefab models.
Fixed codecov targets
- Previous targets were restrictively high due to earlier near-perfect coverage.
- A codecov.yml file is now included in the repo (and ignored for the R build) which sets the target arbitrarily at the still-quite-high-but-not-restrictively-so 95%.
- It can be changed if needed in the future.
Simple EDM model added
JAGS infrastructure added
- Using the runjags package, with extensive access to the API of
run.jags
via acontrol_runjags
list
(seerunjags_control
). - Currently in place with a very simple random walk model.
- addresses
Prepared rodents table includes more content
- Expanded back in time to the start.
- Added effort columns (all default options in
prefab_rodents_controls
haveeffort = TRUE
).
Updated adding a model and data vignette
- Added section at the end about just extending existing models to new data sets.
- addresses
v0.11.0
Ensembling reintroduced
- Associated with the reconfiguration of portalcasting from v0.8.1 to 0.9.0, ensembling was removed temporarily.
- A basic ensemble is reintroduced, now as an unweighted average across all selected models, allowing us to have an ensemble but not have it be tied to AIC weighting (because AIC weighting is no longer possible with the split between interpolated and non-interpolated data for model fitting).
- In a major departure from v0.8.1 and earlier, the ensemble's output is not saved like the actual models'. Rather, it is only calculated when needed on the fly.
- In plotting, it is now the default to use the ensemble for
plot_cast_ts
andplot_cast_point
and for the ensemble to be included inplot_casts_err_lead
andplot_casts_cov_RMSE
.
Return of most_abundant_species
- Function used to select the most common species.
- Now uses the actual data and not the casts to determine the species.
v0.10.0
Model evaluation and ensembling added back in
- Were removed with the updated version from 0.8.1 to 0.9.0 to allow time to develop the code with the new infrastructure.
- Model evaluation happens within the cast tab output as before.
Temporarily removed figures returned
- Associated with the evaluation.
- Plotting of error as a function of lead time for multiple species and multiple models. Now has a fall-back arrangement that works for a single species-model combination.
- Plotting RMSE and coverage within species-model combinations.
Flexing model controls to allow user-defined lists for prefab models
- For sandboxing with existing models, it is useful to be able to change a parameter in the model's controls, such as the data sets. Previously, that would require a lot of hacking around. Now, it's as simple as inputting the desired controls and flipping
arg_checks = FALSE
.
v0.9.0
Major API update: increase in explicit top-level arguments
- Moved key arguments to focal top-level inputs, rather than nested within control options list. Allows full control, but with default settings working cleanly. addresses
- Restructuring of the controls lists, retained usage in situations where necessary: model construction, data set construction, file naming, climate data downloading.
- Openness for new
setup
functions, in particularsetup_sandbox
. addresses - Simplification of model naming inputs. Just put the names in you need, only use the
model_names
functions when you need to (usually in coding inside of functions or for setting default argument levels). addresses
Directory tree structure simplified
dirtree
was removedbase
(both as a function and a concept) was removed. To make that structure use main = "./name"- "PortalData" has been removed as a sub and replaced with "raw", which includes all raw versions of files (post unzipping) downloaded: Portal Data and Portal Predictions and covariate forecasts (whose saving is also new here).
Tightened messaging
- Expanded use of
quiet
andverbose
connected throughout the pipeline. - Additional messaging functions to reduce code clutter.
- Formatting of messages to reduce clutter and highlight the outline structure.
Download capacity generalized
- Flexible interface to downloading capacity through a url, with generalized and flexible functions for generating Zenodo API urls (for retrieving the raw data and historical predictions) and NMME API urls (for retrieving weather forecasts) to port into the
download
function. addresses and addresses and addresses
Changes for users adding their own models to the prefab set
- Substantial reduction in effort for users who wish to add models (i.e. anyone who is sandboxing). You can even just plunk your own R script (which could be a single line calling out to an external program if desired) without having to add any model script writing controls, and just add the name of the model to the models argument in
portalcast
and it will run it with everything else. - Outlined in the updated Getting Started and Adding a Model/Data vignettes.
- Users adding models to the prefab suite should now permanently add their model's control options to the source code in
model_script_controls
rather than write their own control functions. - Users adding models to the prefab suite should permanently add their model's function code to the
prefab_models
script (reusing and adding to the documentation inprefab_model_functions
), rather than to its own script. - Users should still add their model's name to the source code in
model_names
.
Relaxed model requirements
- Models are no longer forced to use interpolated data.
- Models are no longer required to output a rigidly formatted data-table. Presently, the requirement is just a list, but soon some specifications will be added to improve reliability.
- Outlined in the updated Adding a Model/Data vignette.
More organization via metadata
- Generalized cast output is now tracked using a unique id in the file name associated with the cast, which is related to a row in a metadata table, newly included here. addresses and addresses and addresses
- Additional control information (like data set setup) is sent to the model metadata and saved out.
- Directory setting up configuration information is now tracked in a
dir_config.yaml
file, which is pulled from to save information about what was used to create, setup, and run the particular casts.
Changes for users interested in analyzing their own data sets not in the standard data set configuration
- Users are now able to define rodent observation data sets that are not part of the standard data set ("all" and "controls", each also with interpolation of missing data) by giving the name in the
data_sets
argument and the controls defining the data set (used by portalr'ssummarize_rodent_data
function) in thecontrols_rodents
argument. - In order to actualize this, a user will need to flip off the argument checking (the default in a sandbox setting, if using a standard or production setting, set
arg_checks = FALSE
in the relevant function). - Users interested in permanently adding the treatment level to the available data sets should add the source code to the
rodents_controls
function, just like with the models. - addresses
- Internal code points the pipeline to the files named via the data set inputs. The other data files are pointed to using the
control_files
(seefile_controls
) input list, which allows for some general flexibility with respect to what files the pipeline is reading in from thedata
subdirectory.
Split of standard data sets
- The prefab
all
andcontrols
were both default being interpolated for all models because of the use of AIC for model comparison and ensemble building. That forced all models to use interpolated data. - Starting in this version, the models are not required to have been fit in the same fashion (due to generalization of comparison and post-processing code), and so interpolation is not required if not needed, and we have split out the data to standard and interpolated versions.
Application of specific models to specific data sets now facilitated
write_model
andmodel_template
have adata_sets
argument that is used to write the code out, replacing the hard code requirement of analyzing "all" and "controls" for every model. Now, users who wish to analyze a particular data component can easily add it to the analysis pipeline.
Generalization of code terms
- Throughout the codebase, terminology has been generalized from "fcast"/"forecast"/"hindcast" to "cast" except where a clear distinction is needed (here primarily due to where the covariate values used come from).
- Nice benefits: highlights commonality between the two (see next section) and reduces code volume.
start_newmoon
is nowstart_moon
likeend_moon
- addresses
"Hindcasting" becomes more similar to "forecasting"
- In the codebase now, "hindcasting" is functionally "forecasting" with a forecast origin (
end_moon
) that is not the most recently occurring moon. - Indeed, "hindcast" is nearly entirely removed from the codebase and "forecast" is nearly exclusively retained in documentation (and barely in the code itself), with both functionally being replaced with the generalized (and shorter) "cast".
cast_type
is retained in the metadata file for posterity, but functionality is more generally checked by consideringend_moon
andlast_moon
in combination, whereend_moon
is the forecast origin andlast_moon
is the most recent- Rather than the complex machinery used to iterate through multiple forecasts ("hindcasting") that involved working backwards and skipping certain moons (which didn't need to be skipped anymore due to updated code from a while back that allows us to forecast fine even without the most recent samples yet), a simple for loop is able to manage iterating. This is also facilitated by the downloading of the raw portalPredictions repository from Zenodo and critically its retention in the "raw" subdirectory, which allows quick re-calculation of historic predictions of covariates. addresses
cast_type
has been removed as an input, it's auto determined now based onend_moon
and the last moon available (if they're equal it's a "forecast", if not it's a "hindcast").
Softer handling of model failure
- Within
cast
, the model scripts are now sourced within a for-loop (rather than sapply) to allow for simple error catching of each script. addresses
Improved argument checking flow
- Arg checking is now considerably tighter, code-wise.
- Each argument is either recognized and given a set of attributes (from an internally defined list) or unrecognized and stated to the user that it's not being checked (to help notify anyone building in the code that there's a new argument).
- The argument's attributes define the logical checking flow through a series of pretty simple options.
- There is also now a
arg_checks
logical argument that goes intocheck_args
to turn off all of the underlying code, enabling the user to go off the production restrictions that would otherwise through errors, even though they might technically work under the hood.
Substantial re-writes of the vignettes
- Done in general to update with the present version of the codebase.
- Broke the
adding a model or data
vignette into "working locally" and "adding to the pipeline", also added checklists and screen shots. addresses - Reorganized the
getting started
vignette to an order that makes sense. addresses
Post-processing (evaluation and ensemble building) temporarily removed
- T...
portalcasting v0.8.1
hook up with zenodo and some other minor documentation edits
no coding changes