v0.9.0 #129

juniperlsimonis · 2019-08-18T01:16:26Z

Major API update: increase in explicit top-level arguments

Moved key arguments to focal top-level inputs, rather than nested within control options list. Allows full control, but with default settings working cleanly. addresses
Restructuring of the controls lists, retained usage in situations where necessary: model construction, data set construction, file naming, climate data downloading.
Openness for new setup functions, in particular setup_sandbox. addresses
Simplification of model naming inputs. Just put the names in you need, only use the model_names functions when you need to (usually in coding inside of functions or for setting default argument levels). addresses

Directory tree structure simplified

dirtree was removed
base (both as a function and a concept) was removed. To make that structure use main = "./name"
"PortalData" has been removed as a sub and replaced with "raw", which includes all raw versions of files (post unzipping) downloaded: Portal Data and Portal Predictions and covariate forecasts (whose saving is also new here).

Tightened messaging

Expanded use of quiet and verbose connected throughout the pipeline.
Additional messaging functions to reduce code clutter.
Formatting of messages to reduce clutter and highlight the outline structure.

Download capacity generalized

Flexible interface to downloading capacity through a url, with generalized and flexible functions for generating Zenodo API urls (for retrieving the raw data and historical predictions) and NMME API urls (for retrieving weather forecasts) to port into the download function. addresses and addresses and addresses

Changes for users adding their own models to the prefab set

Substantial reduction in effort for users who wish to add models (i.e. anyone who is sandboxing). You can even just plunk your own R script (which could be a single line calling out to an external program if desired) without having to add any model script writing controls, and just add the name of the model to the models argument in portalcast and it will run it with everything else.
Outlined in the updated Getting Started and Adding a Model/Data vignettes.
Users adding models to the prefab suite should now permanently add their model's control options to the source code in model_script_controls rather than write their own control functions.
Users adding models to the prefab suite should permanently add their model's function code to the prefab_models script (reusing and adding to the documentation in prefab_model_functions), rather than to its own script.
Users should still add their model's name to the source code in model_names.

Relaxed model requirements

Models are no longer forced to use interpolated data.
Models are no longer required to output a rigidly formatted data-table. Presently, the requirement is just a list, but soon some specifications will be added to improve reliability.
Outlined in the updated Adding a Model/Data vignette.

More organization via metadata

Generalized cast output is now tracked using a unique id in the file name associated with the cast, which is related to a row in a metadata table, newly included here. addresses and addresses and addresses
Additional control information (like data set setup) is sent to the model metadata and saved out.
Directory setting up configuration information is now tracked in a dir_config.yaml file, which is pulled from to save information about what was used to create, setup, and run the particular casts.

Changes for users interested in analyzing their own data sets not in the standard data set configuration

Users are now able to define rodent observation data sets that are not part of the standard data set ("all" and "controls", each also with interpolation of missing data) by giving the name in the data_sets argument and the controls defining the data set (used by portalr's summarize_rodent_data function) in the controls_rodents argument.
In order to actualize this, a user will need to flip off the argument checking (the default in a sandbox setting, if using a standard or production setting, set arg_checks = FALSE in the relevant function).
Users interested in permanently adding the treatment level to the available data sets should add the source code to the rodents_controls function, just like with the models.
addresses
Internal code points the pipeline to the files named via the data set inputs. The other data files are pointed to using the control_files (see file_controls) input list, which allows for some general flexibility with respect to what files the pipeline is reading in from the data subdirectory.

Split of standard data sets

The prefab all and controls were both default being interpolated for all models because of the use of AIC for model comparison and ensemble building. That forced all models to use interpolated data.
Starting in this version, the models are not required to have been fit in the same fashion (due to generalization of comparison and post-processing code), and so interpolation is not required if not needed, and we have split out the data to standard and interpolated versions.

Application of specific models to specific data sets now facilitated

write_model and model_template have a data_sets argument that is used to write the code out, replacing the hard code requirement of analyzing "all" and "controls" for every model. Now, users who wish to analyze a particular data component can easily add it to the analysis pipeline.

Generalization of code terms

Throughout the codebase, terminology has been generalized from "fcast"/"forecast"/"hindcast" to "cast" except where a clear distinction is needed (here primarily due to where the covariate values used come from).
Nice benefits: highlights commonality between the two (see next section) and reduces code volume.
start_newmoon is now start_moon like end_moon
addresses

"Hindcasting" becomes more similar to "forecasting"

In the codebase now, "hindcasting" is functionally "forecasting" with a forecast origin (end_moon) that is not the most recently occurring moon.
Indeed, "hindcast" is nearly entirely removed from the codebase and "forecast" is nearly exclusively retained in documentation (and barely in the code itself), with both functionally being replaced with the generalized (and shorter) "cast".
cast_type is retained in the metadata file for posterity, but functionality is more generally checked by considering end_moon and last_moon in combination, where end_moon is the forecast origin and last_moon is the most recent
Rather than the complex machinery used to iterate through multiple forecasts ("hindcasting") that involved working backwards and skipping certain moons (which didn't need to be skipped anymore due to updated code from a while back that allows us to forecast fine even without the most recent samples yet), a simple for loop is able to manage iterating. This is also facilitated by the downloading of the raw portalPredictions repository from Zenodo and critically its retention in the "raw" subdirectory, which allows quick re-calculation of historic predictions of covariates. addresses
cast_type has been removed as an input, it's auto determined now based on end_moon and the last moon available (if they're equal it's a "forecast", if not it's a "hindcast").

Softer handling of model failure

Within cast, the model scripts are now sourced within a for-loop (rather than sapply) to allow for simple error catching of each script. addresses

Improved argument checking flow

Arg checking is now considerably tighter, code-wise.
Each argument is either recognized and given a set of attributes (from an internally defined list) or unrecognized and stated to the user that it's not being checked (to help notify anyone building in the code that there's a new argument).
The argument's attributes define the logical checking flow through a series of pretty simple options.
There is also now a arg_checks logical argument that goes into check_args to turn off all of the underlying code, enabling the user to go off the production restrictions that would otherwise through errors, even though they might technically work under the hood.

Substantial re-writes of the vignettes

Done in general to update with the present version of the codebase.
Broke the adding a model or data vignette into "working locally" and "adding to the pipeline", also added checklists and screen shots. addresses
Reorganized the getting started vignette to an order that makes sense. addresses

Post-processing (evaluation and ensemble building) temporarily removed

The model evaluation and ensemble building had to be recoded with the updated flexibility in cast output.
These components are undergoing active development and are not yet ready for integration with the remainder of the codebase, which is stable and functional.
To provide a more up-to-date version and anchor all of the code work done thus far, I am releasing v0.9.0 with the provision that these components are missing and will be added in the very near future in a new release.

Additional things

drop_spp is now changed to species (so focus on inclusion, not exclusion). addresses
Improved examples, also now as \donttest{}. addresses
Tightened testing with skip_on_cran used judiciously. addresses
No longer building the AIC-based ensemble. addresses
Default confidence limit is now the more standard 0.95.

success but ugly in spots

unnecessary for that function

functions included now are for creating the directory and downloading files, as well as some utilities all of the functions that haven't been generalized fully have been pulled out for the time being to help ensure a non-vestigial codebase

including fill_raw to work generally

its creating an unnecessary level of coding. keeping it simple with base, main, and subs arguments next step might actually combine the base and main tho

no longer do we need to have base explicitly in the code the main input can be used to root the directory and make any additional hierarchy

starting to flesh this setup out

write_model model_names model_script_controls also removed the internal data object for now

i think that's what's causing the note re: check-level folders

to minimize runtimes

v0.9.0

juniperlsimonis added 30 commits July 16, 2019 01:13

x

68b7392

wip

d7dce65

success but ugly in spots

patching some bugs in portalcast

1dccc95

cleaning

ed770e7

oops

f2cd1d0

removing models control from portalcast

802d769

unnecessary for that function

stripping it down to rebuild

e93a8a3

functions included now are for creating the directory and downloading files, as well as some utilities all of the functions that haven't been generalized fully have been pulled out for the time being to help ensure a non-vestigial codebase

Update .travis.yml

26b1fb4

Update .travis.yml

31e5fc6

starting to build out fill_dir

7da98e2

including fill_raw to work generally

removing the dirtree list

4b7f547

its creating an unnecessary level of coding. keeping it simple with base, main, and subs arguments next step might actually combine the base and main tho

removing base

c8d9532

no longer do we need to have base explicitly in the code the main input can be used to root the directory and make any additional hierarchy

adding the PortalPredictions data file

99d779c

starting to flesh this setup out

model filling functionality

0c64585

write_model model_names model_script_controls also removed the internal data object for now

prepare_moons

10dc363

Update test-20-download.R

ed1cd7d

Update test-65-prepare_moons.R

83d9f55

Update download.R

7cfc318

Update test-65-prepare_moons.R

15ced47

travis testing

41ba8d6

Update download.R

790c61e

Update test-20-download.R

f26bb98

trying

11e3986

Update test-20-download.R

dd78daa

Update download.R

7eaafc5

more trying with travis

823bcc6

removing commented out code

30f8253

wrapping examples in donttest

2590d86

i think that's what's causing the note re: check-level folders

Update test-65-prepare_moons.R

1bc8075

reducing the test redundancy

f836350

to minimize runtimes

juniperlsimonis added 3 commits September 3, 2019 15:50

better logo

65fc435

doc fixing

d8b72e6

adding control_files to prefab models

ddc0ee1

juniperlsimonis mentioned this pull request Sep 3, 2019

Adds naiveArima model to package #126

Closed

juniperlsimonis added 4 commits September 3, 2019 17:17

Update test-17-data_input_output.R

3805a09

document editing

313c4b3

doc edits

fe43b77

Update _pkgdown.yml

ac0774f

This was referenced Sep 4, 2019

Add naive random walk #116

Closed

allow hindcasting to process multiple covariate forecasts within a newmoon window #52

Closed

juniperlsimonis added 12 commits September 3, 2019 22:35

doc edits

01309d3

figure formatting

c8f702f

vignettes and docs

64eb6c7

vignettes

9645f35

doc editing

42322af

vignette working

fa6c69c

tightening control list naming

80b855f

oops

4bd6bdb

Update test-11-prepare_rodents.R

d792360

vignettes

dc6ff44

spell checking

10c21a2

vignette updates

50f052b

juniperlsimonis mentioned this pull request Sep 4, 2019

"Adding a Model" vignette #113

Closed

juniperlsimonis added 4 commits September 4, 2019 16:20

Update NEWS.md

153d8d8

Update test-13-figures.R

e57b99c

documentation editing

ea3b9f1

Update NEWS.md

8e68492

juniperlsimonis changed the title ~~[WIP] v0.9.0~~ v0.9.0 Sep 6, 2019

juniperlsimonis merged commit a910a3f into master Sep 6, 2019

PrayasJ pushed a commit to PrayasJ/portalcasting that referenced this pull request Aug 24, 2022

Merge pull request weecology#129 from weecology/juniper_active

c3d7b0f

v0.9.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.9.0 #129

v0.9.0 #129

juniperlsimonis commented Aug 18, 2019 •

edited

Loading

v0.9.0 #129

v0.9.0 #129

Conversation

juniperlsimonis commented Aug 18, 2019 • edited Loading

Major API update: increase in explicit top-level arguments

Directory tree structure simplified

Tightened messaging

Download capacity generalized

Changes for users adding their own models to the prefab set

Relaxed model requirements

More organization via metadata

Changes for users interested in analyzing their own data sets not in the standard data set configuration

Split of standard data sets

Application of specific models to specific data sets now facilitated

Generalization of code terms

"Hindcasting" becomes more similar to "forecasting"

Softer handling of model failure

Improved argument checking flow

Substantial re-writes of the vignettes

Post-processing (evaluation and ensemble building) temporarily removed

Additional things

juniperlsimonis commented Aug 18, 2019 •

edited

Loading