fVAR algorithm

Purpose

This algorithm makes use of the typical threshold behaviour of vegetation responses (e.g., GPP, or canopy conductance) to soil moisture to detect and quantify its impact on reducing fluxes/rates. The method makes use of machine learning (neural networks, NN) and is described in Stocker et al. (2018).

library(rsofun)
library(dplyr)
library(readr)
library(lubridate)
library(ggplot2)

Installation

Development release

To install and load the fvar package (development release) run the following command in your R terminal:

if(!require(devtools)){install.packages(devtools)}
devtools::install_github( "stineb/fvar", build_vignettes = FALSE )
library(fvar)

Stable release

The fvar R package is not yet available on CRAN. We're working on it.

Usage example

Get FLUXNET 2015 data

Training data is to be read in as a data frame with column names corresponding to the variable names defined by settings (see below).

Read observational data including soil moisture for one site (see column sitename) from FLUXNET 2015 data. The following loads data extracted from original files by data-raw/prepare_data_example.R.

load("./data/df_fluxnet.Rdata")

Define settings

This named list is used by several functions of the package.

# ## use all observational soil moisture data
# varnams_soilm <- df_fluxnet %>% 
#   dplyr::select( starts_with("SWC_") ) %>% 
#   dplyr::select( -ends_with("QC") ) %>% 
#   names()

settings <- list( 
  target        = "GPP_NT_VUT_REF", 
  predictors    = c("temp_day","vpd_day", "ppfd", "fpar"), 
  varnams_soilm = "wcont_splash",
  nnodes        = 10,
  nrep          = 3,
  package       = "nnet"
  )

The settings are a named list that must contain the following variables:

target : Character, target variable for the NN model, must correspond to the respective column name in the training data.
predictors : Vector of characters, predictor variables (excluding soil moisture) for the NN model, must correspond to the respective column name in the training data.
varnams_soilm : (Vector of) character(s). Name of the soil moisture variable(s). Must correspond to the respective column name in the training data.
nnodes : Number of hidden nodes of the NN.
nrep : Number of repetitions of NN model training. fvar is calculated as a mean across repetitions.
package : R package name that implements the NN (for now, use "nnet", which is used in combination with caret)

Prepare data

To prepare training data, removing NAs and outliers, in a standardized way, do:

df_train <- prepare_trainingdata_fvar( df_fluxnet, settings )

Train model and predict fvar

Train models and get fvar (is returned as one column in the returned data frame). Here, we specify the soil moisture threshold to separate training data into moist and dry days to 0.6 (fraction of water holding capacity). Deriving an optimal threshold is implemented by the functions profile_soilmthreshold_fvar() and get_opt_threshold() (see below).

df_nn_soilm_obs <- train_predict_fvar( 
  df_train,
  settings,
  soilm_threshold = 0.6,
  weights         = NA, 
  verbose         = TRUE
  )

Evaluate performance

The function test_performance_fvar() returns a list of all ggplot objects and a data frame with evaluation results for multiple tests.

testresults_fvar <- test_performance_fvar(df_nn_soilm_obs, settings)

Get optimal soil moisture threshold

The two steps (train/predict and performance evaluation) are executed for a set of soil moisture thresholds. The optimal threshold is then selected based on the different performance criteria.

First, run the fvar algorithm for a set of soil moisture thresholds and record performance metrics for each round.

df_profile_performance <- profile_soilmthreshold_fvar(
  df_train,
  settings,
  weights = NA,    # optional weights
  len = 4          # number of soil moisture thresholds
  )

Then select best threshold as described in Stocker et al. (2018). Their procedure was as follows:

Determine the difference in the bias of NN$_\text{pot}$ between dry and moist days and select the three (originally used five) thresholds with the greatest difference.
Determine RMSE of NNpot vs. NNact during moist days and take the threshold where it's minimised

purrr::map(df_profile_performance, "df_metrics") %>% 
  bind_rows(.id = "threshold") %>% 
  arrange(-diff_ratio_dry_moist) %>% 
  slice(1:3) %>% 
  arrange(rmse_nnpot_nnact_moist) %>% 
  slice(1) %>% 
  pull(threshold) %>% 
  as.numeric()

This is implemented by a function:

get_opt_threshold(df_profile_performance)

Repeat the fvar algorithm with optimal threshold.

df_nn <- train_predict_fvar( 
  df_train,
  settings,
  soilm_threshold = get_opt_threshold(df_profile_performance),
  weights         = NA
  )

Evaluations and visualisations

Identify soil moisture droughts

Determine drought events based on fvar.

out_droughts <- get_droughts_fvar( 
  df_nn, 
  nam_target     = settings$target, 
  leng_threshold = 10, 
  df_soilm       = select(df_fluxnet, date, soilm=wcont_swbm),
  df_par         = select(df_fluxnet, date, par=ppfd),
  par_runmed     = 10
  )

Plot time series of fvar and highlight drought events.

ggplot() +
  geom_rect(
    data=out_droughts$droughts, 
    aes(xmin=date_start, xmax=date_end, ymin=-99, ymax=99), 
    fill=rgb(0,0,0,0.3), 
    color=NA) + 
  geom_line(data=out_droughts$df_nn, aes(date, fvar_smooth_filled)) +
  coord_cartesian(ylim=c(0, 1.2))

Align data by droughts

After having determined drought events above, "align" data by the day of drought onset and aggregate data across multiple drought instances.

dovars <- c("GPP_NT_VUT_REF", "fvar", "soilm", "nn_pot", "nn_act")
df_nn <- out_droughts$df_nn %>% mutate(site="FR-Pue")
df_alg <- align_events( 
  select( df_nn, site, date, one_of(dovars)), 
  select( df_nn, site, date, isevent=is_drought_byvar_recalc ),
  dovars,
  do_norm=FALSE,
  leng_threshold=10, 
  before=20, after=80, nbins=10
  )

Visualise data aggregated by drought events.

df_alg$df_dday_agg_inst_site %>% 
  ggplot(aes(dday, fvar_median)) +
  geom_ribbon(aes(ymin=fvar_q33, ymax=fvar_q66), fill="grey70") +
  geom_line() +
  geom_vline(xintercept=0, linetype="dotted") +
  ylim(0,1) + 
  labs(title = "fVAR")

median <- df_alg$df_dday_agg_inst_site %>%
  select(dday, nn_pot=nn_pot_median, nn_act=nn_act_median, gpp_obs=GPP_NT_VUT_REF_median) %>% 
  tidyr::gather(model, var_nn_median, c(gpp_obs, nn_pot, nn_act))

q33 <- df_alg$df_dday_agg_inst_site %>%
  select(dday, nn_pot=nn_pot_q33, nn_act=nn_act_q33, gpp_obs=GPP_NT_VUT_REF_q33) %>% 
  tidyr::gather(model, var_nn_q33, c(gpp_obs, nn_pot, nn_act))

q66 <- df_alg$df_dday_agg_inst_site %>%
  select(dday, nn_pot=nn_pot_q66, nn_act=nn_act_q66, gpp_obs=GPP_NT_VUT_REF_q66) %>% 
  tidyr::gather(model, var_nn_q66, c(gpp_obs, nn_pot, nn_act))

df_tmp <- median %>% 
  left_join(q33, by=c("dday", "model")) %>% 
  left_join(q66, by=c("dday", "model"))

df_tmp %>% 
  ggplot(aes(dday, var_nn_median, color=model)) +
  geom_line() +
  geom_ribbon(aes(ymin=var_nn_q33, ymax=var_nn_q66, fill=model), alpha=0.3, color=NA) +
  geom_vline(xintercept=0, linetype="dotted") +
  labs(
    title = "GPP", 
    subtitle = "Observation and models",
    x="dday", y="GPP (g C m-2 d-1)") +
  scale_color_discrete(
    name="Model", 
    breaks=c("transp_obs", "nn_act", "nn_pot"),
    labels=c("Observed", expression(NN[act]), expression(NN[pot])) ) +
  scale_fill_discrete(
    name="Model", 
    breaks=c("transp_obs", "nn_act", "nn_pot"),
    labels=c("Observed", expression(NN[act]), expression(NN[pot])) )

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
R		R
data-raw		data-raw
data		data
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md
example_nee_hh_data.Rmd		example_nee_hh_data.Rmd
fvar.Rproj		fvar.Rproj
fvar_rsoil.Rmd		fvar_rsoil.Rmd
plot_fvar_cwd.R		plot_fvar_cwd.R
test_mct.Rmd		test_mct.Rmd
vignette_fvar.Rmd		vignette_fvar.Rmd
vignette_fvar_francescos.Rmd		vignette_fvar_francescos.Rmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fVAR algorithm

Purpose

Installation

Development release

Stable release

Usage example

Get FLUXNET 2015 data

Define settings

Prepare data

Train model and predict fvar

Evaluate performance

Get optimal soil moisture threshold

Evaluations and visualisations

Identify soil moisture droughts

Align data by droughts

About

Releases 1

Packages

Contributors 2

Languages

geco-bern/fvar

Folders and files

Latest commit

History

Repository files navigation

fVAR algorithm

Purpose

Installation

Development release

Stable release

Usage example

Get FLUXNET 2015 data

Define settings

Prepare data

Train model and predict fvar

Evaluate performance

Get optimal soil moisture threshold

Evaluations and visualisations

Identify soil moisture droughts

Align data by droughts

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages