-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
time series sl3 r and rolling cross validation #248
Comments
I get the same problem even with this sample codes of https://github.com/tlverse/sl3_lecture/blob/master/sl3_timeseries.Rmd library(data.table) load datadata(bsds) head(bsds) tsdata<-xts(bsds$cnt, order.by=as.POSIXct(bsds$dteday)) #Visualize the time-series: PerformanceAnalytics::chart.TimeSeries(tsdata, auto.grid = FALSE, main = "Count of total rental bikes") #Final setup folds = origami::make_folds(tsdata, fold_fun=folds_rolling_window, window_size = 50, validation_size = 30, gap = 0, batch = 50) covars <- "cnt" outcome <- "cnt" create the sl3 task and take a look at itts_uni_task <- sl3_Task$new(data = bsds, covariates = covars,
let's take a look at the sl3 taskn_ahead_param <- 2 verify that the learner is fitfit_arima$is_trained head(pred_arima)
lrnr_tsdyn_setar <- Lrnr_tsDyn$new(learner = "setar", m = 1, model = "TAR",
lrnr_tsdyn_lstar <- Lrnr_tsDyn$new(learner = "lstar", m = 1,
lrnr_garch <- Lrnr_rugarch$new(n.ahead = n_ahead_param) lrnr_expsmooth <- Lrnr_expSmooth$new(n.ahead = n_ahead_param) lrnr_harmonicreg <- Lrnr_HarmonicReg$new(n.ahead = n_ahead_param, K = 7,
ts_stack <- Stack$new(lrnr_arima, lrnr_tsdyn_linear, lrnr_tsdyn_setar,
ts_stack_fit <- ts_stack$train(ts_uni_task) ts_stack_preds <- ts_stack_fit$predict() |
There seems to be a recent bug in |
Thank you so much!! I shall be desperately waiting for the new update on it. The problem seems to be related to data.table. |
Do you have any update on the above-mentioned problem? |
Hi- sorry for the delay. I was able to fix it, and will be pushing the updated version in the next few days (I need to check other CVs as well). |
Hello Ivana Malenica, |
This should now be fixed on devel. You can install the devel version by doing install_github("tlverse/sl3@devel"). It will be merged up to master shortly. |
First of all, I removed old version of sl3 and reinstall it using the link you provided. I checked again using the my own data/codes and this example https://github.com/tlverse/sl3_lecture/blob/master/sl3_timeseries.Rmd. I still get the same problem. Am I making any mistake.? Thanks in Advance. Error in set(learner_preds, j = current_names, value = current_preds) : |
I want to apply time series rolling/cross validation. Though the data(washb_data) used below is not the times series. I am just assuming it as time series. so that we can make it reproducible and I shall be able to apply on my time series data. I am error getting same error with my actual time series data as well.
I have added one line code from your time series
folds = origami::make_folds(washb_data, fold_fun=folds_rolling_window, window_size = 50, validation_size = 30, gap = 0, batch = 50)
Howver, when I reached sl_fit <- sl$train(washb_task). I get the following error. I don't know to fix it.
Error in set(private$.data, j = new_col_names, value = new_data) :
Supplied 570 items to be assigned to 1000 items of column 'd47fdc00-01a0-11ea-a044-4560ff6b69d1_Pipeline(Lrnr_pkg_SuperLearner_screener_screen.corP->Stack)_Lrnr_glm_TRUE'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code
The rest are your codes
library(data.table)
library(knitr)
library(kableExtra)
library(tidyverse)
library(origami)
library(SuperLearner)
library(sl3)
set.seed(7194)
load data set and take a peek
washb_data <- fread("https://raw.githubusercontent.com/tlverse/tlverse-data/master/wash-benefits/washb_data.csv",
stringsAsFactors = TRUE)
washb_data <- washb_data[1:1000 ,]
head(washb_data) %>%
kable(digits = 4) %>%
kableExtra:::kable_styling(fixed_thead = T) %>%
scroll_box(width = "100%", height = "300px")
specify the outcome and covariates
outcome <- "whz"
covars <- colnames(washb_data)[-which(names(washb_data) == outcome)]
folds = origami::make_folds(washb_data, fold_fun=folds_rolling_window, window_size = 50, validation_size = 30, gap = 0, batch = 50)
create the sl3 task
washb_task <- make_sl3_Task(
data = washb_data,
covariates = covars,
outcome = outcome, folds = folds
)
choose base learners
lrnr_glm <- make_learner(Lrnr_glm)
lrnr_mean <- make_learner(Lrnr_mean)
lrnr_glmnet <- make_learner(Lrnr_glmnet)
lrnr_ranger100 <- make_learner(Lrnr_ranger, num.trees = 100)
lrnr_hal_simple <- make_learner(Lrnr_hal9001, degrees = 1, n_folds = folds)
lrnr_gam <- Lrnr_pkg_SuperLearner$new("SL.gam")
lrnr_bayesglm <- Lrnr_pkg_SuperLearner$new("SL.bayesglm")
stack <- make_learner(
Stack,
lrnr_glm, lrnr_mean, lrnr_ranger100, lrnr_glmnet,
lrnr_gam, lrnr_bayesglm
)
metalearner <- make_learner(Lrnr_nnls)
screen_cor <- Lrnr_pkg_SuperLearner_screener$new("screen.corP")
which covariates are selected on the full data?
screen_cor$train(washb_task)
cor_pipeline <- make_learner(Pipeline, screen_cor, stack)
fancy_stack <- make_learner(Stack, cor_pipeline, stack)
we can visualize the stack
dt_stack <- delayed_learner_train(fancy_stack, washb_task)
plot(dt_stack, color = FALSE, height = "400px", width = "100%")
sl <- make_learner(Lrnr_sl,
learners = fancy_stack,
metalearner = metalearner
)
we can visualize the super learner
dt_sl <- delayed_learner_train(sl, washb_task)
plot(dt_sl, color = FALSE, height = "400px", width = "100%")
sl_fit <- sl$train(washb_task)
sl_preds <- sl_fit$predict()
head(sl_preds)
The text was updated successfully, but these errors were encountered: