Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time out when waitForJobs #4

Open
genomewalker opened this issue Apr 9, 2018 · 4 comments
Open

Time out when waitForJobs #4

genomewalker opened this issue Apr 9, 2018 · 4 comments

Comments

@genomewalker
Copy link

Hi,
the time limit for the batchtools waitForJobs function is 604800 seconds. I haven't seen a way to tell Pulsar or the SpiecEasi pulsar branch to increase the time other than modifying the source. Now, it fails with:

Applying data transformations...
Selecting model with batch.pulsar using bstars...
Sourcing configuration file '/home/afernand/.batchtools.conf.R' ...
Created registry in '/bioinf/home/afernand/SANDBOX/batchPulsar/regdir2_init' using cluster functions 'SGE'
Adding 2 jobs ...
Submitting 2 jobs in 2 chunks using cluster functions 'SGE' ...
Error in batch.pulsar(data = X, fun = match.fun(estFun), fargs = args,  :
  Errors in batch jobs for computing initial stability
In addition: Warning message:
In batchtools::waitForJobs(reg = reg, id) : Timeout reached

Maybe a solution would be to set it to Inf or to implement a way to resume Pulsar/SpiecEasi running jobs using the batchtools registry (sorry if this already exists).

Many thanks
Antonio

@zdk123
Copy link
Owner

zdk123 commented Apr 10, 2018

I have no technical issue with setting the time limit to infinity.

One idea is to set global batchtools-related options, as to not pollute the function arguments.

A longer term goal of mine is to migrate to the futures/futures.batchtool package so that all parameters can be passed in to the user, via the futures API.

However, I think the larger problem is that it's taking longer than a week for even a single huge run to complete. You can see our published timing results the whole pipeline took 6056 seconds for p=2000, n=2000.

I'm wondering if you can try further filtering the data, just so we can rule out system or data problems more quickly.

Thanks!

@genomewalker
Copy link
Author

genomewalker commented Apr 10, 2018

I think is a combination of my data and the parameters I am using. I already calculated smaller versions of the matrix (p=955) and it was calculated fast with mb and glasso using lambda.min.ratio of 1e-2 and 1e-3.
Now I wanted to use a less filtered matrix (it is pretty sparse, 0.959). I already got results with mb and glasso with a lambda.min.ratio of 1e-2. The graphs with 1e-2 were very sparse and I wanted to get denser ones and decreased it to 1e-3. MB was quite fast, but glasso hit the limit of a week. If you want to have a look to the data I can send you the matrix.

Many thanks!

@zdk123
Copy link
Owner

zdk123 commented Apr 12, 2018

At risk of telling you something you already know... the StARS solution will not find a denser matrix by lowering lambda.min.ratio. This number just gives you the smallest value of lambda to try.

Pretend for a moment that we can sample continuously along the lambda path between l_min and l_max, we compute a graphical model and estimate variability at each value of l. StaRS will always select the same l_stars, just as long as l_min <= l_stars <= l_max.

StaRS selects lambda such that the average edge stability is 0.05 (in the SpiecEasi setting, 0.1 in the original StaRS paper), a number which doesn't depend on the lower/upper bound of lambda you happen to try.

Raising stars.thresh is the parameter adjustment that will to get you denser graph. You don't even need to rerun SpiecEasi/pulsar at that point. Just look at the summary statistic vector and pick a different opt.ind based on the fixed higher value.

[ as an aside: since we're not sampling continuously, we instead choose the l_stars associated with the graph with the largest variability that is not over thresh ]

However, I'm not convinced that your network is all that sparse. For p=4000 a network with a sparsity of 96% has 319920 edges (p*(p-1)/2)*(1-sparsity), though of course ymmv.

@genomewalker
Copy link
Author

Hi Zach
thank you very much for the insights! I was following the suggestions from here
The graphs I got with larger lambda.min.ratio (1e-2) were quite sparse and I used to get this warning:

In batch.pulsar(data = X, fun = match.fun(estFun), fargs = args,  :
  Accurate lower bound could not be determined with the first 2 subsamples

In addition, the target stability threshold was quite far away from the 0.05 threshold. With 1e-e3 the optimal value was indeed within the lambda path values of 1e-2. If I understood correctly, it would be correct to use a smaller lambda value than the optimal one from the vector? Then I would be able to compare the results from the mb and the glasso.
Many thanks!

@zdk123 zdk123 added this to the 0.3.4 milestone Aug 24, 2018
@zdk123 zdk123 modified the milestones: 0.3.4, 0.3.5 Mar 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants