Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot submit >=1 sge job #198

Closed
nick-youngblut opened this issue Jun 15, 2020 · 2 comments
Closed

cannot submit >=1 sge job #198

nick-youngblut opened this issue Jun 15, 2020 · 2 comments

Comments

@nick-youngblut
Copy link

I've tried messing around with n_jobs, job_size, and workers, but no matter what I try, Q() always just submits one cluster job instead of multiple. An example:

fx = function(x, y) x * 2 + y
tmpl = list(job_mem = '8G', log_file = '/ebio/abt3/nyoungblut/tmp/clustermq.log')
Q(fx, x=1:20, const=list(y=10), n_jobs=4, job_size=1, 
workers = clustermq::workers(4), template=tmpl)

My SGE template:

#!/bin/bash
#$ -N {{ job_name }}                    # job name
#$ -pe parallel {{ cores | 1 }}         # job threads
#$ -l h_rt={{ job_time | 00:59:00 }}    # job time
#$ -l h_vmem={{ job_mem | 7G }}         # job memory
#$ -j y                                 # combine stdout/error in one file
#$ -o {{ log_file | /dev/null }}        # output log file
#$ -cwd                                 # use pwd as work dir
#$ -V                                   # use environment variable

. ~/.bashrc
conda activate {{ conda | py3 }}

export OMP_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1
export MKL_NUM_THREADS=1

#ulimit -v $(( 1024 * {{ memory | 4096 }} ))
CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'

SessionInfo:

R version 4.0.1 (2020-06-06)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS/LAPACK: /ebio/abt3_projects/Georg_animal_feces/envs/sandbox/lib/libopenblasp-r0.3.9.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] clustermq_0.8.9

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6      prettyunits_1.1.1 digest_0.6.25     crayon_1.3.4     
 [5] IRdisplay_0.7.0   repr_1.1.0        R6_2.4.1          jsonlite_1.6.1   
 [9] evaluate_0.14     pillar_1.4.4      progress_1.2.2    rlang_0.4.6      
[13] uuid_0.1-4        vctrs_0.3.1       IRkernel_1.1      tools_4.0.1      
[17] hms_0.5.3         parallel_4.0.1    compiler_4.0.1    pkgconfig_2.0.3  
[21] base64enc_0.1-3   rzmq_0.9.7        htmltools_0.4.0   pbdZMQ_0.3-3  

btw, does anyone know if there's a way to use wildcards for the log file names for SGE? I'd like each job to write a separate log file.

@mschubert
Copy link
Owner

It looks to me like the "array job" line is missing in your template:

#$ -t 1-{{ n_jobs }} # submit jobs as array

See here for details: https://mschubert.github.io/clustermq/articles/userguide.html#scheduler-templates

You can split your log files per worker using \$TASK_ID in the file name.

@nick-youngblut
Copy link
Author

Thanks for answering my simple questions!

Repository owner locked and limited conversation to collaborators Mar 29, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants