You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm encountering an issue when using the Docker example provided in the documentation to submit tasks to an HPC cluster managed by SLURM. During the execution, I noticed that the S3 download process on the HPC skipped very quickly. As a result, the code reports an error of res_tmpl failed and the file is incomplete.
Additional details
ASLPrep version: latest
Docker version: Enroot 3.4.1
What were you trying to do?
I tried different settings of enroot on slurm but nothing works. I run the docker sample on local environment and compared the logs. And I found the time diff among the s3 downloading operation.
Do compute nodes on your HPC have internet access?
Thank you for your reply.
The HPC administrator told me that the HPC has a network connection, but it may be unstable. Currently, it can be observed that the time interval between two lines of logs during normal local operation is inconsistent with that of the logs on the HPC. However, I'm not sure if there is an error prompt when the download fails. I've noticed that the subsequent graph_flow is still established normally. If it is confirmed that it is a network problem, I will discuss it with the administrator again.
Summary
I'm encountering an issue when using the Docker example provided in the documentation to submit tasks to an HPC cluster managed by SLURM. During the execution, I noticed that the S3 download process on the HPC skipped very quickly. As a result, the code reports an error of res_tmpl failed and the file is incomplete.
Additional details
What were you trying to do?
I tried different settings of enroot on slurm but nothing works. I run the docker sample on local environment and compared the logs. And I found the time diff among the s3 downloading operation.
Reproducing the bug
log_exp.txt
The text was updated successfully, but these errors were encountered: