Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--scratch_dir not working in v 1.4.1 #311

Closed
moritzbuck opened this issue Mar 12, 2021 · 7 comments · Fixed by #318
Closed

--scratch_dir not working in v 1.4.1 #311

moritzbuck opened this issue Mar 12, 2021 · 7 comments · Fixed by #318
Labels
error Help required for a GTDB-Tk error. next version Upcoming feature/fix in staging branch.

Comments

@moritzbuck
Copy link
Contributor

Hej, thanks once again for GTDB and GTDBtk, big fan.

I updated to version 1.4.1 and it seems like the --scratch_dir option does not work anymore, it just loads it all to the RAM (and obviously then crashes my machine as I only have 128Gb of RAM).

I downgraded to version 1.4.0 in the same environment and it works fine, so I don't think it's my env or machine, but I attached the settings anyhow!

regards and greetings from Sweden,

Moritz

Environment

  • [x ] Installed via pip (include the output of pip list)
Package     Version
----------- -------------------
certifi     2020.12.5
DendroPy    4.5.2
gtdbtk      1.4.1
mkl-fft     1.3.0
mkl-random  1.2.0
mkl-service 2.3.0
numpy       1.19.5
pip         21.0.1
setuptools  49.6.0.post20210108
six         1.15.0
tqdm        4.46.1
wheel       0.36.2
  • [x ] Using a conda environment (include the output of `conda list)
 Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
blas                      2.3                    openblas    conda-forge
boost                     1.67.0           py36h3e44d54_0    conda-forge
boost-cpp                 1.67.0               h3a22d5f_0    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2020.12.5            ha878542_0    conda-forge
certifi                   2020.12.5        py36h5fab9bb_1    conda-forge
dendropy                  4.5.2              pyh3252c3a_0    bioconda
fastani                   1.1                  h4ef8376_0    bioconda
fasttree                  2.1.10               h516909a_4    bioconda
gtdbtk                    1.4.0                    pypi_0    pypi
hmmer                     3.3.2                he1b5a44_0    bioconda
icu                       58.2              hf484d3e_1000    conda-forge
intel-openmp              2020.2                      254  
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
libblas                   3.9.0                3_openblas    conda-forge
libboost                  1.73.0              h3ff78a5_11  
libcblas                  3.9.0                3_openblas    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
libgfortran-ng            7.5.0               h14aa051_18    conda-forge
libgfortran4              7.5.0               h14aa051_18    conda-forge
liblapack                 3.9.0                3_openblas    conda-forge
liblapacke                3.9.0                3_openblas    conda-forge
libopenblas               0.3.12          pthreads_hb3c22a3_1    conda-forge
libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
llvm-openmp               11.0.1               h4bd325d_0    conda-forge
lz4-c                     1.9.3                h9c3ff4c_0    conda-forge
mkl                       2020.4             h726a3e6_304    conda-forge
mkl-service               2.3.0            py36h8c4c3a4_2    conda-forge
mkl_fft                   1.3.0            py36h92226af_1    conda-forge
mkl_random                1.2.0            py36h7c3b610_1    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
numpy                     1.19.0rc2                pypi_0    pypi
numpy-base                1.18.5           py36h2f8d375_0  
openssl                   1.1.1j               h7f98852_0    conda-forge
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
pplacer                   1.1.alpha19                   1    bioconda
prodigal                  2.6.3                h516909a_2    bioconda
py-boost                  1.73.0          py36ha9443f7_11  
python                    3.6.13          hffdb5ce_0_cpython    conda-forge
python_abi                3.6                     1_cp36m    conda-forge
readline                  8.0                  he28a2e2_2    conda-forge
setuptools                49.6.0           py36h5fab9bb_3    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sqlite                    3.34.0               h74cdb3f_0    conda-forge
tk                        8.6.10               h21135ba_1    conda-forge
tqdm                      4.46.1                   pypi_0    pypi
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.4.5                h9ceee32_0  

Server information

@moritzbuck moritzbuck added the error Help required for a GTDB-Tk error. label Mar 12, 2021
@arminlahm
Copy link

arminlahm commented Mar 25, 2021

"Dirty" workaround: in the gtdbtk/external sub-directory in file pplacer.py change lines 138 to 140 from
if mmap_file:
args.append('--mmap-file')
args.append(mmap_file)
into
# if mmap_file:
args.append('--mmap-file')
args.append('/path_to_your_scratch_dir/')

worked for me (HP Z400, 48Gb memory, single Xeon processor, 4 core, 8 hyperthreads). 100 Gb of free space (internal SSD disc) was enough to classify 200-300 assemblies in about 3 hours (using the --cpus 8 flag).

@moritzbuck
Copy link
Contributor Author

Lovely hack! Thanks :)
So it's just a badly handled argv ... damn I hate CLI management.... reminds me I have to fix some of my stuff..

@pchaumeil
Copy link
Collaborator

Hello,
Thank you for the feedback,
We have spotted the issue and the scratch_dir should now work with classify_wf.
The new feature will be available in the new version of GTDB-Tk coming in few days.

Regards,
Pierre

@jsgounot
Copy link

jsgounot commented Apr 5, 2021

Hi, did you updated the new version on conda ?

@arminlahm
Copy link

arminlahm commented Apr 6, 2021 via email

@aaronmussig aaronmussig linked a pull request Apr 22, 2021 that will close this issue
@aaronmussig aaronmussig added the next version Upcoming feature/fix in staging branch. label Apr 22, 2021
@arminlahm
Copy link

arminlahm commented Apr 26, 2021 via email

@moritzbuck
Copy link
Contributor Author

Thanks fo the fix by the way! And sorry for the late reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error Help required for a GTDB-Tk error. next version Upcoming feature/fix in staging branch.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants