Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test example takes forever #33

Open
agarrubio opened this issue Aug 31, 2024 · 2 comments
Open

Test example takes forever #33

agarrubio opened this issue Aug 31, 2024 · 2 comments

Comments

@agarrubio
Copy link

agarrubio commented Aug 31, 2024

Hi Luwei!

I am running the test example, apparently without errors. But this single protein is taking too long (over one hour). Also, while I am using a dedicated GPU (RTX-4090), its utilization is quite low (arround 3%).
What might be the cause? How to improve?
Many thanks!

dynamicbind) ada% python run_single_protein_inference.py data/origin-1qg8.pdb data/1qg8_input.csv  --savings_per_complex 40 --inference_steps 20 --header test  --device 1 --python /home/alejandro/miniconda3/envs/dynamicbind/bin/python  --relax_python /home/alejandro/miniconda3/envs/relax/bin/python
INFO:root:run_single_protein_inference.py data/origin-1qg8.pdb data/1qg8_input.csv --savings_per_complex 40 --inference_steps 20 --header test --device 1 --python /home/alejandro/miniconda3/envs/dynamicbind/bin/python --relax_python /home/alejandro/miniconda3/envs/relax/bin/python
2024_08_31_12_54
--------------------------------

/home/alejandro/DynamicBind/run_single_protein_inference.py /home/alejandro/DynamicBind
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 57.79it/s]
hub dir /home/alejandro/DynamicBind/esm_models
Read data/prepared_for_esm_test.fasta with 1 sequences
Processing 1 of 1 batches (1 sequences)
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 201/201 [01:04<00:00,  3.13it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 201/201 [01:14<00:00,  2.69it/s]
Reading molecules and generating local structures with RDKit
4it [00:00, 14.56it/s]
Reading language model embeddings.
Generating graphs for ligands and proteins
loading complexes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00,  6.53it/s]
loading data from memory:  data/cache_torsion/limit0_INDEX_maxLigSizeNone_H0_recRad15.0_recMax24_esmEmbeddings3297976096/heterographs.pkl
Number of complexes:  4
radius protein: mean 27.280357360839844, std 0.0, max 27.280357360839844
radius molecule: mean 6.446958541870117, std 0.2975179851055145, max 6.782521724700928
distance protein-mol: mean 5.086441845492118e-08, std 2.719834313325009e-08, max 9.765030029029731e-08
rmsd matching: mean 0.0, std 0.0, max 0
common t schedule [1.   0.95 0.9  0.85 0.8  0.75 0.7  0.65 0.6  0.55 0.5  0.45 0.4  0.35
 0.3  0.25 0.2  0.15 0.1  0.05]
Size of test dataset:  4
1it [18:09, 1089.33s/it]
@patjiang
Copy link

Hello agarrubio,

I may not be the owner of the repository, but I have a few possibilities.
Firstly, I'm not sure how compatible your hardware is with pytorch currently? The problem with underutilization could come from here: https://discuss.pytorch.org/t/rtx4090-issue-with-pytorch/188653/6

Second, if your hardware drivers are all compatible, then you should check the following in terminal:
source activate dynamicbind python import torch torch.cuda.is_available torch.cuda.device_count() torch.cuda.current_device() torch.cuda.get_device_name(0)

Finally, if all of the above have been passed, then I would just say that this could be super slow because of memory issues. I have tried to run dynamicbind on smaller GPU's (a30, a100_40), and I've noticed a significant decrease in time associated with increasing GPU memory.

I hope this helps,

Thanks,
Patrick Jiang

@agarrubio
Copy link
Author

Thanks, Patrick Jiang!
After your tests failed, I restarted the computer and now dynamicbind runs much faster. The most likely explanation is an automatic update to the driver without a restart. Some how, other torch dependent programs (or environments) were not affected.
Thanks again for your help.
Alejandro

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants