Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with inference_demo.sh: NoneType Error During MET Parsing in CCD CIF File #43

Open
Oaklight opened this issue Dec 28, 2024 · 10 comments

Comments

@Oaklight
Copy link

Description

When running inference_demo.sh, the script fails with an AttributeError: 'NoneType' object has no attribute 'res_id' after the warning get_component_atom_array() can not parse MET. Despite verifying that "MET" exists in the local CCD CIF file (components.v20240608.cif), the error persists. This suggests a potential issue with how the MET residue is being processed in the json_parser.py or infer_data_pipeline.py scripts.

Original terminal messages:

$ CUDA_AVAILABLE_DEVICES=0 ./inference_demo.sh 
Try to find the ccd cache data in the code directory for inference.
[2024-12-27 19:41:10,524] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
 [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
 [WARNING]  NVIDIA Inference is only supported on Ampere and newer architectures
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3
 [WARNING]  using untested triton version (3.0.0), only 1.0.0 is known to be compatible
2024-12-27 19:41:29,332 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Distributed environment: world size: 1, global rank: 0, local rank: 0
2024-12-27 19:41:29,332 [/homes/pding/Protenix/runner/inference.py:63] INFO root: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
2024-12-27 19:41:29,333 [/homes/pding/Protenix/runner/inference.py:87] INFO root: Finished init ENV.
train scheduler 16.0
inference scheduler 16.0
Diffusion Module has 16.0
2024-12-27 19:41:34,850 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Loading from /homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/./release_data/checkpoint/model_v0.2.0.pt, strict: False
2024-12-27 19:41:36,706 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Sampled key: module.input_embedder.atom_attention_encoder.linear_no_bias_f.weight
2024-12-27 19:41:36,845 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Finish loading checkpoint.
2024-12-27 19:41:36,858 [/homes/pding/Protenix/runner/inference.py:226] INFO __main__: Loading data from
./examples/example.json
2024-12-27 19:41:37,447 [/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/infer_data_pipeline.py:209] INFO protenix.data.infer_data_pipeline: Featurizing 7r6r...
2024-12-27 19:41:40,001 [/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/ccd.py:90] WARNING protenix.data.ccd: Warning: get_component_atom_array() can not parse MET
2024-12-27 19:41:40,017 [/homes/pding/Protenix/runner/inference.py:237] INFO __main__: 'NoneType' object has no attribute 'res_id':
Traceback (most recent call last):
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/infer_data_pipeline.py", line 211, in __getitem__
    data, atom_array, _ = self.process_one(
                          ^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/infer_data_pipeline.py", line 101, in process_one
    sample2feat = SampleDictToFeatures(
                  ^^^^^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/json_to_feature.py", line 34, in __init__
    self.input_dict = add_entity_atom_array(single_sample_dict)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/json_parser.py", line 589, in add_entity_atom_array
    atom_info = build_polymer(entity_info)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/json_parser.py", line 344, in build_polymer
    chain_array = _build_polymer_atom_array(ccd_seqs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/json_parser.py", line 292, in _build_polymer_atom_array
    residue.res_id[:] = res_id + 1
    ^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'res_id'

Steps to Reproduce

  1. Run CUDA_AVAILABLE_DEVICES=0 ./inference_demo.sh.
  2. Observe the error traceback pointing to the NoneType issue during MET parsing.

Expected Behavior

The script should successfully parse the MET residue from the CCD CIF file and proceed with inference.

Actual Behavior

The script fails with a NoneType error, indicating a problem with parsing the MET residue.

It's strange to break at this line:

Warning: get_component_atom_array() can not parse MET

I verified the local CCD cif file in python interactively:

 >>> import os
>>> COMPONENTS_FILE = "/homes/pding/Protenix/release_data/ccd_cache/components.v20240608.cif"
>>> import biotite.structure.io.pdbx as pdbx
>>> ccd_cif = pdbx.CIFFile.read(COMPONENTS_FILE)
>>> "MET" in ccd_cif
True
>>> 

Please enlighten me on what may caused this and how may I fix it. Thanks!

@cloverzizi
Copy link
Contributor

Hi @Oaklight ,
I couldn't reproduce this issue. Based on the error message, it seems there may be a problem with the integrity of the ccd_cache file. You can try creating this script within the code directory:

from protenix.data.ccd import get_component_atom_array

res = get_component_atom_array("MET", keep_leaving_atoms=True, keep_hydrogens=False)
print(res)

After running it, you should expect the following output:

HET         0  MET N      N        -1.816    0.142   -1.166
HET         0  MET CA     C        -0.392    0.499   -1.214
HET         0  MET C      C         0.206    0.002   -2.504
HET         0  MET O      O        -0.236   -0.989   -3.033
HET         0  MET CB     C         0.334   -0.145   -0.032
HET         0  MET CG     C        -0.273    0.359    1.277
HET         0  MET SD     S         0.589   -0.405    2.678
HET         0  MET CE     C        -0.314    0.353    4.056
HET         0  MET OXT    O         1.232    0.661   -3.066

If it doesn't print successfully, try renaming the /homes/pding/Protenix/release_data folder to something else and running bash inferece_demo.sh to re-download the release_data .

@Oaklight
Copy link
Author

Oaklight commented Dec 28, 2024

I use the exact script you provided, and got the following result, which looks the same as yours to me:

$ python check.py 
Try to find the ccd cache data in the code directory for inference.
HET         0  MET N      N        -1.816    0.142   -1.166
HET         0  MET CA     C        -0.392    0.499   -1.214
HET         0  MET C      C         0.206    0.002   -2.504
HET         0  MET O      O        -0.236   -0.989   -3.033
HET         0  MET CB     C         0.334   -0.145   -0.032
HET         0  MET CG     C        -0.273    0.359    1.277
HET         0  MET SD     S         0.589   -0.405    2.678
HET         0  MET CE     C        -0.314    0.353    4.056
HET         0  MET OXT    O         1.232    0.661   -3.066

then immediately I reran the inference_demo.sh

$ CUDA_VISIBLE_DEVICES=0 ./inference_demo.sh 
Try to find the ccd cache data in the code directory for inference.
[2024-12-28 15:52:44,121] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
 [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
 [WARNING]  NVIDIA Inference is only supported on Ampere and newer architectures
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3
 [WARNING]  using untested triton version (3.0.0), only 1.0.0 is known to be compatible
2024-12-28 15:52:50,517 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Distributed environment: world size: 1, global rank: 0, local rank: 0
2024-12-28 15:52:50,517 [/homes/pding/Protenix/runner/inference.py:63] INFO root: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
2024-12-28 15:52:50,517 [/homes/pding/Protenix/runner/inference.py:87] INFO root: Finished init ENV.
train scheduler 16.0
inference scheduler 16.0
Diffusion Module has 16.0
2024-12-28 15:52:56,437 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Loading from /homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/./release_data/checkpoint/model_v0.2.0.pt, strict: False
2024-12-28 15:52:58,274 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Sampled key: module.input_embedder.atom_attention_encoder.linear_no_bias_f.weight
2024-12-28 15:52:58,413 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Finish loading checkpoint.
2024-12-28 15:52:58,426 [/homes/pding/Protenix/runner/inference.py:226] INFO __main__: Loading data from
./examples/example.json
2024-12-28 15:52:58,993 [/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/infer_data_pipeline.py:209] INFO protenix.data.infer_data_pipeline: Featurizing 7r6r...
2024-12-28 15:53:01,030 [/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/ccd.py:90] WARNING protenix.data.ccd: Warning: get_component_atom_array() can not parse MET
2024-12-28 15:53:01,039 [/homes/pding/Protenix/runner/inference.py:237] INFO __main__: 'NoneType' object has no attribute 'res_id':
Traceback (most recent call last):
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/infer_data_pipeline.py", line 211, in __getitem__
    data, atom_array, _ = self.process_one(
                          ^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/infer_data_pipeline.py", line 101, in process_one
    sample2feat = SampleDictToFeatures(
                  ^^^^^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/json_to_feature.py", line 34, in __init__
    self.input_dict = add_entity_atom_array(single_sample_dict)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/json_parser.py", line 589, in add_entity_atom_array
    atom_info = build_polymer(entity_info)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/json_parser.py", line 344, in build_polymer
    chain_array = _build_polymer_atom_array(ccd_seqs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/json_parser.py", line 292, in _build_polymer_atom_array
    residue.res_id[:] = res_id + 1
    ^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'res_id'

Got the same error message @cloverzizi

@cloverzizi
Copy link
Contributor

Hi @Oaklight
First, ensure that your code is up-to-date with the latest version from the "main" branch. Then, either delete or rename the release_data folder in the Protenix code directory (e.g., rename it to release_data_failed). This will prompt the system to automatically re-download the release_data folder when you rerun inference_demo.sh. This approach might fix this issues caused by previously downloaded data.

@Oaklight
Copy link
Author

Oaklight commented Jan 4, 2025

Some update on this @cloverzizi

The machine I reported the above error message with has V100 GPUs. After switching to a machine with A100 GPUs and rerunning the same process, I encountered an error for not finding the release data at:

/homes/pding/miniforge3/envs/esm/lib/python3.12/site-packages/release_data/

I resolved this by creating a soft link to the release_data in the cloned Protenix git repo. After rerunning the code, it worked successfully.

Here is a copy of the running log from the A100 machine:

$ CUDA_VISIBLE_DEVICES=7 bash inference_demo.sh 
Try to find the ccd cache data in the code directory for inference.
[2025-01-04 03:40:45,776] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
 [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3
 [WARNING]  using untested triton version (3.0.0), only 1.0.0 is known to be compatible
2025-01-04 03:40:49,365 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Distributed environment: world size: 1, global rank: 0, local rank: 0
2025-01-04 03:40:49,365 [/homes/pding/Protenix/runner/inference.py:63] INFO root: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [7]
2025-01-04 03:40:49,366 [/homes/pding/Protenix/runner/inference.py:87] INFO root: Finished init ENV.
train scheduler 16.0
inference scheduler 16.0
Diffusion Module has 16.0
2025-01-04 03:40:54,121 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Loading from /rbstor/pding/miniforge3/envs/esm/lib/python3.12/site-packages/./release_data/checkpoint/model_v0.2.0.pt, strict: False
2025-01-04 03:40:55,806 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Sampled key: module.input_embedder.atom_attention_encoder.linear_no_bias_f.weight
2025-01-04 03:40:55,943 [/homes/pding/Protenix/runner/inference.py:153] INFO __main__: Finish loading checkpoint.
2025-01-04 03:40:55,957 [/homes/pding/Protenix/runner/inference.py:226] INFO __main__: Loading data from
./examples/example.json
2025-01-04 03:40:56,676 [/rbstor/pding/miniforge3/envs/esm/lib/python3.12/site-packages/protenix/data/infer_data_pipeline.py:209] INFO protenix.data.infer_data_pipeline: Featurizing 7r6r...
2025-01-04 03:41:04,797 [/homes/pding/Protenix/runner/inference.py:246] INFO __main__: [Rank 0 (1/1)] 7r6r: N_asym 1, N_token 203, N_atom 1666, N_msa 363
2025-01-04 03:41:45,598 [/homes/pding/Protenix/runner/inference.py:265] INFO __main__: [Rank 0] 7r6r succeeded.
Results saved to ./output

Repeating the same process of soft linking release_data in the Protenix git repo to the site-packages/release_data on the V100 machine and rerunning the code still resulted in the same NoneType object has no attribute 'res_id' error.

This suggests the issue may be rooted in GPU-architecture-sensitive code.

@cloverzizi
Copy link
Contributor

Hi @Oaklight ,
I still suggest removing the release_data from your machine with V100 GPUs, both in the site-packages directory and within the Protenix git repository. Then, re-run the inference_demo.sh script to download the data afresh. This might resolve the issue.

@Oaklight
Copy link
Author

Oaklight commented Jan 6, 2025

I forgot to mention in the last reply, but I did what you suggested on v100. I even tried reclone the repo and downloaded release_data via 'inference_demo.sh'. It just didn't work. That's the motivation of trying another machine. 🙂

@cloverzizi
Copy link
Contributor

Hi @Oaklight , it shouldn't be related to the type of GPU, as this processing step hasn't utilized the GPU yet. Please check the absolute path /af3-dev/release_data/ to see if there is a release_data file present. If it exists, try deleting it and retrying (older versions of Protenix would place files there).

@Oaklight
Copy link
Author

Oaklight commented Jan 7, 2025

My account doesn't have sudo privilege, so there is no /af3-dev folder on both machines. Any other possible reason? I was using v0.3.5 when posting the issue.

@cloverzizi
Copy link
Contributor

Hi @Oaklight ,
Recently, we released Protenix version 0.4.0, which includes a script for manually updating the CCD Cache. You can update the CCD files by running
python3 scripts/gen_ccd_cache.py -n [num_cpu]

This might help resolve this issues.

@Oaklight
Copy link
Author

I appreciate this effort! I will check that out soon. Meanwhile, I made a protein score inference server, enabling RESTful queries about a few tasks. Protenix is selected as one of the big protein models. It's publicly available here: https://github.com/Oaklight/protein-score-server
Feedback and issue posts are welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants