Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault in remora/model_util.py load_onnx_model() #241

Closed
benbfly opened this issue Jan 24, 2022 · 3 comments
Closed

segfault in remora/model_util.py load_onnx_model() #241

benbfly opened this issue Jan 24, 2022 · 3 comments

Comments

@benbfly
Copy link

benbfly commented Jan 24, 2022

Using:
Megalodon/2.4.1
guppy 6.0.1

I am trying to install and run megalodon 2.4.1 GPU mode. It works fine with the old modbases model, but I am running into problems with Remora. It segfaults inside of remora/model_util.py load_onnx_model(). I'm guessing this is an incompatibility with either cuda libraries, or maybe the Pandas/numexpr warning at the top (remora/model_util.py does use Pandas).


command:
megalodon ./megtemp.CPU16.RAM127000.1643019236/1-1 --output-directory ./megout.CPU16.RAM127000.1643019236/1-1 --guppy-params "--use_tcp 34487" --remora-modified-bases dna_r9.4.1_e8 fast 0.0.0 5hmc_5mc CG 0 --guppy-config dna_r9.4.1_450bps_fast.cfg --guppy-server-path /usr/local/hurcs/guppy/6.0.1/bin/guppy_basecall_server --outputs basecalls mappings mod_mappings mods per_read_mods --reference genome.fa --processes 16 --overwrite --devices 0 --suppress-progress-bars --suppress-queues-status


stderr:
[12:14:14] Loading guppy basecalling backend
[12:14:16] CRF models are not fully supported.
/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/pandas/compat/_optional.py:138: UserWarning: Pandas requires version '2.7.0' or newer of 'numexpr' (version '2.6.9' currently installed).
warnings.warn(msg, UserWarning)
Traceback (most recent call last):
File "/usr/local/hurcs/megalodon/2.4.1/bin/megalodon", line 8, in
sys.exit(_main())
File "/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/megalodon/main.py", line 754, in _main
megalodon._main(args)
File "/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/megalodon/megalodon.py", line 1807, in _main
args.remora_modified_bases,
File "/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/megalodon/backends.py", line 596, in init
remora_model_spec=remora_model_spec,
File "/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/megalodon/backends.py", line 1170, in pyguppy_load_settings
remora_model_spec=remora_model_spec,
File "/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/megalodon/backends.py", line 1130, in pyguppy_set_model_attributes
quiet=True,
File "/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/remora/model_util.py", line 474, in load_model
return load_onnx_model(path, device, quiet=quiet)
File "/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/remora/model_util.py", line 298, in load_onnx_model
model_filename, providers=providers, provider_options=provider_options
File "/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 335, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/usr/local/hurcs/megalodon/2.4.1/lib/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 368, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
RuntimeError: /onnxruntime_src/onnxruntime/core/platform/posix/env.cc:183 onnxruntime::{anonymous}::PosixThread::PosixThread(const char*, int, unsigned int ()(int, Eigen::ThreadPoolInterface), Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed, error code: 0 error msg:

/var/spool/slurmd/job1134068/slurm_script: line 7: 36960 Segmentation fault

@benbfly benbfly changed the title segfault in segfault in remora/model_util.py load_onnx_model() Jan 24, 2022
@marcus1487
Copy link
Collaborator

This seems quite similar to another onnx related issue another remora user has encountered. Could you try the python commands in this onnx issue (microsoft/onnxruntime#10166) and report the results?

@benbfly
Copy link
Author

benbfly commented Jan 24, 2022

Mine produces the same output and stalls:
/usr/local/hurcs/megalodon/2.4.1/bin/python3.7 snippet.py
2022-01-24 23:34:17.075188907 [I:onnxruntime:, inference_session.cc:273 operator()] Flush-to-zero and denormal-as-zero are off
2022-01-24 23:34:17.075230285 [I:onnxruntime:, inference_session.cc:280 ConstructorCommon] Creating and using per session threadpools since use_per_session_threads_ is true

Debian v.10 (buster)

@marcus1487
Copy link
Collaborator

This issue should be resolved with the latest Remora release (0.1.2). Please re-open if this issue persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants