-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model loading is too slow with onnxruntime-gpu #5957
Comments
sess = rt.InferenceSession(onnx_fn) --> this waits for 50 - 200 seconds at the first call. With the same model, the amount of time varies even for the sequential tries. Sometimes I have to wait for 5 minutes and that makes the debuging process very not enjoyable. Any help is appreciated. |
If:
yolov5x model took only 2 seconds to load. I was always uninstalling onnxruntime before installing onnxruntime-gpu. If this was specified in the documents, I'm sorry guys :) If not, I have no idea if this is an important issue to be resolved or not. You decide, and fell free to close this issue. Keep up the good work.. |
My previous comment is wrong. Uninstalling-installing has nothing to do with this issue. I have tried with another computer which has RXT 3080 GPU and Intel(R) Core(TM) i7-10700F CPU @ 2.90GHz, same issue occurred. I have no idea what made the other computer to load models in 0-1 seconds. |
I'm not sure this issue is related to onnxruntime or not. These are the steps that resolved this issue in my tries on two computers:
|
I encounter the same problem |
@korhun is the issue resolved for you now (based on your earlier comment)? We'll be upgrading to cuda11 in the next release (coming soon). |
@pranavsharma yes the issue had been resolved in 2 different computers after trying the steps I've mentioned on my previous comment. Thanks for your interest. I'm looking forward for your next release; keep up the good work. |
Thanks to the work of @radu-matei, @dkim, @dllu, I generated a binding for linux that works with ONNX 1.7 and CUDA 11. This avoid a performance issue with CUDA: microsoft/onnxruntime#5957
Thanks to the work of @radu-matei, @dkim, @dllu, I generated a binding for linux that works with ONNX 1.7 and CUDA 11. This avoid a performance issue with CUDA: microsoft/onnxruntime#5957
Thanks to the work of @radu-matei, @dkim, @dllu, I generated a binding for linux that works with ONNX 1.7 and CUDA 11. This avoid a performance issue with CUDA: microsoft/onnxruntime#5957
Thank you for providing the solution, I would like to know how should I solve this if my computer has no Nvidia GPU (on MacBook Pro)? |
@SystemErrorWang there was no issue when working on CPU in any of my tries. I recommend, you uninstall onnxruntime-gpu and install the latest version of onnxruntime. |
Thank You korhun, I finally found the problem, I used onnx simplifier and caused this error, the raw onnx works fine |
Using the cpu version, I have no problem. Loading onnx models using "InferenceSession" with onnxruntime-gpu takes >102 seconds for the first model. If we load more than one models, the others take no time at all; we only wait for the first call of "onnxruntime.InferenceSession(onnx_fn)" I have tried with yolov3, yolov3-tiny, yolov4, yolov5, ssd_mobilenet.
System information
Is this normal? Should I try other versions of CUDA or cuDNN?
The text was updated successfully, but these errors were encountered: