It takes way too long to run inference #1876

boadecea25 · 2024-07-22T22:24:30Z

boadecea25
Jul 22, 2024

I have several videos that are 3-4 hrs long and each video take around 9 hrs to run inference on. How do I decrease the time it takes? I am using a GPU to run inference.

talmo · 2024-07-23T04:35:38Z

talmo
Jul 23, 2024
Maintainer

Hi @boadecea25,

Can you share some more info for context?

What is your model configuration?
Are you running inference from the GUI or CLI?
Do you have tracking enabled? Which settings?
What GPU are you using?

0 replies

boadecea25 · 2024-07-23T04:49:29Z

boadecea25
Jul 23, 2024
Author

Hello and thank you for the response, 1. My model configuration is bottom-up. 2. I am running inference from the CLI on a remote cluster. 3. Tracking is enabled- Simple tracking with iou and Hungarian matching. 4. My GPU model is A100. I read that some people reached 3-digit fps inference rate while I’m stuck in 5-10 fps. Thanks!

…

On Mon, Jul 22, 2024 at 9:35 PM Talmo Pereira ***@***.***> wrote: Hi @boadecea25 <https://github.com/boadecea25>, Can you share some more info for context? 1. What is your model configuration? 2. Are you running inference from the GUI or CLI? 3. Do you have tracking enabled? Which settings? 4. What GPU are you using? — Reply to this email directly, view it on GitHub <#1876 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A3R33Z6MCYCI4ZUFYCDQ5PLZNXMS7AVCNFSM6AAAAABLJFQW72VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJSGE3TQOI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

1 reply

talmo Jul 23, 2024
Maintainer

Do you mind sharing the model configuration (training_config.json) or a screenshot from the GUI so we can see how the model itself is configured?
Can you share the exact CLI command?
Tracking is unfortunately not very parallelizable since it depends on previous frames, so ultimately this might be your bottleneck. Let's continue investigating though with the above Q's.

boadecea25 · 2024-07-23T17:35:54Z

boadecea25
Jul 23, 2024
Author

1. I have the attached the training_config.json to this email. 2. python run_sleap_multigpu.py \ --video_dir /scratch/users/ycwei/videos \ --model /scratch/users/ycwei/models/240710_122929.multi_instance.n=270 \ --output_folder /home/groups/nirao/sleapoutputs \ --max_instances 2 \ --num_workers 4 the num_workers here is the number of gpus I am using. I have assigned one video to one gpu on my python script. 3. While I used the GUI, I used to get an fps of 5-10 and this would mean it might take 10-12 hrs for one video. This command line code I am using however, has been taking longer and I am not sure if there is an error. How do I find the fps of inference when it is being done on the command line? Thanks! Varsha

…

On Mon, Jul 22, 2024 at 10:05 PM Talmo Pereira ***@***.***> wrote: 1. Do you mind sharing the model configuration (training_config.json) or a screenshot from the GUI so we can see how the model itself is configured? 2. Can you share the exact CLI command? 3. Tracking is unfortunately not very parallelizable since it depends on previous frames, so ultimately this might be your bottleneck. Let's continue investigating though with the above Q's. — Reply to this email directly, view it on GitHub <#1876 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A3R33ZZ2ZCOQURKZCPZ6CA3ZNXQBPAVCNFSM6AAAAABLJFQW72VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJSGE4TIMQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

boadecea25 · 2024-07-23T17:38:53Z

boadecea25
Jul 23, 2024
Author

Sorry for the error in the previous email, I have attached the training_config.json in this email. On Tue, Jul 23, 2024 at 10:35 AM Varsha Satish ***@***.***> wrote:

…

1. I have the attached the training_config.json to this email. 2. python run_sleap_multigpu.py \ --video_dir /scratch/users/ycwei/videos \ --model /scratch/users/ycwei/models/240710_122929.multi_instance.n=270 \ --output_folder /home/groups/nirao/sleapoutputs \ --max_instances 2 \ --num_workers 4 the num_workers here is the number of gpus I am using. I have assigned one video to one gpu on my python script. 3. While I used the GUI, I used to get an fps of 5-10 and this would mean it might take 10-12 hrs for one video. This command line code I am using however, has been taking longer and I am not sure if there is an error. How do I find the fps of inference when it is being done on the command line? Thanks! Varsha On Mon, Jul 22, 2024 at 10:05 PM Talmo Pereira ***@***.***> wrote: > > 1. Do you mind sharing the model configuration (training_config.json) > or a screenshot from the GUI so we can see how the model itself is > configured? > 2. Can you share the exact CLI command? > 3. Tracking is unfortunately not very parallelizable since it depends > on previous frames, so ultimately this might be your bottleneck. Let's > continue investigating though with the above Q's. > > — > Reply to this email directly, view it on GitHub > <#1876 (reply in thread)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/A3R33ZZ2ZCOQURKZCPZ6CA3ZNXQBPAVCNFSM6AAAAABLJFQW72VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJSGE4TIMQ> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

1 reply

talmo Jul 23, 2024
Maintainer

Hey @boadecea25,

I don't think the attachments are coming through on GitHub. Maybe you could put them in a Gist or Google Drive?

Thanks!

Talmo

boadecea25 · 2024-07-23T17:51:41Z

boadecea25
Jul 23, 2024
Author

Sure, please check if it is accessible now. Thanks, Varsha

…

On Tue, Jul 23, 2024 at 10:47 AM Talmo Pereira ***@***.***> wrote: Hey @boadecea25 <https://github.com/boadecea25>, I don't think the attachments are coming through on GitHub. Maybe you could put them in a Gist <https://gist.github.com> or Google Drive? Thanks! Talmo — Reply to this email directly, view it on GitHub <#1876 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A3R33Z6EIAPRSBWRNLAT2S3ZN2JLBAVCNFSM6AAAAABLJFQW72VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJSHE2TAOA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

boadecea25 · 2024-07-23T18:15:51Z

boadecea25
Jul 23, 2024
Author

I just tried running the same length video without tracking and even increased the batch-size to 64. fps of inference still stagnates at 11 on the gui. thanks, Varsha

…

On Mon, Jul 22, 2024 at 10:05 PM Talmo Pereira ***@***.***> wrote: 1. Do you mind sharing the model configuration (training_config.json) or a screenshot from the GUI so we can see how the model itself is configured? 2. Can you share the exact CLI command? 3. Tracking is unfortunately not very parallelizable since it depends on previous frames, so ultimately this might be your bottleneck. Let's continue investigating though with the above Q's. — Reply to this email directly, view it on GitHub <#1876 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A3R33ZZ2ZCOQURKZCPZ6CA3ZNXQBPAVCNFSM6AAAAABLJFQW72VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJSGE4TIMQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It takes way too long to run inference #1876

{{title}}

Replies: 6 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

It takes way too long to run inference #1876

boadecea25 Jul 22, 2024

Replies: 6 comments · 2 replies

talmo Jul 23, 2024 Maintainer

boadecea25 Jul 23, 2024 Author

talmo Jul 23, 2024 Maintainer

boadecea25 Jul 23, 2024 Author

boadecea25 Jul 23, 2024 Author

talmo Jul 23, 2024 Maintainer

boadecea25 Jul 23, 2024 Author

boadecea25 Jul 23, 2024 Author

boadecea25
Jul 22, 2024

Replies: 6 comments 2 replies

talmo
Jul 23, 2024
Maintainer

boadecea25
Jul 23, 2024
Author

talmo Jul 23, 2024
Maintainer

boadecea25
Jul 23, 2024
Author

boadecea25
Jul 23, 2024
Author

talmo Jul 23, 2024
Maintainer

boadecea25
Jul 23, 2024
Author

boadecea25
Jul 23, 2024
Author