You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The model itself may have been limited to work on batch size 1 only, which makes increasing throughput hard, but I'm not sure I understand this part right based on what others have said.
If we can get high enough throughput on a single node (YAMNet model architecture is specifically designed for efficient inference on constrained architectures IIRC), it is preferable to do that. However, if I do this, I may try to go ahead and run this myself via Spark. This is so that I can learn how to load binary data like MP3 files from google cloud storage directly without depending upon gcsfuse, which is known to be buggy and stall from our first time making the dataset (among other undesirable issues). I can also learn how to use GPUs effectively with Spark this way as well, if I find that using T4 GPUs is necessary to achieve reasonable throughput (ideally I would like to analyze the full dataset in a few hours).
The text was updated successfully, but these errors were encountered:
I would like to run that model on all of our audio files.
Greg made an initial attempt here: https://github.com/greg-landing/yamnet
I was told that there were a few problems:
If we can get high enough throughput on a single node (YAMNet model architecture is specifically designed for efficient inference on constrained architectures IIRC), it is preferable to do that. However, if I do this, I may try to go ahead and run this myself via Spark. This is so that I can learn how to load binary data like MP3 files from google cloud storage directly without depending upon gcsfuse, which is known to be buggy and stall from our first time making the dataset (among other undesirable issues). I can also learn how to use GPUs effectively with Spark this way as well, if I find that using T4 GPUs is necessary to achieve reasonable throughput (ideally I would like to analyze the full dataset in a few hours).
The text was updated successfully, but these errors were encountered: