Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Nemo-Serve's speed #14

Open
gaurav opened this issue Aug 31, 2022 · 3 comments
Open

Improve Nemo-Serve's speed #14

gaurav opened this issue Aug 31, 2022 · 3 comments
Assignees

Comments

@gaurav
Copy link
Contributor

gaurav commented Aug 31, 2022

At the moment, Nemo-Serve CLI takes around 0.8-1sec/text. Ideally, we would like to speed it up both so we can annotate all of split_11 (around 430k lines), but also so it's generally easier to integrate Nemo-Serve into workflows.

@gaurav
Copy link
Contributor Author

gaurav commented Sep 16, 2022

@YaphetKG pointed out that inferencing blocks the Uvicorn worker, so if there are too many requests, no new HTTP requests will be served until previous processing is complete.

@gaurav gaurav moved this to Todo in Comparing NER engines Sep 16, 2022
@YaphetKG
Copy link
Contributor

Did some testing for different GPU types with the following test text

Scientific fraud: the McBride case--judgment. Dr W G McBride, who was a specialist obstetrician and gynaecologist and the first to publish on the teratogenicity of thalidomide, has been removed from the medical register after a four-year inquiry by the Medical Tribunal of New South Wales. Of the 44 medical practice allegations made against him by the Department of Health only one minor one was found proved but 24 of the medical research allegations were found proved. Of these latter, the most serious was that in 1982 he published a scientific journal, spurious results relating to laboratory experiments on pregnant rabbits dosed with scopolamine.

And POD memory was 10Gi for the following instances:

nvidia.com/mig-1g.5gb  
  4.520619392
  5.052228928
  5.666506767
  5.098900557
  4.109045029
  4.500363827
  3.967423439
  5.439374685
  5.601055145
  4.811736107
Average: 4.876725388
   
   
nvidia.com/mig-2g.10gb  
  0.8122918606
  0.7861666679
  0.9124851227
  1.168374777
  0.9636256695
  0.8782594204
  1.006503105
  1.078202486
  1.105758667
  0.9951579571
Average 0.9706825733
   
nvidia.com/mig-3g.20gb  
  0.7158727646
  0.9437057972
  1.074194193
  0.9899556637
  0.9441432953
  0.7792351246
  0.6111810207
  0.8769848347
  0.6851010323
  1.264921904
Average 0.888529563

@gaurav
Copy link
Contributor Author

gaurav commented Jan 16, 2023

My mds-import tool can use Nemo-Serve and SAPBERT to generate 7,569 annotations from 1,141 fields in about 20 mins, so around 57 fields a minute, which is pretty close to your estimate of around a second per text. This isn't too bad at all for HEAL purposes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

2 participants