You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm looking at the Efficiency Study paper and I'd like to replicate the query encoding numbers - could you please provide a pipeline or any other pointers so I can ensure my measurement is correct?
Thanks a lot!
The text was updated successfully, but these errors were encountered:
So at the time I think I basically tokenized everything (without taking the time into account) and then ran it a query at the time with a single CPU core (set with SLURM). I can try spinning a similar pipeline, but would love to hear your thoughts.
For later papers I started using a benchmarker from huggingface, but I cannot find it right now. I can try digging deeper if needed.
Thanks Carlos! I was looking for the test setup for measuring inference latency for queries so I could replicate your numbers, but I think I can manage without it if you don't have it. I'll have a dig on HF and see if I can find anything. Thanks again!
Hey all,
I'm looking at the Efficiency Study paper and I'd like to replicate the query encoding numbers - could you please provide a pipeline or any other pointers so I can ensure my measurement is correct?
Thanks a lot!
The text was updated successfully, but these errors were encountered: