-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prompt eval time is counted twice #790
Comments
The timing computations looks correct for me, tested with |
It's still broken in The reason we do it like this is because when using You don't see this in |
I'm not sure this is the only problem. If I understand
correctly, model loading is decoupled from context creation here and we cant' access the timings from this function even when not using A simple workaround could be to remove the offending code in |
Co-authored-by: Andrei <[email protected]>
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Creating a new issue so this doesn't get forgotten:
@KASR posted a CSV of processing times in #603 (comment)
But the times don't add up: If you take the total time, and subtract the partial times that are supposed to add up to it, the result is all over the place:
The clue lies in the comment by @ggerganov :
Originally posted by @ggerganov in #603 (comment)
The text was updated successfully, but these errors were encountered: