-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize on either GiB/MiB or GB/MB in logging #94
Comments
good catch. Right now we are actually calculating GiB in the logging but calling it (incorrectly) GB, so we are 7% off. That said, I think we should stay consistent with how Nvidia specs their GPU's and atm it's in GB rather than GiB, so I would vote for correcting the calcs to 1000^3 rather than current 1024^3. |
For some reason I cannot see the screenshot. Interestingly, in pytorch/pytorch#120172, the user suggested that Nvidia labels it as GB but actually means GiB 😆 (pytorch/pytorch#120172 (comment)) |
this PR updates the GPU metrics to labelling as GiB - we were calculating GiB but calling it GB. (credit to @awgu for flagging this - issue #94) function names and member vars in metrics.py have been updated to _gib instead of _gb for clarity, and the logging output now labels as GiB: <img width="851" alt="Screenshot 2024-02-27 at 11 28 23 AM" src="https://github.com/pytorch/torchtrain/assets/46302957/85eb260a-77e9-4c49-be8a-b1aaa10dc3e2">
this PR updates the GPU metrics to labelling as GiB - we were calculating GiB but calling it GB. (credit to @awgu for flagging this - issue #94) function names and member vars in metrics.py have been updated to _gib instead of _gb for clarity, and the logging output now labels as GiB: <img width="851" alt="Screenshot 2024-02-27 at 11 28 23 AM" src="https://github.com/pytorch/torchtrain/assets/46302957/85eb260a-77e9-4c49-be8a-b1aaa10dc3e2">
this PR updates the GPU metrics to labelling as GiB - we were calculating GiB but calling it GB. (credit to @awgu for flagging this - issue pytorch#94) function names and member vars in metrics.py have been updated to _gib instead of _gb for clarity, and the logging output now labels as GiB: <img width="851" alt="Screenshot 2024-02-27 at 11 28 23 AM" src="https://github.com/pytorch/torchtrain/assets/46302957/85eb260a-77e9-4c49-be8a-b1aaa10dc3e2">
In my understanding, gibibyte (GiB) is$1024^3$ bytes, and mebibyte (MiB) is $1024^2$ bytes. However, gigabyte (GB) is $1000^3$ bytes, and megabyte (MB) is $1000$ bytes.
I think we should standardize on one or the other, maybe GiB since the memory profiler went that direction: pytorch/pytorch#120172 In that case, I think we mainly need to change "GB" to "GiB" in our logging.
The text was updated successfully, but these errors were encountered: