Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use nvidia-ml-py and psutil to replace original command line call #20

Merged
merged 9 commits into from
Sep 5, 2017

Conversation

wookayin
Copy link
Owner

@wookayin wookayin commented Aug 17, 2017

This is a support and follow-up for #17 by @Stonesjtu, in which all nvidia-smi and ps command line calls are replaced by nvidia-ml-py and psutil, respectively. On top of #17, I have added unit tests and fixed several bugs (or different behavior) introduced in the PR.

Changes

  • Replace nvidia-smi call by nvidia-ml-py and psutil.
    • As a result, querying nvidia devices gets much faster (even without nvidia-smi daemon); moreover, we can obtain more detailed information on GPUs and processes.
  • Numerical information such as memory usage now will be given in bytes, not in mega-bytes (Breaking change).

Stonesjtu and others added 6 commits August 13, 2017 11:41
Use the python binding of NVML is more flexible and maintainable for
future expansion. But the Not Supported error is remained to be
considered in next commit.
Here comes many refactors in this commit.

The processes information is now integrated in the GPUStat class, viewed
as a property of GPU state.

The GPUCollection now contains a list of GPUStat, rather than a dict
indexed by GPU-UUID. But the UUID keyword is still remained for future
usage.
This commit is a support for PR #17.

- Fixes minor typos and improve code styles.

- Correct the usage of colors: wrong colors introduced from d9cc46d
  are fixed (e.g. bold_white, bold_black); and the case of --no-color
  is also set properly.

- Fix different behavior at python3 (including integer division,
  bytes to unicode decoding for name and uuid).

- `cmdline` names are simplified to their basename, as were in
  the previous versions (psutil returns full path of executable).

- Bring back the `new_query()` method.
All the pynvml/psutil objects interacting with gpustat are
replaced by mock; so we can unittest the gpustat behavior even
in the environment where nvidia cards are not present.

Also fixed minor bugs, e.g. improve 'not supported' handling.
@Stonesjtu
Copy link
Collaborator

LGTM

As we have changed the gpustat package to depend on non-builtin
packages (psutils, etc.), it is no more possible to import
gpustat from setup.py without having those libraries installed.

This commit is a part of #20.
@wookayin
Copy link
Owner Author

wookayin commented Sep 5, 2017

Thank you so much @Stonesjtu !!

try:
temperature = N.nvmlDeviceGetTemperature(handle, N.NVML_TEMPERATURE_GPU)
except N.NVMLError:
memory = None # Not supported
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a typo. Maybe temperature = None should be in except?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, you are correct. Fixed in 895e1f8. Thanks!

@wookayin wookayin added this to the 0.4 milestone Nov 2, 2017
@wookayin wookayin deleted the feature/use-pynvml branch September 10, 2018 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants