Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-configure benchmarking devices & add markers to bench_cugraph_uniform_neighbor_sample #4561

Conversation

nv-rliu
Copy link
Contributor

@nv-rliu nv-rliu commented Jul 29, 2024

This PR re-enables the benchmarks/cugraph/pytest-based/bench_cugraph_uniform_neighbor_sample.py benchmark in the MNMG nightlies.

This benchmark was previously being skipped due to the fact that the benchmarks are configured to run with Pytest markers "managedmem_off and poolallocator_on". Thanks to @jameslamb for spotting this inside conftest.py

By adding the missing markers to the test (and removing some dgx machine specific dask-client configs), the benchmark should properly run in the nightly jobs.

@nv-rliu nv-rliu added bug Something isn't working non-breaking Non-breaking change graph-devops Issues for the graph-devops team benchmarks labels Jul 29, 2024
@nv-rliu nv-rliu added this to the 24.08 milestone Jul 29, 2024
@nv-rliu nv-rliu requested review from rlratzel and jameslamb July 29, 2024 21:00
@nv-rliu nv-rliu changed the base branch from branch-24.08 to branch-24.10 July 29, 2024 21:02
@nv-rliu
Copy link
Contributor Author

nv-rliu commented Jul 29, 2024

This PR is targeting 24.10 and should not be reviewed until the forward-merge has been completed.

@nv-rliu nv-rliu modified the milestones: 24.08, 24.10 Jul 29, 2024
@nv-rliu nv-rliu marked this pull request as ready for review August 1, 2024 20:16
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the pytest marker changes will help ensure the benchmark actually runs.

For my own understanding ... why is it desirable to remove the ability to configure which GPUs on the system are used? (this is just for my own curiosity... can answer this after you merge)

@nv-rliu
Copy link
Contributor Author

nv-rliu commented Aug 5, 2024

I do think the pytest marker changes will help ensure the benchmark actually runs.

For my own understanding ... why is it desirable to remove the ability to configure which GPUs on the system are used? (this is just for my own curiosity... can answer this after you merge)

Hi James. Thanks for the question.
When we run these benchmarks on our clustered machines, we run then in multiple configurations (2-GPU, 8-GPU, 32, 64, etc) and ideally want to test the algorithm while using all of the available GPUs (allocated by the job scheduler).
As for the old code that selected 4-GPUs (Devices 1-4), my guess is that this was for running this benchmark locally on the lab machines.

@rlratzel
Copy link
Contributor

rlratzel commented Aug 5, 2024

/merge

@rapids-bot rapids-bot bot merged commit 1be81a4 into rapidsai:branch-24.10 Aug 5, 2024
132 checks passed
@nv-rliu nv-rliu deleted the reenable-bench-cugraph-unif-neighb-sample branch August 5, 2024 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmarks bug Something isn't working graph-devops Issues for the graph-devops team non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants