-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable pytest-xdist for TVM CI #8576
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes here look good overall, just a couple of documentation/ordering questions.
function run_pytest() { | ||
local extra_args=( ) | ||
if [ "$1" == "--parallel" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the documentation currently recommends using task_python_unittest.sh
to run the unit tests, we should either make sure that location also has pytest-xdist
in the list of packages to install in order to run the tests, or check whether pytest-xdist
is installed by wrapping this in if python3 -c "import xdist" > /dev/null 2>&1; then
.
@@ -31,9 +31,9 @@ if [ -z "${TVM_UNITTEST_TESTSUITE_NAME:-}" ]; then | |||
fi | |||
|
|||
# First run minimal test on both ctypes and cython. | |||
run_pytest ctypes ${TVM_UNITTEST_TESTSUITE_NAME}-platform-minimal-test tests/python/all-platform-minimal-test | |||
run_pytest cython ${TVM_UNITTEST_TESTSUITE_NAME}-platform-minimal-test tests/python/all-platform-minimal-test | |||
run_pytest --parallel ctypes ${TVM_UNITTEST_TESTSUITE_NAME}-platform-minimal-test tests/python/all-platform-minimal-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this applies to everything in the unittest directory, including test_tvm_testing_features.py
. I don't see the explicit ordering that will be required for some of those tests, so we should add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed this change is in place, looks good to me.
python/tvm/testing.py
Outdated
@@ -379,7 +379,7 @@ def _get_targets(target_str=None): | |||
if len(target_str) == 0: | |||
target_str = DEFAULT_TEST_TARGETS | |||
|
|||
target_names = set(t.strip() for t in target_str.split(";") if t.strip()) | |||
target_names = list(sorted(set(t.strip() for t in target_str.split(";") if t.strip()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A thought from this morning. Do we want to have the list in sorted order, or in the order specified by target_str
? Either way would be deterministic, but I'd lean toward the latter so that we can have explicit control over the order of targets. We can get this behavior if we use a dict
instead of a set
for removing duplicates.
target_names = list({t.strip():None for t in target_str.split(';') if t.strip()})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah probably in order of target_str
dfa2dd8
to
37c39f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
vta/python/vta/__init__.py
Outdated
|
||
# do not from tvm import topi when __main__ is in vta_onboard | ||
# to maintain minimum Python dependencies on the board. | ||
# TODO(vta-team): move board-only logic imported above into vta_onboard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The introduction of a separate vta_board package is likely no longer needed after my update https://github.com/apache/tvm/blob/main/vta/python/vta/__init__.py#L36.
We should be able to use the original vta as it is. Would be great to do so if that is the case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah oops this was a bad merge, fixing
* yeah we're gonna have to think about testing this approach.
d4b95b1
to
5a1aa02
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really good work @areusch .
I had a small question.
PYTEST_NUM_CPUS=$(expr ${PYTEST_NUM_CPUS} - 1) # Don't nuke interactive work. | ||
fi | ||
|
||
# Don't use >4 CPUs--in general, we only use 4 CPUs in testing, so we want to retain this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 4 ?
Also would it be possible for us to specialize run_pytest to for certain subset of tests that we think could benefit from this. i.e. --parallel <some_number> ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 just relates to how we currently allocate Jenkins executors to CI nodes. what i really want to do is make it possible to use $(nproc) here always (so that each node is maximally used). however, initial tests indicate that there is a threshold after which nodes become RAM-limited.
can you clarify your second question? how would we leverage this better in the subset?
I assume this is superceded by #9834 |
Enable pytest-xdist to paralellize CI jobs on worker nodes. The thought here is: right now our concurrency model in CI is to keep with 1 CPU to reduce debugging headache and run $(nproc) workers per CI CPU node. Meanwhile our CI is heterogenous so a long ci_cpu-bound step will negatively affect the overall runtime of all PRs. Going the other way, the idea is to run each CI job as fast as possible and let jobs pile into queues where we can see more clearly the bottlenecks. This means that when the queues are drained, devs get faster response times from the CI, and the CPU nodes should still be used optimally (or perhaps get a slight boost since caching may work better with fewer workloads).
Restricting to 2 CPUs since this is a test; in the future, CI_PYTEST_NUM_CPUS should be used to actually control from Jenkinsfile.
cc @tqchen @jroesch @Lunderberg