[TEST] Remove `llvm -device=arm_cpu` and `cuda -libs=cudnn` from default enabled list #9905

masahi · 2022-01-11T21:55:32Z

Right now tvm.testing.enabled_targets() return llvm -device=arm_cpu and cuda -libs=cudnn (if cudnn is enabled, applies to CI env) in addition to llvm and cuda. So all tests that loop over enabled targets using tvm.testing.enabled_targets() are essentially running x86 and cuda tests twice.

I don't see a reason to run llvm -device=arm_cpu tests in addition to llvm. For cudnn, we have separate unit tests dedicated for it, we don't need to run full integration tests with it.

I did this change for PyTorch in #9781 and it sped up local testing time significantly. It should make the CI situation a lot better.

@areusch @leandron @Lunderberg @driazati

default list

areusch · 2022-01-11T22:04:53Z

cc @u99127 @Mousius @manupa-arm this is an idea that came up internally but it'd be great to get your feedback regarding test coverage. we could leave this enabled e.g. on ci-arm, too.

masahi · 2022-01-11T22:08:22Z

Does anyone know what exactly llvm -device=arm_cpu do on an x86 host? For proper ARM codegen, I thought there is another flag that does affect LLVM.

Lunderberg · 2022-01-11T22:37:52Z

For history, the llvm -device=arm_cpu was added in #6331, though I can't see where it traced back beyond there.

I added the cudnn target in #8383, as part of parametrizing tests for better visibility. Some unit tests which explicitly exercised a cudnn path could pass on a machine that that had cudnn disabled, but fail when a cudnn was enabled. The addition of cudnn as a general target was to ensure that no test coverage was dropped as a result.

masahi · 2022-01-11T22:46:25Z

Thanks @Lunderberg, I think testing for external libs are outside the scope of default enabled list. For complete coverage, we should also exercise cublas, mkl etc with full integration tests but that would be impractical.

driazati · 2022-01-12T00:06:41Z

cudnn (and others mentioned) sound like good targets for a main-only / nightly test run (where run time doesn't really matter) rather than something that runs on every PR

Lunderberg

LGTM

u99127 · 2022-01-12T16:18:32Z

cc @u99127 @Mousius @manupa-arm this is an idea that came up internally but it'd be great to get your feedback regarding test coverage. we could leave this enabled e.g. on ci-arm, too.

I think we would need to leave it in place for the testing on ci_arm . I am also not immediately sure what testing coverage is lost by removing this - perhaps see the difference in reports for pytest-cov before taking this off so that we make an informed decision .

regards
Ramana

masahi · 2022-01-12T19:09:49Z

@u99127 @areusch I looked it what -device=arm_cpu does. It's only used at

tvm/python/tvm/autotvm/tophub.py

Line 107 in 6b2cbfe

device = tgt.attrs.get("device", "")

where it instructs TVM to fetch a pre-tuned topi schedule configuration for ARM targets. It only affects tile sizes etc for conv2d, so it is perfectly valid to use it on an x86 host, although there is no point doing so.

Anyway, running both llvm and llvm -device=arm_cpu has nothing to do with test coverage and it only increases test time for no good reason.

areusch · 2022-01-12T19:18:58Z

@masahi i think it also enables schedules with ".arm_cpu" suffix so i don't think it's quite that simple. i think @u99127 is right that we may not need to run these tests elsewhere other than ci-arm, but i think we need to diff the tests being run first before coming to that conclusion.

masahi · 2022-01-12T19:42:16Z

ok, I missed that -device=arm_cpu also affects compute/schedule/strategy dispatch. But it still stands that having both

DEFAULT_TEST_TARGETS = [
    "llvm",
    "llvm -device=arm_cpu",

is odd, since llvm is not supposed to work on ci-arm. So it has to be the case that DEFAULT_TEST_TARGETS is never used in ci-arm, and that it is safe to remove llvm -device=arm_cpu from it. Tests that intend to be run on ARM hosts should always use tvm.target.arm_cpu() or manually specify the -mcpu flag.

masahi · 2022-01-13T20:39:52Z

I'm going to merge this unless there is no further comment.

u99127 · 2022-01-14T13:46:10Z

I'd be happier with a comparison test run of say the topi tests with both llvm and llvm -device=arm_cpu on a ci_cpu machine to see what the difference is.

I've not had the chance to look at the difference if any in test results for topi on aarch64 with llvm vs llvm -device=arm_cpu on a ci_arm environment. That's not on by default yet but I was working on that a bit slowly.

regards
Ramana

masahi · 2022-01-14T13:55:40Z

Yeah, CI job time has been very unstable recently (some finishes in 4 hours while others > 8). So it's difficult to assess how much this helps.

Note that "llvm" means x86 specifically, so you never use llvm (without any option) target on aarch64. But that's what on DEFAULT_TEST_TARGETS, so like I said, ci-arm shouldn't be using this value. So there would be no loss of distinct test coverage.

…che#9905) default list

…fault test target list (#10500) After recent improvement in GPU frontend tests, I found that `topi: GPU` has become a bottleneck. From the log https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-10391/14/pipeline/319, it is clear that topi tests are running on target `llvm -device=arm_cpu` and `cuda -libs=cudnn`, which I claim is completely redundant since we already run on `llvm` and `cuda` targets. In #9905, I've already removed them from `DEFAULT_TEST_TARGETS`, but `topi: GPU` uses its own list of targets which still includes `llvm -device=arm_cpu` and `cuda -libs=cudnn`. I propose to remove them from topi test targets, which hopefully will cut topi GPU tests time by half.

…fault test target list (apache#10500) After recent improvement in GPU frontend tests, I found that `topi: GPU` has become a bottleneck. From the log https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-10391/14/pipeline/319, it is clear that topi tests are running on target `llvm -device=arm_cpu` and `cuda -libs=cudnn`, which I claim is completely redundant since we already run on `llvm` and `cuda` targets. In apache#9905, I've already removed them from `DEFAULT_TEST_TARGETS`, but `topi: GPU` uses its own list of targets which still includes `llvm -device=arm_cpu` and `cuda -libs=cudnn`. I propose to remove them from topi test targets, which hopefully will cut topi GPU tests time by half.

[TEST] Remove llvm -device=arm_cpu and cuda -libs=cudnn from

b07bec1

default list

masahi requested review from comaniac, icemelon, jroesch, junrushao, tqchen, yzhliu and zhiics as code owners January 11, 2022 21:55

Lunderberg approved these changes Jan 12, 2022

View reviewed changes

masahi merged commit 4419241 into apache:main Jan 14, 2022

crazydemo pushed a commit to crazydemo/tvm that referenced this pull request Jan 27, 2022

[TEST] Remove llvm -device=arm_cpu and cuda -libs=cudnn from (apa…

b0d9c31

…che#9905) default list

ylc pushed a commit to ylc/tvm that referenced this pull request Feb 16, 2022

[TEST] Remove llvm -device=arm_cpu and cuda -libs=cudnn from (apa…

3161414

…che#9905) default list

masahi mentioned this pull request Mar 6, 2022

[CI] Remove llvm -device=arm_cpu and cuda -libs=cudnn from the default test target list #10500

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TEST] Remove `llvm -device=arm_cpu` and `cuda -libs=cudnn` from default enabled list #9905

[TEST] Remove `llvm -device=arm_cpu` and `cuda -libs=cudnn` from default enabled list #9905

masahi commented Jan 11, 2022 •

edited

Loading

areusch commented Jan 11, 2022

masahi commented Jan 11, 2022 •

edited

Loading

Lunderberg commented Jan 11, 2022

masahi commented Jan 11, 2022 •

edited

Loading

driazati commented Jan 12, 2022

Lunderberg left a comment

u99127 commented Jan 12, 2022

masahi commented Jan 12, 2022 •

edited

Loading

areusch commented Jan 12, 2022

masahi commented Jan 12, 2022 •

edited

Loading

masahi commented Jan 13, 2022

u99127 commented Jan 14, 2022

masahi commented Jan 14, 2022 •

edited

Loading

[TEST] Remove llvm -device=arm_cpu and cuda -libs=cudnn from default enabled list #9905

[TEST] Remove llvm -device=arm_cpu and cuda -libs=cudnn from default enabled list #9905

Conversation

masahi commented Jan 11, 2022 • edited Loading

areusch commented Jan 11, 2022

masahi commented Jan 11, 2022 • edited Loading

Lunderberg commented Jan 11, 2022

masahi commented Jan 11, 2022 • edited Loading

driazati commented Jan 12, 2022

Lunderberg left a comment

Choose a reason for hiding this comment

u99127 commented Jan 12, 2022

masahi commented Jan 12, 2022 • edited Loading

areusch commented Jan 12, 2022

masahi commented Jan 12, 2022 • edited Loading

masahi commented Jan 13, 2022

u99127 commented Jan 14, 2022

masahi commented Jan 14, 2022 • edited Loading

[TEST] Remove `llvm -device=arm_cpu` and `cuda -libs=cudnn` from default enabled list #9905

[TEST] Remove `llvm -device=arm_cpu` and `cuda -libs=cudnn` from default enabled list #9905

masahi commented Jan 11, 2022 •

edited

Loading

masahi commented Jan 11, 2022 •

edited

Loading

masahi commented Jan 11, 2022 •

edited

Loading

masahi commented Jan 12, 2022 •

edited

Loading

masahi commented Jan 12, 2022 •

edited

Loading

masahi commented Jan 14, 2022 •

edited

Loading