Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hexagon][UnitTest] Disable flaky quantization test #16337

Merged
merged 2 commits into from
Jan 3, 2024

Conversation

Lunderberg
Copy link
Contributor

The test_pass_fq2i_avg_pool2d.py::test_avgpool_conv2d test is sensitive to rounding errors, and failed about a third of the time (42 / 100 tests). This was first noticed as CI failures in unrelated PRs (e.g. https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-hexagon/detail/PR-16184/6/tests). This commit marks the flaky portions of the test with pytest.mark.xfail, to avoid causing breaking CI for other PRs.

To minimize the extent of the disabled test cases, this commit breaks up each of the unit tests. Where previously a single test performed both hardware/simulation tests and relay graph comparisons, these are now done in separate test functions. The hardware/simulation tests use tvm.testing.assert_allclose and have a tolerance of 1e-02, while the graph-comparison tests use tvm.ir.structural_equal, and require identical floating-point values. Only the graph-comparison test is disabled here.

The other two test cases in test_pass_fq2i_avg_pool2d.py do not show this same sensitivity, with no failures seen in 100 executions.

The `test_pass_fq2i_avg_pool2d.py::test_avgpool_conv2d` test is
sensitive to rounding errors, and failed about a third of the time (42
/ 100 tests).  This was first noticed as CI failures in unrelated
PRs (e.g. https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-hexagon/detail/PR-16184/6/tests).
This commit marks the flaky portions of the test with
`pytest.mark.xfail`, to avoid causing breaking CI for other PRs.

To minimize the extent of the disabled test cases, this commit breaks
up each of the unit tests.  Where previously a single test performed
both hardware/simulation tests and relay graph comparisons, these are
now done in separate test functions.  The hardware/simulation tests
use `tvm.testing.assert_allclose` and
have a tolerance of `1e-02`, while the graph-comparison tests use
`tvm.ir.structural_equal`, and require identical floating-point
values.  Only the graph-comparison test is disabled here.

The other two test cases in `test_pass_fq2i_avg_pool2d.py` do not show
this same sensitivity, with no failures seen in 100 executions.
@Lunderberg
Copy link
Contributor Author

Lunderberg commented Jan 2, 2024

@rasagna-quic Can you take a look at this test case? It was introduced in #15599, and is failing about 1/3 of the time. This PR is a stopgap to avoid impacting other work, but it should have a better fix for the long-term.

@rasagna-quic
Copy link
Contributor

@Lunderberg Thank you for these changes, these look good to me. I will work create a new PR to fix this issue.

@rasagna-quic Can you take a look at this test case? It was introduced in #15599, and is failing about 1/3 of the time. This PR is a stopgap to avoid impacting other work, but it should have a better fix for the long-term.

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@junrushao junrushao merged commit 42b4f21 into apache:main Jan 3, 2024
18 checks passed
@Lunderberg Lunderberg deleted the hexagon_disable_flaky_qnn_test branch January 3, 2024 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants