[Hexagon][UnitTest] Disable flaky quantization test #16337

Lunderberg · 2024-01-02T20:22:30Z

The test_pass_fq2i_avg_pool2d.py::test_avgpool_conv2d test is sensitive to rounding errors, and failed about a third of the time (42 / 100 tests). This was first noticed as CI failures in unrelated PRs (e.g. https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-hexagon/detail/PR-16184/6/tests). This commit marks the flaky portions of the test with pytest.mark.xfail, to avoid causing breaking CI for other PRs.

To minimize the extent of the disabled test cases, this commit breaks up each of the unit tests. Where previously a single test performed both hardware/simulation tests and relay graph comparisons, these are now done in separate test functions. The hardware/simulation tests use tvm.testing.assert_allclose and have a tolerance of 1e-02, while the graph-comparison tests use tvm.ir.structural_equal, and require identical floating-point values. Only the graph-comparison test is disabled here.

The other two test cases in test_pass_fq2i_avg_pool2d.py do not show this same sensitivity, with no failures seen in 100 executions.

The `test_pass_fq2i_avg_pool2d.py::test_avgpool_conv2d` test is sensitive to rounding errors, and failed about a third of the time (42 / 100 tests). This was first noticed as CI failures in unrelated PRs (e.g. https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-hexagon/detail/PR-16184/6/tests). This commit marks the flaky portions of the test with `pytest.mark.xfail`, to avoid causing breaking CI for other PRs. To minimize the extent of the disabled test cases, this commit breaks up each of the unit tests. Where previously a single test performed both hardware/simulation tests and relay graph comparisons, these are now done in separate test functions. The hardware/simulation tests use `tvm.testing.assert_allclose` and have a tolerance of `1e-02`, while the graph-comparison tests use `tvm.ir.structural_equal`, and require identical floating-point values. Only the graph-comparison test is disabled here. The other two test cases in `test_pass_fq2i_avg_pool2d.py` do not show this same sensitivity, with no failures seen in 100 executions.

Lunderberg · 2024-01-02T20:25:15Z

@rasagna-quic Can you take a look at this test case? It was introduced in #15599, and is failing about 1/3 of the time. This PR is a stopgap to avoid impacting other work, but it should have a better fix for the long-term.

rasagna-quic · 2024-01-03T04:00:26Z

@Lunderberg Thank you for these changes, these look good to me. I will work create a new PR to fix this issue.

@rasagna-quic Can you take a look at this test case? It was introduced in #15599, and is failing about 1/3 of the time. This PR is a stopgap to avoid impacting other work, but it should have a better fix for the long-term.

junrushao

LGTM!

Lunderberg mentioned this pull request Jan 2, 2024

[TIR][Transform] Implement InlinePrivateFunctions #16184

Merged

Disable pylint for pytest fixture names

e17316e

junrushao approved these changes Jan 3, 2024

View reviewed changes

junrushao merged commit 42b4f21 into apache:main Jan 3, 2024
18 checks passed

Lunderberg deleted the hexagon_disable_flaky_qnn_test branch January 3, 2024 17:30

ysh329 mentioned this pull request Apr 21, 2024

[Release] v0.16.0 Release Candidate Notes #16911

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hexagon][UnitTest] Disable flaky quantization test #16337

[Hexagon][UnitTest] Disable flaky quantization test #16337

Lunderberg commented Jan 2, 2024

Lunderberg commented Jan 2, 2024 •

edited

Loading

rasagna-quic commented Jan 3, 2024

junrushao left a comment

[Hexagon][UnitTest] Disable flaky quantization test #16337

[Hexagon][UnitTest] Disable flaky quantization test #16337

Conversation

Lunderberg commented Jan 2, 2024

Lunderberg commented Jan 2, 2024 • edited Loading

rasagna-quic commented Jan 3, 2024

junrushao left a comment

Choose a reason for hiding this comment

Lunderberg commented Jan 2, 2024 •

edited

Loading