Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Update image tag to 20240917-153130-9f281758 #17397

Merged
merged 5 commits into from
Sep 23, 2024

Conversation

mshr-h
Copy link
Contributor

@mshr-h mshr-h commented Sep 20, 2024

As per title.
Part of #17346

@tqchen
Copy link
Member

tqchen commented Sep 20, 2024

looks like a flaky error, we should increas atol and rtol for the case

@mshr-h
Copy link
Contributor Author

mshr-h commented Sep 21, 2024

@tqchen
Copy link
Member

tqchen commented Sep 21, 2024

here is the error

[2024-09-21T07:06:36.974Z] self = <FixtureDef argname='test_case' scope='session' baseid='tests/python/tir-transform/test_tir_transform_simplify.py::TestMostRestrictiveConditional'>
[2024-09-21T07:06:36.974Z] request = <SubRequest 'test_case' for <Function test_compare[test_case2]>>
[2024-09-21T07:06:36.974Z] 
[2024-09-21T07:06:36.974Z]     def execute(self, request: SubRequest) -> FixtureValue:
[2024-09-21T07:06:36.974Z]         """Return the value of this fixture, executing it if not cached."""
[2024-09-21T07:06:36.974Z]         # Ensure that the dependent fixtures requested by this fixture are loaded.
[2024-09-21T07:06:36.974Z]         # This needs to be done before checking if we have a cached value, since
[2024-09-21T07:06:36.974Z]         # if a dependent fixture has their cache invalidated, e.g. due to
[2024-09-21T07:06:36.974Z]         # parametrization, they finalize themselves and fixtures depending on it
[2024-09-21T07:06:36.974Z]         # (which will likely include this fixture) setting `self.cached_result = None`.
[2024-09-21T07:06:36.974Z]         # See #4871
[2024-09-21T07:06:36.974Z]         requested_fixtures_that_should_finalize_us = []
[2024-09-21T07:06:36.974Z]         for argname in self.argnames:
[2024-09-21T07:06:36.974Z]             fixturedef = request._get_active_fixturedef(argname)
[2024-09-21T07:06:36.974Z]             # Saves requested fixtures in a list so we later can add our finalizer
[2024-09-21T07:06:36.974Z]             # to them, ensuring that if a requested fixture gets torn down we get torn
[2024-09-21T07:06:36.974Z]             # down first. This is generally handled by SetupState, but still currently
[2024-09-21T07:06:36.974Z]             # needed when this fixture is not parametrized but depends on a parametrized
[2024-09-21T07:06:36.974Z]             # fixture.
[2024-09-21T07:06:36.974Z]             if not isinstance(fixturedef, PseudoFixtureDef):
[2024-09-21T07:06:36.974Z]                 requested_fixtures_that_should_finalize_us.append(fixturedef)
[2024-09-21T07:06:36.974Z]     
[2024-09-21T07:06:36.974Z]         # Check for (and return) cached value/exception.
[2024-09-21T07:06:36.974Z]         if self.cached_result is not None:
[2024-09-21T07:06:36.974Z]             request_cache_key = self.cache_key(request)
[2024-09-21T07:06:36.974Z]             cache_key = self.cached_result[1]
[2024-09-21T07:06:36.974Z]             try:
[2024-09-21T07:06:36.974Z]                 # Attempt to make a normal == check: this might fail for objects
[2024-09-21T07:06:36.974Z]                 # which do not implement the standard comparison (like numpy arrays -- #6497).
[2024-09-21T07:06:36.974Z] >               cache_hit = bool(request_cache_key == cache_key)
[2024-09-21T07:06:36.974Z] E               TypeError: __bool__ should return bool, returned NotImplementedType

@tqchen
Copy link
Member

tqchen commented Sep 21, 2024

[2024-09-21T07:06:36.974Z] E TypeError: __bool__ should return bool, returned NotImplementedType

Seems like a bug in pytest or our fixture impl cc @Lunderberg

@tqchen
Copy link
Member

tqchen commented Sep 21, 2024

Some guess about what happens to the tests/python/tir-transform/test_tir_transform_simplify.py:TestMostRestrictiveConditional

This is the only test where all parameters appears as bool PrimExpr and also do not contain a bool constant, not sure how that impacts the comparison impl.

Update, i managed to create a case that hits the error

import tvm

i, j, k = [tvm.tir.Var(name, "int32") for name in "ijk"] 

((i == j), (i > j)) == ((i == j), (i > j))

raises

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __bool__ should return bool, returned NotImplementedType

I can also reproduce the error via env


Note that the same failure happens in my python 3.10 env

@tqchen
Copy link
Member

tqchen commented Sep 21, 2024

@yongwww can you help a bit with model binary

@mshr-h
Copy link
Contributor Author

mshr-h commented Sep 21, 2024

Python 3.8 is going to EoL in 2024-10, so maybe it's a good time to move to 3.9.

@tqchen
Copy link
Member

tqchen commented Sep 21, 2024

Update about my finding. I find that i can reproduce the same issue on my python3.10 (the tuple comparison issue) but the testcase did not fail, then i digged a bit into pytest code. The pytest of my python3.10 do not have the same code as bool(cached_key == requested_key)

Seems it is a versioning problem. The pytest version was too old. Likely the reliance of bool(cached_key == requested_key) was fixed by a later version of pytest. So the main reason here likely was pytest version issue. We can likely fix it by update the pytest to a later version

On a separate note, i agree it is good time to move to higher python version.

@mshr-h
Copy link
Contributor Author

mshr-h commented Sep 21, 2024

The test passes with pytest==8.2.2, but it fails with pytest==8.3.3.
We can temporarily work around this issue by pinning pytest.

@tqchen
Copy link
Member

tqchen commented Sep 21, 2024

Ah actually it has to do with newer pytest version. I see, then we should up find ways to fix the testcase

@tqchen
Copy link
Member

tqchen commented Sep 21, 2024

I think one solution would be to add a wrapper class for the parameters of this testcase and define our own equal operator, so it won't rely on tuple equal. @mshr-h do you mind try that? We can use structural equality and hash for the wrapper key

@mshr-h mshr-h marked this pull request as draft September 22, 2024 13:44
@tqchen
Copy link
Member

tqchen commented Sep 22, 2024

if we can work around the torch model issue by skipping first, we can aim to merge first

@mshr-h mshr-h force-pushed the upgrade-ci-image-20240917 branch from 46d7cfd to d19d28a Compare September 23, 2024 04:30
@mshr-h mshr-h force-pushed the upgrade-ci-image-20240917 branch from d19d28a to dfab291 Compare September 23, 2024 05:08
@mshr-h mshr-h marked this pull request as ready for review September 23, 2024 07:59
@mshr-h
Copy link
Contributor Author

mshr-h commented Sep 23, 2024

CI has passed.
Can you take a look at it again? Thanks! @tqchen @yongwww

@tqchen tqchen merged commit 9e2a75d into apache:main Sep 23, 2024
15 checks passed
@tqchen
Copy link
Member

tqchen commented Sep 23, 2024

cc @yongwww @vinx13 in case we also need to manually push to the tlcpack images from staging

@mshr-h mshr-h deleted the upgrade-ci-image-20240917 branch September 23, 2024 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants