Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix][Relax] BlockBuilder may not assume unique input functions #16805

Merged
merged 1 commit into from
Apr 1, 2024

Conversation

Lunderberg
Copy link
Contributor

@Lunderberg Lunderberg commented Mar 27, 2024

Prior to this commit, the implementation of relax::BlockBuilder::AddFunction implicitly assumed that the input IRModule does not contain duplicate copies of the same function. This commit updates the implementation, removing the reliance on this assumption. This commit resolves the error by tracking all GlobalVar that map to the same function, rather than an just one.

A well-formed IRModule may contain duplicate function definitions. This is rare, as most functions can be disambiguated by the the function attribute tvm::attr::kGlobalSymbol. However, private functions do not have this attribute, and a well-formed IRModule may contain multiple copies of the same function.

The regression test added in this PR calls BlockBuilder::UpdateFunc and BlockBuilder::AddFunc in a specific order to reproduce this issue. In practice, this failure was sporadic, depending on the order in which a transformation pass visited functions in a module. This was first observed in VMShapeLower, with sporadic errors depending on the order of iteration over mod->functions.

Prior to this commit, the implementation of
`relax::BlockBuilder::AddFunction` implicitly assumed that the input
`IRModule` does not contain duplicate copies of the same function.
This commit updates the implementation, removing the reliance on this
assumption.  This commit resolves the error by tracking all
`GlobalVar` that map to the same function, rather than an just one.

A well-formed IRModule may contain duplicate function definitions.
This is rare, as most functions can be disambiguated by the the
function attribute `tvm::attr::kGlobalSymbol`.  However, private
functions do not have this attribute, and a well-formed IRModule
may contain multiple copies of the same function.

The regression test added in this PR calls `BlockBuilder::UpdateFunc`
and `BlockBuilder::AddFunc` in a specific order to reproduce this
issue.  In practice, this failure was sporadic, depending on the order
in which a transformation pass visited functions in a module.  This
was first observed in `VMShapeLower`, with sporadic errors depending
on the order of iteration over `mod->functions`.
@tqchen
Copy link
Member

tqchen commented Mar 31, 2024

In this case, I think we should perhaps fix the pass instead if that is a rare case. The main consideration is the added cost and complexity

@Lunderberg
Copy link
Contributor Author

I don't think there's any indication of this being a bug in the pass itself. I also don't know if it's a rare case, only that I encountered it within one specific pass. I think this is a bug in the general relax::BlockBuilder utility, as nothing in the API suggests that use of AddFunc and UpdateFunc within the same transform is unsupported. If we want to avoid complexity in the relax::BlockBuilder, then we should remove the function-level de-duplication from relax::BlockBuilder altogether, and handle it in a dedicated transform.

@Lunderberg
Copy link
Contributor Author

Rephrasing, I agree that we should minimize the complexity of implementations. However, I think that must be done by choosing which features to offer within an implementation. Since relax::BlockBuilder already provides de-duplication, and allows passes to freely use both AddFunc and UpdateFunc, the complexity in this implementation is complexity that we've already committed to.

I think we should merge this PR, as it resolves a bug in the current feature set provided by relax::BlockBuilder. To reduce the complexity of relax::BlockBuilder, a follow-up change can try moving the de-duplication of functions into a separate pass.

@tqchen tqchen merged commit 3f615dc into apache:main Apr 1, 2024
19 of 20 checks passed
@tqchen
Copy link
Member

tqchen commented Apr 1, 2024

OK, after looking through again i agree, so merging this

@Lunderberg Lunderberg deleted the bugfix_block_builder_deduplication branch April 1, 2024 16:01
thaisacs pushed a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
…pache#16805)

Prior to this commit, the implementation of
`relax::BlockBuilder::AddFunction` implicitly assumed that the input
`IRModule` does not contain duplicate copies of the same function.
This commit updates the implementation, removing the reliance on this
assumption.  This commit resolves the error by tracking all
`GlobalVar` that map to the same function, rather than an just one.

A well-formed IRModule may contain duplicate function definitions.
This is rare, as most functions can be disambiguated by the the
function attribute `tvm::attr::kGlobalSymbol`.  However, private
functions do not have this attribute, and a well-formed IRModule
may contain multiple copies of the same function.

The regression test added in this PR calls `BlockBuilder::UpdateFunc`
and `BlockBuilder::AddFunc` in a specific order to reproduce this
issue.  In practice, this failure was sporadic, depending on the order
in which a transformation pass visited functions in a module.  This
was first observed in `VMShapeLower`, with sporadic errors depending
on the order of iteration over `mod->functions`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants