Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AOT] Add AOTLowerMain pass to lower a Relay main into TIR #12550

Merged
merged 1 commit into from
Sep 14, 2022

Conversation

mbaret
Copy link
Contributor

@mbaret mbaret commented Aug 22, 2022

Tracking issue: #12548

This is a pass refactored out of the AOTExecutorCodegen. Instead of combining all of the functionality of the AOTExecutorCodegen into a single monolithic pass, this pass only handles the lowering of the Relay main function into TIR. Tests for the pass are included.

cc @Mousius @alanmacd @areusch

@mbaret
Copy link
Contributor Author

mbaret commented Aug 22, 2022

Copy link
Member

@Mousius Mousius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @mbaret! I'm assuming this is mostly a move and a clean up with additional test coverage? Based on that I think this is already much better than what we have but it'd be good to split out the ExprAllocator pass and test it independently - otherwise I think we can get this in to move forwards with the larger refactor

tests/python/relay/aot/test_pass_aot_lower_main.py Outdated Show resolved Hide resolved
tests/python/relay/aot/test_pass_aot_lower_main.py Outdated Show resolved Hide resolved
tests/python/relay/aot/test_pass_aot_lower_main.py Outdated Show resolved Hide resolved
@mbaret
Copy link
Contributor Author

mbaret commented Aug 24, 2022

I'm assuming this is mostly a move and a clean up with additional test coverage?

Yeah that's right, we were finding the the AOT codegen had become unmanageably complex and had very poor testing coverage so it was difficult to add features to. This is a first attempt at bringing it back from the brink :) There's some more context on where this is headed in the tracking issue.

It'd be good to split out the ExprAllocator pass and test it independently

I did consider this, but sided against it on the basis that ExprAllocator is an implementation detail of AOTMainLowerer. We should still be able to test all the behaviours implemented by ExprAllocator through the stable interface of AOTMainLowerer.

I accept there are some conflicting testing philosophies around this though (I think BDD vs. TDD being the relevant one here), and so am open to other views.

@Mousius
Copy link
Member

Mousius commented Aug 24, 2022

I'm assuming this is mostly a move and a clean up with additional test coverage?

Yeah that's right, we were finding the the AOT codegen had become unmanageably complex and had very poor testing coverage so it was difficult to add features to. This is a first attempt at bringing it back from the brink :) There's some more context on where this is headed in the tracking issue.

Makes sense, I reviewed it in this spirit rather than a complete rewrite 😸

It'd be good to split out the ExprAllocator pass and test it independently

I did consider this, but sided against it on the basis that ExprAllocator is an implementation detail of AOTMainLowerer. We should still be able to test all the behaviours implemented by ExprAllocator through the stable interface of AOTMainLowerer.

I accept there are some conflicting testing philosophies around this though (I think BDD vs. TDD being the relevant one here), and so am open to other views.

In either the BDD or the TDD case you'd have tests first so no code would appear without a test to cover it. I don't think that's the case here as the tests are retro-fitted to the existing code? If written tests first, I would expect to follow the method you've described by starting with tests for AOTMainLowerer and then write tests for the internal units where I can't test them as part of the higher level interface.

I don't think you can test all the behaviours of ExprAllocator through AOTMainLowerer due to cases such as:
https://github.com/apache/tvm/pull/12550/files#diff-fd79a3a236f19a9f85e5045e7c75ed47a09ad7d698042114a93f4f74dd17c75eR99-R101
Which is an internal optimisation inside of ExprAllocator that hopefully we don't remove but makes no difference to the outer behaviour of the AOT pass and is therefore difficult to assert is behaving correctly.

The TVM Code Quality Guidelines philosophy is that we should ensure test coverage, implying we should test these cases and also error cases which are equally part of the Passes contract.

@mbaret
Copy link
Contributor Author

mbaret commented Aug 24, 2022

Which is an internal optimisation inside of ExprAllocator that hopefully we don't remove but makes no difference to the outer behaviour of the AOT pass and is therefore difficult to assert is behaving correctly.

If that code is unreachable from AOTMainLowerer, it is fundamentally unreachable from the program that is 'TVM', and at the end of the day TVM is the component under test. If it is reachable, then we can test it through the AOTMainLowerer interface.

I think it is reachable because we can always construct a test case with a function marked kPrimitive for AOTMainLowerer. If that's the primary concern here I'll happily add in such a test. I would caution though that the TVM Code Quality Guidelines make no commitment to 100% coverage, so I think it's important we focus our testing effort wisely (I certainly haven't covered every corner case in the current tests).

@Mousius
Copy link
Member

Mousius commented Aug 24, 2022

Which is an internal optimisation inside of ExprAllocator that hopefully we don't remove but makes no difference to the outer behaviour of the AOT pass and is therefore difficult to assert is behaving correctly.

If that code is unreachable from AOTMainLowerer, it is fundamentally unreachable from the program that is 'TVM', and at the end of the day TVM is the component under test. If it is reachable, then we can test it through the AOTMainLowerer interface.

The line itself is not unreachable, it's harder to test and ensure the behaviour from within AOTMainLowerer as you're essentially testing "nothing happened" with the kPrimitive attribute added. There are always pieces of behaviour which are not explicitly user facing but are important for the overall behaviour of the program, else you could consider the existing AOT tests which invoke relay.build to be sufficient?

I think it is reachable because we can always construct a test case with a function marked kPrimitive for AOTMainLowerer. If that's the primary concern here I'll happily add in such a test. I would caution though that the TVM Code Quality Guidelines make no commitment to 100% coverage, so I think it's important we focus our testing effort wisely (I certainly haven't covered every corner case in the current tests).

Yip, you can interpret "test coverage" differently, I'd rather not gamble on whether a future change will be caught or not by the tests though 😸

@mbaret
Copy link
Contributor Author

mbaret commented Aug 24, 2022

At the end of the day, I think we're debating whether or not to test internal optimizations for what is essentially performance benefit. I'm especially wary to set precedent here because this very refactor would have been made a lot harder if extensive testing of internal implementation existed.

I appreciate this is effectively down to personal preference given no strict ruling in the code quality guidelines around acceptable coverage, and that makes discussions like this hard to progress. In the spirit of Consensus Building, perhaps we can wait and see if there's any other input given I think we've both presented the extent of our arguments at this stage.

Copy link
Contributor

@lhutton1 lhutton1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As someone who recently tried to make sense of the AOT codegen this is definitely a welcomed change :) I see that nothing was removed from aot_executor_codegen.cc, I'm assuming that this will come in a later PR?

python/tvm/relay/backend/utils.py Show resolved Hide resolved
src/relay/backend/aot/aot_lower_main.cc Show resolved Hide resolved
@mbaret
Copy link
Contributor Author

mbaret commented Aug 24, 2022

I see that nothing was removed from aot_executor_codegen.cc, I'm assuming that this will come in a later PR?

Yeah, aot_executor_codegen.cc only gets refactored itself in P4. The first 3 PRs are to extract functionality from aot_executor_codegen.cc as standalone passes, these then get combined to build the newly refactored codegen. I've got a draft PR demonstrating this if you want to see how that ends up looking: #12549.

@areusch
Copy link
Contributor

areusch commented Aug 25, 2022

I think it is reachable because we can always construct a test case with a function marked kPrimitive for AOTMainLowerer. If that's the primary concern here I'll happily add in such a test. I would caution though that the TVM Code Quality Guidelines make no commitment to 100% coverage, so I think it's important we focus our testing effort wisely (I certainly haven't covered every corner case in the current tests).

Yip, you can interpret "test coverage" differently, I'd rather not gamble on whether a future change will be caught or not by the tests though 😸

@Mousius regarding test coverage: generally I'm supportive of having more test coverage, but i'm not sure in this case it's necessary to block the PR over it; in this case, i think the regression is that we'd visit more of the main function than we'd need to in building expr_storage_map_. to me this falls a bit more on the "perfect" side of the don't-let-perfect-be-the-enemy-of-good argument. is there another kind of regression you can see here that is hard to test without explicitly testing ExprAllocator standalone?

Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one other minor comment here

src/relay/backend/aot/aot_lower_main.cc Outdated Show resolved Hide resolved
@mbaret
Copy link
Contributor Author

mbaret commented Aug 29, 2022

I'm keen to make progress with this issue and given the AOT codegen is largely untested, I believe this refactor represents a strict improvement over the current situation. The decisions I've made here aren't meant to be final and patches are always welcome to introduce further coverage, including adding specific test cases where community stakeholders care about particular functionality not regressing.

To give a more blunt personal view, there's an enormous amount of untested code in TVM with a lot of it poorly specified. If we make the barrier-to-entry for participating in improving that situation '100% coverage', rather than getting full coverage we'll just get no participation. That being said, if we want to update the project's code quality guidelines to be more explicit around testing expectations, I'd be fully supportive of that effort and would happily participate in any discussions arising from an RFC. We can then vote as a community on how we want to go forward rather than having the same debates on a PR-by-PR basis.

@lhutton1 could you request further changes/approve, thanks :)
@Mousius let me know what you think

Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @mbaret ! @Mousius could you take a look?

Copy link
Contributor

@ekalda ekalda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late to the party - just some nits :) In general, I'm really happy to see this bit of codebase being cleaned up, so thanks a lot @mbaret !

Regarding to testing the ExprAllocator, what's the problem with testing it? Would it be very difficult to do? It has quite a bit of logic in it (and at least to me, everything that has visitors becomes somewhat unpredictable and hard to reason about), so my instincts align here with @Mousius. But if the gains from testing it are marginal compared to the effort, I'd rather not stop that PR going ahead because of this.

src/relay/backend/aot/aot_lower_main.cc Outdated Show resolved Hide resolved
src/relay/backend/aot/aot_lower_main.cc Outdated Show resolved Hide resolved
src/relay/backend/aot/aot_lower_main.cc Show resolved Hide resolved
src/relay/backend/aot/aot_lower_main.cc Show resolved Hide resolved
@mbaret
Copy link
Contributor Author

mbaret commented Aug 30, 2022

Regarding to testing the ExprAllocator, what's the problem with testing it? Would it be very difficult to do? It has quite a bit of logic in it (and at least to me, everything that has visitors becomes somewhat unpredictable and hard to reason about), so my instincts align here with @Mousius. But if the gains from testing it are marginal compared to the effort, I'd rather not stop that PR going ahead because of this.

The functional logic of ExprAllocator is tested because it's part of the implementation of AOTLowerMain (specifically it's working out what Allocates are needed). I've chosen not to test it as a standalone component here for the same reason I haven't tested things like CreateMainFunc - all the behaviours of interest can be tested through the interface of AOTLowerMain.

@Mousius raised the point that ExprAllocator has some logic that can't be easily tested through that interface, namely an optimization to avoid creating StorageInfo for functions AOTLowerMain doesn't care about. The current testing would allow this to regress with the consequence being a marginal (probably microsecond) performance regression.

@Mousius
Copy link
Member

Mousius commented Aug 30, 2022

I think it is reachable because we can always construct a test case with a function marked kPrimitive for AOTMainLowerer. If that's the primary concern here I'll happily add in such a test. I would caution though that the TVM Code Quality Guidelines make no commitment to 100% coverage, so I think it's important we focus our testing effort wisely (I certainly haven't covered every corner case in the current tests).

Yip, you can interpret "test coverage" differently, I'd rather not gamble on whether a future change will be caught or not by the tests though 😸

@Mousius regarding test coverage: generally I'm supportive of having more test coverage, but i'm not sure in this case it's necessary to block the PR over it; in this case, i think the regression is that we'd visit more of the main function than we'd need to in building expr_storage_map_. to me this falls a bit more on the "perfect" side of the don't-let-perfect-be-the-enemy-of-good argument. is there another kind of regression you can see here that is hard to test without explicitly testing ExprAllocator standalone?

I've requested a minimal amount of additional tests be added to this PR, which is attempting to address a broken window in the code base. Please don't get hung up on the singular example I found from a brief review, what troubles me more is the active resistance to investing effort in testing, this is not good and far from perfect. If we are aware the behaviour could regress, and we can solidify the behaviour with a test case, I don't understand the rationale for ignoring it? We're already ignoring a large amount of the overall logic by not testing error cases, which in itself means we can't guarantee the component fails safely.

@mbaret
Copy link
Contributor Author

mbaret commented Aug 30, 2022

To be clear, this isn't just a case of me refusing to put the test in 'just because it's effort' - I don't agree at a fundamental level with testing an implementation detail that has no functional impact on the behaviour of the component-under-test. Because of this, I don't want to establish precedent for what I believe to be bad practice (such tests have created nightmares for me in the past when refactoring). I accept you have a different view, and I've outlined a path by which the community can come to a consensus on testing best-practices. I don't think the comments on a PR with incredibly limited visibility is the forum for this though.

@mbaret mbaret force-pushed the aot-refactor-p1 branch 3 times, most recently from cad0932 to 11a4573 Compare September 6, 2022 19:03
@mbaret
Copy link
Contributor Author

mbaret commented Sep 6, 2022

cc @ekalda @lhutton1 could you take another look and see if you've got further comments or can approve?

This is a pass refactored out of the AOTExecutorCodegen. Instead of
combining all of the functionality of the AOTExecutorCodegen into a
single monolithic pass, this pass only handles the lowering of the
Relay main function into TIR. Tests for the pass are included.
@mbaret
Copy link
Contributor Author

mbaret commented Sep 9, 2022

@Mousius As this PR is being blocked and none of my proposals to achieve consensus have been acknowleged, I have done the following:

  • A new function, CreateStorage, has been created and exposed in the header to allow ExprAllocator to be subjected to testing.

  • A test has been added to confirm that StorageInfo is not allocated for nested functions as requested.

Because ExprAllocator is coupled to the implementation of AOTLowerMain, I've added a comment to indicate this in the header. Unfortunately, there was no programmatic way to enforce that this function isn't reused outside AOTLowerMain without making it untestable. Furthermore, the logic under test is unreachable from AOTLowerMain as AOT does not support nested functions and will check fail if they're encountered.

Lastly, I would prefer going forward that if committers intend to block PRs indefinitely, in parallel they engage with a consensus building process involving multiple members of the community. Without such engagement, contributors are disempowered to take part in good faith technical discussions which is a core tenet of the review process.

@lhutton1 @ekalda please approve or request changes explicitly.

Copy link
Contributor

@lhutton1 lhutton1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many apologies for the delay in taking another look at this - thanks @mbaret LGTM with regards to my comments. With regards to the testing, I don't really hold a strong opinion so will leave for others to take a look

Copy link
Member

@Mousius Mousius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mousius As this PR is being blocked and none of my proposals to achieve consensus have been acknowleged

It seems to me there was an active discussion on this PR with various interested parties participating which leads me to believe we are working towards consensus? The code review guidelines were also contributed via voting thread and explicitly state test coverage is expected for code contributions.

Because ExprAllocator is coupled to the implementation of AOTLowerMain, I've added a comment to indicate this in the header. Unfortunately, there was no programmatic way to enforce that this function isn't reused outside AOTLowerMain without making it untestable. Furthermore, the logic under test is unreachable from AOTLowerMain as AOT does not support nested functions and will check fail if they're encountered.

I think this is a great outcome of a code review, by attempting to find an appropriate way of testing this I believe we've found this code is actually unreachable and can be removed? This is a good reason to encourage more discussions to improve test coverage, as it helps to highlight code which is unreachable (or requiring a slightly more focussed test as was first thought). In this case, can we just remove the unreachable code? If it's re-added I would expect we can re-add with the appropriate tests.

Lastly, I would prefer going forward that if committers intend to block PRs indefinitely, in parallel they engage with a consensus building process involving multiple members of the community. Without such engagement, contributors are disempowered to take part in good faith technical discussions which is a core tenant of the review process.

It was never my intention to block this PR indefinitely, I raised my request for changes early and have been active in the technical discussion to work with you to move this forwards. As noted, the code review guidelines were voted in by the community, if you wish to make a change to reduce test coverage, I propose you raise a process RFC which we can discuss and vote on. Contributors should be aware that "High-quality code reviews prevent technical debt for long-term and are crucial to the success of the project", in which case there are likely to be discussions and changes requested - I'd hope a contributor feels empowered by an active review thread and welcome feedback on their contributions?

@ekalda
Copy link
Contributor

ekalda commented Sep 12, 2022

Apologies, I was away last week, so just caught up with the discussion here.

@mbaret I'm a bit confused by the addition of CreateStorage - you added handling (and testing) of nested functions, but we wouldn't reach that bit of code from AOTLowerMain since it doesn't support nested functions? It's not entirely clear to me where this came from.

My opinion regarding to the testing has not changed - I agree with @mbaret in general that we shouldn't test implementation details, but I'd test ExprAllocator in this case since besides being an implementation detail it is a chunky pass with complex logic. However, I don't perceive it as a fundamental issue, so I have no intention to block this PR. I also don't think the "ensure test coverage of the new feature" of the code quality guidelines is particularly helpful here since it is not defined what "test coverage" means. So I'd approve this PR once I'll understand better what is going on with the CreateStorage (since clearly I'm missing something).

@mbaret
Copy link
Contributor Author

mbaret commented Sep 12, 2022

@ekalda Thanks for taking another look and no worries for the delay.

The addition of CreateStorage and the associated test was to address the concern from @Mousius raised here:

I don't think you can test all the behaviours of ExprAllocator through AOTMainLowerer due to cases such as:
https://github.com/apache/tvm/pull/12550/files#diff-fd79a3a236f19a9f85e5045e7c75ed47a09ad7d698042114a93f4f74dd17c75eR99-R101
Which is an internal optimisation inside of ExprAllocator that hopefully we don't remove but makes no difference to the outer behaviour of the AOT pass and is therefore difficult to assert is behaving correctly.

I can't write a test against ExprAllocator directly without exposing it through the header because otherwise I can't include it in the test case. That's why I added CreateStorage, to provide an exposed function through which to probe ExprAllocator (like how we have AOTLowerMain and AOTMainLowerer). To me this is slightly problematic because it was never the intent to expose ExprAllocator, but I have no choice if we require tests are written against it directly.

The reason that particular test has been added is because it's one behaviour of ExprAllocator we can't test through the interface of AOTLowerMain. This is because AOTLowerMain will throw an exception if it encounters a nested function. However, AOTLowerMain will throw after ExprAllocator runs, so the code is still in a strict sense reachable.

I'd test ExprAllocator in this case since besides being an implementation detail it is a chunky pass with complex logic.

I think this is a fair point, and to be honest were I writing this from scratch I probably would have added some tests for ExprAllocator to help write it in the first place. In fact, if a bit more love and time were put into ExprAllocator it could probably be made into a reusable component (rather than an implementation detail), but I think that exceeds the scope of this refactor.

However, now we also have AOTLowerMain written, to test ExprAllocator entirely separately would require a lot of duplicated tests (the AOTLowerMain tests would be looking at the Allocates produced whereas the ExprAllocator tests would look at the StorageInfos). If we want to spend this effort, my personal view is it should be done in the context of promoting ExprAllocator to be a proper analysis pass rather than as part of this refactor.

I also don't think the "ensure test coverage of the new feature" of the code quality guidelines is particularly helpful here since it is not defined what "test coverage" means.

Agreed, and I think the fact there are 5 committers in this discussion all with differing interpretations of those guidelines is testament to that point 😅

@Mousius
Copy link
Member

Mousius commented Sep 12, 2022

The reason that particular test has been added is because it's one behaviour of ExprAllocator we can't test through the interface of AOTLowerMain. This is because AOTLowerMain will throw an exception if it encounters a nested function. However, AOTLowerMain will throw after ExprAllocator runs, so the code is still in a strict sense reachable.

Ah, good point, I guess it's expedited the error 😸

However, now we also have AOTLowerMain written, to test ExprAllocator entirely separately would require a lot of duplicated tests (the AOTLowerMain tests would be looking at the Allocates produced whereas the ExprAllocator tests would look at the StorageInfos). If we want to spend this effort, my personal view is it should be done in the context of promoting ExprAllocator to be a proper analysis pass rather than as part of this refactor.

I think testing the behaviour which can't be tested higher up is good here, we don't have to duplicate the complete test suite at every layer - agree that if it was actually its own pass we can test it in isolation fully.

I can't write a test against ExprAllocator directly without exposing it through the header because otherwise I can't include it in the test case. That's why I added CreateStorage, to provide an exposed function through which to probe ExprAllocator (like how we have AOTLowerMain and AOTMainLowerer). To me this is slightly problematic because it was never the intent to expose ExprAllocator, but I have no choice if we require tests are written against it directly.

Sorry, I misunderstood this originally (thought you meant you couldn't make it static rather than excluding it from the interface header), you can just put the prototype in the C++ test file, similar to how it's done for grabbing the Target hooks here:
https://github.com/apache/tvm/blob/main/src/relay/backend/contrib/cmsisnn/target.cc#L33-L34

Copy link
Member

@Mousius Mousius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the CreateStorage function out of the header, but otherwise I appreciate your patience in building up our test coverage @mbaret 😸 Will leave that bit with @ekalda

Copy link
Contributor

@ekalda ekalda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification @mbaret! I hadn't realised the handling of nested function was there before. I'm happy with the current state of testing of this PR (I suppose you could move CreateStorage out of the header the way @Mousius suggests in a future patch). I agree that it would be more appropriate to extend the testing coverage when we make ExprAllocator a general component, so it's probably out of scope of this refactor.

Thanks again for all the effort! 🏅

@lhutton1 lhutton1 merged commit f7f2cda into apache:main Sep 14, 2022
@lhutton1
Copy link
Contributor

Thanks @mbaret @areusch @Mousius @ekalda!

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
)

This is a pass refactored out of the AOTExecutorCodegen. Instead of
combining all of the functionality of the AOTExecutorCodegen into a
single monolithic pass, this pass only handles the lowering of the
Relay main function into TIR. Tests for the pass are included.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants