Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TVMScript] Disable black_format by default #15706

Merged
merged 1 commit into from
Sep 9, 2023

Conversation

Hzfengsy
Copy link
Member

@Hzfengsy Hzfengsy commented Sep 8, 2023

Black is a popular code formatter for Python. However, it is not suitable for TVM script. This PR tries to disable it by default.

This would change the default printing behavior, love to hear opinions from the community.

cc @tqchen @Lunderberg @junrushao @cyx-6

`Black` is a popular code formatter for Python. However, it is not
suitable for TVM script. This PR tries to disable it by default.
@junrushao
Copy link
Member

In most of the cases, formatting TVMScript results in a quite long python program spanning over a screen, so I suppose this is a better change by default.

@junrushao junrushao merged commit 7322769 into apache:main Sep 9, 2023
@Lunderberg
Copy link
Contributor

I've preferred having the auto-formatting of TVMScript, primarily to avoid having extremely long single lines (e.g. function declarations of end-to-end models), which would often be displayed incorrectly in paginators.

@Mousius
Copy link
Member

Mousius commented Sep 9, 2023

I've preferred having the auto-formatting of TVMScript, primarily to avoid having extremely long single lines (e.g. function declarations of end-to-end models), which would often be displayed incorrectly in paginators.

Agree.

This would change the default printing behavior, love to hear opinions from the community.

It would've been good to allow more time for Community Participation.

@junrushao
Copy link
Member

@Lunderberg it makes sense to have shorter lines in function signature, but my main worry is that it may lead to too fragmented line breakings like this: https://github.com/apache/tvm/blob/main/tests/python/unittest/test_tvmscript_roundtrip.py#L374, which happens a bit more frequently than long function signatures considering our arith analysis in most of the cases. Therefore, I usually turn off black_format when developing TIR schedules/passes.

I don’t have strong opinion on the default value of this flag though. Personally I always write “.show(black_format=False)” by default. It doesn’t bother me a lot, but considering there seem more cases it’s better to be turned off in the test file I shared, it might be better to do so by default.

junrushao pushed a commit to junrushao/tvm that referenced this pull request Sep 10, 2023
`Black` is a popular code formatter for Python. However, it is not
suitable for TVM script. This PR tries to disable it by default.
@Lunderberg
Copy link
Contributor

Lunderberg commented Sep 15, 2023

@junrushao True, it does tend to depend on what type of IR somebody tends to look at. Relax IR tends to be very flat, due to the restricted form of the DataflowBlock. The TIR examples tend to be more nested, especially with many loops over high-dimensionality buffers.

That said, from the example linked there are a number of improvements we could make that would make the TVMScript more readable, with the ease of black-auto-formatting coming along as a side effect.

  • Avoid extraneous parentheses. Currently, TVMScript gives explicit parentheses for operations, even when the parentheses are unnecessary. Edit: The example has extra parentheses that TVMScript now avoids. ((x_outer * 32768) + (x_c * 1024)) + (k_outer * 4) could instead be x_outer * 32768 + x_c * 1024 + k_outer * 4
  • The T.grid syntax is only used for serial loops. If it had an additional keyword-only argument for the loop types, other nested for x in T.parallel(0, 32): for y in T.serial(0, 1024): could instead be written as for x,y in T.grid(32, 1024, type='PS').
  • T.ramp and T.broadcast can be printed as python slices into a buffer, rather than explicitly written. This lets C_global[T.ramp((x_c_init * 32), 1, 32)] = T.broadcast(T.float32(0), 32) be written as C_global[x_c_init*32: x_c_init*32 + 32] = T.float32(0) Edit: The current TVMScript printer handles this case.

There are also lowering passes we could implement that would improve performance, with improved TVMScript readability as a side-effect.

  • Automatic loop fusion. If two nested loops have the same type, that type is not kVectorized, and the loop iterators are always used as the expression iter_outer*extent_inner + iter_inner, then the loops can be fused into a single loop.

Edit: It looks like a large number of the changes I was brainstorming on are already implemented, but are not included in all TVMScript examples in the unit tests. For curiosity, I re-printed the linked example, both with and without black-auto-formatting (link), and the improved printing makes the TVMScript more readable, and more amenable to auto-formatting.

@Lunderberg
Copy link
Contributor

That said, those are each longer plans. An easier solution for the moment may be to have an environment variable serving as the default, rather than a single default for all developers.

def show(..., black_format=None, ...):
    if black_format is None:
        black_format = bool(int(os.environ.get('TVM_BLACK_FORMAT','1')))

@Lunderberg
Copy link
Contributor

Lunderberg commented Sep 15, 2023

I submitted #15762, which determines the default from an environment variable.

@junrushao
Copy link
Member

Thanks @Lunderberg! 100% agreed with your comments, and particularly, feeling the same as you that the previous generation of TVMScript printer comes with many subtle issues making it not good looking or non-pythonic at all, as you mentioned, extra parentheses, non-fused loops, and ramp/broadcast syntax. The environment variable based approach is even better to me. Happy to get it in!

I have been thinking about this issue as well in the recent days, as it's something we all care about a lot. I think all of us agree with the formatting perspective, and my personal source of pain is mainly about the line width. How about @Hzfengsy? If line-width is the only thing we have different opinions, we can make it configurable using an environment variable.

Let's also move the discussion to @Lunderberg's new PR for more visbility!

Thanks again Eric for being always super thoughtful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants