Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert the default-case optimisation to a MIR pass #33567

Closed
nagisa opened this issue May 11, 2016 · 8 comments
Closed

Convert the default-case optimisation to a MIR pass #33567

nagisa opened this issue May 11, 2016 · 8 comments
Labels
A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html

Comments

@nagisa
Copy link
Member

nagisa commented May 11, 2016

#33566 introduces an optimisation which speeds up LLVM greatly. My suspicion is that this optimisation is also very relevant to the MIR pipeline (i.e. MIR transformations ought to be slower to analyse and transform the switches with more branches). Thus we should port this optimisation over to a MIR pass down the line.

Moreover, the code in the PR relies on the assumption that the otherwise branch of the switches is always unreachable. This assumption has potential to become not true after we get more complex optimisations on the MIR going. Having such optimisation to be a plain MIR transformation would help with this too.

cc @dotdash

@nagisa nagisa added the A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html label May 11, 2016
@Aatch
Copy link
Contributor

Aatch commented May 19, 2016

The question is whether to change the Switch terminator to be non-exhaustive or make a new terminator instead. Making Switch non-exhaustive seems like the better option to me, as I'm not sure what the value of having two almost-identical instructions is.

Also, it being non-exhaustive means we can build the initial MIR a bit smarter. There are often cases where we can narrow down the candidates based on the fact that previous arms weren't taken:

enum Foo {
  A,
  B,
  C
}

match (x, y) {
  (_, A) => {},
  (A, _) => {} // We know that the `_` can't be A here because otherwise the earlier case would have been taken
}

I'm pretty sure the existing code tries to handle this already, I know old trans does.

@nagisa
Copy link
Member Author

nagisa commented May 20, 2016

The question is whether to change the Switch terminator to be non-exhaustive or make a new terminator instead. Making Switch non-exhaustive seems like the better option to me, as I'm not sure what the value of having two almost-identical instructions is.

IMO we should remove switch altogether. A big part of MIR is getting rid of things tied to language (e.g. we have no for, while etc loops, only a goto), but we (due to historical accidents, already!) have both the Switch and SwitchInt, where Switch is just a glorified version of SwitchInt that works on enum discriminants only.

What I think we ought to do, is:

  1. Add a Lvalue projection for the discriminant;
  2. Remove Switch and replace that with SwitchInt on discr(Lvalue).

The only downside I see – we would end up duplicating the values being Switched on with ones in AdtDef, but if we want to make Switch non-exhaustive that is necessary anyway.

@nagisa
Copy link
Member Author

nagisa commented May 20, 2016

Incidentally that should also decrease our reliance on trans::adt greatly.

@nagisa
Copy link
Member Author

nagisa commented May 20, 2016

also cc @nikomatsakis, in case you want to discuss this.

@nikomatsakis
Copy link
Contributor

Unsurprisingly, I do have an opinion:

  1. I definitely agree we should make the MIR translation for switches more intelligent. The primary goal with the current version was that it should be commented and (relatively) readable -- a tradition I hope we will continue, in strong contrast to the old trans version. :) But I think it's clear it doesn't generate the most optimal code.
  2. I am not sure about unifying Switch and SwitchInt. I see why it's appealing, and it may well make sense -- along these lines, maybe we should unify SwitchInt and If -- but I have two reservations:
    • I expect we will eventually want to read MIR as input, for example from different front-ends; in these cases, we'll need something to verify the various correctness assumptions that are currently assured "by construction". Having a more "semantic" switch (that clearly indicates it is testing the enum variant, rather than some random integer) will make this mildly easier. OTOH, we could address this in various other ways.
    • Does ADT translation always produce an integer to be tested? I guess probably it does; I was just imagining that sometimes (e.g., null vs non-null), this might introduce some silly transformations, though LLVM would probably optimize them away. (For example, the equivalent of let discrim = if x != null { 1 } else { 0 }; if discrim == 1 { ... } else { ... })

I'm not sure how much to weigh the first point. It seems like it's probably good to keep in mind, but bad to be overly constrained by this -- after all, if we want to change things later, we can do so, and it's clear that we don't know the full set of constraints right now.

Another way to look at unifying Switch and SwitchInt is that it's the kind of lowering that makes sense to do before trans for sure, but it's a semantic distinction I could imagine wanting to preserve during static checking.

Finally, I imagine we could bridge the gap just by making Switch kind of polymorphic over the type of discriminant:

  • integer, bool: the integer labels are the value itself
  • enum: the integer labels are variant numbers

In other words, we wouldn't generate code like:

TMP = LOAD_DISCRIMINANT(ENUM)
SWITCH_INT(TMP, ...)

but just

SWITCH_INT(ENUM, ...)

@nikomatsakis
Copy link
Contributor

So @scottcarr is starting up an internship and I was thinking this would make a good starter problem for him (presuming nobody else is hacking on it just now?). I was thinking specifically not of modifying the MIR in any particular way, but just modifying the MIR generation to not build a basic block for every variant, but instead to be more selective. (In other words, it would directly generate the IR that we wind up optimizing to.)

We could also change the IR, but the two things seem somewhat orthogonal to me.

@nikomatsakis
Copy link
Contributor

Here are some test cases (just for reference) that exhibit the N^2 blowup we observed (at least initially). Note that the one that uses integral constants does not: I think the simplest fix to the blowup is just to have the enum case use a similar pattern to what constants do, where it figures out which variants are "of interest" and creates an otherwise-like block to use as the target for the remainder (we might also change the IR; just seems orthogonal).

https://gist.github.com/nikomatsakis/dc3641659919d11b01a6fc14051831bd

bors added a commit that referenced this issue Jun 3, 2016
generate fewer basic blocks for variant switches

CC #33567
Adds a new field to TestKind::Switch that tracks the variants that are actually matched against.  The other candidates target a common "otherwise" block.
bors added a commit that referenced this issue Jun 5, 2016
generate fewer basic blocks for variant switches

CC #33567
Adds a new field to TestKind::Switch that tracks the variants that are actually matched against.  The other candidates target a common "otherwise" block.
@nagisa
Copy link
Member Author

nagisa commented Mar 12, 2017

I think this is done now after we switched to the SwitchInt everywhere.

@nagisa nagisa closed this as completed Mar 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html
Projects
None yet
Development

No branches or pull requests

3 participants