-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate tuple allocations in branching let binding rhs #11407
Conversation
As mentioned in the comments, the optimization does not kick in with a first class tuple use: let z (tuple: int * int) =
let a, b =
if cond () then
tuple
else
1, 2
a + b It's doable but I feel a lot more code would be required, plus I'm not sure if it's desirable in the first place. |
Ideas for other interesting tests, corner cases, would be appreciated. |
Any feedback? I don't know when the fsc feature cutoff point happens (or if there is such a thing), but it would be a shame if this didn't make it into the next F# compiler version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code generally looks good, and the tests are great. Would like to get @TIHan or @dsyme or @vzarytovskii to take a look as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great
The only change I'd really like is to delay the generation of the val refs etc. until we detemine they will be needed.
Also maybe an assert in IlxGen.fs that temporaries of the given name have always been eliminated by Optimizer.fs?
src/fsharp/Optimizer.fs
Outdated
| _ -> None | ||
| _ -> None | ||
|
||
let argTys = destRefTupleTy g v.Type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the potential to execute very often - once for each tuple-typed let
. I guess the tuple type acts as a good filter but I think it would be great to delay this execution of computing vrefs
until we know it's really going to be necessary, i.e. we actually have a Expr.Op
at the end of every branch, or at least one branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would basically wrapping these 4 lines in a Lazy
be an acceptable solution? Annoyingly, the lazy value would have to be a 4-element tuple since argTys
, ves
and binds
are referred to further on too, not just used to create vrefs
for the dive
function. Or is it literally just vrefs
that you want delayed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsyme, it's ready for another look.
src/fsharp/Optimizer.fs
Outdated
|
||
|
||
let MakeMutableStructuralBindingForTupleElement (v: Val) i (arg: Expr) argTy = | ||
let name = sprintf "%s_%d_%s" v.LogicalName i PrettyNaming.tempTupleElementAssignmentTargetName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, these temporaries should always be removed within Optimizer.fs, even in debug code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. Given this:
let x () =
let a, b =
sideEffect ()
if cond () then
let v = "yep"
let v2 = if cond () then 1 else 3
v, v2
else
"", f ()
a, b
no tuples are created on the right hand side, but those temporaries will still be used to create a tuple in the let body. Yes, the optimization is a bit pointless here, but I reckon it would be more expensive to detect whether the temporaries will be eliminated in the end, rather than just rewrite the right hand side.
public static Tuple<string, int> x()
{
TupleElimination.sideEffect();
string item;
int item2;
if (TupleElimination.cond())
{
int num = (!TupleElimination.cond()) ? 3 : 1;
item = "yep";
item2 = num;
}
else
{
item = "";
item2 = TupleElimination.f();
}
return new Tuple<string, int>(item, item2);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps not pointless at all. This is what happens today.
The more I think about it, I realize the temporaries are never removed by the optimizer (that must be why I had not originally appendend temp to the function name, like in MakeStructuralBindingTemp
), so the prefix in tempTupleElementAssignmentTargetName
is not the best. Maybe you had something else on your mind?
src/fsharp/Optimizer.fs
Outdated
@@ -2586,7 +2586,7 @@ and AddValEqualityInfo g m (v: ValRef) info = | |||
// when their address is passed to the method call. Another exception are mutable variables | |||
// created for tuple elimination in branching tuple bindings because they are assigned to | |||
// exactly once. | |||
if v.IsMutable && not (v.IsCompilerGenerated && (v.DisplayName.StartsWith(PrettyNaming.outArgCompilerGeneratedName) || IsMutableStructuralBindingForTupleElement v)) then | |||
if v.IsMutable && not (IsMutableStructuralBindingForTupleElement v) && not (v.IsCompilerGenerated && v.DisplayName.StartsWith(PrettyNaming.outArgCompilerGeneratedName)) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the not a && not b
is irking me, does inverting the conditionals / branch looks better?
Also wondering about repeated calls to v.IsCompilerGenerated that now occurs (one is introduced, unless mistaken).
As you are stress testing the inliner and bool expression optimizer, maybe you should also stress test the final reviewer by rewriting the whole condition from scratch to make it more readable 🙂.
Suggestion, since both current branches are very small expressions that could be copied:
In the comment, it says
an exception can be made for outArg values...
this would be one simple if condition, then it says
Another exception are mutable variables
created for tuple elimination in branching tuple bindings because they are assigned to
exactly once.
This would be another one, the final case is just returning the default case.
Sorry if bikeshedding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does inverting the conditionals / branch looks better
not (a || b)
? Hm, I don't feel strongly either way.
Also wondering about repeated calls to v.IsCompilerGenerated
I know, but it needs to be a part of IsMutableStructuralBindingForTupleElement
, otherwise If the caller does not also check IsCompilerGenerated
, then it could lead to bugs.
So what you're suggesting at the end is this?
if v.IsMutable && not (IsMutableStructuralBindingForTupleElement v) then
info
elif v.IsMutable && not (v.IsCompilerGenerated && v.DisplayName.StartsWith(PrettyNaming.outArgCompilerGeneratedName)) then
info
else
{ info with Info = MakeValueInfoForValue g m v info.Info }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kerams, thanks a lot for the feedback.
not (a || b)? Hm, I don't feel strongly either way.
if that's the end result, the best (to me) would be if a || b then
and inverting the branches expressions then (I don't like "not" for complex conditions, the same way I don't like isNull
, I'd prefer isNotNull
and not using "not" 🙂).
I know, but it needs to be a part of IsMutableStructuralBindingForTupleElement, otherwise If the caller does not also check IsCompilerGenerated, then it could lead to bugs.
Understood, it fits well with when I was proposing to extract a function, and it makes sense to have the redundant check to properly encapsulate the assumption.
It may also surface as a comment if you can expand on the deeper reason, to avoid someone trying to remove duplicate check to do it hastily.
So what you're suggesting at the end is this?
for the unsuspecting next maintainer dealing with this part (which seems sensitive for tuple elision, but it, hmm, escapes me how it actually works), I feel your proposed edit, with maybe a re-interleaving of the keywords (such as "outArg values", "temp mutable for tuples that can be elided") from the larger text above, would do a great job at improving the very expansive conditional in the code, and shouldn't add runtime cost, just a duplicate branch with info
.
So I see potential small wins for the clarity of the code, if this is not bikeshedding and you feel it makes it easier for next person following your journey, I wish to understand better the flow of Val
objects and do the work you do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which seems sensitive for tuple elision, but it, hmm, escapes me how it actually works
let a, b =
if System.Diagnostics.Debugger.IsAttached then
1, 3
else
3, 4
a + b
enters TryRewriteBranchingTupleBinding
in the form of
let patternInput = if System.Diagnostics.Debugger.IsAttached then 1, 3 else 3, 4
let b = patternInput.Item2
let a = patternInput.Item1
a + b
and exits as
let mutable patternInput_0_tupleElem = null
let mutable patternInput_1_tupleElem = null
let patternInput =
if System.Diagnostics.Debugger.IsAttached then
patternInput_0_tupleElem <- 1
patternInput_1_tupleElem <- 3
else
patternInput_0_tupleElem <- 3
patternInput_1_tupleElem <- 4
patternInput_0_tupleElem, patternInput_1_tupleElem
let b = patternInput.Item2
let a = patternInput.Item1
a + b
As you can see, there's still a let binding with a tuple (I copied that part from ExpandStructuralBindingRaw
, the 4 lines in the else
branch). It's only further on in OptimizeExprOpReductionsAfter
and TryOptimizeTupleFieldGet
that this tuple is eliminated. If you step through the code, you'll realize it's only possible when the equivalence of patternInput_0_tupleElem
and patternInput.Item1
is established based on the former's ExprValueInfo
. Had I not changed AddValEqualityInfo
to account for this mutable tupleElem
, there'd be no equality information (because normally you can't, for obvious reasons, equate a mutable variable with another one) and the result would look like:
public static int z()
{
int item;
int item2;
if (Debugger.IsAttached)
{
item = 1;
item2 = 3;
}
else
{
item = 3;
item2 = 4;
}
Tuple<int, int> tuple = new Tuple<int, int>(item, item2);
return tuple.Item1 + tuple.Item2;
}
If, on the other hand, I'd used a regular local in MakeMutableStructuralBindingForTupleElement
instead of mkMutableCompGenLocal
, patternInput_0_tupleElem
would be equated with that initial null
(0) forever, and the whole function would through a series of inlining and eliminations be reduced to this:
public static int z()
{
if (Debugger.IsAttached)
{
}
return 0 + 0;
}
Bump @dsyme |
let [<Literal>] suffixForVariablesThatMayNotBeEliminated = "$cont" | ||
|
||
/// Indicates a ValRef generated to facilitate tuple eliminations | ||
let [<Literal>] suffixForTupleElementAssignmentTarget = "tupleElem" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably name this $tupleElem
instead of tupleElem
because the user can never name a val
with a $
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TIHan, it works for me when double-ticked. The check is guarded with IsCompilerGenerated
, so it should be fine? outArg
does not use $
either. Let me know if you insist and I will of course change it;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably fine then. I just want to make sure. :) Thank you for your work on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My pleasure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good and tests are great.
One minor comment on the suffix name.
Apologies for the late second review I'm concerned about the interaction with closure capture. The code: let f(a,b,c) =
let x,y =
if c then
printfn "hello"
a,b
else
b,a
(fun () -> x + y) will now promote We need to find a right approach to this that doesn't implicitly do this kind of thing I will revert this until we've sorted this out. |
While investigating I also noticed a significant change in the generated release code for let f(a,b) =
let x,y = printfn "hello"; b*a,a*b
(fun () -> x + y) Here the compiler eliminates the tuple binding and we end up capturing values names |
…net#11407)" This reverts commit e1c785d.
@kerams I suppose the right approach is to change let x,y = ... e1,e2 to let mutable x$tmp
let mutable y$tmp
...
x$tmp <- e1
y$tmp <- e2
let x = x$tmp // not mutable
let y = y$tmp // not mutable I'll see if I can make this change (I had actually thought the code was doing this) |
@dsyme, I think I'll need to recheck The second problem was most likely introduced (or rather surfaced in, because you get those pattern inputs in sharplab, without any of these PRs) in #11296, but I'm not sure how to fix that. (And I'm afraid the above will suffer from this too.) |
I see, yes, the problem may be simply with I'll take a deep look in about half an hour. I'd rather not revert of course, we should find a good fix |
|
@kerams Have you got time to chat on slack or teams or something? |
The capture boxing happens in autobox.fs I'm trying to understand why IsMutableStructuralBindingForTupleElement was needed |
It's needed in You also found something in #11267, to which I replied. |
On the question of the elimination of good user-facing values, yes, we have a problem This: let testFunction(a,b) =
let x,y = printfn "hello"; b*a,a*b
(fun () -> x + y) is becoming this: let mutable patternInput1 = <default>
let mutable patternInput2 = <default>
printfn "hello";
patternInput1 <- b*a
patternInput2 <- a*b
let patternInput = (patternInput1, patternInput2)
let x = patternInput1
let y = patternInput2
(fun () -> x + y) and the ValValue propagation is indeed being done for What happens if we just avoid doing this and have the multiple locals? Does assembly code get any worse? |
@kerams I think we should have a rule that whenever we have let f() =
...
let someCompilerGeneratedThing = expr
let someUserValue = someCompilerGeneratedThing
...someUserValue... and we use This is always valid and will always improve the debugging experience for release code |
Implements #8953.
becomes
instead of
More tests coming soon.