-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse lambda expressions with explicit return type #53827
Conversation
if (this.IsPossibleLambdaExpression(precedence)) | ||
{ | ||
return this.ParseLambdaExpression(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting ambiguity. if we have ref x () =>
is that a ref
to this lambda, or a lambda that returns by-ref. Given that the former seems fairly meaningless semantically (correct me if i'm wrong), it seems safe to presume this is the latter. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add comment on this what case this is matching?
if (foundParameterModifier) | ||
{ | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happened here? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case was previously used to treat (ref x,
as the start of an explicitly typed lambda. If we also allow those cases when there is an explicit return type, we'd treat F(ref x,
as the start of a lambda (with return type F
) rather than a call to F()
. The simplest change seemed to be look ahead for ) =>
in all cases as we were doing for cases with parameters with no modifiers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above, i'd put a comment to this effect in this method to help make it clearer for maintenance.
this.Reset(ref resetPoint); | ||
this.Release(ref resetPoint); | ||
|
||
return isAsync; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i skimmed. i def need to review this in depth later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed in depth. i think this is correct. however, it is definitely on the knife's edge of being really hard to maintain and get right. this approach does have the virtue of being in line with how we did it before, with very little deviation, sot hat's a plus.
However, we could consider tearing this down (now or int he future) and just syaing: this code will presume that what it is seeing is a lambda, but will require the =>
to follow. And, if it doesn't have that, it's not a lambda. That would likely make this a ton simpler, and would allow us to not have to do all the contortions around identifying and and considering all the ambiguity cases. In a scannign world that was necessary. In a parsing world it is not (though we might generate more garbage in common cases) :-/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could consider tearing this down ...
Added comment to #53976 for now.
} | ||
} | ||
EOF(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add test for thsi case: RT LambdaName(int arg) => { ... }
. It's not legal, but people will likely try to write this out.
Note that we might want to make this legal at some point int eh future so we can have recursive lambdas. But that's obviously a future issue. The purpose of the test here is to juts validate that we fail and that hopefully we don't go too off the rails. #Resolved
542c0dd
to
9c480b1
Compare
Consider adding a more complicated function pointer scenario (with calling convention and such) In reply to: 856977525 In reply to: 856977525 In reply to: 856977525 Refers to: src/Compilers/CSharp/Test/Syntax/Parsing/LambdaReturnTypeParsingTests.cs:433 in 52dbe07. [](commit_id = 52dbe07, deletion_comment = False) |
break; | ||
} | ||
|
||
if (this.CurrentToken.Kind == SyntaxKind.EndOfFileToken) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this bailout? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously we were tracking whether we saw a ref
, out
, or in
modifier to return true
early without scanning for a =>
, and for this case when the partially complete parameter list with a modifier was at the end of the file. To handle any of these cases now, we'd need the caller to pass in whether there was an explicit return type because F (ref x,
may be either a lambda with return type F
or an invocation of method F
. It didn't seem worth the complexity to keep the early return optimization, and the case here, of a lambda with parameters that have modifiers at the end of the file, seemed unlikely.
if (this.CurrentToken.Kind == SyntaxKind.EqualsGreaterThanToken) | ||
{ | ||
// 1. `static =>` | ||
// 2. `async static =>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a case for static async
#Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see you're handling it below. Consider mentioning that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've only checked for static
and async static
above so static async
is not applicable here. Will leave as is since the comment and the preceding checks were essentially unchanged.
@@ -11231,14 +11227,14 @@ private ExpressionSyntax ParseCastOrParenExpressionOrLambdaOrTuple(Precedence pr | |||
{ | |||
Debug.Assert(this.CurrentToken.Kind == SyntaxKind.OpenParenToken); | |||
|
|||
if (IsPossibleLambdaExpression(precedence) && this.TryParseLambdaExpression() is { } lambda) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Consider moving this check to the place that calls ParseCastOrParenExpressionOrLambdaOrTuple
. Seems more regular with other cases there. #Closed
Consider adding a test for In reply to: 856993252 In reply to: 856993252 In reply to: 856993252 Refers to: src/Compilers/CSharp/Test/Syntax/Parsing/LambdaReturnTypeParsingTests.cs:2616 in 52dbe07. [](commit_id = 52dbe07, deletion_comment = False) |
@@ -9768,7 +9772,7 @@ private bool IsPossibleExpression(bool allowBinaryExpressions, bool allowAssignm | |||
// expression (whether it is used as an identifier or a keyword). | |||
return this.IsTrueIdentifier() || (this.CurrentToken.ContextualKind == SyntaxKind.FromKeyword); | |||
default: | |||
return (IsPredefinedType(tk) && tk != SyntaxKind.VoidKeyword) | |||
return IsPredefinedType(tk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: what's a scenario affected by this change? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change allows void
at the start of an expression - for instance in the argument value to [A(void () => { })] class C { }
.
// something lambda-like like: | ||
// | ||
// async a => ... // or | ||
// async (a) => ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what contexts are either of these not a lambda? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what contexts are either of these not a lambda?
It seems the existing comment is saying these are more likely to be lambdas than the example below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with review pass (iteration 3)
// | ||
// Or we could have something that isn't a lambda like: | ||
// | ||
// async (); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is bad code, right? Why would we assume that it isn't a lambda with a missing body for error recovery? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async ();
could be a method call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. and this is a case where @jaredpar is correct. if we can presume async
is a keyword, this becomes like 10x simpler minimum. it's really this ambiguity which forces the insanity here.
note: a potentially cleaner approach here (since we already ahve the reset point) is to simply assume that async
is the keyword, and just ensrue we get to the =>
. if so, it's a lmbda. if not, we bail out.
case SyntaxKind.IdentifierToken: | ||
case SyntaxKind.StaticKeyword: | ||
case SyntaxKind.RefKeyword: | ||
case SyntaxKind.DelegateKeyword: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true? What if it's followed by a *
? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is just checking if async
is in a location where it should be considered an async
keyword in a lambda expression rather than an identifier. async delegate*...
is a such a case.
} | ||
} | ||
} | ||
} | ||
M(SyntaxKind.SemicolonToken); | ||
} | ||
N(SyntaxKind.LocalDeclarationStatement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I anticipate this is going to cause all sorts of hell for completion. Not sure what to do about it, but it won't be fun.
@@ -6422,7 +6423,8 @@ private ScanTypeFlags ScanType(ParseTypeMode mode, out SyntaxToken lastTokenOfTy | |||
{ | |||
case SyntaxKind.QuestionToken | |||
when lastTokenOfType.Kind != SyntaxKind.QuestionToken && // don't allow `Type??` | |||
lastTokenOfType.Kind != SyntaxKind.AsteriskToken: // don't allow `Type*?` | |||
lastTokenOfType.Kind != SyntaxKind.AsteriskToken && // don't allow `Type*?` | |||
!isFunctionPointer: // don't allow `delegate*<...>?` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior of ScanType()
is different today from ParseType()
for this case, making it difficult to recover from cases such as delegate*<void>? () => x
.
Why not? It's unambiguous what it means. It's bad, true, but unambiguous. I think it would make error recovery better.
Is delegate*<...>?
different in that regard from Type??
or Type*?
which we disallow here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is delegate*<...>? different in that regard from Type?? or Type*? which we disallow here?
I think so, yes. In either of those cases, Type
is not guaranteed to definitely be a type. delegate*
is 100% a type and cannot be anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior of ScanType() is different today from ParseType() for this case, making it difficult to recover from cases such as delegate*? () => x.
I'm not opposed to making the behavior of ParseType be the same here, parsing into an invalid nullable type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree with fred here. i'd far prefer that we parse these out as types (unless htat would be bad because it woudl actually break some level ternary-expression case). And then later on have binding/whatever error and say that a nullable func-ptr is not legal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done review pass (commit 3) In reply to: 857043793 |
This is a duplicate In reply to: 856977697 Refers to: src/Compilers/CSharp/Test/Syntax/Parsing/LambdaReturnTypeParsingTests.cs:2990 in 52dbe07. [](commit_id = 52dbe07, deletion_comment = False) |
return this.ParseCastOrParenExpressionOrLambdaOrTuple(precedence); | ||
if (IsPossibleLambdaExpression(precedence)) | ||
{ | ||
if (this.TryParseLambdaExpression() is { } lambda) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe merge two if's? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge two if's?
Unfortunately, that would mean the lambda
declaration is in scope for the entire switch
which conflicts with an earlier declaration. The options were merge the two if
s and wrap the if
in { }
or nest the if
s. Neither option seemed better than the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i personally like the merging plus surrounding {}
. but i agree it is all the same anyways. so no bggie.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine to leave as-is then. Thanks for the explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks (iteration 5)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (commit 5), with a bug to follow up on the function pointer parsing.
{ | ||
Debug.Assert(this.CurrentToken.Kind == SyntaxKind.OpenParenToken); | ||
|
||
var resetPoint = this.GetResetPoint(); | ||
try | ||
{ | ||
if (ScanParenthesizedImplicitlyTypedLambda(precedence)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider having this in hte code, but maybe with a debug.assert that this is not the case and that the caller should have checked for this and parsed out the lambda itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still probably a reasonable idea. but #nonblocking.
{ | ||
return true; | ||
} | ||
|
||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrlate to this PR, but man do i dislike this scanner. the core thing it does which i think is broken is that it's just a loop that accepts certain forms in any order. in other words, you could have type, then attributes, then comma, then attributes, the type, then ref/out, then etc. It makes is nigh impossible to gauge if it's accepting things it shouldn't accept. this is a case where just parsing the real shape and checking it (and following tokesn) for acceptance is much more sensible.
@@ -12420,6 +12423,22 @@ LambdaExpressionSyntax parseLambdaExpressionWorker() | |||
this.IsInAsync = true; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParseLambdaExpression()
contains several statements including a call to the parseLambdaExpressionWorker()
local function where the actual changes are.
finally | ||
{ | ||
this.Release(ref resetPoint); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this shoudl be a TryParseLambdaReturnType() helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only checked the parser. I'm feeling pretty confident about it. Most of my feedback was around potential small bits of reorg, a couple of questions on us missing cases, and a few potential thoughts on how to make this simpler in the future.
I'm only 50% confident on the X ? Y() => z : w
case though... but i can't come up with any cases that break this or that it goes the wrong path on... so let's go with this for now.
That said. can we throw a huge corpus against this and see if we generate a new tree in any cases of legal code before? that woudl raise my confidence level.
Based on your suggestion, I've tested parsing with and without this change on dotnet/roslyn and dotnet/runtime, and for the C# files that parsed without errors previously, there were no changes to SyntaxKind or FullSpan in the resulting trees. |
Proposal: lambda-improvements.md
Test plan: #52192
As a perf test, parsed the 1600 C# files beneath src/Compilers/CSharp, with and without this change. Reviewing in perfview, there seems to be no real difference in time spent in
ParseText()
, and the difference in total allocations withinParseText()
was 0.1%.