Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SynExpr.Sequential doesn't store presence or range of semicolon ; #16914

Closed
brianrourkeboll opened this issue Mar 20, 2024 · 3 comments · Fixed by #16981
Closed

SynExpr.Sequential doesn't store presence or range of semicolon ; #16914

brianrourkeboll opened this issue Mar 20, 2024 · 3 comments · Fixed by #16981
Labels
Area-Compiler-Syntax lexfilter, indentation and parsing Feature Improvement
Milestone

Comments

@brianrourkeboll
Copy link
Contributor

brianrourkeboll commented Mar 20, 2024

TL;DR

  • We should consider incorporating the range of the semicolon, if present, into SynExpr.Sequential's range.
  • We should consider adding syntax trivia to SynExpr.Sequential to hold the presence (and range) or absence of a semicolon.

Long, rambling description of why it would be useful

(I have a vague recollection that this has come up before. If it has, let me know.)

SynExpr.Sequential doesn't store whether a semicolon ; is present, nor, if one is, does it include the range of the semicolon in the range of the SynExpr.Sequential (even if the semicolon is on a different line!):

/// F# syntax: expr; expr
///
/// isTrueSeq: false indicates "let v = a in b; v"
| Sequential of
debugPoint: DebugPointAtSequential *
isTrueSeq: bool *
expr1: SynExpr *
expr2: SynExpr *
range: range

Parsing, however, may depend on the presence or absence of a semicolon.

In this example (see the AST), all of the expressions whose start column is leftward of that set by the first expression (i.e., column 4, the start of 1) would be considered offsides if not for the semicolons:

[
    1;
   (if true then 3 else 99) (* 👉;;;;;👈 *) ;
    100;
 (int) "lol 

          🤠
        👉;👈
         👢👢
        
        nope
".[0]
            
            ;
    4;
   (if true then 3 else 99) (* 👉;;;;;👈 *)
   ;6;
 10
   ;10
   ;10
]

The parentheses analyzer is running into this problem because such expressions can only exist when there are semicolons, but those expressions nonetheless may become offsides when the parentheses are removed.

In the example above, the parentheses analyzer will currently indicate that the parentheses are unnecessary around both the first and second if-then-else expressions. However, if we remove the parentheses in place, the first if (with the semicolon on the same line) suddenly becomes offsides, while the second if (with the semicolon on the next line) does not.

If the presence and range of semicolons were kept in the AST, it wouldn't be that hard to err on the side of caution and just say parens are required if the parent sequential expression has a semicolon.

My next thought was to look for semicolons in the source text, but, as the example above shows, that means essentially reinventing half a parser in order to deal with comments, strings, etc.

Alternatively, I could search all of a given sequential expression's ancestors and descendants and check whether any of them had a start column greater than the parenthesized expression's start column, but that would make the operation ~quadratic. E.g., if we wanted to know whether parens were required around each parenthesized expression in the following example (AST), we'd need to traverse all ancestral sequential expressions for each one to check whether any of them had a start column farther right:

[
    Unchecked.defaultof<_>; // If this line isn't here, we can remove parens below... But if it is, we can't.
   (while true do ());
   (while true do ());
   (while true do ());
   (while true do ());
   (while true do ());
   (while true do ());
   (while true do ());
   (while true do ());
   (while true do ());
   (while true do ());
   (while true do ());
   (while true do ());
]

Here's a real example of this phenomenon from this repo:

kindNested (* Table 41 *);
(if usingWhidbeyBeta1TableSchemeForGenericParam then kindGenericParam_v1_1 else kindGenericParam_v2_0); (* Table 42 *)
kindMethodSpec (* Table 43 *);

The parentheses analyzer wants to turn that into:

  kindNested               (* Table 41 *); 
 if usingWhidbeyBeta1TableSchemeForGenericParam then kindGenericParam_v1_1 else  kindGenericParam_v2_0);        (* Table 42 *) 
  kindMethodSpec         (* Table 43 *); 

but that won't parse. If we did remove parens, we'd need to insert a space before, i.e.:

  kindNested               (* Table 41 *); 
  if usingWhidbeyBeta1TableSchemeForGenericParam then kindGenericParam_v1_1 else  kindGenericParam_v2_0);        (* Table 42 *) 
  kindMethodSpec         (* Table 43 *); 
@edgarfgp
Copy link
Contributor

edgarfgp commented Apr 2, 2024

I found the other day the following code also generates a SynExpr.Sequential ast

new MyObject(A= 4;  B= 5)

new MyObject
    (A= 4;
    B= 5)

@brianrourkeboll
Copy link
Contributor Author

@edgarfgp Is that syntax (with semicolons instead of commas) ever valid?

@edgarfgp
Copy link
Contributor

edgarfgp commented Apr 2, 2024

@edgarfgp Is that syntax (with semicolons instead of commas) ever valid?

Not valid, but it is parsed and generates AST 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compiler-Syntax lexfilter, indentation and parsing Feature Improvement
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants