Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format logic and parenthesized patterns #1380

Merged
merged 6 commits into from
Feb 12, 2024
Merged

Format logic and parenthesized patterns #1380

merged 6 commits into from
Feb 12, 2024

Conversation

munificent
Copy link
Member

Trying to get infix patterns indenting correctly has proved surprisingly hard. This PR does it, but it does so by first doing some significant refactoring to CodeWriter, and then adding a new indentation mechanism that is only used by if-case. I'm not sure if it's the best approach.

The problem

The way if-case statements are styled in the new back end is a little unusual. Here are some examples:

if (object
    case first ||
        second ||
        third ||
        fourth) {
  ;
}

if (object
    case veryLongPrefix
        .longIdentifierName) {
  ;
}

if (obj
    case SomeLongTypeName
        longVariableName) {
  ;
}

if (obj
    case var // c
        x) {
  ;
}

In all four examples, the line after the case line is indented +4 relative to the previous line. The tricky part is: Where does that indentation come from?

Infix operators indent their subsequent operands, so in the first example, the natural answer would be that we let the infix pattern handle it and if-case doesn't need to add any indentation. Likewise the second example.

But in the third example, variable declarations don't increase indentation when they split before the type and name. Likewise in the fourth example. So in those cases, the if-case itself would be responsible for increasing the indentation.

If we make the if-case Piece always increase the indentation, the last two examples look right, but then you get double indentation on the first two:

if (object
    case first ||
            second ||
            third ||
            fourth) {
  ;
}

if (object
    case veryLongPrefix
            .longIdentifierName) {
  ;
}

If we make the if-case Piece never increase the indentation, then the last two examples are wrong:

if (obj
    case SomeLongTypeName
    longVariableName) {
  ;
}

if (obj
    case var // c
    x) {
  ;
}

We could make the if-case Piece look at the AST node of the pattern and determine if that node will increase the indentation. If not, then the if-case Piece will do it, otherwise it lets the pattern handle it. But given that constant patterns can contain any expression, that would mean looking through the whole expression grammar to try to figure out which expression types will indent when they split and which don't.

That seems extremely brittle to me.

Solution

The solution I implemented here is to add support for a kind of "collapsible" indentation. What we really want for if-case is to always indent +4 and only +4. If both if-case and the pattern try to indent, the two should be merged together, sort of like margin collapse in CSS.

I have that working and it does neatly fix the problem.

But it feels pretty strange to me for CodeWriter to have this funny little indentation mechanism that's only used here. It's possible we could use it a couple of other places that have similar styling. In particular, the old formatter has some logic to avoid redundant indentation for certain kinds of expression function bodies, like:

function() =>
    operand +
    another +
    third;

Note how another isn't indented further than operand like it usually would be. We might be able to use collapsible indentation here too.

So this could be an overall nice approach. But the mechanism does feel a little squishy to me and I don't love it. At the same time, I don't have any brilliant ideas for a better approach yet.

The PR

  • In order to implement collapsible indentation, I first refactored the CodeWriter's indent API to be more explicitly stack-based. That wasn't strictly necessary, but Kallen has asked for it to be more push-pop based already and it seemed like the right time to make the change.

    That's the first commit. The resulting API feels a little more explicit, which is good. But I did find it annoying sometimes to correctly keep track of when I need to pop a state, and had a couple of bugs where I forgot to pop or popped too many.

  • Likewise, I made a similar change for the "allow newline" API. That's the second commit.

  • The third commit just reorganizes ChainPiece.format() to be a little more redundant but I think easier to read.

  • The fourth commit adds support for logic patterns and parenthesized patterns. It also adds some additional tests for constant patterns and variable patterns. Note that adding the new patterns is trivial: the code is just like infix and parenthesized expressions. The only work needed to get the indentation correct is the one line change in AssignPiece (which is used for case <pattern> in if-case) to let its indentation collapse.

Overall, I'm a little on the fence with these changes. But they do work, and we need to do something to get the rest of patterns going.

While the Solver is working through Solutions, it keeps track of the
best Solution it's found so far. This will get returned if there is no
Solution that fits within the page width: when overflow is inevitable,
the formatter still tries to do the best it can.

The Solver is careful to not consider an invalid solution (one that
contains a newline where none is allowed) to be the best solution.
Except in one case: If the *initial* solution is invalid, the Solver
will still store it in best. Then, as long as no other solution has
less overflow than the initial invalid one, it ends up returning an
invalid solution.

This fixes that: any valid solution, regardless of overflow, will kick
out that initial invalid solution if there is one.
Prior to this change, CodeWriter maintained an implicit stack where
each Piece being formatted has its own indentation level which it can
mutate. When the Piece is done, whatever amount of relative indentation
it set is discarded.

This commit replaces that with an explicit stack that Pieces manipulate
by pushing and popping indentation levels onto it. Pieces that don't
care about indentation don't affect the stack at all. Pieces that need
multiple levels of indentation (like constructor initializers) can push
and pop multiple times.
As with the previous commit, instead of each Piece implicitly having
it's own context where this is stored, require pieces to explicitly
push and pop regions where newlines are allowed or disallowed.

With that change, there's almost nothing left in the _Options class, so
eliminate that too.
I think it's clearer to handle each state separately.
In order to get infix patterns (and split chain constant patterns, and
others) indenting correctly in if-case statements, I had to introduce
a notion of "collapsible" indent.

In this PR, it's only used for the indentation after if-case, but it
might be useful elsewhere (for example, with `=>` function bodies).
Copy link
Member

@kallentu kallentu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for helping me out on this!

@munificent munificent merged commit 25dc786 into main Feb 12, 2024
7 checks passed
@munificent munificent deleted the if-case branch February 12, 2024 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants