-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Dafny format file.dfy and IDE extension #2399
Conversation
Source/Dafny/AST/TokenFormatter.dfy
Outdated
module {:extern @" Helpers { | ||
public class HelperString { | ||
public static readonly System.Text.RegularExpressions.Regex NewlineRegex = | ||
new System.Text.RegularExpressions.Regex(new string(new char[] | ||
{'(','?','m',')','(','^',')','\\','s','*'})); | ||
|
||
public static string Reindent(string input, string newIndent) { | ||
return NewlineRegex.Replace(input, newIndent); | ||
} | ||
} | ||
} /*/"} DefaultCode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the record, this is super clever, but reminds me way too much of SQL injection attacks. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clearer about my opinion: I don't actually want to see any real Dafny code using this hack. It just illustrates how under-specified {:extern}
is and relies on behavior that could very easily break in the future. You're much better off putting target language source code in target language source code files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will surely do, until we have something like:
unsafe "csharp" {
// blablah
}
:-)
5491f43
to
b16651f
Compare
This reverts commit bd0451b.
SpaceTab
Because informally David told me he was fine now.
Test/dafny0/formattingStdin.dfy
Outdated
@@ -0,0 +1,2 @@ | |||
// RUN: %exits-with 2 %baredafny format --stdin 2> "%t" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should have an exit code of 1 as it is a command-line error.
@@ -12,7 +12,7 @@ Failing2: FAILED | |||
RunAllTestsOption.dfy(29,2): expectation violation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid the changes in RunAllTestsOption.dfy.expect and in contract-wrappers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It builds!! Yay. LGTM except a few changes in expect files
This PR makes it possible to "format" one or many Dafny files on the command-line, which for now means only changing the indentation of every single line. This is a step towards #1658
Supported syntax:
dafny format file1.dfy file2.dfy file3.dfy...
Format each file provided in arguments.dafny format --print file1.dfy
Format each file but instead of writing files, output them formatted in stdoutdafny format --check file1.dfy
Gives a summary of what remains to format, with a non-zero error code if not everything is formatted.dafny format .
Look up for all the Dafny files recursively in the current repository and format them.Insights
Coming up with an opinionated formatter wasn't an easy task. We want the guarantee that the reprinted program should be the same as the previous one, modulo whitespace.
Our approach in this PR consists of three phases:
Step 3. has been entirely written in Dafny, and offers the interesting guarantee that the final reprinted program will contain all the "val" of every token reachable from the first token provided in input.
What is the indentation of a "token"?
This PR considers that each token has to be associated with 3 types of indentation.
Because the token printer will traverse the tokens in order, the indentation 1. is used only for the trivia associated to the token in the leading whitespace of that token, like
/* Comment about X */
in the following example, which is in the leading trivia ofconst
Note that in this same example, the indentation of
Comment about D
is possible because of the indentation stored in 3.The indentation of 2. makes is possible to have the token to be a delimiter of two indented regions, like this:
How this PR has been tested?
Rules implemented:
Interesting cases
Assignments
The basic rule is that, after
:=
, there is indentation, except in one case that I found were quite common: Single calls on the RHS don't have extra indentation:However, if there are multiple calls, there is extra indentation so that we know where this finishes
Moreover, if there is a single RHS but it's on a separate line, it has to be indented.
And, if the RHS is a closure, it should still be indented,
as the arrow "=>" is not a delimiter of an end of region
If the first right-hand-side is one one line, all right-hand-side are aligned
However, if the first right-hand-side spans multiple lines, the subsequent ones do not need to be aligned, just indented
Comprehensions
Comprehensions like ``imap` make an extra effort to indent their inner expressions when the keyword is alone on its line.
Otherwise, they keep the surrounding indentation. In any case, they align ":=" and "::" as much as possible and indent their arguments, if they are placed at the beginning of the line.
Aligned
//
The formatter keeps the alignment of consecutive and already aligned comments starting with
//
, like the one above.Adding or removing one space in the indentation of
// Line 2
will make the formatter take the default indentation:Adding spaces to avoid changing blocks
Here
const
needs to be de-indented, so we need to add spaces in front of the/*
Commented cases and datatypes constructors
Normally, comments after a constructor are aligned with this constructor. But above, it's obvious that one comment is not about Child2, but a commented case.
Here we add the rule that comment starting with
//|
in such cases are indented like the other|
around, Same for commentedcase 3
, in the match statement above.Alignment of binary comparison operators
In many places, multi-line equality and comparison align their arguments. The formatter implements this behavior, so even if normally things in parentheses are a single block, there is an extra de-indent for these operators. This works for chaining expressions and calc statements as well.
Else alignment
Here we have two common cases: if-then-else where the "else" and the "then" are on two different lines, and they have contradictory indentation. We resolved this contradiction by deciding that, if the "then" is aligned with the "else" before, the formatter will keep this alignment no matter what. Otherwise, it will indent the "else" next to the "if"
Case with opening parentheses
The formatter assumes that
(
always defines a block in which everything is in (meaning, their column is equal or greater to the column of(
. But in the case above, it was obvious that the parentheses are there only to disambiguate.The formatter will not consider that
(
creates a full block if it's followed by a newline and it's preceeded by "=>"Clear disambiguation
The Dafny formatter helps distinguishing otherwise ambiguous programs, such as this one:
The Dafny formatter will indent it like this:
because the last
if
is parsed as part of the last case. Of course, it's always better to use the{}
for thematch
statement to avoid these ambiguities, but at least the formatter will help.Blocks and hints.
Usually, blocks delimited with
{}
have a prelude, so it makes sense that comments right before a{
are indented one extra level.Except in one case: Hints in calc statements
Expected follow-up PRs:
By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.