-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split string formatting to individual nodes #9058
Conversation
e54f469
to
2cfc547
Compare
|
f1e269d
to
6d7690d
Compare
Current dependencies on/for this PR:
This stack of pull requests is managed by Graphite. |
3afecc0
to
966af18
Compare
966af18
to
3e1635f
Compare
Would you mind including a comparision of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the formatting into the corresponding nodes helps to untangle some of the complexity.
It overall looks good, but there are some places where we can reduce the complexity by limiting the option types further.
My main feedback is not specifically related to this PR but to the string parts refactor.
I continue to find it extremelly difficult to keep the different string nodes apart. because we have ExprStringLiteral
and StringLiteral
. I made a naming recommendation on your most recent AST refactor and I believe that implementing it could greatly help with naming (including downstream logic like the formatting code here). Doing this renaming sooner rather than later will ensure that downstream crates use the terminology consistently.
My other feedback related to this refactor is that I find the StringLiteralValue
concept difficult to understand because it seems to be mainly an implementation detail. Reducing that additional concept (from the already complicated string nodes) by making value
and the StringLiteralValue
struct private, and moving all StringLiteralValue
methods to StringLiteralExpression
might help further reduce the complexity of our string nodes (at least the perceived complexity)
I don't expect us to do this refactor as part of this PR but I think it will be worthwile to improve the naming of the string nodes, because I fail to understand the structure even after having looked up their definitions a couple of time.
crates/ruff_python_formatter/src/expression/expr_string_literal.rs
Outdated
Show resolved
Hide resolved
crates/ruff_python_formatter/src/expression/expr_string_literal.rs
Outdated
Show resolved
Hide resolved
Thanks for the detailed review. I've made most of the requested changes:
Open points:
|
Yeah, thanks for providing your perspective on this. With the AST rename PR, I'm able to understand why this might be so. I'll probably do a follow-up next week to merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good to me but I think it would be helpful to avoid using StringOptions
(or remove it entirely) and instead use more specific option types. I'm saying this because I struggled hard to understand which combinations of StringOptions
are possible in the different formatting functions.
I commented on the PR where I think we can narrow the options but you can also use the following diff if you want all changes applied (requires some cleanup because I didn't always rename fields to match the type of the options).
Index: crates/ruff_python_formatter/src/expression/expr_f_string.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/expression/expr_f_string.rs b/crates/ruff_python_formatter/src/expression/expr_f_string.rs
--- a/crates/ruff_python_formatter/src/expression/expr_f_string.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/expression/expr_f_string.rs (date 1702518874243)
@@ -9,20 +9,19 @@
in_parentheses_only_group, NeedsParentheses, OptionalParentheses,
};
use crate::prelude::*;
-use crate::string::{AnyString, FormatStringContinuation, Quoting, StringOptions};
+use crate::string::{AnyString, FormatStringContinuation, Quoting};
#[derive(Default)]
pub struct FormatExprFString;
impl FormatNodeRule<ExprFString> for FormatExprFString {
fn fmt_fields(&self, item: &ExprFString, f: &mut PyFormatter) -> FormatResult<()> {
- let options =
- StringOptions::default().with_quoting(f_string_quoting(item, &f.context().locator()));
+ let quoting = f_string_quoting(item, &f.context().locator());
match item.value.as_slice() {
- [f_string_part] => f_string_part.format().with_options(options).fmt(f),
+ [f_string_part] => f_string_part.format().with_options(quoting).fmt(f),
_ => in_parentheses_only_group(
- &FormatStringContinuation::new(&AnyString::FString(item)).with_options(options),
+ &FormatStringContinuation::new(&AnyString::FString(item)).with_options(quoting),
)
.fmt(f),
}
Index: crates/ruff_python_formatter/src/expression/expr_string_literal.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/expression/expr_string_literal.rs b/crates/ruff_python_formatter/src/expression/expr_string_literal.rs
--- a/crates/ruff_python_formatter/src/expression/expr_string_literal.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/expression/expr_string_literal.rs (date 1702519544384)
@@ -6,18 +6,37 @@
use crate::expression::parentheses::{
in_parentheses_only_group, NeedsParentheses, OptionalParentheses,
};
+use crate::other::string_literal::StringLiteralKind;
use crate::prelude::*;
-use crate::string::{
- AnyString, FormatStringContinuation, StringOptions, StringPrefix, StringQuotes,
-};
+use crate::string::{AnyString, FormatStringContinuation, Quoting, StringPrefix, StringQuotes};
#[derive(Default)]
pub struct FormatExprStringLiteral {
- options: StringOptions,
+ options: ExprStringLiteralKind,
+}
+
+#[derive(Default, Copy, Clone, Debug)]
+pub enum ExprStringLiteralKind {
+ #[default]
+ String,
+ Docstring,
+}
+
+impl ExprStringLiteralKind {
+ const fn string_literal_kind(self) -> StringLiteralKind {
+ match self {
+ ExprStringLiteralKind::String => StringLiteralKind::String,
+ ExprStringLiteralKind::Docstring => StringLiteralKind::Docstring,
+ }
+ }
+
+ const fn is_docstring(self) -> bool {
+ matches!(self, Self::Docstring)
+ }
}
impl FormatRuleWithOptions<ExprStringLiteral, PyFormatContext<'_>> for FormatExprStringLiteral {
- type Options = StringOptions;
+ type Options = ExprStringLiteralKind;
fn with_options(mut self, options: Self::Options) -> Self {
self.options = options;
@@ -30,11 +49,15 @@
let ExprStringLiteral { value, .. } = item;
match value.as_slice() {
- [string_literal] => string_literal.format().with_options(self.options).fmt(f),
- _ => in_parentheses_only_group(
- &FormatStringContinuation::new(&AnyString::String(item)).with_options(self.options),
- )
- .fmt(f),
+ [string_literal] => string_literal
+ .format()
+ .with_options(self.options.string_literal_kind())
+ .fmt(f),
+ _ => {
+ assert!(!self.options.is_docstring());
+ in_parentheses_only_group(&FormatStringContinuation::new(&AnyString::String(item)))
+ .fmt(f)
+ }
}
}
Index: crates/ruff_python_formatter/src/other/f_string_part.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/other/f_string_part.rs b/crates/ruff_python_formatter/src/other/f_string_part.rs
--- a/crates/ruff_python_formatter/src/other/f_string_part.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/other/f_string_part.rs (date 1702519544305)
@@ -2,16 +2,17 @@
use ruff_python_ast::FStringPart;
use crate::other::f_string::FormatFString;
+use crate::other::string_literal::StringLiteralKind;
use crate::prelude::*;
-use crate::string::StringOptions;
+use crate::string::Quoting;
#[derive(Default)]
pub struct FormatFStringPart {
- options: StringOptions,
+ options: Quoting,
}
impl FormatRuleWithOptions<FStringPart, PyFormatContext<'_>> for FormatFStringPart {
- type Options = StringOptions;
+ type Options = Quoting;
fn with_options(mut self, options: Self::Options) -> Self {
self.options = options;
@@ -22,12 +23,11 @@
impl FormatRule<FStringPart, PyFormatContext<'_>> for FormatFStringPart {
fn fmt(&self, item: &FStringPart, f: &mut PyFormatter) -> FormatResult<()> {
match item {
- FStringPart::Literal(string_literal) => {
- string_literal.format().with_options(self.options).fmt(f)
- }
- FStringPart::FString(f_string) => {
- FormatFString::new(f_string, self.options.quoting()).fmt(f)
- }
+ FStringPart::Literal(string_literal) => string_literal
+ .format()
+ .with_options(StringLiteralKind::InFString(self.options))
+ .fmt(f),
+ FStringPart::FString(f_string) => FormatFString::new(f_string, self.options).fmt(f),
}
}
}
Index: crates/ruff_python_formatter/src/other/string_literal.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/other/string_literal.rs b/crates/ruff_python_formatter/src/other/string_literal.rs
--- a/crates/ruff_python_formatter/src/other/string_literal.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/other/string_literal.rs (date 1702519695435)
@@ -3,19 +3,40 @@
use ruff_text_size::Ranged;
use crate::prelude::*;
-use crate::string::{docstring, StringOptions, StringPart};
+use crate::string::{docstring, Quoting, StringPart};
use crate::QuoteStyle;
#[derive(Default)]
pub struct FormatStringLiteral {
- options: StringOptions,
+ layout: StringLiteralKind,
+}
+
+#[derive(Copy, Clone, Debug, Default)]
+pub enum StringLiteralKind {
+ #[default]
+ String,
+ Docstring,
+ InFString(Quoting),
+}
+
+impl StringLiteralKind {
+ pub(crate) const fn is_docstring(self) -> bool {
+ matches!(self, StringLiteralKind::Docstring)
+ }
+
+ fn quoting(self) -> Quoting {
+ match self {
+ StringLiteralKind::String | StringLiteralKind::Docstring => Quoting::CanChange,
+ StringLiteralKind::InFString(quoting) => quoting,
+ }
+ }
}
impl FormatRuleWithOptions<StringLiteral, PyFormatContext<'_>> for FormatStringLiteral {
- type Options = StringOptions;
+ type Options = StringLiteralKind;
fn with_options(mut self, options: Self::Options) -> Self {
- self.options = options;
+ self.layout = options;
self
}
}
@@ -25,24 +46,24 @@
let locator = f.context().locator();
let parent_docstring_quote_style = f.context().docstring();
- let quote_style = if self.options.is_docstring() {
+ let quote_style = match self.layout {
// Per PEP 8 and PEP 257, always prefer double quotes for docstrings
- QuoteStyle::Double
- } else {
- f.options().quote_style()
+ StringLiteralKind::Docstring => QuoteStyle::Double,
+ StringLiteralKind::String | StringLiteralKind::InFString(_) => {
+ f.options().quote_style()
+ }
};
let normalized = StringPart::from_source(item.range(), &locator).normalize(
- self.options.quoting(),
+ self.layout.quoting(),
&locator,
quote_style,
parent_docstring_quote_style,
);
- if self.options.is_docstring() {
- docstring::format(&normalized, f)
- } else {
- normalized.fmt(f)
+ match self.layout {
+ StringLiteralKind::Docstring => docstring::format(&normalized, f),
+ StringLiteralKind::String | StringLiteralKind::InFString(_) => normalized.fmt(f),
}
}
}
Index: crates/ruff_python_formatter/src/statement/suite.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/statement/suite.rs b/crates/ruff_python_formatter/src/statement/suite.rs
--- a/crates/ruff_python_formatter/src/statement/suite.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/statement/suite.rs (date 1702519544227)
@@ -9,9 +9,10 @@
leading_comments, trailing_comments, Comments, LeadingDanglingTrailingComments,
};
use crate::context::{NodeLevel, TopLevelStatementPosition, WithIndentLevel, WithNodeLevel};
+use crate::expression::expr_string_literal::ExprStringLiteralKind;
+use crate::other::string_literal::StringLiteralKind;
use crate::prelude::*;
use crate::statement::stmt_expr::FormatStmtExpr;
-use crate::string::StringOptions;
use crate::verbatim::{
suppressed_node, write_suppressed_statements_starting_with_leading_comment,
write_suppressed_statements_starting_with_trailing_comment,
@@ -609,7 +610,7 @@
leading_comments(node_comments.leading),
string_literal
.format()
- .with_options(StringOptions::docstring()),
+ .with_options(ExprStringLiteralKind::Docstring),
]
)?;
Index: crates/ruff_python_formatter/src/string/mod.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/string/mod.rs b/crates/ruff_python_formatter/src/string/mod.rs
--- a/crates/ruff_python_formatter/src/string/mod.rs (revision Staged)
+++ b/crates/ruff_python_formatter/src/string/mod.rs (date 1702519544492)
@@ -13,6 +13,7 @@
use crate::comments::{leading_comments, trailing_comments};
use crate::expression::parentheses::in_parentheses_only_soft_line_break_or_space;
use crate::other::f_string::FormatFString;
+use crate::other::string_literal::StringLiteralKind;
use crate::prelude::*;
use crate::QuoteStyle;
@@ -47,21 +48,29 @@
}
}
- fn parts(&self) -> Vec<AnyStringPart<'a>> {
+ fn parts(&self, quoting: Quoting) -> Vec<AnyStringPart<'a>> {
match self {
- Self::String(ExprStringLiteral { value, .. }) => {
- value.iter().map(AnyStringPart::String).collect()
- }
+ Self::String(ExprStringLiteral { value, .. }) => value
+ .iter()
+ .map(|part| AnyStringPart::String {
+ literal: part,
+ layout: StringLiteralKind::String,
+ })
+ .collect(),
Self::Bytes(ExprBytesLiteral { value, .. }) => {
value.iter().map(AnyStringPart::Bytes).collect()
}
Self::FString(ExprFString { value, .. }) => value
.iter()
.map(|f_string_part| match f_string_part {
- ast::FStringPart::Literal(string_literal) => {
- AnyStringPart::String(string_literal)
- }
- ast::FStringPart::FString(f_string) => AnyStringPart::FString(f_string),
+ ast::FStringPart::Literal(string_literal) => AnyStringPart::String {
+ literal: string_literal,
+ layout: StringLiteralKind::InFString(quoting),
+ },
+ ast::FStringPart::FString(f_string) => AnyStringPart::FString {
+ string: f_string,
+ quoting,
+ },
})
.collect(),
}
@@ -100,17 +109,23 @@
#[derive(Clone, Debug)]
enum AnyStringPart<'a> {
- String(&'a ast::StringLiteral),
+ String {
+ literal: &'a ast::StringLiteral,
+ layout: StringLiteralKind,
+ },
Bytes(&'a ast::BytesLiteral),
- FString(&'a ast::FString),
+ FString {
+ string: &'a ast::FString,
+ quoting: Quoting,
+ },
}
impl<'a> From<&AnyStringPart<'a>> for AnyNodeRef<'a> {
fn from(value: &AnyStringPart<'a>) -> Self {
match value {
- AnyStringPart::String(part) => AnyNodeRef::StringLiteral(part),
+ AnyStringPart::String { literal, .. } => AnyNodeRef::StringLiteral(literal),
AnyStringPart::Bytes(part) => AnyNodeRef::BytesLiteral(part),
- AnyStringPart::FString(part) => AnyNodeRef::FString(part),
+ AnyStringPart::FString { string, .. } => AnyNodeRef::FString(string),
}
}
}
@@ -118,101 +133,49 @@
impl Ranged for AnyStringPart<'_> {
fn range(&self) -> TextRange {
match self {
- Self::String(part) => part.range(),
+ Self::String { literal, .. } => literal.range(),
Self::Bytes(part) => part.range(),
- Self::FString(part) => part.range(),
+ Self::FString { string, .. } => string.range(),
+ }
+ }
+}
+
+impl Format<PyFormatContext<'_>> for AnyStringPart<'_> {
+ fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
+ match self {
+ AnyStringPart::String { literal, layout } => {
+ literal.format().with_options(*layout).fmt(f)
+ }
+ AnyStringPart::Bytes(bytes_literal) => bytes_literal.format().fmt(f),
+ AnyStringPart::FString { string, quoting } => {
+ FormatFString::new(string, *quoting).fmt(f)
+ }
}
}
}
#[derive(Copy, Clone, Debug, Default)]
-pub(crate) enum Quoting {
+pub enum Quoting {
#[default]
CanChange,
Preserve,
}
-#[derive(Default, Copy, Clone, Debug)]
-pub(crate) enum StringLayout {
- #[default]
- Default,
- DocString,
-}
-
-/// Resolved options for formatting any kind of string. This can be either a string,
-/// bytes or f-string.
-#[derive(Copy, Clone, Debug, Default)]
-pub struct StringOptions {
- quoting: Quoting,
- layout: StringLayout,
-}
-
-impl StringOptions {
- /// Creates a new context with the docstring layout.
- pub(crate) fn docstring() -> Self {
- Self {
- layout: StringLayout::DocString,
- ..Self::default()
- }
- }
-
- /// Returns a new context with the given [`Quoting`] style.
- pub(crate) const fn with_quoting(mut self, quoting: Quoting) -> Self {
- self.quoting = quoting;
- self
- }
-
- /// Returns the [`Quoting`] style to use for the string.
- pub(crate) const fn quoting(self) -> Quoting {
- self.quoting
- }
-
- /// Returns `true` if the string is a docstring.
- pub(crate) const fn is_docstring(self) -> bool {
- matches!(self.layout, StringLayout::DocString)
- }
-}
-
-struct FormatStringPart<'a> {
- part: &'a AnyStringPart<'a>,
- options: StringOptions,
-}
-
-impl<'a> FormatStringPart<'a> {
- fn new(part: &'a AnyStringPart<'a>, options: StringOptions) -> Self {
- Self { part, options }
- }
-}
-
-impl Format<PyFormatContext<'_>> for FormatStringPart<'_> {
- fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
- match self.part {
- AnyStringPart::String(string_literal) => {
- string_literal.format().with_options(self.options).fmt(f)
- }
- AnyStringPart::Bytes(bytes_literal) => bytes_literal.format().fmt(f),
- AnyStringPart::FString(f_string) => {
- FormatFString::new(f_string, self.options.quoting()).fmt(f)
- }
- }
- }
-}
-
pub(crate) struct FormatStringContinuation<'a> {
string: &'a AnyString<'a>,
- options: StringOptions,
+ quoting: Quoting,
}
impl<'a> FormatStringContinuation<'a> {
pub(crate) fn new(string: &'a AnyString<'a>) -> Self {
Self {
string,
- options: StringOptions::default(),
+ quoting: Quoting::default(),
}
}
- pub(crate) fn with_options(mut self, options: StringOptions) -> Self {
- self.options = options;
+ pub(crate) fn with_options(mut self, options: Quoting) -> Self {
+ self.quoting = options;
self
}
}
@@ -223,11 +186,11 @@
let mut joiner = f.join_with(in_parentheses_only_soft_line_break_or_space());
- for part in self.string.parts() {
+ for part in self.string.parts(self.quoting) {
joiner.entry(&format_args![
line_suffix_boundary(),
leading_comments(comments.leading(&part)),
- FormatStringPart::new(&part, self.options),
+ part,
trailing_comments(comments.trailing(&part))
]);
}
I recommend taking a second look where omitting a call to with_options
(and assuming the default) leads to invalid results and consider removing the AsFormat
and IntoFormat
implementations for these types (to force callers to specify all fields when cosntructing the Format*
.
Note: It seems GitHub (reviewed with VS code) lost a few of my comments or even misplaced them. I'll try to identify the missing ones but the patch above should include all places where we can replace
StringOptions
with more specific types (to the point where we can deleteStringOptions
all together)
crates/ruff_python_formatter/src/expression/expr_string_literal.rs
Outdated
Show resolved
Hide resolved
Alright, this is looking good. I've made the suggested changes and some more:
I'll merge this but feel free to add any further comments and I can take it up in a follow-up PR. |
806abae
to
79872e0
Compare
This PR splits the string formatting code in the formatter to be handled by the respective nodes.
Previously, the string formatting was done through a single
FormatString
interface. Now, the nodes themselves are responsible for formatting.The following changes were made:
StringLayout::ImplicitStringConcatenationInBinaryLike
and inline the call toFormatStringContinuation
. After the refactor, the binary like formatting would delegate toFormatString
which would then delegate toFormatStringContinuation
. This removes the intermediary steps.FStringPart
which delegates it to the respective string literal or f-string node.ExprStringLiteralKind
which is eitherString
orDocstring
. If it's a docstring variant, then the string expression would not be implicitly concatenated. This is guaranteed by theDocstringStmt::try_from_expression
constructor.StringLiteralKind
which is either aString
,Docstring
orInImplicitlyConcatenatedFString
. The last variant is for when the string literal is implicitly concatenated with an f-string ("foo" f"bar {x}"
).FormatString
.ExprFString
orFormatStringContinuation
).Formatter ecosystem result
This PR
main