Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update RUF001, RUF003 to check in f-strings #7477

Merged
merged 2 commits into from
Sep 19, 2023

Conversation

dhruvmanila
Copy link
Member

@dhruvmanila dhruvmanila commented Sep 18, 2023

Summary

This PR updates the rule RUF001 and RUF003 to check in f-strings using the
FStringMiddle token which contains the non-expression part of a f-string.

For reference,

Code Name Message
RUF001 ambiguous-unicode-character-string String contains ambiguous {}. Did you mean {}?
RUF003 ambiguous-unicode-character-comment Comment contains ambiguous {}. Did you mean {}?

Test Plan

cargo test

@dhruvmanila dhruvmanila added rule Implementing or modifying a lint rule python312 Related to Python 3.12 labels Sep 18, 2023
@dhruvmanila dhruvmanila changed the title Update RUF001-003 to check in f-strings Update RUF001, RUF003 to check in f-strings Sep 18, 2023
@dhruvmanila dhruvmanila linked an issue Sep 18, 2023 that may be closed by this pull request
33 | # Same test cases as above but using f-strings instead:
|

confusables.py:34:7: RUF001 String contains ambiguous `𝐁` (MATHEMATICAL BOLD CAPITAL B). Did you mean `B` (LATIN CAPITAL LETTER B)?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a new context to make this message say "F-string contains ..." instead of "String contains ..."?

ruff::rules::ambiguous_unicode_character(
&mut diagnostics,
locator,
range,
if tok.is_string() {
if tok.is_string() || tok.is_f_string_middle() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can fstrings be docstrings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Reference: https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals

Formatted string literals cannot be used as docstrings, even if they do not include expressions.

def foo():
    f"Not a docstring"

foo.__doc__ is None
# True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would then rewrite it for better clarity

if tok.is_string() && is_docstring {
	Context::Docstring
} else if tok.is_comment() {
	Context::Comment
} else {
	Context::String
}

or even using a match

let context = match tok {
	Tok::String => if is_docstring { Context::Docstring } else { Context::String },
	Tok::FStringMiddle => Context::String,
	Tok::Comment => Context::Comment,
	_ => unreachable!()
};

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -45,12 +45,15 @@ pub(crate) fn check_tokens(
let mut state_machine = StateMachine::default();
for &(ref tok, range) in tokens.iter().flatten() {
let is_docstring = state_machine.consume(tok);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this "state machine" need any updating to correctly handle fstrings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, as f-strings cannot be used as docstrings. The state machine is mainly used for docstring detection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. But does it need any updating to not get confused by the new tokens emitted by the lexer?

Copy link
Member Author

@dhruvmanila dhruvmanila Sep 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. No, it doesn't. The reason being that the detection mechanism resets the state if it finds any token other than the string token after a function / class / module.

@dhruvmanila dhruvmanila merged commit 50d1553 into dhruv/pep-701 Sep 19, 2023
@dhruvmanila dhruvmanila deleted the dhruv/issue-7294 branch September 19, 2023 06:29
dhruvmanila added a commit that referenced this pull request Sep 20, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
dhruvmanila added a commit that referenced this pull request Sep 21, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
dhruvmanila added a commit that referenced this pull request Sep 22, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
dhruvmanila added a commit that referenced this pull request Sep 22, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
dhruvmanila added a commit that referenced this pull request Sep 22, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
dhruvmanila added a commit that referenced this pull request Sep 26, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
dhruvmanila added a commit that referenced this pull request Sep 27, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
dhruvmanila added a commit that referenced this pull request Sep 28, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
dhruvmanila added a commit that referenced this pull request Sep 29, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
dhruvmanila added a commit that referenced this pull request Sep 29, 2023
## Summary

This PR updates the rule `RUF001` and `RUF003` to check in f-strings using the
`FStringMiddle` token which contains the non-expression part of a f-string.

For reference,
| Code | Name | Message|
| --- | --- | --- |
| RUF001 | ambiguous-unicode-character-string | String contains ambiguous {}. Did you mean {}? |
| RUF003 | ambiguous-unicode-character-comment | Comment contains ambiguous {}. Did you mean {}? |

## Test Plan

`cargo test`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python312 Related to Python 3.12 rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update RUF001-003 to check in f-strings
2 participants