-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass ParserOptions
to the parser
#16220
Conversation
|
What's the motivation for changing the function signature for I've some additional thoughts that are related to this:
Curious to hear @MichaReiser thoughts on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @dhruvmanila that I wouldn't make ParserOptions
generic or add ParserOptions
to parse_unchecked_source
. It unnecessarily complicates things internally.
We could consider adding a ParseSourceOptions
struct which is ParserOptions
with the source type prop omitted, but I'm not sure if it's worth exposing the options on parse_unchecked_source
. Instead, advanced use cases should use pare_unchecked
and then map/unwrap
on the call site.
I think it could make sense to expose the ParserOptions
to the Lexer
, or at least a subset of the options but I think this is something we can tackle separately.
I'll leave this one to Micha/Dhruv :-) |
The motivation for changing the signature of And the |
The motivation for this is that `parse_unchecked_source` relies on the fact that it's called with a `PySourceType` to know that it's safe to unwrap `try_into_module`. If we change the signature from ```rust pub fn parse_unchecked_source(source: &str, source_type: PySourceType) -> Parsed<ModModule> { ``` directly to ```rust pub fn parse_unchecked_source(source: &str, options: ParserOptions) -> Parsed<ModModule> { ``` we will lose this safety. Instead, to enforce this in the type system, I want to have to call `parse_unchecked_source` with a `ParserOptions<KnownSource>` and *not* with a `ParserOptions<UnknownSource>`. This is slightly more awkward because I also don't want to propagate the generic on `ParserOptions` to every `Parser` site, so I introduced the separate `AsParserOptions` trait so `Parser` could hold a `Box<dyn AsParserOptions>` instead of needing `SourceType` generics everywhere.
f9977cd
to
f486e76
Compare
I've reverted the generic stuff and put I also changed
I looked into this briefly, but I don't think the lexer currently has access to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet. I recommend giving @dhruvmanila some time to review the PR too.
let parsed = parse(source, ParserOptions::from_mode(source_type.as_mode())) | ||
.expect("Expect source to be valid Python"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We could implement from_source_type
(or From<SourceType> for ParserOptions
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I was thinking about that too. Should I implement From<Mode> for ParseOptions
too? I guess I was thinking we might have multiple from_*
methods, but I'm realizing now that they won't actually conflict with each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gives me more to put in the separate options.rs
file anyway. I thought it looked a bit sparse before 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a love-hate relationship with From
and I've been very inconsistent in the past with having explicit from_
methods vs From
implementations...
I do like that from_mode
is explicit but it also is a bit verbose... 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know what you mean. I already switched to From
immediately after commenting, but I'm happy to go back to the explicit versions if you and/or Dhruv prefer. I think I tend toward from_
methods usually, so I can really go either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to you and @dhruvmanila
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think From
is fine. In the future, when there are more options and if it turns out a bit difficult to read, we could switch to from_
methods. Another option would be to use a builder pattern with the default impl so that it reads as ParseOptions::default().with_mode(mode).with_python_version(version)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I'd recommend updating the PR description to reflect the current state of the PR for future readers. We can keep the "Details" content as a collapsed section for historical purposes.
I forgot to update the |
## Summary This PR builds on the changes in #16220 to pass a target Python version to the parser. It also adds the `Parser::unsupported_syntax_errors` field, which collects version-related syntax errors while parsing. These syntax errors are then turned into `Message`s in ruff (in preview mode). This PR only detects one syntax error (`match` statement before Python 3.10), but it has been pretty quick to extend to several other simple errors (see #16308 for example). ## Test Plan The current tests are CLI tests in the linter crate, but these could be supplemented with inline parser tests after #16357. I also tested the display of these syntax errors in VS Code: data:image/s3,"s3://crabby-images/c2531/c253144488ab262d5070474f99cd628b88beaa69" alt="image" data:image/s3,"s3://crabby-images/c1da1/c1da1d1455bc19e559ac98c1a66fa5e0e6af4dbb" alt="image" --------- Co-authored-by: Alex Waygood <[email protected]>
Summary
This is part of the preparation for detecting syntax errors in the parser from #16090. As suggested in this comment, I started working on a
ParseOptions
struct that could be stored in the parser. For this initial refactor, I only made it hold the existingMode
option, but for syntax errors, we will also need it to have aPythonVersion
. For that use case, I'm picturing something like aParseOptions::with_python_version
method, so you can extend the current calls to something likeBut I thought it was worth adding
ParseOptions
alone without changing any other behavior first.Most of the diff is just updating call sites taking
Mode
to takeParseOptions::from(Mode)
or those takingPySourceType
s to takeParseOptions::from(PySourceType)
. The interesting changes are in the newparser/options.rs
file and smaller parts ofparser/mod.rs
andruff_python_parser/src/lib.rs
.Outdated implementation details for future reference
NOTE: as the
details
summary says, this does not correspond to the current implementation, which preserves the original signature ofparse_unchecked_source
and uses a much simpler representation ofParseOptions
. This is preserved just for historical purposes.The
ParserOptions
implementation is complicated a bit by wanting to preserve theSAFETY
comment inparse_unchecked_source
:ruff/crates/ruff_python_parser/src/lib.rs
Lines 289 to 295 in b5cd4f2
The only way I could see to enforce this while passing a
ParserOptions
instead of aPySourceType
was by adding a generic type state toParserOptions
, so thatparse_unchecked_source
would take aParserOptions<KnownSource>
constructed byParserOptions::from_source_type
, as opposed to theParserOptions<UnknownSource>
produced byParserOptions::from_mode
.On top of that, I didn't want to propagate the generics from
ParserOptions
to all uses ofParser
, so there's another indirection through theAsParserOptions
trait, which is used byParser
to hold aBox<dyn AsParserOptions>
instead of aParserOptions<S: SourceType>
.Finally, other public functions in the parser crate could be updated to take
ParserOptions
if we want, butparse_unchecked_source
is the variant that ruff and red-knot call where I want to integrate the syntax error checks, so it was my main priority.Test Plan
Existing tests, this should not change any behavior.