-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust & Typescript Cursors for the CST #540
Rust & Typescript Cursors for the CST #540
Conversation
🦋 Changeset detectedLatest commit: 22128d5 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Cursor::new(self.clone()) | ||
} | ||
|
||
pub fn is_token_kind(&self, kinds: &[TokenKind]) -> Option<&Rc<TokenNode>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For is_token_kind()
and is_rule_kind()
, I wonder why do they take multiple kinds, rather than only one? is it an optimization for NAPI? does that mean that we won't do that if it was a native TS API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is for alignment with the Cursor API, where you can search for more than one at a time without having to do multiple searches over multiple Cursors. In TS it will be is_token_kind(TokenKind, ...)
.
Note that it also returns the matched TokenNode, meaning no match
is required, and it functions therefore as a typeguard.
Note also the following is_token_matching
which is also for alignment with the cursor API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is_XX()
usually indicates returning booleans, a cheap check that can be chained in a boolean condition with other conditions. Thoughts about something like as_token()
? which indicates a "cast" that can succeed/fail (thus returning an option).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, done
self.leaf.is_none() | ||
} | ||
|
||
pub fn node_or_none(&self) -> Option<Node> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding node()
and node_or_none()
, I might be missing something, but I would have expected the rust idiomatic way is to return Option<Node>
only/directly for things like this, and let the user expect()
/unwrap()
if needed? why the extra API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there are many cases where the client know that node isn't none e.g. after a successful go_to_*
, and I wanted to not require .unwrap()
there, and instead catch it in one place. Happy to change the method names if you have a suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is idiomatic to let the user make that decision, and to make it explicit at the call site (a strong direction in rust). I think similar to how a vector/map/set would offer get() -> Option<T>
without (the often convenient) get_and_unwrap() -> T
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trait Index
provides []
for this, and it is implemented on Vec
, HashMap
, and BTreeMap
for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in any case, solved with an explicit completed state. Also allows the cursor to be reset
.text_range() | ||
} | ||
|
||
pub fn path_rule_nodes(&self) -> Vec<Rc<RuleNode>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand the use case for path_rule_nodes()
. I see it is currently unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that you can access the parent RuleNode
s when using a cursor to navigate the tree.
} | ||
|
||
#[napi(getter, ts_return_type = "RuleNode | TokenNode | null")] | ||
pub fn node(&self, env: Env) -> Option<JsObject> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm missing something. This would convert it to a plain JS object, however, us typing it with RuleNode | TokenNode
means that it will have the usual APIs on it, like the ones that need round tripping to Rust. How does that work?
this is related to other to_js()
uses as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we have to return JsObject
because the result is polymorphic. And fundamentally, JsObject
is what is passed even if the Rust type were explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So on the receiver side, working in NodeJS land, they would get a plain JS object.
It wouldn't have the native Rust APIs? not sure how would calling its methods would round trip.
... I think I will wait for this to be ready and play with it for a bit to understand more.
fn rule_exit( | ||
&mut self, | ||
node: &Rc<RuleNode>, | ||
cursor: &Cursor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TOL, I wonder if we should be exposing the cursor to the visitor API, even if it cannot be used/mutated by the caller? what is the use case here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For two reasons
- You can start a
Visitor
at anyCursor
position. It is nice in theVisitor
to be able to get to theCursor
and do some more internal iteration. Apart from being convenient, it makes internal and external iterators nicely symmetric. - Regardless of the symmetry, it is convenient in the visitor to be able to access things like the
text_range
from theCursor
, and also thepath_rule_nodes
. As theCursor
API is extended, theVisitor
implementations can use those facilities.
}, | ||
}; | ||
|
||
#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest splitting into three tests for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but because the 'test' nature of this is not actually for testing, but just to validate that the example is ok, I decided that the clarity I could gain by simply repeating a section of code with exactly the same result but using different API patterns was worth having the repeated functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is also for "testing", since the Rust test infra is the same in all projects, this is a quick way for someone to just copy a test locally, and edit/run it for experimenting.
It also indicates to readers that these are separate, instead of having side effects on later statements.
crates/solidity/outputs/cargo/tests/src/doc_examples/cursor_api.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few suggestions/concerns.
return Ok(VisitorEntryResponse::StepIn); | ||
if node.kind == RuleKind::ContractDefinition { | ||
if let Node::Token(token) = &node.children[2] { | ||
assert_eq!(token.kind, TokenKind::Identifier); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ensure!()
to be consistent with bail!()
instead of panicking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
) -> Result<VisitorEntryResponse> { | ||
if node.kind != RuleKind::ContractDefinition { | ||
return Ok(VisitorEntryResponse::StepIn); | ||
if node.kind == RuleKind::ContractDefinition { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I suggest quitting early when possible (like the existing visitor) for readability, and to decrease nesting. But not a blocker at all.
if !interesting {
return;
}
do_something();
rather than:
if interesting {
do_something();
}
return;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a functional programming fan, I prefer to have a single control flow, and therefore not use guards as often.
goToNthChild(childNumber: number): boolean; | ||
goToNextSibling(): boolean; | ||
goToPreviousSibling(): boolean; | ||
findRuleKind(kinds: Array<RuleKind>): RuleNode | null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TOL: thoughts about unifying findXXXX()
and findXXXXWithoutRecursing()
to a default argument findXXXX(recursive = true)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't want to add a recurse
param to find*
if the use cases were not equaly distributed, which I suspect they wouldn't be.
But in any case I've removed the *WithoutRecursing
forms because they cannot work - the movement to the next has to happen between calls to those methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've a few open questions, but will wait until the docs are ready to have a better understanding of the proposed API before continuing.
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and publish to npm yourself or [setup this action to publish automatically](https://github.com/changesets/action#with-publishing). If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @nomicfoundation/[email protected] ### Minor Changes - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add a Rust Cursor API and refactor the Rust Visitor API to run on top of it. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Move Visitor et al to node:: namespace, which is where Cursor is. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Rename `range` functions that return a TextRange to `text_range` ### Patch Changes - [#543](#543) [`7a34599`](7a34599) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Move `syntax::parser::ProductionKind` to `syntax::nodes` namespace. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add TokenNode.text to the TS API. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add first pass of Typescript binding to the Cursor API, but no TS Visitor yet. - [#545](#545) [`e73658a`](e73658a) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - render EBNF grammar on top of each `ProductionKind`, `RuleKind`, and `TokenKind`. - [#558](#558) [`95bbc50`](95bbc50) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Correct versioning for `SourceUnitMember` and `ContractMember` children. ## [email protected] ### Minor Changes - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add a Rust Cursor API and refactor the Rust Visitor API to run on top of it. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Move Visitor et al to node:: namespace, which is where Cursor is. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Rename `range` functions that return a TextRange to `text_range` ### Patch Changes - [#543](#543) [`7a34599`](7a34599) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Move `syntax::parser::ProductionKind` to `syntax::nodes` namespace. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add TokenNode.text to the TS API. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add first pass of Typescript binding to the Cursor API, but no TS Visitor yet. - [#545](#545) [`e73658a`](e73658a) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - render EBNF grammar on top of each `ProductionKind`, `RuleKind`, and `TokenKind`. - [#558](#558) [`95bbc50`](95bbc50) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Correct versioning for `SourceUnitMember` and `ContractMember` children. Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Does not include TS Visitor - waiting for Omar's TS/Rust codegen unification PR to land before completing this task.