-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an AST visitor #601
Add an AST visitor #601
Conversation
This adds an optional "derive-visitor" feature which lets the user visit and mutate the AST easily
This is cool! I hope this feature will merge soon, exactly what i need for my case. I've been trying to implement a similar thing but as a separate crate. And my use case is writing a restricted ANSI-like SQL query builder which will be used to build queries of a specified dialect (not only ANSI and with support for query parameters) after some post processing done with help of this feature. |
I think this is a great feature overall, but I have some questions about the approach:
Also, @tokarevart, you mentioned that you were trying to implement something like this as a separate crate. Curious, are there any challenges to it? |
@iskakaushik in case of my solution there were not many challenges, i defined a big enum with a variants for each I like the idea and implementation proposed by @lovasoa using About your questions i think i can answer them, @lovasoa can correct me if i'm wrong somewhere:
|
Yes,I would have answered the same as @tokarevart :)
let sql = "select a,b from t1";
// Table t1 has changed and its fields are now split between tables t1 and t2
let mut ast = Parser::parse_sql(&GenericDialect, sql).unwrap();
ast[0].drive_mut(&mut visitor_enter_fn_mut(
|table: &mut sqlparser::ast::TableWithJoins| {
let has_t1 = once(&table.relation)
.chain(table.joins.iter().map(|j| &j.relation))
.any(|r| {
matches!(r, TableFactor::Table {name: ObjectName(idents), ..}
if idents[0].value == "t1")
});
if has_t1 {
table.joins.push(Join {
relation: Table {
name: ObjectName(vec![Ident::from("t2")]),
alias: None,
args: None,
with_hints: vec![],
},
join_operator: JoinOperator::Inner(JoinConstraint::Using(vec![Ident::from(
"t_id",
)])),
});
}
},
));
assert_eq!(ast[0].to_string(), "SELECT a, b FROM t1 JOIN t2 USING(t_id)"); |
Hello :) Any news ? When do you think this can be merged ? |
Hey, can someone please review this, and merge this if possible ? |
I will review this today or tomorrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @lovasoa -- this looks like a very cool feature and one I have wanted in my own projects
I am concerned about the new dependency of derive-vistor
(which has only been published for 14 days and is not particularly widely used yet). Are there any other options you considered ? What do you think about possibly moving the parts of derive-visitor
you want to use into sqlparser
itself (aka a little copying rather than dependency)?
I am also a bit concerned a little about the relatively few tests (as in I suspect I could remove most of the #[cfg_attr(feature = "derive-visitor", ...)]
lines and all the tests would still pass
The other thing that I noticed while reviewing this PR is that this feature may not be very discoverable (https://docs.rs/sqlparser/latest/sqlparser/ wouldn't show it) -- perhaps as a follow on we can add an example showing it.
I could indeed have done better for tests and documentation, you are right. I'll improve these. As for vendoring |
What I am thinking about is that the I think one of the great strengths of sqlparser is its relative simplicity -- it has few dependencies, the code is relatively straightforward and east to contribute to, and is fast to compile. The proc-macro implementation in this PR is elegant and avoids boiler plate code, but I worry about the additional complexity (and now there is that much more code to maintain in sqlparser, higher barrier to new contributors, and a new source of potential issues (if something goes wrong we have to debug it in other crates) What would you think about making a new crate that provided rewriting functionality (rather than in the sqlparser-rs crate itself)? I would be willing to help maintain this as we may end up using such functionality in my own project. Perhaps One example of how to implement a rewriter without changing sqlparser-rs is |
# Conflicts: # src/ast/value.rs
Fixes mistakenly removed annotations
Pull Request Test Coverage Report for Build 3453111794
💛 - Coveralls |
Hey @alamb ! Thanks for your comments. I added more tests, and documentation to make the feature more discoverable. Do you really think the added complexity to this repo is a lot to handle ? It is just annotations added to structs and attributes, and associated tests and documentation. On the other hand, I think the benefit is huge, because looking at code on github, it looks like many user of this crate end up wanting to visit the AST, and currently everyone hand-rolls their own code to do that, and ends up with verbose and fragile code, that often needs to be updated when the sqlparser AST changes. Manually implementing a visitor that is specific to sqlparser in a different crate would work, but this would be a very large amount of tedious and uninteresting work, both to write initially, and to maintain in the long term. It would be very easy for it to fall out of sync, and there would be no compiler guarantee that all fields are indeed visited. I do agree that adding a dependency on an external crate that is not widely used is not ideal for a crate such as sqlparser. But maybe we can mitigate the risks by asking some sqlparser maintainers to be added as maintainers of derive-visitor? I for myself would be happy to help maintaining derive-visitor. This way fixing potential issues in the visitor code would be a contribution to the entire rust community, and help a larger set of rust developers. What would you think, @nikis05 ? |
@lovasoa Sure, sounds good, I will be happy to add more maintainers. |
The short answer is yes. The longer answer is that every little feature addition / dependency addition in itself is typically not that complex. However, over time the libraries end up accumulating a large number of them. Also, sadly, I do not have the time to review the derive visitor crate's code carefully at this time
Yes I agree this is a commonly desired pattern.
I think it would be possible to write a code generator to generate visitors / rewriters rather than requiring a bunch of annotations to the
Thank you for the offer -- I unfortunately don't have time to maintain another crate. |
Perhaps @Dandandan or @andygrove have some opinions they could share? |
Just throwing in a possible option. There is a feature in Updating of said crate to match |
@nikis05 that idea sounds intriguing. |
@alamb but this option has its own problems / downsides,
This could be way too much effort compared to just adding In any case if you decide to somehow support |
In my mind this is the key tradeoff -- basically who will maintain the code and how much effort they are willing to expend to do so. As mentioned, I don't have a lot of time to spend maintaining sqlparser as it is, which is why I have this extreme aversion to making it more complicated, even though I understand the arguments and benefits that the total effort would be lower if the code was in the sqlparser crate. The key difference is who would be doing the effort |
Thank you for the offer. I have started a discussion on #643 where hopefully we can figure out a path forward |
After more thought, I am not willing to take on or subject users of this crate to the additional potential maintenance of a new dependency to sql-parser that is not used anywhere else. I am sure it is awesome but I sadly don't have the time to help debug or polish it Here is what I am looking at: https://crates.io/crates/derive-visitor/reverse_dependencies For this particular usecase, if/when I ever need it, I would likely create a new crate that automatically generates code for a visitor (rather than procedural macros in this crate) |
If other maintainers feel differently and are willing to approve and help maintain this feature, I am not opposed. I just don't have the time to devote now |
Overall, I like the I don't believe it is an issue for the One possible solution is for the enum Recursion {
Continue,
Stop
} An example of this approach can be observed in DataFusion, where the "pre_" functions return a status: I think it would be important to address this, as it will break consumers if such a change were introduced later. |
@stuartcarnie there is an issue currently open in I added some design questions in the issue discussion that I need help answering before implementing this. |
@alamb , I do understand that. However, I still feel like it's a shame to leave this crate in a "semi-abandoned" state when there are people who are actually ready to do the work ! We are not talking about some obscure feature here. We are talking about something that almost all users would benefit from, and that is implemented in a few hundred lines of code in an actively maintained external crate, the maintainer of which is willing to collaborate with sqlparser. |
I do agree faillible visitors are great to have in general, but in the context of sqlparser, I'm not sure handling infinitely large sql statements is a required feature. The parser does not and probably never should create recursive data structures ! |
If you are willing to do the work, I recommend taking an alternate strategy and either forking the project, adding the feature there, and then keeping up, or take the alternate path and figure out how to auto generate visitor code without having to modify the crate itself. Sorry I can't help more |
Closing in favor of #765 I'll make another PR to add a mutable visitor to the existing system |
This adds an optional "derive-visitor" feature which lets the user visit and mutate the AST easily
Related to #78, #114, and #189
This is not ready to be merged, as this currently depends on nikis05/derive-visitor#9, but it already works using my fork of the repoThe upstream PR has been merged, this is ready for review :)
This lets you write code like