Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parsing speed by avoiding some clones in parse_identifier #1624

Merged
merged 2 commits into from
Dec 29, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Dec 28, 2024

I made a flamegraph of the sqlparser_bench and found a few unnecessary clones

flamegraph

This PR avoids a clone (string copy) in parse_identifier.

I don't think this will make things much faster (yet), but it makes clear where the cloning is happening.
I hope to optimize the speed of parsing identifiers more as a follow on PR.

@@ -970,15 +970,15 @@ impl<'a> Parser<'a> {
t @ (Token::Word(_) | Token::SingleQuotedString(_)) => {
if self.peek_token().token == Token::Period {
let mut id_parts: Vec<Ident> = vec![match t {
Token::Word(w) => w.to_ident(next_token.span),
Token::Word(w) => w.into_ident(next_token.span),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w.to_ident cloned (copied) the string. info_ident simply reuses the string

@@ -1108,7 +1108,7 @@ impl<'a> Parser<'a> {
if dialect_of!(self is PostgreSqlDialect | GenericDialect) =>
{
Ok(Some(Expr::Function(Function {
name: ObjectName(vec![w.to_ident(w_span)]),
name: ObjectName(vec![w.clone().into_ident(w_span)]),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case I couldn't figure out (yet) how to avoid this clone given that &w is passed in

This PR doesn't increase the number of clones done, but it makes it more explicit when they are happening

@@ -13475,13 +13477,23 @@ impl<'a> Parser<'a> {
}

impl Word {
#[deprecated(since = "0.55.0", note = "please use `into_ident` instead")]
pub fn to_ident(&self, span: Span) -> Ident {
Ident {
value: self.value.clone(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to_indent clones self.value

src/parser/mod.rs Outdated Show resolved Hide resolved
@@ -13475,13 +13477,23 @@ impl<'a> Parser<'a> {
}

impl Word {
#[deprecated(since = "0.54.0", note = "please use `into_ident` instead")]
Copy link
Contributor Author

@alamb alamb Dec 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to help any downstream crates that may also use this function with a deprecation notice rather than simply removing the function immediately

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@alamb alamb marked this pull request as ready for review December 28, 2024 13:59
@alamb alamb changed the title Improve parsing speed by Avoid clone in parse_identifier Improve parsing speed by avoiding some clones in parse_identifier Dec 28, 2024
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -13475,13 +13477,23 @@ impl<'a> Parser<'a> {
}

impl Word {
#[deprecated(since = "0.54.0", note = "please use `into_ident` instead")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@iffyio iffyio merged commit 3db1b44 into apache:main Dec 29, 2024
8 checks passed
@alamb alamb deleted the alamb/faster_parse_prefix branch December 29, 2024 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants