Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Improve command resolution in telemetry #115992

Closed
costin opened this issue Oct 30, 2024 · 2 comments · Fixed by #120527
Closed

ESQL: Improve command resolution in telemetry #115992

costin opened this issue Oct 30, 2024 · 2 comments · Fixed by #120527
Assignees
Labels
:Analytics/ES|QL AKA ESQL >refactoring Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@costin
Copy link
Member

costin commented Oct 30, 2024

Description

The ESQL telemetry infrastructure expects each LogicalPlan to map uniquely to a command, (through LogicalPlan#commandName()) an assumption that can't be generalized.
Instead the mapping between the plan and the commands needs to be moved to a separate class / registry / map.
P.S. Furthermore, the telemetry might in time be moved to occur at parsing time to track queries that don't pass the grammar, before the logical AST is fully assembled.

Example: FROM index | INLINESTATS c = count() should count FROM and INLINESTATS but ends up counting FROM, INLINESTATS and STATS (due to the normalized plan).

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 30, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@luigidellaquila
Copy link
Contributor

@costin I gave it a try when I implemented APM telemetry, and I ended up using commandName() and making it dynamic (see what we do for METRICS)
Even if we move the logic in a separate class, we'll still have to store some information inside the plan, that is the only thing we have after parsing.

P.S. Furthermore, the telemetry might in time be moved to occur at parsing time to track queries that don't pass the grammar, before the logical AST is fully assembled.

That's the best thing to do for sure, but it's not straight forward. If we have to invest on this thing though, IMHO this is the right approach.

bpintea added a commit to bpintea/elasticsearch that referenced this issue Jan 28, 2025
This implements an interface that export the names of the plan nodes and
functions that need to be counted in the metrics.

Also, the metrics are now counted from within the parser. This should
allow correct accounting for the cases where some nodes can appear both
standalone or part other nodes' children (like Aggregate being a child
of INLINESTATS, so no STATS counting should occur).

The functions counting now also validates that behind a name there is
actually a function registered.

Closes elastic#115992.

(cherry picked from commit a4482d4)
bpintea added a commit to bpintea/elasticsearch that referenced this issue Jan 28, 2025
This implements an interface that export the names of the plan nodes and
functions that need to be counted in the metrics.

Also, the metrics are now counted from within the parser. This should
allow correct accounting for the cases where some nodes can appear both
standalone or part other nodes' children (like Aggregate being a child
of INLINESTATS, so no STATS counting should occur).

The functions counting now also validates that behind a name there is
actually a function registered.

Closes elastic#115992.

(cherry picked from commit a4482d4)
bpintea added a commit that referenced this issue Jan 29, 2025
* ESQL: Implement a MetricsAware interface (#120527)

This implements an interface that export the names of the plan nodes and
functions that need to be counted in the metrics.

Also, the metrics are now counted from within the parser. This should
allow correct accounting for the cases where some nodes can appear both
standalone or part other nodes' children (like Aggregate being a child
of INLINESTATS, so no STATS counting should occur).

The functions counting now also validates that behind a name there is
actually a function registered.

Closes #115992.

(cherry picked from commit a4482d4)

* Drop the HashSet gating when counting commands

The telemetry accounting is no longer done in just one place in the parser,
but split, so that no HashSet is required to discard duplicate accounting of
the same node. This lowers the memory requirements.
bpintea added a commit to bpintea/elasticsearch that referenced this issue Jan 29, 2025
* ESQL: Implement a MetricsAware interface (elastic#120527)

This implements an interface that export the names of the plan nodes and
functions that need to be counted in the metrics.

Also, the metrics are now counted from within the parser. This should
allow correct accounting for the cases where some nodes can appear both
standalone or part other nodes' children (like Aggregate being a child
of INLINESTATS, so no STATS counting should occur).

The functions counting now also validates that behind a name there is
actually a function registered.

Closes elastic#115992.

(cherry picked from commit a4482d4)

* Drop the HashSet gating when counting commands

The telemetry accounting is no longer done in just one place in the parser,
but split, so that no HashSet is required to discard duplicate accounting of
the same node. This lowers the memory requirements.
elasticsearchmachine pushed a commit that referenced this issue Jan 29, 2025
* ESQL: Implement a MetricsAware interface (#120527)

This implements an interface that export the names of the plan nodes and
functions that need to be counted in the metrics.

Also, the metrics are now counted from within the parser. This should
allow correct accounting for the cases where some nodes can appear both
standalone or part other nodes' children (like Aggregate being a child
of INLINESTATS, so no STATS counting should occur).

The functions counting now also validates that behind a name there is
actually a function registered.

Closes #115992.

(cherry picked from commit a4482d4)

* Drop the HashSet gating when counting commands

The telemetry accounting is no longer done in just one place in the parser,
but split, so that no HashSet is required to discard duplicate accounting of
the same node. This lowers the memory requirements.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >refactoring Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants