[Explore]Ensure ad-hoc metrics are not named the same as an existing column #13426

john-bodley · 2021-03-03T01:28:46Z

We should prevent ad-hoc metrics from being named the same as existing columns in the datasource as depending on the engine the generated SQL could result in a projection issue given it's not evident which column is being referred to.

Expected results

Ensure all metric names are unique to prevent such an issue.

Actual results

Presto queries of the form (generated using an ad-hoc query with the metric named cost),

SELECT
    SUM(cost) AS "cost"
FROM
    ...
ORDER BY
    SUM(cost)

result in the database error Invalid reference to output projection attribute from ORDER BY aggregation.

Screenshots

If applicable, add screenshots to help explain your problem.

How to reproduce the bug

Go to explore
Create an ad-hoc metric which is named the same as an existing column in the datasource.
Add the same ad-hoc metric to the SORT BY.
Click RUN.
See the error.

Environment

(please complete the following information):

superset version: master
python version: 3.7
node.js version: node -v

Checklist

Make sure to follow these steps before submitting your issue - thank you!

I have checked the superset logs for python stacktraces and included it here as text if there are any.
I have reproduced the issue with at least the latest released version of superset.
I have checked the issue tracker for the same issue and I haven't found one similar.

The text was updated successfully, but these errors were encountered:

ktmud · 2021-03-03T03:30:51Z

Maybe the SQL query generator should prefix all adhoc metric/sort by with table names instead:

SELECT
    SUM(tbl.cost) AS "cost"
FROM tbl
ORDER BY
    SUM(tbl.cost)

villebro · 2021-03-03T05:13:25Z

Good idea @ktmud , should not be difficult to implement

john-bodley · 2021-03-03T06:27:34Z

I’m not sure I agree, i.e., one shouldn’t need to include a table alias in order for these queries to work. Additionally it might not be apparent to the user from the UI what they’re actually doing given there are multiple “columns” with the same name. Furthermore each engine has a perspective on how this should be treated (and how the ORDER BY clause should be composed, i.e., in this case it may have wanted the alias as opposed to the expression in the clause).

Finally this used to work and thus it seems there’s been a regression somewhere (possibly related to the other linked issue) and thus I think we should first try to track that down.

I guess the TL;DR from me is we should strive for defining concise SQL and let the various SQLAlchemy dialects deal with the SQL nuances.

ktmud · 2021-03-11T20:17:05Z

@john-bodley AdhocMetrics are always created from the datasource columns so I think from a user's perspective, it is expected that the aggregation should apply only to known columns of the underlying table (be it virtual or physical)---not another ad-hoc metric or derived column.

A table name prefix will be totally transparent to the UI and users would not need to do anything extra. It is needed to create a concise & correct SQL query because in this case we obviously didn't give SQLAlchemy enough hints to generate unambiguous SQL queries for all data engines.

john-bodley added the #bug Bug report label Mar 3, 2021

john-bodley mentioned this issue Mar 3, 2021

[Explore]SORT BY metric incorrectly re-added to SELECT clause #13423

Closed

3 tasks

john-bodley added the #bug:regression Bugs that are identified as regessions label Mar 3, 2021

zuzana-vej added explore:metrics Related to metrics of Explore and removed #bug Bug report labels Mar 3, 2021

junlincc changed the title ~~Ensure ad-hoc metrics are not named the same as an existing column~~ [Explore]Ensure ad-hoc metrics are not named the same as an existing column Mar 3, 2021

junlincc added the assigned:flexiana label Mar 3, 2021

This was referenced Mar 3, 2021

fix(query): order by adhoc metrics should trigger group by #13434

Merged

[Explore][Table] Adding sort by without metrics causing unexpected error #13465

Closed

ktmud mentioned this issue Mar 23, 2021

fix(sqla-query): order by aggregations in Presto and Hive #13739

Merged

6 tasks

ktmud closed this as completed in #13739 Apr 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Explore]Ensure ad-hoc metrics are not named the same as an existing column #13426

[Explore]Ensure ad-hoc metrics are not named the same as an existing column #13426

john-bodley commented Mar 3, 2021

ktmud commented Mar 3, 2021

villebro commented Mar 3, 2021

john-bodley commented Mar 3, 2021 •

edited

Loading

ktmud commented Mar 11, 2021

[Explore]Ensure ad-hoc metrics are not named the same as an existing column #13426

[Explore]Ensure ad-hoc metrics are not named the same as an existing column #13426

Comments

john-bodley commented Mar 3, 2021

Expected results

Actual results

Screenshots

How to reproduce the bug

Environment

Checklist

ktmud commented Mar 3, 2021

villebro commented Mar 3, 2021

john-bodley commented Mar 3, 2021 • edited Loading

ktmud commented Mar 11, 2021

john-bodley commented Mar 3, 2021 •

edited

Loading