Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: adds the ability to disallow SQL functions per engine #28639

Merged
merged 5 commits into from
May 29, 2024

Conversation

dpgaspar
Copy link
Member

@dpgaspar dpgaspar commented May 22, 2024

SUMMARY

Adds a new configuration key named DISALLOWED_SQL_FUNCTIONS that defines disallowed function per engine on SQL statements. These functions will be disallowed on SQLLab and Charts.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Copy link

codecov bot commented May 22, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 2 lines in your changes missing coverage. Please review.

Project coverage is 83.43%. Comparing base (76d897e) to head (396a8e0).
Report is 1094 commits behind head on master.

Files with missing lines Patch % Lines
superset/db_engine_specs/base.py 75.00% 1 Missing ⚠️
superset/exceptions.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #28639       +/-   ##
===========================================
+ Coverage   60.48%   83.43%   +22.94%     
===========================================
  Files        1931      523     -1408     
  Lines       76236    37605    -38631     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31377    -14737     
+ Misses      28017     6228    -21789     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 48.99% <28.57%> (-0.18%) ⬇️
javascript ?
postgres 77.22% <71.42%> (?)
presto 53.56% <71.42%> (-0.24%) ⬇️
python 83.43% <92.85%> (+19.95%) ⬆️
sqlite 76.68% <71.42%> (?)
unit 58.95% <92.85%> (+1.33%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pull-request-size pull-request-size bot added size/L and removed size/M labels May 22, 2024
@dpgaspar dpgaspar marked this pull request as ready for review May 22, 2024 15:19
@dosubot dosubot bot added data:databases Related to database configurations and connections sqllab Namespace | Anything related to the SQL Lab labels May 22, 2024
logger.debug("Query %d: Running query: %s", query_id, sql)

try:
cls.execute(cursor, sql, query.database)
with app.app_context():
cls.execute(cursor, sql, query.database)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting.

logger.debug("Query %d: Running query: %s", query_id, sql)

try:
cls.execute(cursor, sql, query.database)
with app.app_context():
cls.execute(cursor, sql, query.database)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting.

:param function_list: The list of functions to search for
:param engine: The engine to use for parsing the SQL statement
"""
return ParsedQuery(sql, engine=engine).check_functions_exist(function_list)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We (probably me) will have to convert this to use sqlglot and the SQLStatement class (#26786) but I'm happy to do it, seems simple enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do it myself, can I just not use sqlparse? implement something using the same pattern as extract_tables_from_statement?

# A set of disallowed SQL functions per engine. This is used to restrict the use of
# unsafe SQL functions in SQL Lab and Charts. The keys of the dictionary are the engine
# names, and the values are sets of disallowed functions.
DISALLOWED_SQL_FUNCTIONS: dict[str, set[str]] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure where the best place for this deny list, i.e., here in the configuration or within the extra JSON payload of the database.

Additionally should this be engine (dialect) specific or database specific? If it's the later then maybe the extra JSON payload field is preferable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON payload at the database level is more dynamic and would avoid having to change the config to add remove disallowed functions. But on the other hand the user that actually registers the db could have intentions to "abuse" these functions.

@dpgaspar dpgaspar merged commit 5dfbab5 into apache:master May 29, 2024
34 checks passed
@dpgaspar dpgaspar deleted the feat/disallow-sql-functions branch May 29, 2024 09:51
EnxDev pushed a commit to EnxDev/superset that referenced this pull request May 31, 2024
@michael-s-molina michael-s-molina added the v4.0 Label added by the release manager to track PRs to be included in the 4.0 branch label Jun 26, 2024
@mistercrunch mistercrunch added 🍒 4.0.2 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels labels Jul 24, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels data:databases Related to database configurations and connections size/L sqllab Namespace | Anything related to the SQL Lab v4.0 Label added by the release manager to track PRs to be included in the 4.0 branch 🍒 4.0.2 🚢 4.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants