-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Use utils.json_iso_dttm_ser to dump jsons when async query execution #13830
Conversation
If encoding is not `None`, the default encoder `utils.json_iso_dttm_ser` is not used for binary data and instead tries to encode to the default 'utf-8'. This fixes #13829
Codecov Report
@@ Coverage Diff @@
## master #13830 +/- ##
==========================================
+ Coverage 77.22% 79.40% +2.17%
==========================================
Files 935 939 +4
Lines 47266 47541 +275
Branches 5893 5938 +45
==========================================
+ Hits 36502 37750 +1248
+ Misses 10616 9670 -946
+ Partials 148 121 -27
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
/testenv up |
1 similar comment
/testenv up |
superset/views/core.py
Outdated
@@ -2211,7 +2211,8 @@ def results_exec( # pylint: disable=too-many-return-statements | |||
obj = apply_display_max_row_limit(obj, rows) | |||
|
|||
return json_success( | |||
json.dumps(obj, default=utils.json_iso_dttm_ser, ignore_nan=True) | |||
json.dumps(obj, default=utils.json_iso_dttm_ser, ignore_nan=True, | |||
encoding=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't specify the encoding type, I guess json.dumps
might use ASCII
encoding, which might be a problem for Asian characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi!
Thanks for checking the PR. I looked into the encoding and it seems that if a utf-8 string is in the json as a binary string, it will pass through
superset/superset/utils/core.py
Line 550 in b5c95c5
return obj.decode("utf-8") |
utf-8
, and if the string is a str
it will go through https://github.com/simplejson/simplejson/blob/8bef979ad8272cbc2903970f4b9992f603d50973/simplejson/encoder.py#L51 and also get encoded into utf-8
.There are two other places where the dump is done with
encoding=None
:superset/superset/views/core.py
Line 2302 in b5c95c5
encoding=None, |
superset/superset/views/core.py
Line 2428 in b5c95c5
encoding=None, |
I also think that the other calls to json.dumps with
default=utils.*
should have encoding set to None
.I did a brief test of the change with non ASCII strings and they were correctly displayed.
Is there some other test you think could help mitigate the risk of querying non ASCII chars?
Cheers :)

@zhaoyongjie Ephemeral environment creation is currently limited to committers. |
@junlincc Ephemeral environment spinning up at http://54.191.32.199:8080. Credentials are |
Hi @cabo40 |
Hi @cabo40, Sorry, please abort my earlier reply, I reproduced the problem, but it doesn't seem to be caused by the global async query. Here is my configuration for reproducing this issue
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for this fix. please run
tox -e pre-commit
to fix PEP8
Ephemeral environment shutdown and build artifacts deleted. |
* master: fix: Use utils.json_iso_dttm_ser to dump jsons when async query execution (apache#13830) feat: TrinoEngineSpec.adjust_database_uri (apache#14122) chore: bump package.json (apache#14222) Add superset helm repository (apache#14223) fix(cross-filters): Fix missed metadata (apache#14220)
…tion (apache#13830) * Use utils.json_iso_dttm_ser to dump jsons If encoding is not `None`, the default encoder `utils.json_iso_dttm_ser` is not used for binary data and instead tries to encode to the default 'utf-8'. This fixes apache#13829 * Change to comply with tox -m pre-commit
SUMMARY
If encoding is not
None
, the default encoderutils.json_iso_dttm_ser
is not used for binary data and instead it tries to encode to the default 'utf-8'.For data exploration on tables with blob columns doing
SELECT *
will throw an error.This fixes #13829
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
before:

after:

TEST PLAN
Try executing
select decode('DEADBEEF', 'hex');
on the sample DB with async turned on.ADDITIONAL INFORMATION
Asynchronous query execution must be enabled for the database.