-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix csv upload schema_name bug #5940
Conversation
8f75ecc
to
3a730be
Compare
superset/db_engine_specs.py
Outdated
|
||
hive_table_schema = Table(upload_path).infer() | ||
column_name_and_type = [] | ||
for column_info in hive_table_schema['fields']: | ||
column_name_and_type.append( | ||
'{} {}'.format( | ||
"'" + column_info['name'] + "'", | ||
'`' + column_info['name'] + '`', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these backticks be included in the format statement to improve readabily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
upload_path = config['UPLOAD_FOLDER'] + \ | ||
secure_filename(form.csv_file.data.filename) | ||
secure_filename(filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this line? It seems like filename
is merely used as a variable alias and only appears once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used once more and the variable name is long. This looks cleaner to me
superset/db_engine_specs.py
Outdated
table_name = form.name.data | ||
schema_name = form.schema.data | ||
full_table_name = '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed given it’s guaranteed to be defined in the if/else block and is scoped for the whole of the function.
|
||
filename = form.csv_file.data.filename | ||
bucket_path = config['CSV_TO_HIVE_UPLOAD_S3_BUCKET'] | ||
full_table_name = '{}.{}'.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this is outside the scope of this PR but frontend form validation to prevent a period in the table name and enforcing a schema seems like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I will follow up with adding frontend validation
filename = form.csv_file.data.filename | ||
bucket_path = config['CSV_TO_HIVE_UPLOAD_S3_BUCKET'] | ||
full_table_name = '{}.{}'.format( | ||
schema_name, table_name) if schema_name else table_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the schema_name
is None
and the table_name
doesn’t include a period? Does this simply create the table in the default schema? Again I sense improving the form validation could simplify the logic and improve the UX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. I will work on improving the ux with some form validation
3a730be
to
a9efb3f
Compare
a9efb3f
to
d5ddb55
Compare
Codecov Report
@@ Coverage Diff @@
## master #5940 +/- ##
=========================================
+ Coverage 48.19% 48.2% +<.01%
=========================================
Files 393 393
Lines 23600 23597 -3
Branches 2630 2630
=========================================
Hits 11375 11375
+ Misses 12225 12222 -3
Continue to review full report at Codecov.
|
(cherry picked from commit 00c4c7e)
(cherry picked from commit 00c4c7e)
This PR fixes a few bugs in the upload CSV to hive flow. Most notably in the specification of the schema in the create table statement. Previously, we would accidentally reset the table_name variable back to just the table name after appending the schema. This PR fixes that.
@john-bodley
cc @graceguo-supercat