Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload CSV to Database (Trino) is broken after #28192 #31784

Open
devlatte opened this issue Jan 10, 2025 · 2 comments
Open

Upload CSV to Database (Trino) is broken after #28192 #31784

devlatte opened this issue Jan 10, 2025 · 2 comments
Assignees
Labels
data:connect:trino Related to Trino data:csv Related to import/export of CSVs

Comments

@devlatte
Copy link

Bug description

The "Upload CSV to Database" feature for Trino is not working properly after the changes introduced in #28192.

This issue did not exist in version 4.0.2, where the feature was functioning correctly.

Screenshots/recordings

Go to the "Upload CSV to Database" feature in the Superset UI.
Select Trino as the target database.
Attempt to upload a CSV file.
Observe the error or malfunction.

If Table Already Exists: Fail

Traceback (most recent call last):
  File "/app/superset/commands/database/uploaders/base.py", line 114, in _dataframe_to_database
    database.db_engine_spec.df_to_sql(
  File "/app/superset/db_engine_specs/trino.py", line 515, in df_to_sql
    if database.has_table_by_name(table.table, table.schema):
AttributeError: 'Database' object has no attribute 'has_table_by_name'


If Table Already Exists: Replace

Traceback (most recent call last):
  File "/app/superset/commands/database/uploaders/base.py", line 114, in _dataframe_to_database
    database.db_engine_spec.df_to_sql(
  File "/app/superset/db_engine_specs/trino.py", line 546, in df_to_sql
    location=upload_to_s3(
  File "/app/superset/db_engine_specs/hive.py", line 85, in upload_to_s3
    bucket = s3.Bucket(bucket_path)
  File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 922, in __getattr__
    raise AttributeError(
AttributeError: 'S3' object has no attribute 'Bucket'```



### Superset version

4.1.1

### Python version

3.10

### Node version

16

### Browser

Chrome

### Additional context

This functionality was working correctly in version 4.0.2.
The regression seems to have been introduced with the changes in [#28192](https://github.com/apache/superset/pull/28192).

I attempted to use the "Replace" setting as a workaround, but it also fails. 
The issue appears to be caused by hive.py relying on an outdated version of boto3.

### Checklist

- [X] I have searched Superset docs and Slack and didn't find a solution to my problem.
- [X] I have searched the GitHub issue tracker and didn't find a similar bug report.
- [X] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
Copy link

dosubot bot commented Jan 10, 2025

Hey @devlatte! I'm a bot designed to help you with bug fixes, troubleshooting, and contributions. I'm here to assist you while waiting for a human maintainer. Let's squash some bugs together!

I found a similar unsolved discussion regarding Trino connection issues for CSV uploads in Superset version 4.1.0rc2. You can view the discussion here [1].

Regarding the errors you're encountering, it seems there are two main issues:

  1. Missing has_table_by_name Method: The error indicates that the Database object lacks the has_table_by_name method. This might be due to changes introduced in pull request feat: new Columnar upload form and API #28192. You may need to check if this method was removed or renamed in the recent updates.

  2. S3 Bucket Attribute Error: The error 'S3' object has no attribute 'Bucket' suggests a potential issue with the boto3 library version or its usage in hive.py. The upload_to_s3 function uses boto3 to interact with S3, and there might be compatibility issues with the version of boto3 you are using. Ensure that your boto3 library is up-to-date and compatible with Python 3.10. Additionally, verify that the AWS credentials and permissions are correctly configured [2].

You might want to investigate these areas further to resolve the issues.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added data:connect:trino Related to Trino data:csv Related to import/export of CSVs labels Jan 10, 2025
@rusackas
Copy link
Member

Pinging @dpgaspar since he's the author of #28192

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:connect:trino Related to Trino data:csv Related to import/export of CSVs
Projects
None yet
Development

No branches or pull requests

3 participants