-
-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle non-unicode encoded datafiles #421
Comments
Thanks for reporting this bug, @sp35! If possible, I'd like to avoid asking users for the encoding since we're aiming to be friendly to non-technical users. I think we should investigate a solution that tries different encoding options automatically. |
I've used a small 2kb sample file I found on the web and it failed with this error:
Here's the file I used: |
The grades file seems to cause the Edit: #572 |
ProposalUse encoding detection libraries like Charset Normalizer or cChardet. IssuesBlocking request - Encoding detection is not instant and must be dealt asynchronously. This is also the case with creating data from csv(will be blocking for huge csv files) and solving it should also solve this problem. RemarksI am using charset-normalizer and haven't had issues so far, so I am leaning towards using that library. |
@silentninja That sounds good to me. If you're still interested, please feel free to send a PR. We've put off handling operations asynchronously for now in order to avoid adding additional dependencies such as a task queue, etc. for Mathesar. Let's handle this synchronously for now and create a separate issue for handling it asynchronously. |
Description
Upload CSV feature fails for non
utf-8
encoded CSVs. The exception is unhandled so the server returns 500.UnicodeDecodeError: 'utf-8' codec can't decode byte 0xae in position 265: invalid start byte
Expected behavior
We can ask for the encoding of the CSV file in the UI and handle the errors if any.
To Reproduce
utf-8
encoded CSV (e.g.-unknown-8bit
: https://sample-videos.com/csv/Sample-Spreadsheet-10-rows.csv)Environment
Docker
The text was updated successfully, but these errors were encountered: