-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'Upload parquet' option #14020
Comments
Hi @elenamereloaxesor, thanks for suggesting!Took a look at Apache Parquet, it seems promising! However, we haven't received a lot of the same request so this is not in our roadmap. would you be interested in adding the support? |
Parquet is awesome and I think it’s something we should consider, along with an option to import a file directly from a remote server (eg, s3). Though, if data is already in parquet format it may make more sense to import it directly into the db then connect the db to superset. For these upload data options, superset is really just acting as a broker for the target db. @elenamereloaxesor is the motivation here due to supersets upload UI being much friendlier than going directly to the db? Or is getting direct access to the db the challenge? |
The motivation may well be both you mention. For ease of use, and since i am having a bit of trouble accesing GCS's parquets, even if i have succesfully connected Drill as a database. Thanks for the rapid response. I would of course love to help, but im afraid im quite new to this world, getting the grasp of how everything works and where everything is, so i feel a bit at a loss as to how to contribute. |
Hi,if parquet does gets added I also recommend to add in the same process support for ORC files. Both of these formats are quite common in the Hadoop environment and analytics engineering area. |
Is your feature request related to a problem? Please describe.
In my data mining department we read data from parquets, something that's quite common and widespread. However, Superset doesn't give the option to directly upload a parquet file, just csv or excel.
Describe the solution you'd like
I would like it to be an 'upload parquet' option, alongside the existing ones.
Describe alternatives you've considered
I am already using Drill with Superset since Drill supports many kind of files. It would be perfect were it to be an option like when adding the database and using Drill, of using SQLAlchemy to connect with more databases that the default ones.
The text was updated successfully, but these errors were encountered: