You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Joris Van den Bossche / @jorisvandenbossche:
You are using adlfs, which is an fsspec-compatible filesystem, and so normally I expect that the pandas read_parquet call converts the "abfss://data.parquet" URI to an fsspec filesystem, passing that to the underlying pyarrow function, and we do have support for fsspec filesystems (and in that way we can support filesystems that don't have native support inside Arrow C++, such as Azure at the moment).
So something is going wrong here. As a starter, can you indicate which versions you are using for pyarrow, pandas, fsspec and adlfs? (eg a pip list or conda list)
If you really want to use adlfs this issue is definitely solvable just with changes to the user code. However, I think this will also be solved by #39317. This will connect up the new C++ AzureFileSystem on the python side and provide much better performance and reliability compared to adlfs.
I am running the below commands in databricks.
When I am trying to read a file which is stored in adls using pandas:
Then I got the below error:
Reporter: Prakhar Sandhu
Note: This issue was originally created as ARROW-17672. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: