Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

df.write_excel does not work with file objects #18849

Closed
2 tasks done
littleblubber opened this issue Sep 23, 2024 · 4 comments · Fixed by #20638
Closed
2 tasks done

df.write_excel does not work with file objects #18849

littleblubber opened this issue Sep 23, 2024 · 4 comments · Fixed by #20638
Labels
A-io-spreadsheet Area: reading/writing Excel/ODS files bug Something isn't working good first issue Good for newcomers P-low Priority: low python Related to Python Polars

Comments

@littleblubber
Copy link

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

df = pl.DataFrame({"a": [1, 2, 3, 4, 5]})

with open("dataframe.xlsx", "wb") as f:
    df.write_excel(f)

Log output

Traceback (most recent call last):
  File "<input>", line 6, in <module>
  File "~/.../polars/dataframe/frame.py", line 3363, in write_excel
    wb, ws, can_close = _xl_setup_workbook(workbook, worksheet)
  File "~/.../polars/io/spreadsheet/_write_utils.py", line 595, in _xl_setup_workbook
    file = Path("dataframe.xlsx" if workbook is None else workbook)
  File "/usr/lib/python3.9/pathlib.py", line 1072, in __new__
    self = cls._from_parts(args, init=False)
  File "/usr/lib/python3.9/pathlib.py", line 697, in _from_parts
    drv, root, parts = self._parse_args(args)
  File "/usr/lib/python3.9/pathlib.py", line 681, in _parse_args
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not BufferedWriter

Issue description

_xl_setup_workbook incorrectly defaults to treating file objects like io.BufferedWriter, fsspec.implementations.local.LocalFileOpener, or s3fs.S3File as paths.

This is because the isinstance(workbook, BytesIO) condition condition is False for these objects.

By contrast, write_csv and write_json methods work fine with file-like objects such as the ones mentioned above.

Expected behavior

Similar to write_csv or write_json, the expected behaviour in _xl_setup_workbook is to explicitly check that the workbook is a string/path-like object, otherwise defaulting to treating the workbook as file-like.

I think this could be accomplished by modifying the existing if-else statement along the following lines:

if isinstance(workbook, (str, os.PathLike)):
    file = Path("dataframe.xlsx" if workbook is None else workbook)
    # ...
else:
    wb, ws, can_close = Workbook(workbook, workbook_options), None, True

Installed versions

pl.show_versions()
--------Version info---------
Polars:              1.7.1
Index type:          UInt32
Platform:            Linux-5.15.0-122-generic-x86_64-with-glibc2.31
Python:              3.9.5 (default, Nov 23 2021, 15:27:38) 
[GCC 9.3.0]
----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            <not installed>
fastexcel            0.11.5
fsspec               2024.3.1
gevent               <not installed>
great_tables         <not installed>
matplotlib           3.8.0
nest_asyncio         <not installed>
numpy                1.26.2
openpyxl             3.1.5
pandas               2.1.1
pyarrow              16.1.0
pydantic             2.5.3
pyiceberg            <not installed>
sqlalchemy           1.4.49
torch                <not installed>
xlsx2csv             0.8.3
xlsxwriter           3.2.0
@littleblubber littleblubber added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Sep 23, 2024
@v-wei40680
Copy link

在 Polars 中,你无需使用 with open() 来打开文件进行写入操作。Polars 提供了直接写入文件的方法,比如 df.write_excel()(或其他格式),该方法会直接处理文件路径。

正确的用法是:

df.write_excel("dataframe.xlsx")

这样,Polars 会自动处理文件的打开和关闭,而不需要手动通过 with open() 进行处理。因此,使用 with open() 是不必要的。

In Polars, you don't need to use with open() to open a file for a write operation.Polars provides methods to write directly to a file, such as df.write_excel() (or other formats), which will deal with the file path directly.

The correct usage is:

``python
df.write_excel(“dataframe.xlsx”)


This way, Polars handles the opening and closing of the file automatically, rather than having to do it manually with `with open()`. Therefore, using `with open()` is not necessary.

Translated with DeepL.com (free version)

@mcrumiller
Copy link
Contributor

@v-wei40680 sorry, can you post in English if possible? Here is a translation of your post (using Google Translate):

In Polars, you don't need to use with open() to open files for writing. Polars provides methods for writing files directly, such as df.write_excel() (or other formats), which directly handle file paths.

The correct usage is:

df.write_excel("dataframe.xlsx")

@littleblubber
Copy link
Author

Thanks both; that's correct, however, that solution does not work if I'm using fsspec/s3fs file system classes to open the file and write.

As mentioned above, this means that df.write_excel behaviour is not compatible with other write methods which support working with file system classes.

@deanm0000 deanm0000 added good first issue Good for newcomers P-low Priority: low A-io-spreadsheet Area: reading/writing Excel/ODS files and removed needs triage Awaiting prioritization by a maintainer labels Sep 26, 2024
@github-project-automation github-project-automation bot moved this to Ready in Backlog Sep 26, 2024
@not-so-rabh
Copy link

Hello new here. Can i take this up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io-spreadsheet Area: reading/writing Excel/ODS files bug Something isn't working good first issue Good for newcomers P-low Priority: low python Related to Python Polars
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants