Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary lines with starfile=0.5.10 #79

Closed
hanjinliu opened this issue Jan 7, 2025 · 9 comments · Fixed by #80
Closed

Unnecessary lines with starfile=0.5.10 #79

hanjinliu opened this issue Jan 7, 2025 · 9 comments · Fixed by #80

Comments

@hanjinliu
Copy link
Collaborator

  • starfile version: 0.5.10
  • Python version: 3.11.9
  • Operating System: Windows 11

Description

Since 0.5.10, writing a DataFrame to a star file results in additional lines for each row.
The test workflow of one of my repo that depends on starfile fails only for windows, so maybe this is a windows-specific problem.

What I Did

import starfile
import tempfile
import pandas as pd

df = pd.DataFrame({"a": [1,2,3]})
with tempfile.TemporaryDirectory() as d:
    path = Path(d) / "x.star"
    starfile.write(df, path)
    print(path.read_text())
# Created by the starfile Python package (version 0.5.10) at 22:02:00 on 07/01/2025


data_

loop_
_a #1
1

2

3




@alisterburt
Copy link
Collaborator

Really weird, thanks for the report - I'll take a look today 🙂

@alisterburt
Copy link
Collaborator

unable to reproduce on macOS so seems windows specific as you suggested

# Created by the starfile Python package (version 0.5.10) at 11:15:19 on 07/01/2025


data_

loop_
_a #1
1
2
3

@alisterburt
Copy link
Collaborator

is this pandas version specific? does up/downgrading pandas affect the behavior?

@hanjinliu
Copy link
Collaborator Author

Thank you for your reply, @alisterburt !
I was using the latest pandas=2.2.3, and pandas=2.1.4 resulted in the same problem. pandas<2.1 did not work at all because the method map is not defined in the older versions.
I think it is easier for me to fix this bug because I use Windows. Do you have any idea of which commits are probably related between 0.5.8 and 0.5.10?

@alisterburt
Copy link
Collaborator

thanks for taking a look yourself
#60 (812c6bb) was v0.5.8 so everything more recent than May 24th 2024 https://github.com/teamtomo/starfile/commits/main/

I suspect maybe @jahooker's updates to the writing internals are the issue but I'm not sure...

@jojoelfe
Copy link
Collaborator

jojoelfe commented Jan 8, 2025

Maybe try replacing '\n' in

).split('\n'):

with os.linesep

@jojoelfe
Copy link
Collaborator

jojoelfe commented Jan 8, 2025

@alisterburt was there a reason why macos and windows are disabled here:

platform: [ubuntu-latest, ] # ...macos-latest, windows-latest]

@alisterburt
Copy link
Collaborator

@jojoelfe I remember looking at the amount of free CI, how big the matrix was and thinking it might be too much... open to turning them back on :-)

@jahooker
Copy link
Contributor

jahooker commented Jan 8, 2025

Hi! I think @hanjinliu's changes will fix things. I did assume that the line separator would always be '\n'. My bad!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants