Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group By no longer supported in Rust WASM #17192

Closed
2 tasks done
brettsaunders21 opened this issue Jun 25, 2024 · 8 comments · Fixed by #20186 · May be fixed by #17793
Closed
2 tasks done

Group By no longer supported in Rust WASM #17192

brettsaunders21 opened this issue Jun 25, 2024 · 8 comments · Fixed by #20186 · May be fixed by #17793
Assignees
Labels
accepted Ready for implementation bug Something isn't working P-low Priority: low regression Issue introduced by a new release rust Related to Rust Polars

Comments

@brettsaunders21
Copy link

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

df.group_by(&[col("category"), col("type")]).head(1)

image

Log output

only supported on 64bit arch

Issue description

With the changes to Polars to make group by use row encoding, it seems that this required 64bit support which Web Assembly doesn't. It seems like it isn't possible to use group_by anymore in WASM

Expected behavior

Carry out group_by operations when compiling polars (with default feature false, lazy true) for WASM and use it in browser with wasm-bindgen

Installed versions

0.41.2

polars = { version = "0.41.2", default-features = false, features = [ 'fmt_no_tty', 'lazy', ] }
@brettsaunders21 brettsaunders21 added bug Something isn't working needs triage Awaiting prioritization by a maintainer rust Related to Rust Polars labels Jun 25, 2024
@deanm0000 deanm0000 added regression Issue introduced by a new release breaking rust Change that breaks backwards compatibility for the Rust crate P-medium Priority: medium labels Jun 25, 2024
@github-project-automation github-project-automation bot moved this to Ready in Backlog Jun 25, 2024
@deanm0000 deanm0000 removed needs triage Awaiting prioritization by a maintainer breaking rust Change that breaks backwards compatibility for the Rust crate labels Jun 25, 2024
@ritchie46 ritchie46 added P-low Priority: low and removed P-medium Priority: medium labels Jun 26, 2024
@ritchie46
Copy link
Member

Hmm... The usize should be changed to u64.

@daBlesr
Copy link

daBlesr commented Jul 22, 2024

Hi @ritchie46, not sure if I understand, usize is architecture agnostic right? Which 'problematic' use of usize are you exactly referring to in the code?

@daBlesr
Copy link

daBlesr commented Jul 22, 2024

I see that the offsets vector in RowsEncoded is casted later to i64 (after checking the types) and this causes the problem. Is your suggestion to change the type of offsets to Vec<u64>, such that they take the same number of bytes on memory?

@daBlesr
Copy link

daBlesr commented Jul 22, 2024

I quickly checked the impact of the above, and the code becomes a bit less appealing using u64 as it will also require changes on encode.rs casting usize to u64. What do you think of the following, where offset<usize> (naturally) remains as-is:

let offsets = if std::mem::size_of::<usize>() % std::mem::size_of::<i64>() == 0 {
    // We can safely convert usize to i64
    bytemuck::cast_vec::<usize, i64>(offsets)
} else {
    offsets.iter().map(|&i| i as i64).collect()
};

It checks if it can simply convert, or in worst-case (for wasm) it will have some performance impact because it needs to cast the encoded rows.

@brettsaunders21
Copy link
Author

Any update on this issue?

@daBlesr
Copy link

daBlesr commented Sep 16, 2024

I created a PR which waits for a review

@daBlesr
Copy link

daBlesr commented Oct 21, 2024

Hi @ritchie46, please take some time to review the PR. Let me know what you are unhappy with. Without this PR WASM does not work.

@chtotut
Copy link

chtotut commented Dec 6, 2024

Any update on this issue?
@daBlesr do you continue to work for PR or have you lost interest? (just a friendly ping)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation bug Something isn't working P-low Priority: low regression Issue introduced by a new release rust Related to Rust Polars
Projects
Archived in project
7 participants