Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code example in keep_name does not show functionality #5378

Closed
2 tasks done
mcrumiller opened this issue Oct 30, 2022 · 3 comments
Closed
2 tasks done

Code example in keep_name does not show functionality #5378

mcrumiller opened this issue Oct 30, 2022 · 3 comments
Labels
documentation Improvements or additions to documentation python Related to Python Polars

Comments

@mcrumiller
Copy link
Contributor

mcrumiller commented Oct 30, 2022

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

In the documentation for keep_name, the output of the example is the same in the case where keep_name is not used and when it's used:

Examples

A groupby aggregation often changes the name of a column. With keep_name we can keep the original name of the column

>>> df = pl.DataFrame(
...     {
...         "a": [1, 2, 3],
...         "b": ["a", "b", None],
...     }
... )
>>> df.groupby("a").agg(pl.col("b").list()).sort(by="a")
shape: (3, 2)
┌─────┬───────────┐
│ ab         │
│ ------       │
│ i64list[str] │
╞═════╪═══════════╡
│ 1   ┆ ["a"]     │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ 2   ┆ ["b"]     │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ 3   ┆ [null]    │
└─────┴───────────┘

Keep the original column name:

>>> df.groupby("a").agg(pl.col("b").list().keep_name()).sort(by="a")
shape: (3, 2)
┌─────┬───────────┐
│ ab         │
│ ------       │
│ i64list[str] │
╞═════╪═══════════╡
│ 1   ┆ ["a"]     │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ 2   ┆ ["b"]     │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┤
│ 3   ┆ [null]    │
└─────┴───────────┘

Reproducible example

n/a

Expected behavior

The example should show a change in behavior when keep_name is used. The first example needs to be a case whereby the column name changes.

I can't seem to find such an instance, but if someone wants to point me to an example where the column name changes, I can create a PR.

Installed versions

Replace this line with the output of pl.show_versions(), leave the backticks in place
@mcrumiller mcrumiller added bug Something isn't working python Related to Python Polars labels Oct 30, 2022
@ritchie46
Copy link
Member

Right! Can you think of one? Maybe one where we do multicolumn arithmetic. (6 * pl.all()).keep_name()

@mcrumiller
Copy link
Contributor Author

mcrumiller commented Oct 30, 2022

Multi-column arithmetic doesn't seem to change the name either. A simple example would be something like:

df = pl.DataFrame({
    'a': [1, 2, 3, 4, 5]
})
df.select(pl.col('a').alias('b'))
shape: (5, 1)
┌─────┐
│ b   │
│ --- │
│ i64 │
╞═════╡
│ 1   │
├╌╌╌╌╌┤
│ 2   │
├╌╌╌╌╌┤
│ 3   │
├╌╌╌╌╌┤
│ 4   │
├╌╌╌╌╌┤
│ 5   │
└─────┘

With keep_name:

df.select(pl.col('a').alias('b').keep_name())
shape: (5, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 1   │
├╌╌╌╌╌┤
│ 2   │
├╌╌╌╌╌┤
│ 3   │
├╌╌╌╌╌┤
│ 4   │
├╌╌╌╌╌┤
│ 5   │
└─────┘

However, it's not the greatest example.

@ritchie46
Copy link
Member

See: #3773 (comment)

@zundertj zundertj added documentation Improvements or additions to documentation and removed bug Something isn't working labels Oct 31, 2022
ghuls added a commit to ghuls/polars that referenced this issue Nov 3, 2022
@zundertj zundertj closed this as completed Nov 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants