Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pl.Expr.list.get raises error on null values. #17252

Closed
2 tasks done
dalejung opened this issue Jun 27, 2024 · 5 comments · Fixed by #17262 or #17276
Closed
2 tasks done

pl.Expr.list.get raises error on null values. #17252

dalejung opened this issue Jun 27, 2024 · 5 comments · Fixed by #17262 or #17276
Assignees
Labels
A-dtype-list/array Area: list/array data type accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars regression Issue introduced by a new release
Milestone

Comments

@dalejung
Copy link

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

    import polars as pl

    df = pl.DataFrame({
        'name': ['BOB-3', 'BOB', None],
    })

    split = df.with_columns(
        pl.col('name').str.split('-')
    )
    res = split.with_columns(pl.col('name').list.get(0))

Log output

No response

Issue description

Previously pl.Expr.list.get would propagate null values. Now it errors with ComputeError: get index is out of bounds.

Expected behavior

I would expect pl.Expr.list.get to not error on null list values.

Installed versions

--------Version info---------
Polars:               1.0.0-rc.2
Index type:           UInt32
Platform:             Linux-6.9.6-arch1-1-x86_64-with-glibc2.39
Python:               3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:38:13) [GCC 12.3.0]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          3.0.0
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            <not installed>
fsspec:               2024.5.0
gevent:               <not installed>
great_tables:         <not installed>
hvplot:               0.9.2.post8+g4cb29ba
matplotlib:           3.8.4
nest_asyncio:         1.6.0
numpy:                1.26.4
openpyxl:             3.1.2
pandas:               3.0.0.dev0+756.ge8e6be071c
pyarrow:              17.0.0.dev225+g72e7e4c6c
pydantic:             2.7.2
pyiceberg:            <not installed>
sqlalchemy:           <not installed>
torch:                <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@dalejung dalejung added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Jun 27, 2024
@dalejung dalejung changed the title pl.Expr.list.get raises error on null values. pl.Expr.list.get raises error on null values. Jun 27, 2024
@lscheilling
Copy link

Wouldn't this error be related to the new null_on_oob parameter in pl.Expr.list.get?

Source Code

@mcrumiller
Copy link
Contributor

mcrumiller commented Jun 27, 2024

I don't think that null values should raise on oob even when the parameter is True. If it did, the parameter would only be useful when dealing with series without nulls, otherwise it must be set to False. If not, all calls will fail as long as a single null value is present.

@mcrumiller
Copy link
Contributor

Also, note that this is not the case for arrays:

s = pl.Series([None, [1]], dtype=pl.Array(pl.Int32, 1))
s.arr.get(0, null_on_oob=False)  # returns [null, 1]
s.arr.get(0, null_on_oob=True)   # returns [null, 1]

@stinodego stinodego added regression Issue introduced by a new release P-high Priority: high A-dtype-list/array Area: list/array data type and removed needs triage Awaiting prioritization by a maintainer labels Jun 28, 2024
@github-project-automation github-project-automation bot moved this to Ready in Backlog Jun 28, 2024
@stinodego stinodego added this to the 1.0.0 milestone Jun 28, 2024
@stinodego stinodego moved this from Ready to Next in Backlog Jun 28, 2024
@stinodego
Copy link
Member

Thanks for the report. This is most probably a side-effect of #16841

@stinodego
Copy link
Member

Re-opening as this wasn't fixed completely - see #17276

@github-project-automation github-project-automation bot moved this from Ready to Done in Backlog Jun 29, 2024
@c-peters c-peters added the accepted Ready for implementation label Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dtype-list/array Area: list/array data type accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars regression Issue introduced by a new release
Projects
Archived in project
6 participants