Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Ensure type object of inputs for cached any-value conversion functions are kept alive #19866

Merged
merged 3 commits into from
Nov 20, 2024

Conversation

nameexhaustion
Copy link
Collaborator

@nameexhaustion nameexhaustion commented Nov 19, 2024

Fixes a non-deterministic bug in AnyValue conversion from Python, where the incorrect object conversion function would be used when constructing from a Python object. This occurs when there are multiple dynamically created classes being used.

We have a cache static LUT for get_conversion_function keyed by the usize pointer address of the Python type class. This didn't account for dynamically constructed type objects that may be freed during execution. If a different type object gets allocated to the same location as a previously freed type object, we would incorrectly use the conversion_function meant for the previous type object.

Traceback

This is a traceback that I believe to be caused by the bug:

2024-11-19T11:31:25.2828286Z ================================== FAILURES ===================================
2024-11-19T11:31:25.2829036Z __________________________ test_err_transpose_object __________________________
2024-11-19T11:31:25.2830362Z [gw1] win32 -- Python 3.12.7 D:\a\polars\polars\py-polars\.venv\Scripts\python.exe
2024-11-19T11:31:25.2831180Z tests\unit\operations\test_transpose.py:207: in test_err_transpose_object
2024-11-19T11:31:25.2831788Z     pl.DataFrame([CustomObject()]).transpose()
2024-11-19T11:31:25.2832264Z polars\dataframe\frame.py:375: in __init__
2024-11-19T11:31:25.2832703Z     self._df = sequence_to_pydf(
2024-11-19T11:31:25.2833235Z polars\_utils\construction\dataframe.py:461: in sequence_to_pydf
2024-11-19T11:31:25.2833795Z     return _sequence_to_pydf_dispatcher(
2024-11-19T11:31:25.2834395Z C:\hostedtoolcache\windows\Python\3.12.7\x64\Lib\functools.py:907: in wrapper
2024-11-19T11:31:25.2835180Z     return dispatch(args[0].__class__)(*args, **kw)
2024-11-19T11:31:25.2835882Z polars\_utils\construction\dataframe.py:534: in _sequence_to_pydf_dispatcher
2024-11-19T11:31:25.2836493Z     return to_pydf(**common_params)
2024-11-19T11:31:25.2837416Z polars\_utils\construction\dataframe.py:749: in _sequence_of_elements_to_pydf
2024-11-19T11:31:25.2838539Z     pl.Series(
2024-11-19T11:31:25.2839362Z polars\series\series.py:289: in __init__
2024-11-19T11:31:25.2839822Z     self._s = sequence_to_pyseries(
2024-11-19T11:31:25.2840576Z polars\_utils\construction\series.py:285: in sequence_to_pyseries
2024-11-19T11:31:25.2841192Z     srs = PySeries.new_from_any_values(name, values, strict)
2024-11-19T11:31:25.2843104Z E   pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'AttributeError'>, value: AttributeError("'CustomObject' object has no attribute 'tzinfo'"), traceback: Some(<traceback object at 0x00000192592C8940>) }
2024-11-19T11:31:25.2844496Z ---------------------------- Captured stderr call -----------------------------
2024-11-19T11:31:25.2845236Z thread '<unnamed>' panicked at crates\polars-python\src\conversion\any_value.rs:231:18:
2024-11-19T11:31:25.2846649Z called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'AttributeError'>, value: AttributeError("'CustomObject' object has no attribute 'tzinfo'"), traceback: Some(<traceback object at 0x00000192592C8940>) }
2024-11-19T11:31:25.2847697Z stack backtrace:
2024-11-19T11:31:25.2848202Z note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
2024-11-19T11:31:25.2848867Z =========================== short test summary info ===========================
2024-11-19T11:31:25.2850669Z FAILED tests/unit/operations/test_transpose.py::test_err_transpose_object - pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'AttributeError'>, value: AttributeError("'CustomObject' object has no attribute 'tzinfo'"), traceback: Some(<traceback object at 0x00000192592C8940>) }
2024-11-19T11:31:25.2852293Z ===== 1 failed, 13563 passed, 40 skipped, 2 xfailed in 254.30s (0:04:14) ======
2024-11-19T11:31:25.6771944Z ##[error]Process completed with exit code 1.
2024-11-19T11:31:25.6996072Z Post job cleanup.
2024-11-19T11:31:25.9725140Z [command]"C:\Program Files\Git\bin\git.exe" version
2024-11-19T11:31:25.9968353Z git version 2.47.0.windows.1
2024-11-19T11:31:26.0032882Z Temporarily overriding HOME='D:\a\_temp\09b7a1b4-b0e1-44b1-b48b-64a41195c908' before making global git config changes
2024-11-19T11:31:26.0034380Z Adding repository directory to the temporary git global config as a safe directory
2024-11-19T11:31:26.0044176Z [command]"C:\Program Files\Git\bin\git.exe" config --global --add safe.directory D:\a\polars\polars
2024-11-19T11:31:26.0352376Z [command]"C:\Program Files\Git\bin\git.exe" config --local --name-only --get-regexp core\.sshCommand
2024-11-19T11:31:26.0574454Z [command]"C:\Program Files\Git\bin\git.exe" submodule foreach --recursive "sh -c \"git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :\""
2024-11-19T11:31:26.5085175Z [command]"C:\Program Files\Git\bin\git.exe" config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader
2024-11-19T11:31:26.5274100Z http.https://github.com/.extraheader
2024-11-19T11:31:26.5309604Z [command]"C:\Program Files\Git\bin\git.exe" config --local --unset-all http.https://github.com/.extraheader
2024-11-19T11:31:26.5537462Z [command]"C:\Program Files\Git\bin\git.exe" submodule foreach --recursive "sh -c \"git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :\""
2024-11-19T11:31:26.9699703Z Cleaning up orphan processes
2024-11-19T11:31:27.0051797Z Terminate orphan process: pid (1268) (vctip)
2024-11-19T11:31:27.0203632Z Terminate orphan process: pid (7152) (MSBuild)
2024-11-19T11:31:27.0339223Z Terminate orphan process: pid (5824) (conhost)

@nameexhaustion nameexhaustion changed the title poc (wip) Nov 19, 2024
Copy link

codecov bot commented Nov 19, 2024

Codecov Report

Attention: Patch coverage is 90.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 79.43%. Comparing base (03ba07a) to head (406decd).
Report is 32 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-python/src/conversion/any_value.rs 90.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #19866      +/-   ##
==========================================
- Coverage   79.44%   79.43%   -0.01%     
==========================================
  Files        1550     1551       +1     
  Lines      215192   215243      +51     
  Branches     2447     2452       +5     
==========================================
+ Hits       170952   170977      +25     
- Misses      43683    43708      +25     
- Partials      557      558       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@nameexhaustion nameexhaustion changed the title (wip) fix: Ensure type object of target for cached any-value conversion function is kept alive Nov 19, 2024
@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars and removed title needs formatting labels Nov 19, 2024
@nameexhaustion nameexhaustion changed the title fix: Ensure type object of target for cached any-value conversion function is kept alive fix: Ensure type object of inputs for cached any-value conversion functions are kept alive Nov 19, 2024

Python::with_gil(|py| {
let conversion_function = get_conversion_function(ob, py, allow_object)?;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix a regression from ebd5302#diff-778ed52e7b4a402304a62e51974c92208369baf61c50ad78eca715b7f6f18d89R478-R485 that caused us to always call get_conversion_function regardless of whether it existed in the cache.

#[derive(Debug, Clone)]
pub struct TypeObjectKey {
#[allow(unused)]
type_object: Arc<Py<PyType>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use PyObject here? I believe we don't need the extra Arc alloc that way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed - I was able to remove the Arc

Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one. Probably was an interesting debug. ;)

@nameexhaustion nameexhaustion marked this pull request as draft November 20, 2024 08:29
@nameexhaustion nameexhaustion marked this pull request as ready for review November 20, 2024 08:59
@ritchie46 ritchie46 merged commit 5f61d70 into pola-rs:main Nov 20, 2024
25 checks passed
@c-peters c-peters added the accepted Ready for implementation label Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants