-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: prevent exponential memory growth in UnionArray #3119
fix: prevent exponential memory growth in UnionArray #3119
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpivarski - it looks like the tests on Mac are segfaulting and not only for this PR. I'm checking locally.
@jpivarski I agree with your reasoning. I haven't given much more thought to alternative solutions, but we can always revisit this in future! |
I can confirm that I can reproduce the segfault locally on MacOS 11.6 with Changing back to |
Although we could hold back |
I uninstalled both % python
Python 3.11.5 (main, Sep 11 2023, 08:19:27) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
zsh: segmentation fault python The % lldb python test_pyarrow.py
(lldb) target create "python"
Current executable set to '/Users/yana/anaconda3/bin/python' (x86_64).
(lldb) settings set -- target.run-args "test_pyarrow.py"
(lldb) run
Process 34536 launched: '/Users/yana/anaconda3/bin/python' (x86_64)
Process 34536 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
frame #0: 0x0000000129c8685e libarrow.1601.dylib`malloc_conf_init_helper + 302
libarrow.1601.dylib`malloc_conf_init_helper:
-> 0x129c8685e <+302>: movzbl (%rbx), %eax
0x129c86861 <+305>: testb %al, %al
0x129c86863 <+307>: je 0x129c8936e ; <+11326>
0x129c86869 <+313>: movq %rbx, %rcx
(lldb)
|
That reproducer must depend on the environment because it didn't segfault immediately on import for me. I also noticed in the output that this is x86-64. My laptop is ARM (M2). That's likely relevant. Could you send as much information as you can to the Arrow developers? Since we also need a short-term solution, we can put a version cap on Arrow in our tests for now. There are a few pull requests that need it. |
I've opened an issue apache/arrow#41696 |
Awesome! It looks like they're dealing with it right away. |
This PR introduces massive data inefficiencies for HEP analyses that involve combinations of multiple flavors of leptons, or generally gluing multiple tables together. We should reconsider. I would propose we revert this PR for now, there doesn't seem to be an issue this fixes, but rather one that might happen. I appreciate that, but we should be more careful and reconcile this with the needs of the primary user base. |
return self._contents[index]._carry(nextcarry, True) | ||
return self._contents[index]._carry(nextcarry, False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced lazy projection with eager projection here.
I decided to just make
UnionArray.simplified
not allow lazy carries. It should always return correct results and it can only have a performance impact on wide RecordArrays that go throughUnionArray.simplified
(perhaps viaak.concatenate
).The new test would fail for the old code: for 5 loops, the NumpyArray length gets up to 32, and we assert
<= 2
.I only had to change one test, which was hard-coded to a particular Form. It was for testing
allow_noncanonical_form=True
inak.from_buffers
. I think that feature is still being tested with the updated test.@agoose77, am I missing anything here? Would you do anything differently?