-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend __getitem__
to include jagged and masked arrays in slices.
#67
Comments
Putting here an example of the pyarrow behavior: In [1]: import pyarrow as pa
In [2]: pa.array(range(5))
Out[2]:
<pyarrow.lib.Int64Array object at 0x112289c90>
[
0,
1,
2,
3,
4
]
In [3]: pa.array(range(5)).take(pa.array([1, None, 2]))
Out[3]:
<pyarrow.lib.Int64Array object at 0x1122dd130>
[
1,
null,
2
] |
pyarrow doesn't support it, but a logical extension should also do this: >>> py.array(range(5)).compress(py.array([False, True, None, None, True])
[
1,
null,
null,
4
] Of course, "compress" is a terrible name, and pyarrow's compress function does the more logical thing: lossless compression. However, when these are used in |
Step 1 is done (in PR #111): >>> ak.Array(range(5))[ak.Array([1, None, 2])]
<Array [1, None, 2] type='3 * ?int64'> |
Step 2 is done (also in PR #111): >>> ak.Array(range(5))[ak.Array([False, True, None, None, True])]
<Array [1, None, None, 4] type='4 * ?int64'> |
And all the jagged slices: >>> array = ak.Array([[1.1, 2.2, 3.3], [], [4.4, 5.5], [6.6], [7.7, 8.8, 9.9]]) >>> ak.tolist(array[[[0, -1], [], [], [0, 0, 0], [-1, -2, -3]]])
[[1.1, 3.3], [], [], [6.6, 6.6, 6.6], [9.9, 8.8, 7.7]] >>> ak.tolist(array[[[0, None, -1], [None], [], [0, None, 0], [-1, -2, -3]]])
[[1.1, None, 3.3], [None], [], [6.6, None, 6.6], [9.9, 8.8, 7.7]] >>> ak.tolist(array[[[0, -1], None, [], [], None, [0, 0, 0], [-1, -2, -3]]])
[[1.1, 3.3], None, [], [], None, [6.6, 6.6, 6.6], [9.9, 8.8, 7.7]] >>> ak.tolist(array[[[0, None, -1], None, [None], [], None, [0, 0, 0], [-1, -2, -3]]])
[[1.1, None, 3.3], None, [None], [], None, [6.6, 6.6, 6.6], [9.9, 8.8, 7.7]] |
And jagged mask (almost forgot the most important case!): >>> ak.tolist(array[[[False, False, True], [], [True, True], [False], [True, False, True]]])
[[3.3], [], [4.4, 5.5], [], [7.7, 9.9]] |
This can also have >>> ak.tolist(array[[[False, False, True], None, [], None, [True, True], [False], [True, False, True]]])
[[3.3], None, [], None, [4.4, 5.5], [], [7.7, 9.9]] |
Getting >>> ak.tolist(array[[[False, True, None], [None], [None, True], [False], [True, False, True]]])
[[2.2, None], [None], [None, 5.5], [], [7.7, 9.9]] You can even do them at both levels. >>> ak.tolist(array[[[False, True, None], None, [None], None, [None, True], [False], [True, False, True]]])
[[2.2, None], None, [None], None, [None, 5.5], [], [7.7, 9.9]] So this issue is closed. The |
Relies upon #66.Follow
pyarrow.Array
's behavior for slicing with masked arrays (IndexedOptionArray
,BitMaskedArray
, and eventuallyByteMaskedArray
).Will need to extend
Slice
hierarchy and add jagged and masked cases toContent::getitem_*
.The text was updated successfully, but these errors were encountered: