Support writing nested lists to parquet #1746

tustvold · 2022-05-25T17:31:13Z

Which issue does this PR close?

Closes #1744.

Rationale for this change

The existing logic to handle nested types was incredibly convoluted, performed a lot of unnecessary work when writing non-nested types, and it was unclear how it could be made to handle more than one layer of nested lists.

What changes are included in this PR?

This PR therefore reworks the levels computation logic to hopefully be easier to understand, and support arbitrary nesting of lists.

Are there any user-facing changes?

No

tustvold · 2022-05-26T11:29:19Z

This actually has an issue with nulls in struct arrays with non-null slices in child list arrays... Working on fixing...

tustvold · 2022-05-26T15:48:10Z

I think this is now ready for review, going to work on getting a few more esoteric tests, but I think the meat of it is ready

parquet/src/arrow/levels.rs

tustvold · 2022-05-26T18:52:09Z

parquet/src/arrow/levels.rs


-    #[test]
-    fn test_calculate_array_levels_twitter_example() {


I'm currently working through porting these tests

codecov-commenter · 2022-05-26T19:24:15Z

Codecov Report

Merging #1746 (8b3c37f) into master (722fcfc) will increase coverage by 0.18%.
The diff coverage is 97.49%.

❗ Current head 8b3c37f differs from pull request most recent head abe5c36. Consider uploading reports for the commit abe5c36 to get more accurate results

@@            Coverage Diff             @@
##           master    #1746      +/-   ##
==========================================
+ Coverage   83.27%   83.46%   +0.18%     
==========================================
  Files         195      196       +1     
  Lines       55896    55895       -1     
==========================================
+ Hits        46549    46653     +104     
+ Misses       9347     9242     -105

Impacted Files	Coverage Δ
parquet/src/arrow/levels.rs	`96.69% <95.90%> (+11.89%)`	⬆️
parquet/src/arrow/arrow_writer.rs	`97.77% <97.56%> (+0.11%)`	⬆️
arrow/src/compute/kernels/aggregate.rs	`75.26% <100.00%> (+2.03%)`	⬆️
arrow/src/compute/kernels/concat_elements.rs	`100.00% <100.00%> (ø)`
arrow/src/ipc/reader.rs	`90.74% <100.00%> (+2.01%)`	⬆️
arrow/src/datatypes/datatype.rs	`65.42% <0.00%> (-0.38%)`	⬇️
parquet_derive/src/parquet_field.rs	`65.75% <0.00%> (-0.23%)`	⬇️
arrow/src/array/transform/mod.rs	`86.85% <0.00%> (+0.11%)`	⬆️
parquet/src/encodings/encoding.rs	`93.65% <0.00%> (+0.19%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 722fcfc...abe5c36. Read the comment docs.

alamb

That certainly seems to have made the code simpler, as measured by lines ;) Thank you for the efforts @tustvold

I also read all the code carefully. While I can't say I got every line I would feel much better about debugging and maintaining the code after this PR than before.

There appear to be some non trivial changes to the tests and the cases that are covered -- I wonder if this was intentional or if you are still working to port remaining tests?

Also I don't really have a sense for any end to end testing we have (like do we have tests that round trip StructArrays to and back from parquet? That might be worth writing if we don't have them yet (as a follow on PR??)

cc @nevi-me and @helgikrs in case you have time or interest in reviewing this code

I wonder if @ahmedriza has time to try his reproducer out with the code in this PR?

alamb · 2022-05-26T18:55:22Z

parquet/src/arrow/arrow_writer.rs

-    let indices = levels.filter_array_indices();
-    // Slice array according to computed offset and length
-    let column = column.slice(levels.offset, levels.length);
+    // TODO: Avoid filtering if no need


Should this be a follow on ticket to track? It doesn't seem like this PR is any worse in the delpartment

This is actually a hold over from an intermediate version of this PR, you can only avoid doing the "take" operation if there are no-nulls which is relatively rare, will remove

alamb · 2022-05-26T19:03:10Z

parquet/src/arrow/levels.rs

+/// i.e. a leaf array with no children
+fn is_leaf(data_type: &DataType) -> bool {
+    matches!(
+        data_type,


I wonder if this would be more future proof if it were !matches!(DataType::Dictionary....) -- aka explicitly list out the types that are handled in the rest of this module rather than trying to enumerate the converse.

I am just thinking about what would happen if someone added a new non leaf type to arrow's DataType

I'm not sure I follow? The intent is that we can match just dictionaries where the value type is a leaf, as this can be handled transparently. We don't support complex dictionary value types (and for what it is worth neither does Arrow C++). We used to simply assume that dictionary types were primitive, this is completing a TODO I encountered in the code - https://github.com/apache/arrow-rs/pull/1746/files/b104aba76a6868e4296eb0f5a7b4fd0b8960eb62#diff-4b715628e2e0ae6f66e590227d9587cd5f2155055a59ad6b7b0dc7b1914ad8edL316

Edit: Oh I see what you're saying, I think if a new arrow type were added we would likely need additional work in the parquet writer to support it, and so I think not automatically supporting it is probably safer.

Edit: Oh I see what you're saying, I think if a new arrow type were added we would likely need additional work in the parquet writer to support it, and so I think not automatically supporting it is probably safer.

Yes, thank you -- this is what I was trying to say 💯

We don't support complex dictionary value types (and for what it is worth neither does Arrow C++)

I didn't realize that. TIL!

alamb · 2022-05-26T19:06:23Z

parquet/src/arrow/levels.rs

+/// A helper to construct [`LevelInfo`] from a potentially nested [`Field`]
+enum LevelInfoBuilder {
+    Primitive(LevelInfo),
+    List(Box<LevelInfoBuilder>, LevelContext),


It might help to add some docstrings here about what the Box<LevelInfoBuilder> and Vec<LevelBuilder> represent

alamb · 2022-05-26T19:15:08Z

parquet/src/arrow/levels.rs

+                    for _ in 0..len {
+                        def_levels.push(ctx.def_level - 1);
+                    }


I wonder if this would more "idomatic" if it were written like:

def_levels .extend( std::iter::repeat(ctx.def_level - 1) .take(len) )

? Though I admit the current code seems more readable 👍

alamb · 2022-05-26T19:15:25Z

parquet/src/arrow/levels.rs

+                let mut last_non_null_idx = None;
+                let mut last_null_idx = None;
+
+                // TODO: BitChunkIterator


is this a TODO for this PR or for a follow on?

alamb · 2022-05-26T19:29:15Z

parquet/src/arrow/levels.rs

-        };
-        let array_offsets = vec![0, 2, 2, 4, 8, 11];
-        let array_mask = vec![true, false, true, true, true];
+        let leaf_array = Int32Array::from_iter([0, 0, 2, 2, 3, 3, 3, 3, 4, 4, 4]);


we seem to have lost a substantial number of verifications in this test. Is that intended?

The diff is being unhelpful, once I've ported the remaining tests it should sort itself out

Will check it again when you are done porting the test

parquet/src/arrow/levels.rs

alamb · 2022-05-26T19:38:15Z

parquet/src/arrow/levels.rs

-            offset: 0,
-            length: 5,
+    fn test_struct_mask_list() {
+        // Test a struct array masking a list


I don't understand what "masking a list" means in this case

I've reworded to hopefully make it clearer

parquet/src/arrow/levels.rs

ahmedriza · 2022-05-26T20:59:49Z

This is fantastic. Checked my test cases against this PR and they work perfectly now. Also checked a few other combinations and they are all handled correctly now.

Empty arrays in bids column:

+---+----------------------+----+
|id |prices                |bids|
+---+----------------------+----+
|t1 |[[GBP, 3.14, [[LON]]]]|[]  |
|t2 |[[USD, 4.14, [[NYC]]]]|[]  |
+---+----------------------+----+

Nulls in bids column:

+---+----------------------+----+
|id |prices                |bids|
+---+----------------------+----+
|t1 |[[GBP, 3.14, [[LON]]]]|null|
|t2 |[[USD, 4.14, [[NYC]]]]|null|
+---+----------------------+----+

Mixture of non-null and null in bids column

+---+----------------------+----------------------+
|id |prices                |bids                  |
+---+----------------------+----------------------+
|t1 |[[GBP, 3.14, [[LON]]]]|[[GBP, 3.04, [[LON]]]]|
|t2 |[[USD, 4.14, [[NYC]]]]|null                  |
+---+----------------------+----------------------+

Amazing work. Thanks for the quick fix.

tustvold · 2022-05-26T21:09:32Z

parquet/src/arrow/levels.rs

-            // 2 3 [4] are 0
-            // 4 5 6 7 [8] are 1 (defined at level 1 only)
-            // 8 9 10 [11] are 2 (defined at both levels)
-            definition: vec![0, 0, 1, 0, 0, 3, 3, 3, 3, 3, 3, 3],


This test is actually wrong, the levels are actually incoherent. This can be clearly seen from the first two levels.

The definition levels are [0, 0] implying it is null at the root for both, and yet it has repetition levels of [0, 1] which suggests a non-null slice of two elements, these are clearly not compatible.

I suspect the old code is not correctly handling a struct array masking a non-empty list slice of a child, and is producing a level for each entry in the child slice, instead of a single level for the struct's null.

What I was trying to express here was the list's values even though the struct is null at that point. Hence you see d: [0, 0], r: [0, 1] to say that the value has 2 slots, which are all null due to the struct being null at that point.

Perhaps my oversight was not considering that this might not matter at all, and it's safer/better to do d: [0], r: [0].
What limited me a lot was my indexing logic, because given an child array of 11 values, producing fewer than 11 repetition levels would lose the offset information from the Arrow side.

Looking at your solution, you have 10 values vs my 12 (11 values and an extra value to encode the offset [2, 2] in the list). When slicing into the child list doesn't cause issues, your solution is fine/better.

I suppose the root problem is Arrow's logical null rules. You could have a 1GB record with all sorts of values, but if you wrap that against a null struct (e.g. struct<all the values>), then you get different behaviour:

IPC roundtrip would preserve the values

Writing to and reading from Parquet would drop all those null values.

I was trying to preserve data, but maybe while looking at the minor details, I ended up missing the bigger picture.

Yeah, handling structs masking lists tripped me up the first time also, I had to completely rewrite my rewrite to support it 😆

I think the important property is what you read out is "logically-equal" to what you wrote in, even if it doesn't have the exact same data in the backing buffers. StructArrays are one example where this would occur, but a sliced array would be another example where the underlying arrow buffers are not preserved. This is in fact critical to how the ArrowWriter batches row groups by slicing arrays.

IPC roundtrip would preserve the values

TBH I view this as a potentially undesirable implementation quirk than a desirable property, see #208.

I suppose the root problem is Arrow's logical null rules

I think you could be more general and say arrow has a relaxed approach to the backing representation of an array. This allows it to avoid copies, but comes at the cost of things like the never-ending bug-source that is ArrayData::offset() 😆

Yup, that's true. We also had a lot more logical-null bugs (esp slicing) at the time that I wrote the level logic. So it was harder to avoid the undesirable quirks at the time.

If one will have to materialise data, they might as well remove data that's been made null due to masking, so the new approach is more desirable.

tustvold · 2022-05-26T21:31:51Z

parquet/src/arrow/levels.rs

-            // 3: [[108, 109], [110, 111], [112, 113], [114, 115]]
-            // 4: [[116, 117], [118, 119], [120, 121]]
-            definition: vec![
-                0, 0, 0, 0, 1, 0, 0, 0, 0, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,


Again, these definition levels are wrong. In the structure above the only non-max definition level will come from the empty slice.

tustvold · 2022-05-26T21:49:43Z

Ok I think I've ported all the tests, there are some omissions because we no longer compute the levels for non-leaf arrays (whatever that actually means), and test_calculate_array_levels_2 was just wrong, but otherwise everything matches 🎉

alamb · 2022-05-27T10:33:49Z

🎉 arrow 15 is going to be a great release

nevi-me · 2022-05-27T10:51:47Z

I was still going through it, looks great, I hadn't seen anything that stood out, so I'm happy with the merge

tustvold · 2022-05-27T10:52:58Z

Oh sorry, please let me know if you see anything and I'll be more than happy to address in a follow on PR 😅

nevi-me · 2022-05-27T11:13:49Z

Oh sorry, please let me know if you see anything and I'll be more than happy to address in a follow on PR 😅

I only added a comment explaining why the def/rep for <struct<list>> appeared the way they did in the old test. Thanks for fixing the logic :D

nevi-me · 2022-05-27T11:49:42Z

Worth advertising @tustvold @alamb is at least 30% improvement in the writer due to this PR. At least that's on an M1 Pro (now I'm curious about x86)

write_batch primitive/4096 values primitive                                                                             
                        time:   [1.1237 ms 1.1270 ms 1.1303 ms]
                        thrpt:  [156.08 MiB/s 156.54 MiB/s 156.99 MiB/s]
                 change:
                        time:   [-29.494% -28.991% -28.501%] (p = 0.00 < 0.05)
                        thrpt:  [+39.863% +40.828% +41.831%]
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe
write_batch primitive/4096 values primitive non-null                                                                            
                        time:   [979.35 us 982.76 us 986.33 us]
                        thrpt:  [178.86 MiB/s 179.51 MiB/s 180.13 MiB/s]
                 change:
                        time:   [-30.346% -29.722% -29.002%] (p = 0.00 < 0.05)
                        thrpt:  [+40.849% +42.291% +43.567%]
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe
write_batch primitive/4096 values bool                                                                            
                        time:   [62.749 us 63.655 us 64.860 us]
                        thrpt:  [17.527 MiB/s 17.859 MiB/s 18.116 MiB/s]
                 change:
                        time:   [-33.845% -32.730% -31.281%] (p = 0.00 < 0.05)
                        thrpt:  [+45.520% +48.655% +51.159%]
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
write_batch primitive/4096 values bool non-null                                                                             
                        time:   [40.799 us 40.848 us 40.901 us]
                        thrpt:  [27.793 MiB/s 27.830 MiB/s 27.863 MiB/s]
                 change:
                        time:   [-45.958% -45.620% -45.294%] (p = 0.00 < 0.05)
                        thrpt:  [+82.796% +83.891% +85.043%]
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe
write_batch primitive/4096 values string                                                                            
                        time:   [627.92 us 634.23 us 642.61 us]
                        thrpt:  [123.85 MiB/s 125.49 MiB/s 126.75 MiB/s]
                 change:
                        time:   [-23.875% -23.318% -22.771%] (p = 0.00 < 0.05)
                        thrpt:  [+29.486% +30.408% +31.362%]
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  8 (8.00%) high mild
  3 (3.00%) high severe
write_batch primitive/4096 values string non-null                                                                            
                        time:   [617.66 us 618.60 us 619.56 us]
                        thrpt:  [128.46 MiB/s 128.66 MiB/s 128.86 MiB/s]
                 change:
                        time:   [-26.558% -25.760% -25.150%] (p = 0.00 < 0.05)
                        thrpt:  [+33.601% +34.699% +36.162%]
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  3 (3.00%) high severe

Benchmarking write_batch nested/4096 values primitive list: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.4s, enable flat sampling, or reduce sample count to 60.
write_batch nested/4096 values primitive list                                                                             
                        time:   [1.2680 ms 1.2695 ms 1.2709 ms]
                        thrpt:  [128.87 MiB/s 129.01 MiB/s 129.16 MiB/s]
                 change:
                        time:   [-33.645% -33.420% -33.197%] (p = 0.00 < 0.05)
                        thrpt:  [+49.694% +50.195% +50.705%]
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  1 (1.00%) low severe
  6 (6.00%) low mild
  2 (2.00%) high mild
  5 (5.00%) high severe
Benchmarking write_batch nested/4096 values primitive list non-null: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, enable flat sampling, or reduce sample count to 50.
write_batch nested/4096 values primitive list non-null                                                                             
                        time:   [1.6240 ms 1.6347 ms 1.6475 ms]
                        thrpt:  [117.49 MiB/s 118.41 MiB/s 119.19 MiB/s]
                 change:
                        time:   [-29.010% -28.463% -27.890%] (p = 0.00 < 0.05)
                        thrpt:  [+38.676% +39.788% +40.866%]
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  8 (8.00%) high mild
  5 (5.00%) high severe

nevi-me · 2022-05-27T12:45:56Z

amd64, 3900x on a pcie4 SSD, less dramatic but still great

Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.0s, enable flat sampling, or reduce sample count to 50.
write_batch primitive/4096 values primitive
                        time:   [1.6179 ms 1.6298 ms 1.6462 ms]
                        thrpt:  [107.17 MiB/s 108.24 MiB/s 109.04 MiB/s]
                 change:
                        time:   [-31.768% -19.351% -7.9981%] (p = 0.00 < 0.05)
                        thrpt:  [+8.6934% +23.994% +46.559%]
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
Benchmarking write_batch primitive/4096 values primitive non-null: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.1s, enable flat sampling, or reduce sample count to 50.
write_batch primitive/4096 values primitive non-null
                        time:   [1.3544 ms 1.3650 ms 1.3777 ms]
                        thrpt:  [128.05 MiB/s 129.25 MiB/s 130.25 MiB/s]
                 change:
                        time:   [-20.261% -14.076% -10.278%] (p = 0.00 < 0.05)
                        thrpt:  [+11.455% +16.382% +25.409%]
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe
write_batch primitive/4096 values bool
                        time:   [98.617 us 99.677 us 101.01 us]
                        thrpt:  [11.254 MiB/s 11.405 MiB/s 11.527 MiB/s]
                 change:
                        time:   [-42.532% -30.107% -17.429%] (p = 0.00 < 0.05)
                        thrpt:  [+21.108% +43.076% +74.009%]
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) high mild
  8 (8.00%) high severe
write_batch primitive/4096 values bool non-null
                        time:   [51.438 us 51.798 us 52.269 us]
                        thrpt:  [21.749 MiB/s 21.946 MiB/s 22.100 MiB/s]
                 change:
                        time:   [-36.532% -34.935% -33.116%] (p = 0.00 < 0.05)
                        thrpt:  [+49.513% +53.693% +57.560%]
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe
write_batch primitive/4096 values string
                        time:   [771.00 us 790.87 us 832.66 us]
                        thrpt:  [95.586 MiB/s 100.64 MiB/s 103.23 MiB/s]
                 change:
                        time:   [-8.6599% -2.6996% +8.1085%] (p = 0.73 > 0.05)
                        thrpt:  [-7.5003% +2.7745% +9.4810%]
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) high mild
  6 (6.00%) high severe
write_batch primitive/4096 values string non-null
                        time:   [780.70 us 785.66 us 793.50 us]
                        thrpt:  [100.30 MiB/s 101.30 MiB/s 101.95 MiB/s]
                 change:
                        time:   [-11.689% -9.2061% -6.3254%] (p = 0.00 < 0.05)
                        thrpt:  [+6.7525% +10.140% +13.236%]
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  8 (8.00%) high mild
  3 (3.00%) high severe

Benchmarking write_batch nested/4096 values primitive list: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.0s, enable flat sampling, or reduce sample count to 50.
write_batch nested/4096 values primitive list
                        time:   [1.5257 ms 1.5327 ms 1.5410 ms]
                        thrpt:  [106.28 MiB/s 106.86 MiB/s 107.35 MiB/s]
                 change:
                        time:   [-18.698% -10.026% +0.4958%] (p = 0.04 < 0.05)
                        thrpt:  [-0.4934% +11.143% +22.998%]
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe
write_batch nested/4096 values primitive list non-null
                        time:   [1.9924 ms 2.0106 ms 2.0334 ms]
                        thrpt:  [95.194 MiB/s 96.274 MiB/s 97.150 MiB/s]
                 change:
                        time:   [-11.857% -10.947% -9.8362%] (p = 0.00 < 0.05)
                        thrpt:  [+10.909% +12.292% +13.452%]
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) high mild
  6 (6.00%) high severe

github-actions bot added the parquet Changes to the parquet crate label May 25, 2022

tustvold force-pushed the rework-writer-levels branch 2 times, most recently from 94b30f0 to b50ba82 Compare May 26, 2022 15:41

Support writing arbitrarily nested arrow arrays (apache#1744)

9ede644

tustvold force-pushed the rework-writer-levels branch from b50ba82 to 9ede644 Compare May 26, 2022 15:42

tustvold marked this pull request as ready for review May 26, 2022 15:46

More tests

b104aba

tustvold changed the title ~~Support writing arbitrarily nested arrow arrays~~ Support writing nested lists to parquet May 26, 2022

tustvold commented May 26, 2022

View reviewed changes

parquet/src/arrow/levels.rs Show resolved Hide resolved

tustvold commented May 26, 2022

View reviewed changes

alamb mentioned this pull request May 26, 2022

Parquet write failure (from record batches) when data is nested two levels deep #1744

Closed

Port more tests

a3ba828

alamb reviewed May 26, 2022

View reviewed changes

alamb approved these changes May 26, 2022

View reviewed changes

nevi-me self-requested a review May 26, 2022 19:43

tustvold added 2 commits May 26, 2022 21:17

More tests

d71bec0

Review feedback

f87d1e6

tustvold commented May 26, 2022

View reviewed changes

parquet/src/arrow/levels.rs Show resolved Hide resolved

Reduce test churn

abe5c36

tustvold commented May 26, 2022

View reviewed changes

Port remaining tests

ea9339d

tustvold commented May 26, 2022

View reviewed changes

Review feedback

a978b60

tustvold added 2 commits May 26, 2022 23:26

Fix clippy

9f3e5d5

Merge remote-tracking branch 'upstream/master' into rework-writer-levels

a6fbfce

tustvold merged commit 8e1666a into apache:master May 27, 2022

tustvold mentioned this pull request Oct 28, 2022

Panic when writing Parquet from non-nullable ListArray #385

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support writing nested lists to parquet #1746

Support writing nested lists to parquet #1746

tustvold commented May 25, 2022 •

edited

Loading

tustvold commented May 26, 2022

tustvold commented May 26, 2022

tustvold May 26, 2022

codecov-commenter commented May 26, 2022 •

edited

Loading

alamb left a comment

alamb May 26, 2022

tustvold May 26, 2022 •

edited

Loading

alamb May 26, 2022

tustvold May 26, 2022 •

edited

Loading

alamb May 26, 2022

alamb May 26, 2022

alamb May 26, 2022

alamb May 26, 2022

alamb May 26, 2022

alamb May 26, 2022

tustvold May 26, 2022

alamb May 26, 2022

alamb May 26, 2022

tustvold May 26, 2022

ahmedriza commented May 26, 2022 •

edited

Loading

tustvold May 26, 2022 •

edited

Loading

nevi-me May 27, 2022

tustvold May 27, 2022

nevi-me May 27, 2022

tustvold May 26, 2022

tustvold commented May 26, 2022

alamb commented May 27, 2022

nevi-me commented May 27, 2022

tustvold commented May 27, 2022

nevi-me commented May 27, 2022

nevi-me commented May 27, 2022 •

edited

Loading

nevi-me commented May 27, 2022

Support writing nested lists to parquet #1746

Support writing nested lists to parquet #1746

Conversation

tustvold commented May 25, 2022 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

tustvold commented May 26, 2022

tustvold commented May 26, 2022

Choose a reason for hiding this comment

codecov-commenter commented May 26, 2022 • edited Loading

Codecov Report

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tustvold May 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tustvold May 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahmedriza commented May 26, 2022 • edited Loading

tustvold May 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tustvold commented May 26, 2022

alamb commented May 27, 2022

nevi-me commented May 27, 2022

tustvold commented May 27, 2022

nevi-me commented May 27, 2022

nevi-me commented May 27, 2022 • edited Loading

nevi-me commented May 27, 2022

tustvold commented May 25, 2022 •

edited

Loading

codecov-commenter commented May 26, 2022 •

edited

Loading

tustvold May 26, 2022 •

edited

Loading

tustvold May 26, 2022 •

edited

Loading

ahmedriza commented May 26, 2022 •

edited

Loading

tustvold May 26, 2022 •

edited

Loading

nevi-me commented May 27, 2022 •

edited

Loading