Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nullable field with nested not nullable map in json #3900

Closed
spebern opened this issue Mar 22, 2023 · 1 comment · Fixed by #3906
Closed

Nullable field with nested not nullable map in json #3900

spebern opened this issue Mar 22, 2023 · 1 comment · Fixed by #3906
Labels
arrow Changes to the arrow crate bug

Comments

@spebern
Copy link
Contributor

spebern commented Mar 22, 2023

Which part is this question about

New raw json decoder.

Describe your question

While trying to use the new json decoder in delta I stumbled over the following issue:

#[test]
fn test_delta_checkpoint() {
    let json = "{\"protocol\":{\"minReaderVersion\":1,\"minWriterVersion\":2}}";
    let schema = Arc::new(Schema::new(vec![
        Field::new(
            "protocol",
            DataType::Struct(vec![
                Field::new("minReaderVersion", DataType::Int32, true),
                Field::new("minWriterVersion", DataType::Int32, true),
            ]),
            true,
        ),
        Field::new(
            "add",
            DataType::Struct(vec![Field::new(
                "partitionValues",
                DataType::Map(
                    Box::new(Field::new(
                        "key_value",
                        DataType::Struct(vec![
                            Field::new("key", DataType::Utf8, false),
                            Field::new("value", DataType::Utf8, true),
                        ]),
                        false,
                    )),
                    false,
                ),
                false, // <-- when this is true the test passes
            )]),
            true,
        ),
    ]));

    let batches = do_read(json, 1024, true, schema);
    assert_eq!(batches.len(), 1);
}

This fails with:

thread 'raw::tests::test_delta_checkpoint' panicked at 'called `Result::unwrap()` on an `Err` value: JsonError("expected { got null")', arrow-json/src/raw/mod.rs:389:18

It is related to nested map being not nullable (see comment in the code snippet). I'd expect that if a field (in the example "add") is null, the inner "not nullables" should
not get checked.

Does this work as expected or could this be a bug?

@spebern spebern added the question Further information is requested label Mar 22, 2023
@tustvold
Copy link
Contributor

This is a bug, and I understand what causes it, thank you for the report. Working on a fix

@tustvold tustvold added bug and removed question Further information is requested labels Mar 22, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 22, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue Mar 22, 2023
tustvold added a commit that referenced this issue Mar 24, 2023
* Enforce struct nullability in JSON raw reader (#3900) (#3904)

* Fix tests

* Review feedback
@tustvold tustvold added the arrow Changes to the arrow crate label Mar 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants