Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added toggle for Parquet V1 and V2 formats #9497

Closed
wants to merge 2 commits into from

Conversation

joshthoward
Copy link
Member

This PR creates a toggle to write data in the Parquet V1 format using the native writer. The default version remains the same to ensure that there are no compatibility issues here.

With this change:

  • Hive cannot read Parquet V1 files created with the native Trino writer due to org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.hive.serde2.io.HiveVarcharWritable for all tested Hive versions (there are mixed errors for Parquet V2 files)
  • Spark can read Parquet V2 files created with the native Trino writer with Spark's vectorized reader turned off
  • Spark can read Parquet V1 files created with the native Trino writer with Spark's vectorized reader turned on or off

Why does this seem to work? (TBH I wouldn't have expected it to)

  • ValuesWriter is injected into PrimitiveColumnWriter here.
  • valuesWriterFactory creates a new ValuesWriter here.
  • Trino's native parquet writer still uses DefaultValuesWriterFactory here
  • DefaultValuesWriterFactory respects the Parquet V1 or V2 flag for constructing ValuesWriters here.

This PR is still a WIP, but I wanted to post this for discussion.

@findepi
Copy link
Member

findepi commented Oct 5, 2021

Hive cannot read Parquet V1 files created with the native Trino writer due to org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.hive.serde2.io.HiveVarcharWritable for all tested Hive versions (there are mixed errors for Parquet V2 files)

This sounds like important piece to solve #6377

This PR creates a toggle to write data in the Parquet V1 format using the native writer.

It seems like we didn't make a very deliberate decision to use file format version V2 and we later realized this is causing problems (#7953)

We should switch to v1 for now, unless we can name actual benefits of having a toggle.
Requiring users to switch the toggle so that Trino writes data that can be read by others is not a good deal for me, but maybe i am missing something.

@findepi
Copy link
Member

findepi commented Oct 5, 2021

@findepi
Copy link
Member

findepi commented Oct 5, 2021

per @anjalinorwood #7953 (comment)

changing the withWriterVersion might be not enough.

@anjalinorwood
Copy link
Member

+1000

It seems like we didn't make a very deliberate decision to use file format version V2 and we later realized this is causing problems (#7953)

We should switch to v1 for now, unless we can name actual benefits of having a toggle. Requiring users to switch the toggle so that Trino writes data that can be read by others is not a good deal for me, but maybe i am missing something.

@martint
Copy link
Member

martint commented Oct 5, 2021

We should switch to v1 for now, unless we can name actual benefits of having a toggle.

Yes, I agree that we should switch to V1 since that's what's supported more broadly. I think we should keep V2 as an experimental flag. It's useful to be able to write data in the new format for others to be able to test compatibility as they develop that support in their engines.

@joshthoward
Copy link
Member Author

@anjalinorwood Based on your comments in #7953, did you have an explicit test case where something else was needed to support the V1 spec?

Writing DataPageHeaderV1 in PrimitiveColumnWriter#flushCurrentPageToBuffer (example below) results in Spark thinking that the page is compressed... I think this a bug with Spark missing the compression flag.

parquetMetadataConverter.writeDataPageV1Header((int) uncompressedSize,
                    (int) compressedSize,
                    currentPageRowCount,
                    Encoding.RLE, // Encoding.RLE matches rlEncoding value of RunLengthBitPackingHybridEncoder
                    Encoding.RLE, // Encoding.RLE matches dlEncoding value of RunLengthBitPackingHybridEncoder
                    primitiveValueWriter.getEncoding(),
                    pageHeaderOutputStream);

@findepi
Copy link
Member

findepi commented Oct 5, 2021

for others to be able to test compatibility as they develop that support in their engines.

it would be awesome if Trino could become the reference implementation for Parquet v2 format, but i would assume parquet-mr is going to be that no matter what we do.
We don't know if current implementation is 100% spec-wise with Parquet, do we?

Toggles are not something to add haphazardly. There is a cost to them in terms of code and cognitive complexity that affects future maintainability and ability to safely evolve the system.

@anjalinorwood Based on your comments in #7953, did you have an explicit test case where something else was needed to support the V1 spec?

Good question. While we look for the answer to that, i think we should also take a look what it would take to produce files that are readable by Hive.

@alexjo2144
Copy link
Member

If the other PR gets merged first we'll have a small merge conflict with https://github.com/trinodb/trino/pull/9569/files/fd0c06389c6977de7ecc96a70ec6959086f90446#r725263578

But it should just be another case of switching a "v2" to a "v1"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants