-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converted type is None according to Parquet Tools then utilizing logical types #3017
Comments
I've narrowed this down to pyarrow not being able to read the converted type correctly.
However, fastparquet is able to read the converted type, as it is correctly encoded in the thrift definition
I also tried https://github.com/xitongsys/parquet-go/tree/master/tool/parquet-tools, which resulted in
This leads me to think that this is actually a bug in the pyarrow, and therefore the C++ arrow implementation. Perhaps you might like to raise a bug there? |
Hi thanks for investigating this, so in context of the steps to reproduce you think the bug lies with
|
Hey, I found a (for me at least) new thing: If I set |
I can state categorically the converted type is always being written to the thrift metadata payload in the parquet file. As shown above readers not based on arrow C++ correctly find and read it, and as shown previously manually decoding the the metadata shows it is there. For some reason, intentional or a bug the C++ arrow reader is ignoring this in certain cases. You will need to take this up with them. |
Hello @tustvold , thanks for investigating. I will close the ticket. And open a ticken in Cheers, Markus |
Describe the bug
This regards the output written by the
parquet
crate. Declaring a column to containt a timestamp of microseconds using aLogicalType
causes the written file to not have a converted type. At least according toparquet-tools
.To Reproduce
tmp.par
with a single column with type Timestamp of Microseconds, using a logical type.parquet-tools
in a virtual environment and inspect the fileThe resulting output indicates no Converted type
Expected behavior
I would have expected the converted type to show up in the Metainformation emitted by parquet-tools.
Additional context
Triggered by upstream
odbc2parquet
issue pacman82/odbc2parquet#284. Azure can not seem to handle the output since migration toLogicalType
.Previously misdiagnosed this to not set the converted type correctly in the schema information, this however does happen. See: #2984.
Thanks any help is appreciated!
The text was updated successfully, but these errors were encountered: