-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix read! for types with size not a power of 2 #41711
Conversation
Currently, the fast path in `read!(::IO, ::AbstractArray{T})` checks for `isbitstype(T)` and whether the array is some nice type, then calls `unsafe_read` with `sizeof` for the number of bytes. This works great in the majority of cases, unless `sizeof(T)` is not a power of 2. The aligned size is what's used in `sizeof`, which overestimates the number of bytes required to read from the stream. To fix this, we can adjust the condition of the fast path to also determine whether `sizeof` matches the aligned size.
Of course this had to break tests I didn't run locally... The error is
It must be because |
Since `sizeof(UInt48)` is not a power of 2, the change in the previous commit makes `read!(::IO, ::AbstractVector{UInt48})` hit the non-fast-path branch that does successive calls to `read` rather than using `unsafe_read`. In order for that to work, we need a `read` method defined.
At least one of the following is true:
It may well be both but it's certainly not neither. |
I think in this case we assume the disk and memory representations are the same, i.e. that the alignment padding also exists on disk. So this is not necessarily a bug, unless there is some other code that makes a different assumption. |
@@ -588,6 +588,7 @@ end | |||
primitive type UInt48 48 end | |||
UInt48(x::UInt64) = Core.Intrinsics.trunc_int(UInt48, x) | |||
UInt48(x::UInt32) = Core.Intrinsics.zext_int(UInt48, x) | |||
Base.read(io::IO, ::Type{UInt48}) = only(reinterpret(UInt48, read(io, sizeof(UInt48)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must be what's causing the failure --- it reads too few bytes, failing to skip the alignment padding.
I ran into this when trying to read data encoded on disk as 24-bit integers as part of the BDF format. If you have an |
I think this is the critical context I was missing here. I guess that means that the case I mentioned above is more of a niche case that would need to be handled explicitly/differently rather than relying on the default behavior of |
I will assume that is the case and close |
Currently, the fast path in
read!(::IO, ::AbstractArray{T})
checks forisbitstype(T)
and whether the array is some nice type, then callsunsafe_read
withsizeof
for the number of bytes. This works great in the majority of cases, unlesssizeof(T)
is not a power of 2. The aligned size is what's used insizeof
, which overestimates the number of bytes required to read from the stream. To fix this, we can adjust the condition of the fast path to also determine whethersizeof
matches the aligned size.Note that I've marked this for backport to 1.6 and 1.7.