Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read on uninitialized buffer may cause UB ('uu_od' crate) #1729

Closed
JOE1994 opened this issue Feb 18, 2021 · 10 comments
Closed

Read on uninitialized buffer may cause UB ('uu_od' crate) #1729

JOE1994 opened this issue Feb 18, 2021 · 10 comments

Comments

@JOE1994
Copy link

JOE1994 commented Feb 18, 2021

Hello 🦀,
we (Rust group @sslab-gatech) found a memory-safety/soundness issue in this crate while scanning Rust code on crates.io for potential vulnerabilities.

Issue Description

let mut bytes: Vec<u8> = Vec::with_capacity(buf_size);
unsafe {
bytes.set_len(buf_size);
}
while self.skip > 0 {
let skip_count = cmp::min(self.skip, buf_size);
match self.inner.read_exact(&mut bytes[..skip_count]) {
Err(e) => return Err(e),
Ok(()) => self.skip -= skip_count,
}
}

<partialreader::PartialReader<R> as std::io::Read>::read() method creates an uninitialized buffer and passes it to user-provided Read implementation. This is unsound, because it allows safe Rust code to exhibit an undefined behavior (read from uninitialized memory).

This part from the Read trait documentation explains the issue:

It is your responsibility to make sure that buf is initialized before calling read. Calling read with an uninitialized buf (of the kind one obtains via MaybeUninit<T>) is not safe, and can lead to undefined behavior.

How to fix the issue?

The Naive & safe way to fix the issue is to always zero-initialize a buffer before lending it to a user-provided Read implementation. Note that this approach will add runtime performance overhead of zero-initializing the buffer.

As of Feb 2021, there is not yet an ideal fix that works with no performance overhead. Below are links to relevant discussions & suggestions for the fix.

@Arcterus
Copy link
Collaborator

Arcterus commented Feb 18, 2021

Is this actually an issue when the container’s value type is u8? Like, I understand how this would be an issue with something that implements Drop, but while this is perhaps technically undefined, there is no real alternative as you have said.

Of course, if someone has a way to fix this without wastefully initializing the array, I’d be very happy to do so.

@Arcterus
Copy link
Collaborator

I just took a solid look at the function, and I think it could actually be fixed pretty easily to avoid any allocations at all without using set_len(), so I guess I'll do that later anyway. I think there are more instances of set_len() in the code though, so my previous question still stands.

@JOE1994
Copy link
Author

JOE1994 commented Feb 18, 2021

Is this actually an issue when the container’s value type is u8? Like, I understand how this would be an issue with something that implements Drop, but while this is perhaps technically undefined, there is no real alternative as you have said.

Of course, if someone has a way to fix this without wastefully initializing the array, I’d be very happy to do so.

Thank you for your feedback :) It can be an issue even when the container's value type is u8.
If an arbitrary Read implementation that peeks(or reads) from the buffer before writing,
the compiler can propagate the uninitialized data of the buffer to the parts of the arbitrary Read implementation.

Here is a link to a document that explains the issue well:
https://paper.dropbox.com/doc/IO-Buffer-Initialization-MvytTgjIOTNpJAS6Mvw38

@Arcterus
Copy link
Collaborator

So, IIUC, in the case of u8, the concern is more so leaking potentially important information such as passwords, data structure values, etc. (since any uninitialized pattern can be replicated in safe code).

I don’t think these are actually concerns for the utilities here, but I’ll see if I can find a decent solution, as I’d prefer to use unsafe less anyway.

@Qwaz
Copy link

Qwaz commented Feb 19, 2021

Data leakage is not the only consequence of using an uninitialized memory. The Rust compiler optimizes the code based on the assumption that the buffer is fully initialized. Thus, an uninitialized buffer can lead to a miscompilation and breaks the Rust's safety guarantee ("safe Rust code never causes memory safety error"). Such examples are provided in Ralf Jung's blog post "What The Hardware Does" is not What Your Program Does: Uninitialized Memory and "But how bad are undefined values really?" section of RFC 2930.

@Arcterus
Copy link
Collaborator

I don’t see how that matters in this case as all u8 patterns are valid.

@Arcterus
Copy link
Collaborator

In terms of the links you gave, it seems that as long as the reader doesn't do something bizarre such as read from the buffer it is supposed to write to several times or simply not write anything into the buffer (neither of which are issues in our code AFAIK, although I may need to verify the latter in cases where the input is empty), things are alright. So, basically, I suppose the functions/readers that use Vec::set_len() with u8 should be marked as unsafe for now.

@Qwaz
Copy link

Qwaz commented Feb 19, 2021

If set_len() is not easily removable, zero-initializing the allocated region will also resolve the problem. Thank you for the quick fix and being responsive :)

@Arcterus
Copy link
Collaborator

I believe I've removed all the usages of Vec::set_len() that exhibit undefined behavior with the last od PR. My issue wasn't really with removing set_len(), but removing it in a way that wouldn't impact performance or would provide some other benefit (e.g. avoiding allocations). Unless something else comes up, I'll close this issue after the shred and od PRs are merged.

BTW, if you happen to find any other usage of unsafe that exhibits undefined behavior or is otherwise incorrect (or could simply be removed, as some of the code predates Rust 1.0), please feel free to open another issue.

@Arcterus
Copy link
Collaborator

This should be resolved now that the linked PRs have been merged. Please let me know if I missed something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants