-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of malicious Readers in read_to_end #80895
Conversation
r? @shepmaster (rust-highfive has picked a reviewer for you, use r? to override) |
The job Click to see the possible cause of the failure (guessed by this bot)
|
5f45b80
to
966d071
Compare
library/std/src/io/mod.rs
Outdated
ret = Ok(g.len - start_len); | ||
break; | ||
} | ||
Ok(0) => return Ok(g.len - start_len), | ||
Ok(n) => g.len += n, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we check for an overflow here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but I would rather keep the codepath oriented as much as possible towards non-malicious implementations. Overflow can only be caused by bogus return values, and just means that a nonsensical result will be returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Malicious Read
can shrink g.len
with an overflow, which in turn can lead to a violation of append_to_string()
's second requirement. (and append_to_string()
should probably be marked as an unsafe function)
rust/library/std/src/io/mod.rs
Lines 320 to 329 in c97f11a
// The unsafety in this function is twofold: | |
// | |
// 1. We're looking at the raw bytes of `buf`, so we take on the burden of UTF-8 | |
// checks. | |
// 2. We're passing a raw buffer to the function `f`, and it is expected that | |
// the function only *appends* bytes to the buffer. We'll get undefined | |
// behavior if existing bytes are overwritten to have non-UTF-8 data. | |
fn append_to_string<F>(buf: &mut String, f: F) -> Result<usize> | |
where | |
F: FnOnce(&mut Vec<u8>) -> Result<usize>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the context:
rust/library/std/src/io/mod.rs
Lines 707 to 718 in c97f11a
fn read_to_string(&mut self, buf: &mut String) -> Result<usize> { | |
// Note that we do *not* call `.read_to_end()` here. We are passing | |
// `&mut Vec<u8>` (the raw contents of `buf`) into the `read_to_end` | |
// method to fill it up. An arbitrary implementation could overwrite the | |
// entire contents of the vector, not just append to it (which is what | |
// we are expecting). | |
// | |
// To prevent extraneously checking the UTF-8-ness of the entire buffer | |
// we pass it to our hardcoded `read_to_end` implementation which we | |
// know is guaranteed to only read data into the end of the buffer. | |
append_to_string(buf, |b| read_to_end(self, b)) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right. We'll need to replace that with a checked_add then.
This kind of mess is one of the things that the ReadBuf proposal is designed to avoid, so things should get less janky when that's implemented.
2c14deb
to
ebe402d
Compare
Updated to assert that the returned length is actually valid for the input buffer. |
library/std/src/io/mod.rs
Outdated
// particular, that could break read_to_string if the shortened buffer doesn't end on a UTF-8 boundary. | ||
// The minimal check would just be a checked_add, but this assert is a bit more precise and should be | ||
// just about the same cost. | ||
assert!(n <= g.buf.len() - g.len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to slice the slice that's been passed into the read
call in the first place and then take its length? As in:
let buffer = &mut g.buf[g.len..];
match r.read(buffer) {
// ...
Ok(n) => g.len += buffer[..n].len(),
}
or something along those lines? It has an added benefit of also detecting when returned n
is greater than the length of the length of the buffer provided, and may result in slightly less code generated overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check already is looking for n being greater than the length of the buffer, but agreed that structuring it like that is a bit more clear. Will update later today.
r? @m-ou-se |
cc @Mark-Simulacrum not sure if you think we should to backport to beta? |
I think that seems fine - it's not really necessary IMO given this has been like this for a long time now, but giving it's a soundness fix seems appropriate. I'm going to unilaterally beta-accept but will wait till it lands on master before backporting. |
Yeah, makes sense to me, particularly since the patch is so small. |
@bors r+ |
📌 Commit e6c07b0 has been approved by |
Fix handling of malicious Readers in read_to_end A malicious `Read` impl could return overly large values from `read`, which would result in the guard's drop impl setting the buffer's length to greater than its capacity! ~~To fix this, the drop impl now uses the safe `truncate` function instead of `set_len` which ensures that this will not happen. The result of calling the function will be nonsensical, but that's fine given the contract violation of the `Read` impl.~~ ~~The `Guard` type is also used by `append_to_string` which does not pass untrusted values into the length field, so I've copied the guard type into each function and only modified the one used by `read_to_end`. We could just keep a single one and modify it, but it seems a bit cleaner to keep the guard code close to the functions and related specifically to them.~~ To fix this, we now assert that the returned length is not larger than the buffer passed to the method. For reference, this bug has been present for ~2.5 years since 1.20: rust-lang@ecbb896. Closes rust-lang#80894.
Rollup of 17 pull requests Successful merges: - rust-lang#79982 (Add missing methods to unix ExitStatusExt) - rust-lang#80017 (Suggest `_` and `..` if a pattern has too few fields) - rust-lang#80169 (Recommend panic::resume_unwind instead of panicking.) - rust-lang#80217 (Add a `std::io::read_to_string` function) - rust-lang#80444 (Add as_ref and as_mut methods for Bound) - rust-lang#80567 (Add Iterator::intersperse_with) - rust-lang#80829 (Get rid of `DepConstructor`) - rust-lang#80895 (Fix handling of malicious Readers in read_to_end) - rust-lang#80966 (Deprecate atomic::spin_loop_hint in favour of hint::spin_loop) - rust-lang#80969 (Use better ICE message when no MIR is available) - rust-lang#80972 (Remove unstable deprecated Vec::remove_item) - rust-lang#80973 (Update books) - rust-lang#80980 (Fixed incorrect doc comment) - rust-lang#80981 (Fix -Cpasses=list and llvm version print with -vV) - rust-lang#80985 (Fix stabilisation version of slice_strip) - rust-lang#80990 (llvm: Remove the unused context from CreateDebugLocation) - rust-lang#80991 (Fix formatting specifiers doc links) Failed merges: - rust-lang#80944 (Use Option::map_or instead of `.map(..).unwrap_or(..)`) r? `@ghost` `@rustbot` modify labels: rollup
…ulacrum [beta] backports This backports: * Update RLS and Rustfmt rust-lang#81027 * bump rustfmt to v1.4.32 rust-lang#81093 * Fix handling of malicious Readers in read_to_end rust-lang#80895 * Fix broken ./x.py install rust-lang#80514 * Fix x.py install not working with relative prefix rust-lang#80797 * [security] Update mdbook rust-lang#80688 * rustdoc: Render visibilities succinctly rust-lang#80368 r? `@ghost`
Clarify docs for Read::read's return value Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards. I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this. The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895. I hope that clarifying the docs might help others to avoid this issue.
Clarify docs for Read::read's return value Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards. I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this. The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895. I hope that clarifying the docs might help others to avoid this issue.
Clarify docs for Read::read's return value Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards. I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this. The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895. I hope that clarifying the docs might help others to avoid this issue.
Clarify docs for Read::read's return value Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards. I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this. The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895. I hope that clarifying the docs might help others to avoid this issue.
Clarify docs for Read::read's return value Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards. I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this. The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895. I hope that clarifying the docs might help others to avoid this issue.
Clarify docs for Read::read's return value Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards. I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this. The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see rust-lang#80895. I hope that clarifying the docs might help others to avoid this issue.
A malicious
Read
impl could return overly large values fromread
, which would result in the guard's drop impl setting the buffer's length to greater than its capacity!To fix this, the drop impl now uses the safetruncate
function instead ofset_len
which ensures that this will not happen. The result of calling the function will be nonsensical, but that's fine given the contract violation of theRead
impl.TheGuard
type is also used byappend_to_string
which does not pass untrusted values into the length field, so I've copied the guard type into each function and only modified the one used byread_to_end
. We could just keep a single one and modify it, but it seems a bit cleaner to keep the guard code close to the functions and related specifically to them.To fix this, we now assert that the returned length is not larger than the buffer passed to the method.
For reference, this bug has been present for ~2.5 years since 1.20: ecbb896.
Closes #80894.