Stabilize Rng and SystemRng#157168
Conversation
This comment has been minimized.
This comment has been minimized.
18dd02e to
eb9d7c8
Compare
|
If it’s only the first call that can fail, could we put |
|
@jdahlstrom That would force every caller to deal with it, albeit only once. If we (in the future) provide a fallible |
|
I'm un-marking this as a draft. Based on experiments with As for |
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
Personally, I do not support this stabilization. The most pressing needs can be alleviated by stabilizing a free-standing (potentially panicking)
IMO they should be named
I don't think that added
This does not apply to HW-based RNGs used in cryptography. Not only they are IO-based, but also commonly use internal security checks. The same somewhat applies to RNGs built-in into CPUs. For example, RDRAND may in theory fail at any moment and some buggy AMD CPUs are known to produce bad values (e.g. after hybernation) which are guarded against with runtime checks. In some niche cases it's also important to prove absence of panics and the suggested potentially panicking behavior will be an annoying hindrance. Checking for errors could also be useful in scenarios where we mix entropy from different sources where failure of one source does not stop the system. |
| /// A source of randomness. | ||
| #[unstable(feature = "random", issue = "130703")] | ||
| #[stable(feature = "random_source", since = "CURRENT_RUSTC_VERSION")] | ||
| pub trait RandomSource { |
There was a problem hiding this comment.
Regarding next_u32/next_u64, while I really want DefaultRandomSource.fill_bytes (in some form) stabilized ASAP, I have reservations about leaving it at "it's not clear we need them for performance and we can add them later". Personally I'd rather err on the side of adding these methods, unless we're quite sure we will never need them, or it's clear that we can't resolve the question in a reasonable time frame.
Adding the methods after stabilization has a cost (even besides opportunity cost). As @dhardy pointed out in the past, adding provided method later means existing implementers that want to offer reproducibility (as in stability of produced values) can't override the provided methods without breaking reproducibility for users who started using those methods. And for libraries that use RNGs to sample some distribution and want to promise reproducibility of that sampling, the same problem applies if they're first written against fill_bytes and later want to use next_uN.
Another (smaller) reason to err on the side of including these is to ease the ecosystem's transition from rand traits (which have always had next_u32/u64) to the std trait. If std doesn't have the methods at first and adds them later, that's two unnecessarily transitions (rand::Rng::next_uN -> fill_bytes + uN::from_*e_bytes -> RandomSource::next_uN). Stabilizing some subset of distributions would avoid this, but the distributions are far from ready for stabilization.
Finally, while the benchmarks in #157193 and on Zulip don't have a smoking gun that the methods are necessary for performance, it's also not clear that we won't want them. Even those benchmarks show a benefit for dyn RandomSource (the only argument is whether you consider that compatible with "cares about performance"), and @dhardy previously mentioned that rand has benchmarks justifying the methods in rand's context. At minimum we should look at those benchmarks as well and see if the fill_bytes semantics (which I think matches rand's) actually works for those benchmarks as well.
There was a problem hiding this comment.
IIRC, it's possible to use inlining and https://doc.rust-lang.org/std/intrinsics/fn.is_val_statically_known.html to perform these optimisations without needing the API surface.
There was a problem hiding this comment.
Inlining doesn't work for dyn RandomSource. And is_val_statically_known only helps when the two implementations that have the same behavior, but in this case, some potentially desirable optimizations change behavior. (Also, the intrinsic doesn't seem to have a clear path to being exposed on stable.)
|
I oppose this stabilization, as I've mentioned before I don't think we are at a point where we want to stabilize traits or anything that represents or implies a canonical "way to do random number generation". The current proposal with There is one real need from the standard library: a (no_std overridable) source of random bytes. This should simply be a function without further baggage or API precedent. Only once we have a clear view of what an opinionated |
To quote the
This is rather vague. Would a report of a defective implementation be considered a security issue? E.g. But my biggest concern is what happens on unsupported platforms, e.g. |
This was my understanding as well, but when I sat down and worked through it, I couldn’t come up with a benchmark that shows a difference (between Maybe this has changed over time as LLVM has improved? The way rand derives |
The specific problem with |
|
@hanna-kruppe I tried benchmarking Xoshiro256++, Sfc32 and Sfc64 using If there's desire to use only a single method, I would consider using |
|
Are there any code examples in the docs? If not it'd be great to add them before stabilisation. |
|
From zulip discussion, there seems to be some tension between goals
|
An alternative to panicking is to seed an infallible RNG from a fallible RNG. This at least defers the error condition to something that happens once up-front, and is avoidable thereafter. I'd probably only recommend that for bare metal embedded use cases though. Anywhere you have a proper kernel entropy pool (and potentially have to worry about forking) you're better off using that. |
Can you provide the benchmarking code? I'd love to see if we can optimize that in the style of hanna-kruppe/chacha8rand#1 . |
|
@joshtriplett here's a diff against rand_pcg code. |
This comment has been minimized.
This comment has been minimized.
eb9d7c8 to
44519e1
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
RandomSource and DefaultRandomSourceRng and SystemRng
This comment has been minimized.
This comment has been minimized.
|
@rust-rfcbot merge libs-api |
|
I find it strange we're proposing to stabilize this as an alternative for
If a crate were to switch from Most of these limitations are due to the fact that the current implementation tries to make "easy to use Everything is sacrificed for ease of use of the common case, when no such sacrifices need to be made if these two use-cases are separated in dedicated solutions. |
This provides enough of an interface for people to obtain random bytes. The `Distribution` trait and the `random` function remain unstable; those don't need to block stabilization of `Rng` and `SystemRng`. Similarly, this leaves a `fill_buf` function using `BorrowedCursor` as future work.
44519e1 to
5119e06
Compare
It does allow for a custom implementation via a trait. It specifically doesn't provide a "default" random source since nobody can agree on what properties this should have.
Fallible random sources don't really exist in practice, and it would make the API worse if users have to handle a condition that never occurs in practice. On the other hand I agree that unavailable random sources are something that users may want to handle. This could be handled either with
This will be added separately and in a backwards-compatible way once the read_buf API is stabilized. |
|
@Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
|
@rfcbot concern unavailable SystemRng |
|
Concerns that came up in @rust-lang/libs-api discussion: We should document that @rfcbot concern document-systemrng-will-fail-if-no-secure-randomness We should add a module-level example in @rfcbot concern random-module-level-example |
As has already been said a few times, this is less useful in practice because leveraging the trait would require passing a (generic or dyn) value into any code path which might want to use a random source — e.g. any code using |
It is precisely this what It's why It means a crate like I think randomness from the environment is such a core piece of the ecosystem it should have a similar system to
Precisely. And this includes any code path which might want to use a |
|
If we stabilise |
I'd be a little worried if this mechanism were 100% ambient and there weren't something that needed to be explicitly configured outside of your crate dependencies to elect which crate is providing your secure ambient RNG, e.g. in something like On an embedded platform, you could imagine someone adding a crate that provides a poison ambient RNG, and then later adding a different crate implementing cryptography that calls into the ambient system RNG expecting it to be secure, but getting a poison RNG instead. Since everything got wired up automatically, this can potentially go unnoticed by the developer pulling in these dependencies. All that is to say: I think this deserves some careful design and wouldn't like to see it block a |
|
I think that's trivially preventable by only allowing the root crate (e.g. the binary or I also don't think the attacker model you're describing makes any sense. If you're pulling in a malicious dependency it's over anyway. They're not going to bother with 'subtly' (I don't think there's anything subtle about installing a poisoned global RNG in the first place) trying to poison your RNG source in the hopes of you generating an insecure key, considering they already have arbitrary code execution. |
|
@orlp It doesn't have to be malice; it's easy to go wrong just due to differing requirements. Scenario: you're writing a game. You find the mechanism for setting a default RNG, and that looks convenient. You set up an RNG as the default that lets you have seeded games and daily runs and similar. Later, in another part of the game, you use HTTPS, which uses the RNG for critical TLS material. The danger here is in allowing substitution of an RNG (e.g. via EII) that isn't suitable for secure purposes. If we ever allow doing that, then everyone who needs a secure RNG can no longer safely use the system RNG. I'm not saying we should never have an interface for that; there's value for testing, and there's value for embedded systems. (Though personally I think we should make it easier for embedded systems to more easily use a custom target without hacking on rustc and std.) But we do need to address the possibility of conflicting requirements, and that will require a non-trivial amount of design and care. |
|
@joshtriplett Those are good points, I agree with those. I think that's an indicator
We're discussing this on the stabilization PR of Considering this is being stabilized to provide, and I quote, "a common need in the ecosystem, currently fulfilled via the
Nothing inherently no, but:
|
Noisy attacks get noticed. The thing about a poison RNG is that it can be used for stealthy kleptographic attacks, in a way that its code might even withstand a fair degree of scrutiny. The best example of an attack like this is probably the ScreenOS backdoor. Using a specially designed RNG cascade (using the infamous Anyway, that's all again to say that I wouldn't be too hasty to try to stabilize a mechanism for overriding the system RNG, specifically because it is a juicy target for these sorts of attacks. |
The trait has a single function
But all those things are compatible with the infallible trait because panicking can be used to make the fallible infallible. I think designing the override mechanism itself will present far more difficulty than any extension to the rng API. |
|
The
The former is hard for the standard library to provide, both because of the aforementioned issues with this global resource specifically and because the story for adding more global hooks without ad-hoc language/compiler extensions hasn't been fully figured out yet. The trait proposed for stabilization here is not an adequate substitute for reasons discussed above. What's up for stabilization here is the latter part of I do see a risk that providing only a trait and not a global source encourages the ecosystem to switch away from the |
|
To be clear on my personal position, I do still think a But there doesn't seem to be any libs-api consensus on that and I don't think the trait will tie std's hands. So if this is what can be broadly agreed upon for now then I'm happy to go with it. |
Plenty of sources may fail in practice. And it's not only IO-based generators, even RDRAND is technically fallible and it was encountered in practice (sure, it was a buggy CPU, but still). Just replace "random sources" with "allocators" in your comment to see the potential mess you intend to bake in. With distinct Finally, I believe it does not make sense to introduce just the |
|
The hardware may be fallible in those cases, but the OS normally retries accessing the hardware source later and either blocks or keep serving from the already seeded CSPRNG when you try to get random numbers and adds entropy from other lower quality but guaranteed to exist sources like interrupt jitter (if interrupts stop your system is completely broken and no userspace apps run anyway). |
|
Firstly, you can not assume what OS does. Most OSes do not make the infallibility guarantee (as an exception, modern Windows and Fuchsia do make such guarantee), for example, (IIRC) Hermit was simply forwarding RDRAND without any entropy mixing. On top of that in future we may have a way to override the system source (e.g. a cryptographic application may use an external IO-based certified RNG). Secondly, RNG traits defined in |
|
On Linux, the only documented error conditions for getrandom by glibc are either unconditional (not supported by the kernel, which would effectively be running the binary on a kernel not supported by libstd at all), if you ask for non-blocking (which libstd doesn't), should be immediately retried (EINTR) or you messed up the arguments to getrandom (and got yourself a memory safety bug). So for all practical purposes on Linux it will not fail.
I would expect the Rng trait in libstd to correspond to CryptoRng in rand_core. |
View all comments
Stabilization report
This partial stabilization provides enough of an interface for people to obtain random bytes, which is a common need in the ecosystem, currently fulfilled via the
getrandomcrate.There have been many requests for a
fill_bytesinterface in the standard library. Per previous libs-api discussions,SystemRng.fill_bytescan serve that function, rather than adding a separate free function.Alternatives and Future Work
Uninitialized buffers
We're likely to add a
fill_buffunction to fill aBorrowedCursor<'_, u8>. We can do so onceBorrowedBuf/BorrowedCursoris stable. Deferring this means we will need to support trait impls that providefill_bytesbut notfill_buf, which we might not need to if we waited until afterBorrowedBuf/BorrowedCursoris stable. However, that isn't any worse of a problem than we already have withio::Read, and we don't necessarily want to couple the stabilization ofBorrowedBuf/BorrowedCursorwithDistributions
The
Distributiontrait and therandomfunction remain unstable; those don't need to block stabilization ofRngandSystemRng.Optimized paths for
u32/u64Some RNGs can provide faster results for generating a whole
u32/u64rather than individual bytes.The definition and documentation of
fill_bytessays:We hope that this will allow RNGs that can generate whole words to do so efficiently as a fast path in
fill_bytes/fill_buf. If dedicatednext_u32/next_u64functions still end up being substantially faster, we can always add them as optional trait methods in the future.Some experimentation suggests that it's possible to match the performance.
Resultversus panickingThere's been extensive discussion about whether the function should return a
Resultrather than panicking, or providing an additional such function. The previous conclusion from libs-api was that while it's possible for the first such call to fail (e.g. because the OS or sandbox provides no access to randomness at all), subsequent calls should never fail, and user code will not be prepared to deal with such failure.Furthermore, an API returning
Resultwould propagate throughout higher-level calls, forcing operations as simple as "roll a d20" to either returnResultor callexpect/unwrap. And even providing atryvariant will lead to higher-level APIs having to consider which variant to call. We should, instead, make the guarantee that a well-behaved underlying OS won't panic after the first call.Note, in particular, that
HashMapalready fails via panic if it can't get data from itsRandomState.If there's a need to allow error recovery for the "no OS/sandbox support" case, we could provide a one-time call to check for an error. Or, such users could continue using
getrandomor the underlying OS APIs.If we did want to make every call fallible, we have the capability, using upcoming language features ("supertrait auto impl"), to add a
TryRngsupertrait without breaking backwards compatibility.