From 3d2a06b6324166ff5e3796984a9ae41e9d81d395 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Thu, 20 Nov 2025 17:02:21 -0800 Subject: [PATCH 01/42] Add the unnamed_variants RFC --- text/0000-unnamed-variants.md | 1885 +++++++++++++++++++++++++++++++++ 1 file changed, 1885 insertions(+) create mode 100644 text/0000-unnamed-variants.md diff --git a/text/0000-unnamed-variants.md b/text/0000-unnamed-variants.md new file mode 100644 index 00000000000..35032741c10 --- /dev/null +++ b/text/0000-unnamed-variants.md @@ -0,0 +1,1885 @@ +- Feature Name: (`unnamed_variants`) +- Start Date: +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: + [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary + +Enable ranges of enum discriminants to be reserved ahead of time, requiring +all users of that enum to consider those values as valid. This includes within +the declaring crate. + +`_ = RANGE` is an _unnamed variant_ definition. It specifies that enum +discriminants in `RANGE` are valid. It is sound to construct unnamed variants +with `unsafe`, and to handle them over FFI. If there is no invalid discriminant +for an enum, it becomes an _open enum_. If it is [unit-only], it can then be +`as` cast from its explicit underlying integer. + +[unit-only]: https://doc.rust-lang.org/reference/items/enumerations.html#r-items.enum.unit-only + +# Motivation + +Enums in Rust have a _closed_ representation, meaning the only valid +representations of that type are the variants listed, with any violation of this +being [Undefined Behavior][ub]. This is the right default for Rust, since it +enables niche optimization and ensures values have a known state, limiting +unnecessary or dangerous code paths. + +[ub]: https://doc.rust-lang.org/reference/behavior-considered-undefined.html + +However, a closed enum is not always the best choice for systems programming. +The issue lies with compatibility between existing binaries. There are many +cases in which code is expected to handle non-yet-known enum values as a +non-error. + +Consider a complex system that initially uses this `TaskState` enum to +communicate: + +```rust +#[repr(u8)] +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +#[non_exhaustive] +/// `TaskState` v1 +pub enum TaskState { + Stopped = 0, + Running = 1, +} +``` + +`non_exhaustive` is specified for forwards compatibility, since it should be a +non-breaking change for variants to be added to `TaskState`. This works by +requiring foreign crates to include a wildcard branch when `match`ing. Once a +new `Paused` variant is added to `TaskState`, any code that previously compiled +when using the `TaskState` will continue to do so. However, if any part of the +system is _not_ recompiled, that old code will see the `Paused` variant as +invalid. + +```rust +#[repr(u8)] +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +#[non_exhaustive] +/// `TaskState` v2 +pub enum TaskState { + Stopped = 0, + Running = 1, + // A new valid discriminant for `TaskState` has been introduced! + Paused = 2, +} +``` + +What if it isn't feasible to recompile **every** part of the system that uses +the enum in order to avoid the breaking change? + +```rust +/// `TaskState` v1 reserves discriminants instead of using `non_exhaustive`. +#[repr(u8)] +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +pub enum TaskState { + Stopped = 0, + Running = 1, + // There are reserved variants for the rest of the discriminants: + // The `_` resembles a wildcard seen when `match`ing. + _ = .., +} +``` + +If every binary is using this definition, it is not an breaking change for +existing binaries using this definition to add `Paused = 2`. The `_ = ..` has +required _every_ exhaustive `match` of `TaskState`, including in the defining +crate, to handle the case where it's not one of the currently-named variants. + +## Protobuf + +Protocol Buffers (Protobuf), a language-neutral serialization mechanism, is +designed to be forwards and backwards compatible when extending a schema. +Initially, it defined all of its enums as closed. However, this caused confusing +and often incorrect behavior with `repeated` enums, and so the `proto3` syntax +[switched to open enums][protobuf-history]. Handling unknown values +transparently comes up often in microservices where incremental rollouts cause +schema version skew. + +[protobuf-history]: https://protobuf.dev/programming-guides/enum/#history + +Protobuf generates code for target languages from a schema. On C++, it can +directly generate an `enum` - C++ enums are open since it's valid to +`static_cast` an `enum` from its backing integer. However, on Rust, the current +implementation simulates an open enum by using an integer newtype with +associated constants for each variant. + +While this allows Protobuf enums in Rust to be used _mostly_ like enums, this is +a suboptimal experience. + +### Newtype integers are bad for enumeration + +When the point of a type is to give an integer a set of well-known names (like +in C++), a newtype integer isn't as ergonomic to use as an `enum`: + +- It is arduous to read the generated definition - the variants are inside of an + `impl` instead of next to the name. It hides the type's nature as an enum. +- It's invalid to `use` the pseudo-variants like with `use EnumName::*`. +- `Debug` derives are less useful. +- The third-party macro ecosystem built around enums can't be used. +- Rust is a systems language that can move data around efficiently, and so + first-class support for named integers is valuable for embedded programmers. +- Code analysis and lints specific to enums are unavailable. + - No "fill match arms" in rust-analyzer. + - The [`non-exhaustive patterns` error][E0004] lists only integer values, and + cannot suggest the named variants of the enum. + - The unstable `non_exhaustive_omitted_patterns` lint has no way to work with + this enum-alike, even though treating it like a `non_exhaustive` enum would + be more helpful. +- Generated rustdoc is less clear (the pseudo-enum is grouped with `struct`s). +- In order for a pseudo-variant name to match the normal style for an enum + variant name, `allow(non_uppercase_globals)` is required. + +[E0004]: https://doc.rust-lang.org/stable/error-index.html#E0004 + +If Protobuf instead declared generated Rust enums with a `_ = ..` variant, users +could have a first-class enum experience with compatible open semantics. + +## C interop + +A closed `#[repr(C)]` field-less `enum`s is [hazardous][repr-c-field-less] to +use when interoperating with C, mostly because it is so easy to trigger +Undefined Behavior when unknown values appear. In C, it is idiomatic to do an +unchecked cast from integer to enum. So, even if one ensures that the C and Rust +libraries are compiled at the same time, they must also audit the C source to +ensure that unknown values cannot be exposed to Rust. + +[repr-c-field-less]: https://doc.rust-lang.org/reference/type-layout.html#reprc-field-less-enums + +With unnamed variants, the current guidance surrounding sharing enums with C can +thus be simplified greatly: add a `_ = ..` variant and UB from invalid values +aren't a concern. + +`bindgen` has [multiple ways][bindgen-enum-variation] to generate Rust that +correspond to a C enum, the default being to define a series of `const` items. +Its best-effort logic to determine the backing integer type for a C enum does +not always match that of `repr(C)` on a Rust `enum`. A future version of +`bindgen` could use this feature to add a `_ = ..` variant to a Rust `enum` by +default, instead of a exposing a less-effective `non_exhaustive` attribute. + +[bindgen-enum-variation]: https://docs.rs/bindgen/0.72.1/bindgen/enum.EnumVariation.html + +## Dynamic Linking + +Dynamically linked libraries, Rust or otherwise, are prone to ABI compatibility +breakage. + +Ensuring ABI compatibility when extending a library requires extra care. While +`non_exhaustive` grants API compatibility as variants are added, it [does _not_ +provide ABI compatibility][non-exhaustive-ub]. By reserving discriminants for +future extensions to an enum, libraries can choose to remain ABI +forwards-compatible as new variants are added. + +Projects like Redox and relibc would use this feature for this reason among +others listed. + +[non-exhaustive-ub]: https://github.com/rust-lang/rust-bindgen/issues/1763 + +## Embedded syscalls + +TockOS is an embedded OS with a separate user space and kernel space. Its +syscall ABI defines that kernel error codes are between 1 and 1024. It's highly +desirable to keep the `0` niche available for `Result<(), ErrorCode>`, so the +user space library defines an [`ErrorCode` enum][libtock-errorcode] with 14 +normal variants and 1010 "reserved" variants that will eventually be renamed. +This has drawbacks: + +- It clutters the enum definition. +- rust-analyzer's "Fill match arms" inserts a new match arm for each of the + reserved names, even though a single wildcard branch would be more + appropriate. +- Since the reserved discriminants have named variants, there's nothing + preventing users from using the reserved name. There is no perfect way to + claim a reserved discriminant without breaking the API. + - Declaring an associated `const` is the way to prevent an API breakage. + - Moving a reserved variant name like `N00014` to a `deprecated` associated + `const` is better for readability, but breaks any user that wrote + `use ErrorCode::N00014`. + - Declaring the new variant name as an associated `const` is harder to read, + doesn't interact with code analyzers, and doesn't let users write + `use ErrorCode::NewVariant`. + +[libtock-errorcode]: https://github.com/tock/libtock-rs/blob/master/platform/src/error_code.rs#L30-L33 + +## Zero-copy deserialization + +A common pattern on embedded systems is to read data structures directly from +a `[u8]`, facilitated by libraries like +[`zerocopy`][zerocopy-frombytes-derive] or +[`bytemuck`][bytemuck-checkedbitpattern]. In order to do this, the bytes in +flash must always be validated to be one of the known discriminants. + +This scales poorly for performance and code bloat as more enums are added to be +deserialized in a message. It is more flexible to defer wildcard branches for +unknown discriminants to the point when the enum is `match`ed on, rather than +up-front during deserialization. When these checks are undesirable, ergonomics +must be sacrificed for compatibility and performance by using an integer +newtype. + +[bytemuck-checkedbitpattern]: https://docs.rs/bytemuck/latest/bytemuck/checked/trait.CheckedBitPattern.html +[zerocopy-frombytes-derive]: https://docs.rs/zerocopy/0.6.1/zerocopy/derive.FromBytes.html + +## Restricted range integers + +Unnamed variants can be used to define integers that are statically restricted +to a particular range, including with niches. + +```rust +macro_rules! make_ranged_int { + ($name:ident : $repr:ty; $($range:tt)*) => { + #[repr($repr)] + enum $name { + _ = $($range)*, + } + impl TryFrom<$repr> for $name { + type Error = (); + fn try_from(val: $repr) -> Result<$name, ()> { + match val { + // SAFETY: `val` is a valid discriminant for `$name` + $($range)* => Ok(unsafe { mem::transmute(val) }), + _ => Err(()), + } + } + } + impl From<$name> for $repr { + fn from(val: $name) -> $repr { + val as $repr + } + } + }; +} +make_ranged_int!(FuelLevel: u32; 0..=100); + +assert!(size_of::() == size_of::>()); +assert_eq!(FuelLevel::try_from(10).unwrap() as u32, 10); +assert!(FuelLevel::try_from(21).is_err()); +``` + +With other extensions, this could even be generic: + +```rust +trait EnumDiscriminant { + type Ranged>; +} + +impl EnumDiscriminant for u32 { + type Ranged> = RangedU32; +} + +#[repr(u32)] +enum RangedU32> { + _ = RANGE, +} + +type Ranged> = ::Ranged; + +type FuelLevel = Ranged; +``` + +[Pattern types][pattern types] are a more direct way to express this. + +[pattern types]: https://github.com/rust-lang/rust/pull/107606 + +# Guide-level explanation + +Enums have a _closed_ representation by default, meaning that any enum value +must be represented by one of the listed variants. Constructing any enum value +with an unassigned discriminant is immediate [Undefined Behavior][ub]: + +```rust +#[repr(u32)] // Fruit is represented with specific discriminants of `u32`. +enum Fruit { + Apple, // Apple is represented with 0u32. + Orange, // Orange is represented with 1u32. + Banana = 4, // Banana is represented with 4u32. +} +// Undefined Behavior: 5 is not a valid discriminant for `Fruit`! +let fruit: Fruit = unsafe { core::mem::transmute(5u32) }; + +// Rust utilizes these invalid discriminants for compiler-dependent +// optimization: +assert_eq!(mem::transmute(Option::::None, 2u32)); +``` + +However, by declaring an **unnamed variant**, the discriminant `5` is _reserved_ +and becomes sound to transmute from. + +```rust +#[repr(u32)] // An explicit repr is required to declare an unnamed variant. +enum Fruit { + Apple, // Apple is represented with 0u32. + Orange, // Orange is represented with 1u32. + Banana = 4, // Banana is represented with 4u32. + _ = 5, // Some future variant will be represented with 5u32. +} +// SAFETY: 5 is a reserved discriminant for `Fruit`. +let fruit: Fruit = unsafe { core::mem::transmute(5u32) }; + +// `fruit` is not any of the named variants. +assert!(!matches!(fruit, Fruit::Apple | Fruit::Orange | Fruit::Banana)); + +// These are both rejected: an unnamed variant can't construct reserved +// discriminants or patttern match on them. +// assert!(!matches!(fruit, Fruit::_)); +// let fruit = Fruit::_; +``` + +By introducing this special variant, all users of `Fruit` must include a +wildcard branch when `match`ing, including within the declaring crate. Think of +the `_ = 5` as declaring that "discriminant `5` goes in the `_` branch when +`match`ing". There's no safe way to construct a `Fruit` from a `5`, but it can +be `transmute`d or received over FFI. + +```rust +match fruit { + Fruit::Apple | Fruit::Orange | Fruit::Banana => println!("Known fruit"), + // Must be included, even in the crate that defines `Fruit`. + x => println!("Unknown fruit: {}", x as u32), +} +``` + +An unnamed variant accepts a range as its discriminant expression, which ensures +each discriminant in the range is reserved and valid to use. + +```rust +#[repr(u32)] // Fruit is represented with specific discriminants of `u32`. +enum Fruit { + Apple, // Apple is represented with 0u32. + Orange, // Orange is represented with 1u32. + Banana = 4, // Banana is represented with 4u32. + _ = 3..=10, // 3 through 10 inclusive are valid discriminants for `Fruit`. +} +// SAFETY: 7 is a reserved discriminant for `Fruit` +let fruit: Fruit = unsafe { core::mem::transmute(7u32) }; +``` + +By using `..` as an unnamed variant range, all bit patterns for the enum become +valid. It is now an _open enum_ and can be constructed from its underlying +representation via `as` cast: + +```rust +#[derive(PartialEq, PartialOrd)] +#[repr(u32)] // Fruit is represented by any `u32` - it is an *open enum*. +enum Fruit { + Apple, // Apple is represented with 0u32. + Orange, // Orange is represented with 1u32. + Banana = 4, // Banana is represented with 4u32. + _ = .., // The rest of the discriminants in `u32` are reserved. +} +// Using an `as` cast from `u32`. +let fruit = 3 as Fruit; + +// Does not match any of the known variants. +assert!(!matches!(fruit, Fruit::Apple | Fruit::Orange | Fruit::Banana)); + +// `fruit` preserves its value casting back to `u32`. +assert_eq!(fruit as u32, 3); + +// `derive(PartialOrd, PartialEq)` works by discriminant as usual: +assert!(5 as Fruit > fruit); +assert!(3 as Fruit == fruit); +assert!(1 as Fruit == Fruit::Orange); + +// error: incompatible cast: `Fruit` must be cast from a `u32` +// help: to convert from `isize`, perform a conversion to `u32` first: +// let fruit2 = u32::try_from(5isize).unwrap() as Fruit; +let fruit2 = 5isize as Fruit; +``` + +This open enum is much like a `struct Fruit(u32)`, except it is treated as an +enum by IDEs and developers. + +## Interaction with `#[non_exhaustive]` + +An enum declared both `non_exhaustive` and with an unnamed variant is rejected. +On a field-less enum, it is not a breaking change to replace a +`#[non_exhaustive]` declared on the enum with a contained unnamed variant. +Unnamed variants and `#[non_exhaustive]` both declare that future variants of an +enum may be added as the type evolves. + +`non_exhaustive` affects API semver compatibility: + +- It is flexible in how new variants are represented. +- It does _not_ affect what discriminants are currently valid to represent. +- Crates must be recompiled to use new enum variants. +- It affects _only_ foreign crates. + +By contrast, an unnamed variant affects API _and_ ABI semver compatibility: + +- It reserves specific ranges of discriminants. +- These reserved discriminants are valid to represent without naming the future + variants that use them. +- Crates can manipulate these unnamed enum variants without recompilation. +- It affects all crates, including the declaring one. + +For enums that have relevant discriminant values, an unnamed variant may be the +better choice. This is often the case for enums declaring an explicit `repr`. + +# Reference-level explanation + +## Unnamed variants + +An **unnamed variant** is an enum variant with `_` declared for its name. It is +assigned to one or many **reserved discriminants**. These discriminants are +valid for the enum, and may be assigned to a named variant in the future. It is +valid to `transmute` to an enum type from a reserved discriminant. + +An unnamed variant does not declare an identifier scoped under the enum name, +unlike a named variant. `EnumName::_` remains an invalid expression and pattern. + +An unnamed variant may be specified more than once on the same enum. It is valid +to reserve multiple ranges of discriminants. Those ranges may be discontiguous. + +An explicit `repr(Int)` is required on an enum to declare an unnamed variant. +`Int` is one of the primitive integers or `C`. If it is `C`, then `Int` below is +`isize`. An unnamed variant must specify a discriminant expression with one of +these types: + +- `Int` + - Reserves a particular discriminant value. + - The discriminant must not be assigned to another variant of the enum - + whether named or unnamed. + + ```rust + // error: discriminant value `1` assigned more than once + #[repr(u32)] + enum Color { + Red, + Green, + Blue, + _ = 1, + } + ``` + +- `start..end` (`core::ops::Range`) or\ + `start..=end` (`core::ops::RangeInclusive`) + - Ensures every discriminant value in the range is reserved. + - Named variants have higher precedence than unnamed variants when assigning + discriminants to variants. + + ```rust + #[repr(u32)] + enum HttpStatusCode { + Ok = 200, + NotFound = 404, + // Ensures the discriminants in 100..=599 are valid for Self. + // Actually reserves 100..=199, 201..=403, and 405..=599. + _ = 100..=599, + } + ``` + + - The range must not overlap with discriminants assigned to unnamed variants. + Multiple unnamed variants have equal claim to a discriminant value. + + ```rust + #[repr(u8)] + // error: discriminant value `10` assigned more than once + enum Foo { + X = 0, + _ = 1..=10, + _ = 10, + } + + // error: discriminant values `10..=14` assigned more than once + #[repr(u8)] + enum Bar { + X = 0, + _ = 1..20, + Y = 20, + _ = 10..15, + } + ``` + + - The range must be non-empty. + + ```rust + #[repr(u8)] + enum Foo { + X = 0, + // error: empty range assigned to `_` variant + // help: variant has discriminant range `0..1` + _ = Self::X..Self::Y, + Y = 1, + } + + #[repr(isize)] + enum Bar { + X = 2, + // error: empty range assigned to `_` variant + // help: variant has discriminant range `-2..0` + _ = Self::X..Self::Y, + Y = 0, + } + ``` + + - It is usually a mistake to specify an empty range. + - Allowing this would enable this peculiar situation: + + ```rust + #[repr(u8)] + enum Foo { + X = CONST1, + _ = (CONST1 + 1)..CONST2, + Y = CONST2, + } + ``` + + If `CONST1 + 1 >= CONST2`, then `E` has no reserved discriminants, and + thus no wildcard arm is needed in a `match`, even though an unnamed + variant is syntactically present. + - An empty or negative range could accidentally cause UB if fewer + discriminants are reserved than expected. + - If edge cases are found that necessitate allowing this, this can be made + a `deny`-by-default lint in the future. + - There must be at least one discriminant available to reserve in the range. + + ```rust + enum Foo { + X, + Y, + // error: all discriminants in range `0..=1u8` already assigned + // help: `0` is assigned here: `X` + // help: `1` is assigned here: `Y` + _ = 0..2, + } + ``` + + - Thus, an unnamed variant cannot be specified on an enum that is already + open: + + ```rust + #[repr(u8)] + enum NamedU8 { + James = 0, + Fernando = 1, + Sally = 2, + // ... Named variant for every other u8 ... + Jolene = 255, + + // error: all discriminants in range `0..=255` already assigned + // help: `0` is assigned here: `James = 0` + // help: `255` is assigned here: `Jolene = 255` + _ = .., + } + ``` + +- `start..` (`core::ops::RangeFrom`) + - Equivalent to `start..=Int::MAX`. +- `..end` (`core::ops::RangeTo`) + - Equivalent to `Int::MIN..end`. +- `..=end` (`core::ops::RangeToInclusive`) + - Equivalent to `Int::MIN..=end`. +- `..` (`core::ops::RangeFull`) + - Equivalent to `Int::MIN..=Int::MAX`. + - Reserves the rest of the discriminants for `Int`. This always makes an enum + open without consideration for named variants' discriminants. + - Because unnamed variants cannot have conflicting discriminants, this is the + only unnamed variant allowed on the enum when used. It is called the enum's + _open variant_. + + ```rust + // error: discriminant value `1` assigned more than once + // help: an `_` variant assigned to `..` forbids other `_` variants + #[repr(u8)] + enum Foo { + X = 0, + _ = 1, + Y = 2, + _ = .., + } + ``` + +### Type Inference + +The discriminant expression for an unnamed variant has its type inferred as if +it were an argument to a generic function accepting the valid types for the +representation integer: + +```rust +#[repr(u32)] +enum X { + // {integer} infers as `u32`, `{integer}..{integer}` as `Range`, etc. + _ = validate::(10), + _ = validate::(10..20), + _ = validate::(20..=30), + // ... +} +const fn validate>(x: T) -> T { x } +trait ReserveDiscriminants {} +impl ReserveDiscriminants for u32 {} +// ... impl ReserveDiscriminants for Int {} ... +impl ReserveDiscriminants for Range {} +impl ReserveDiscriminants for RangeInclusive {} +impl ReserveDiscriminants for RangeFrom {} +impl ReserveDiscriminants for RangeTo {} +impl ReserveDiscriminants for RangeToInclusive {} +impl ReserveDiscriminants for RangeFull {} +``` + +### `repr(C)` behavior + +`repr(C)` enums have special semantics in Rust because the discriminant +expression type, `isize`, is not the same as the actual backing integer. These +enums are ordinarily backed by a `ffi::c_int`, but if any of the assigned +discriminants cannot fit, a larger backing integer is chosen that can represent +all of them. + +> Since this behavior is fraught with ABI mismatches, this is going to change +> to [forbid enums larger than `c_int` or `c_uint`][enum-size-constrain]. + +Sometimes this is overridden by the system's ABI. On some rarer platforms, +`repr(C)` enums start as small as 1 byte, smaller than the C `int`. The behavior +is otherwise the same. + +[enum-size-constrain]: https://github.com/rust-lang/rust/pull/147017 + +The same rules apply for discriminants assigned to unnamed variants: + +```rust +#[repr(C)] +enum Small { + X = 1, + _ = 2..10, +} + +// Named and unnamed variants can both grow a `repr(C)` enum. +enum Big1 { + X = 1, + _ = isize::MAX, +} + +enum Big2 { + X = 1, + _ = 2, + Y = isize::MAX, +} + +// On x86_64-unknown-linux-gnu. +const _: () = assert!( + size_of::() == 4 && + size_of::() == 8 && + size_of::() == 8 +); +``` + +The unbounded end of a discriminant range never affects the backing integer of a +`repr(C)` enum. When a range with an unbounded end (`start..`, `..end`, +`..=end`, `..`) is used as an unnamed variant's discriminant expression in a +`repr(C)` enum, the set of discriminants that is reserved by that unbounded end +is dependent on the other variants' discriminants. + +```rust +#[repr(C)] +enum SmallNonnegative { + X = 0, + // Reserves `1..=c_int::MAX`. + _ = 1.., +} + +#[repr(C)] +enum BigOpen1 { + X = isize::MAX, + // Reserves `isize::MIN..isize::MAX`. + _ = .., +} + +#[repr(C)] +enum BigOpen2 { + // Reserves `isize::MIN..0`. + _ = ..0, + _ = 0..=isize::MAX, +} + +// On x86_64-unknown-linux-gnu. +const _: () = assert!( + size_of::() == 4 && + size_of::() == 8 && + size_of::() == 8 +); +``` + +This behavior means that it is sound to expose a C enum defined like this: + +```c +enum Foo { + Name1 = Value1, + Name2 = Value2, + // etc. +}; +``` + +as this Rust enum, regardless of the discriminant values assigned: + +```rust +#[repr(C)] +enum Foo { + Name1 = Value1, + Name2 = Value2, + // etc. + + // The rest of the discriminants for an enum with the named variants + // are reserved and valid. Unchecked casts can't invoke UB. + _ = .., +} +``` + +### Grammar changes + +[EnumVariant] is extended to allow an underscore instead of a variant's name: + +```text +EnumVariant -> + OuterAttribute* Visibility? + (IDENTIFIER | `_`) ( EnumVariantTuple | EnumVariantStruct )? + EnumVariantDiscriminant? +``` + +[EnumVariant]: https://doc.rust-lang.org/reference/items/enumerations.html#grammar-EnumVariant + +### No field data + +This RFC only defines adding unnamed variants to field-less enums, leaving this +as future work. + +### `non_exhaustive` + +The `non_exhaustive` attribute on enums and unnamed variants are mutually +exclusive: + +```rust +#[non_exhaustive] +#[repr(u8)] +enum Color { + Red = 0, + Green = 1, + // error: An `_` variant cannot be specified on a `non_exhaustive` enum. + // help: remove the `#[non_exhaustive]` + _ = 2, +} +``` + +An unnamed variant is more impactful than `non_exhaustive`, since it affects the +declaring crate - the enum is "universally non-exhaustive". + +### Compatibility + +Given enum versions A and B with some change between them: + +- A change is forwards-compatible if a library designed for enum version A can + use A or B. +- A change is backwards-compatible if a library designed for enum version B can + use A or B. +- A change is fully-compatible if it is both forwards and backwards compatible. +- A change is API compatible if the change does not affect static compilation + using a single enum source, either A or B. +- A change is ABI compatible if the change does not affect dynamically linked + libraries compiled using enum versions A and B (with the same Rust compiler). + +It is an API and ABI fully-compatibile change to: + +- Add a named variant to a field-less enum using a discriminant that was + previously reserved. + - When doing the this, removing the last unnamed variant may cause warnings + for unused code in client libraries, as a wildcard branch is no longer + required. This can be avoided by then adding `#[non_exhaustive]` to the + enum. + +It is an API fully-compatible and ABI backwards-compatible change to: + +- Replace `#[non_exhaustive]` on an enum with an unnamed variant. + - This may require changes to the defining crate to add wildcard branches. +- Add another reserved discriminant, if an unnamed variant already exists on the + enum. + +It is an API and ABI backwards-compatible change to: + +- Add an unnamed variant to an enum without `#[non_exhaustive]` or another + unnamed variant. The same caveat regarding unused wildcard branches applies. + +### Applicable lints + +#### Truncatable ranges + +A new `warn`-by-default lint is produced if an unnamed variant's discriminant +range can be shortened to avoid overlapping with named variants. + +Let `start..=end` be the range of discriminants that an unnamed variant +definition is assigned to, regardless of the actual range type used. An +`overlong_discriminant_ranges` lint is produced if all of the below are true: + +- The bound is specified as a range expression in the variant's discriminant + expression, and not as an identifier or block. +- Every discriminant in some prefix or suffix of the range is already assigned. + That is, there exists some `n ≥ 0` such that the sub-range + `start..=(start + n)` or `(end - n)..=end` has every discriminant in that + range assigned to a named variant. Let either sub-range for which this is + true be called an "overlong side". +- An overlong side is specified with a literal integer, and not implicitly + defined by an unbounded range. +- The prefix is an overlong side _or_ the following variant, if any, has an + explicit discriminant. + +```rust +#[repr(u32)] +enum LeftSide { + X, + Y, + Z, + // warning: discriminant range for variant can be shortened + // help: shorten the range: `3..` + // note: `#[warn(overlong_discriminant_ranges)]` on by default + _ = 0.., +} +#[repr(u32)] +enum RightSide { + X, + Y = 9, + Z, + // warning: discriminant range for variant can be shortened + // help: shorten the range: `(Self::X as u32)..9` + // note: `#[warn(overlong_discriminant_ranges)]` on by default + _ = (Self::X as u32)..10, +} +#[repr(u32)] +enum BothSides { + X, + Y, + Z = 10, + // warning: discriminant range for variant can be shortened + // help: shorten the range: `2..=9` + // note: `#[warn(overlong_discriminant_ranges)]` on by default + _ = 0..=10, +} +#[repr(u32)] +enum NonLiteralOverlongSide { + X, + // A warning is not produced, as the overlong side is a non-literal. + // This was likely intended. + _ = (Self::X as u32)..10, +} +#[repr(u32)] +enum UnboundedOverlongSide { + X = 0, + // A warning is not produced, as the overlong side is unbounded. + _ = ..10, +} +#[repr(u32)] +enum ImplicitNextDiscriminant { + // A warning is not produced, as the following named variant depends on the + // overlong side's discriminant. + _ = 5..=10, + X, // 11 + Y = 10, +} +``` + +#### Gap of length one caused by an exclusive range + +The existing [`non_contiguous_range_endpoints`] lint is produced if: + +[`non_contiguous_range_endpoints`]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint_defs/builtin/static.NON_CONTIGUOUS_RANGE_ENDPOINTS.html + +- There exists some unnamed variant assigned to a `start..end` or `..end` + discriminant expression, and +- `end` is not a valid discriminant for the enum, and +- `end + 1` is a valid discriminant for the enum. + +```rust +#[repr(u32)] +enum Foo { + // warning: multiple ranges are one apart + // help: this range doesn't match `100` because `..` is an exclusive range + // help: use an inclusive range instead: `80..=100` + _ = 80..100, + X = 101, + // ^ this could appear to continue range `0..100`, but `100` isn't included + // by either of them + +} + +#[repr(u32)] +enum Bar { + // warning: multiple ranges are one apart + // help: this range doesn't match `99` because `..` is an exclusive range + // help: use an inclusive range instead: `..=99` + _ = ..99, + _ = 100..200, + // ^ this could appear to continue range `..99`, but `99` isn't included + // by either of them +} +``` + +#### Forgot to mention a named variant + +The unstable [`non_exhaustive_omitted_patterns`] `allow`-by-default lint is +produced if a `match` on an enum with reserved discriminants mentions some, but +not all, of the named variants. + +[`non_exhaustive_omitted_patterns`]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint_defs/builtin/static.NON_EXHAUSTIVE_OMITTED_PATTERNS.html + +This uses the same name as the similar lint for `non_exhaustive` because it is +burdensome to require developers to remember two different lints for such +similar use cases. This requires updating the documentation of the lint to +reference unnamed variants as well as `non_exhaustive`. + +It may also be prudent to rename the lint before stabilization to include +unnamed variants. + +```rust +#[repr(u32)] +enum Bar { + A, + B, + _ = .., +} +let b = Bar::A; + +// warning: some variants are not matched explicitly +// pattern `Bar::B` not covered +// help: ensure that all variants are matched explicitly by adding the +// suggested match arms +// note: the matched value is of type `Bar` and the +// `non_exhaustive_omitted_patterns` attribute was found +#[warn(non_exhaustive_omitted_patterns)] +let name = match b { + Bar::A => "A", + _ => "unknown", +}; +``` + +### Next variant's implicit discriminant + +When a named variant without an implicit discriminant follows an unnamed +variant, the assigned implicit discriminant is the next integer after the +declared discriminant range for that unnamed variant. If the unnamed variant is +assigned to an integer, it is the next integer. + +```rust +#[repr(u32)] +enum Foo { + X, + _ = 5, + Y, +} +assert_eq!(Foo::Y as u32, 6); + +#[repr(u32)] +enum Bar { + _ = ..10, + X, + Y = 9, +} +assert_eq!(Bar::X as u32, 10); + +#[repr(u32)] +enum Baz { + _ = 2..=10, + X, +} +assert_eq!(Baz::X as u32, 11); + +#[repr(u8)] +enum Overflow { + _ = 10.., + // error: enum discriminant overflowed + // overflowed on value after 255 + X, +} +``` + +### Non-literal discriminant expression + +A non-literal range or integer is allowed for an unnamed variant. + +```rust +const VALID_FOO: Range = 10..100; + +#[repr(u32)] +enum Foo { + X = 10, + Y = 20, + Z = 30, + _ = VALID_FOO, +} +// SAFETY: `15` is a valid discriminant in range `VALID_FOO`. +let _: Foo = unsafe { mem::transmute(15u32) }; +``` + +### Only variant + +An unnamed variant may be the only variant for an enum. In this case, an `as` +cast or `transmute` is the only way to construct an enum value. + +```rust +#[repr(u32)] +#[derive(PartialEq, PartialOrd)] +enum NothingYet { _ = .. } +(10 as NothingYet > 5 as NothingYet) +``` + +## Open enum conversion + +An _open enum_ is defined as an `enum` for which every value of its backing +integer is a valid discriminant. + +- An open enum always has an explicit `repr` backing integer, or is `repr(C)`. +- An enum is open if every discriminant value for that integer is associated + with a named variant or is reserved with an unnamed variant. + - For a field-less enum, this means every initialized bit pattern is valid. +- A [unit-only] open enum may be `as` cast from its backing integer: + `2u8 as Color`. See below for `repr(C)` behavior. + - Casting from other integer types is rejected. +- If an expression with the `{integer}` inference variable type is used as the + source for an `as` cast to an open enum, it is uniquely constrained to the + backing integer type. + + ```rust + #[repr(u8)] + enum Foo { _ = .. } + let x = 10; + + // `x` must be a `u8` to be cast to `Foo` + let _ = x as Foo; + + // error: mismatched types, expected `u32`, found `u8` + // let _: u32 = x; + ``` + +### `repr(C)` open enum behavior + +The actual backing integer type for a `repr(C)` enum changes based on the +variants' numeric discriminant values as described above. + +A `repr(C)` unit-only open enum may be `as` cast from: + +- `const` expressions of type `isize`. This is so a `repr(C)` enum may always + be `as` cast from the same discriminant expression assigned to a variant. +- Any primitive explicit-width integer that is capable of representing all + variants' discriminants and does not exceed the size of the enum for the + platform. Thus any signedness cast performed to the backing integer has no + visible effect. + - This means that authors who don't know or care about short-enum platforms + can cast from `c_int` and `c_uint` to most `repr(C)` open enums, while + preventing those unexpected truncations when necessary. + +```rust +const TEN: isize = 10; + +// Must be able to represent `u8::MAX`: `u8` or `c_int` or `c_uint`. +#[repr(C)] +enum SmallUnsigned { + X = 0, + Y = TEN, + Z = 255, + _ = .., +} + +// May be backed by `c_int` or `c_uint` or `i8` or `u8`. +#[repr(C)] +enum Small { + X = 0, + Y = 10, + _ = .., +} + +// Must be able to represent negative numbers: `i8` or `c_int`. +#[repr(C)] +enum SmallSigned { + X = 0, + Y = 10, + Z = -10, + _ = .., +} + +// Must be able to hold `isize::MIN..=isize::MAX` which may exceed `c_int`. +#[repr(C)] +enum Big { + X = 0, + Y = TEN, + Z = 255, + _ = isize::MIN..=isize::MAX, +} + +assert!(matches!(TEN as Big, Big::Y)); +assert!(matches!(TEN as SmallUnsigned, SmallUnsigned::Y)); +assert!(matches!(TEN as SmallSigned, SmallSigned::Y)); +assert!(matches!(TEN as Small, Small::Y)); + +let zero: c_int = 0; +assert!(matches!(zero as Big, Big::X)); +// On thumbv7m-none-eabi: +// error: truncating cast to `repr(C)` open enum +// note: `SmallUnsigned` is backed by `u8`, which fallibly converts +// from `i32` +// help: try converting to `u8` first: +// `u8::try_from(zero).unwrap() as SmallUnsigned` +assert!(matches!(zero as SmallUnsigned, SmallUnsigned::X)); + +let ten: isize = 10; +assert!(matches!(ten as Big, Big::Y)); +// On x86_64-unknown-linux-gnu: +// error: truncating cast to `repr(C)` open enum +// note: `SmallUnsigned` is backed by `i32`, which fallibly converts +// from `isize` +// note: a `repr(C)` open enum may be cast from constant `isize` +// help: try converting to `i32` first: +// i32::try_from(ten).unwrap() as SmallUnsigned +assert!(matches!(ten as SmallUnsigned, SmallUnsigned::Y)); + +let byte: u8 = 255; +assert!(matches!(byte as Big, Big::Z)); +assert!(matches!(byte as SmallUnsigned, SmallUnsigned::Z)); +_ = byte as Small; +// On thumbv7m-none-eabi: +// error: truncating cast to `repr(C)` open enum +// note: `SmallSigned` is backed by `i8`, which fallibly converts +// from `u8` +// help: try converting to `i8` first: +// `i8::try_from(byte).unwrap() as SmallSigned` +_ = byte as SmallSigned; + +let signed_byte: i8 = 10; +assert!(matches!(signed_byte as Big, Big::Y)); +// On thumbv7m-none-eabi: +// error: truncating cast to `repr(C)` open enum +// note: `SmallUnsigned` is backed by `u8`, which fallibly converts +// from `i8` +// help: try converting to `u8` first: +// `u8::try_from(signed_byte).unwrap() as SmallUnsigned` +assert!(matches!(signed_byte as SmallUnsigned, SmallUnsigned::Y)); +_ = signed_byte as Small; +_ = signed_byte as SmallSigned; +``` + +## Interaction with the standard library + +- `derive(Debug)` formats as `EnumName(X)` when `X` is a reserved discriminant. + A `Debug` format changing is not considering an API-breaking change. +- `Default` forbids `#[default]` from being specified on an unnamed variant, + but this may change in the future. +- The derives `Clone`, `Copy`, `Eq`, `Hash`, `Ord`, `PartialEq`, and + `PartialOrd` are unaffected by unnamed variants on a field-less enum. + They all operate on discriminants, and this includes reserved discriminants. +- `mem::Discriminant` continues to operate as before, always treating + field-less enum values with the same discriminant integers as equal and + those with different discriminant integers as non-equal. + +# Drawbacks + +- The mutual-exclusion with `non_exhaustive` despite having similar motivations + could be confusing to explain to new users. +- Every new feauture in Rust is another thing to maintain and for users to + learn. +- Rust has not put significant efforts towards ABI compatibility in language + constructs in the past. + +## Flag enums + +It is possible to define `bitflags` style enums using `enum` syntax with +unnamed variants. However, if `BitOr` is defined on such an enum, then, rather +confusingly, `!matches!(Enum::A | Enum::B, Enum::A | Enum::B)`. This problem +exists for `bitflags` or integer newtypes that `derive(PartialEq)` today, which +is why the library defines a `bitflags_match!` macro that avoids it. + +As future work, a lint could trigger when `|` is used in a pattern with a +non-integer type that defines `BitOr` and has structural equality. + +# Rationale and alternatives + +Unnamed variants enable a large range of discriminants to be reserved for an +enum, whether it's all or some of them. `NonZero`, and an `enum` spelling out +each discriminant are the only other ways to achieve this in stable Rust today. + +The open enum conversion from backing integer is an ergonomic benefit that is +made possible by unnamed variants. + +## Do nothing + +Why not just use an integer newtype or macro? + +The best way to write a field-less open enum in Rust today is the "newtype enum" +pattern that uses associated constants for variants. So, to make this enum open: + +```rust +enum Color { + Red, + Blue, + Black, +} +``` + +the author can write this: + +```rust +#[repr(transparent)] // Optional, but often useful +#[derive(PartialEq, Eq)] // In order to work in a `match` +struct Color(pub i8); // Alternatively, make the inner private and `impl From` + +#[allow(non_upper_case_globals)] // Enum variants are CamelCase +impl Color { + pub const Red: Color = Color(0); + pub const Blue: Color = Color(1); + pub const Black: Color = Color(2); +} +``` + +With this syntax, users of an open enum can use these variant names inside a +`match` with _mostly_ the same syntax as they would with a regular closed enum, +except there must _always_ be a wildcard branch for handling unknown values. +This syntax also provides grouping of related values and associated methods, an +advantage over module-level `const` items. + +However, this pattern has some distinct disadvantages when used to emulate +an open enum, as described in the +[Motivation](#newtype-integers-are-bad-for-enumeration) section above. + +[Pattern types][pattern types] can contrain the valid values for an integer +newtype, but do not help with the enum ergonomics issue. + +## As an `enum` attribute + +An enum could be made open by specifying it as part of its `repr`: + +```rust +#[repr(open, u8)] // requires an explicit `repr(Int)` +enum Color { + Red, + Blue, + Black +} +use Color::*; +// or an unsafe `transmute` +assert!(!matches!(3u8 as Color, Red | Blue | Black)); +``` + +This has the same interaction with `#[non_exhaustive]`. The drawbacks: + +- It's not as clear what the attribute does, in contrast to the `_ = ..` syntax + mirroring known concepts: we're introducing new valid values, `_` means + "unnamed/wildcard", and `..` means "the rest" as the discriminants. +- Unnamed variants meld well with unnamed fields in `struct`/`union` for ABI + stability. +- It is not clear why a `repr` would affect `match`/`as` behavior, even though + this does affect how it is valid to represent the type. + - There are many alternative syntaxes for this, such as + `#[non_exhaustive(repr)]` or `[open]` / `#[open(Range)]`. All should require + a `repr(Int)` be specified. +- Allowing for a reservation of particular ranges instead of a full opening + could be done with a pattern-type-like syntax, but this is less discoverable: + + ```rust + #[repr(u8 in 1..=100)] + pub enum NonZeroU8 { + One = 1, + Two = 2, + } + ``` + +## Unbounded ranges select discriminants based on surrounding variants + +```rust +#[repr(u32)] +enum Foo { + X, + // Reserves `1..=4`. + _ = .., + Y = 5, + // Reserves `6..=10`. + _ = ..=10, +} + +enum Bar { + _ = .. + X, + _ = .., + Y = 5, +} +``` + +- This prevents the highly desirable one-line declaration that every + discriminant is valid. +- Ordinarily a variant with an explicit discriminant expression is not sensitive + to the discriminants of surrounding variants. + +Consider this enum being processed by a derive macro: + +```rust +// How does a derive-macro make this enum have no niches? +#[repr(u8)] +enum Foo { + X = CONST1, // non-literal expressions defined elsewhere + Y = CONST2, +} +``` + +How does that macro make the `Foo` enum open? The macro developer might try to +surround the variants with `_ = ..`: + +```rust +#[repr(u8)] +enum Foo { + _ = .., + X = CONST1, // non-literal expressions defined elsewhere + _ = .., + Y = CONST2, + _ = .., +} +``` + +But what if `CONST1 > CONST2`? If this compiles then the range of discriminants +`(CONST2 + 1)..CONST1` are invalid and it's not an open enum! If it errors out, +then there's no clear way one is supposed to write the opening-macro. +Complicating the macro further can make it work, so long as empty discriminant +ranges are allowed: + +```rust +#[repr(u8)] +enum Foo { + _ = ..CONST1, + X = CONST1, + + // You need to provide your *own* `max` and `min` since it's unstable + // in `const`. + _ = min(CONST1, CONST2)..=max(CONST1, CONST2), + + Y = CONST2, + _ = .., +} +``` + +## Forbid unnamed variants' discriminants from overlapping named ones + +```rust +#[repr(u32)] +// error: discriminant `200` assigned more than once +enum HttpStatusCode { + Ok = 200, + _ = 100..=599, +} +``` + +This makes it entirely unambiguous which discriminant is assigned to which +variant, without precedence rules. However, `_ = ..` to "make it open" is still +desirable. + +- Forbidding named variant overlaps with `_ = ..` makes it nearly useless, since + it then must be the only variant for the enum. +- Giving `..` special behavior to reserve "the rest" of the variants is then + inconsistent with other ranges' behavior. + - There is precedent for `..` acting differently than other ranges, such as + when `match`ing a number or `char`. This `..`, however, is an expression and + not a pattern. + - It cannot be reasonably be equivalent to `Int::MIN..=Int::MAX` without that + range allowing named variant overlap. + +## Declare niches instead of reserving values + +If an enum selects its discriminants such that a desirable niche exists, like +`0`, perhaps it is better to declare ranges of niches rather than reserving +discriminants? + +It can be very confusing to mix positive and negative assertions, and this would +be doing that for enum discriminants in likely a different syntax than variant +declaration. + +Unnamed variants use the same syntax to assign discriminants, except they do not +have to have a name and thus can be assigned to discontiguous ranges. + +## Discriminant ranges for named variants instead of unnamed variants + +What if instead this were valid? + +```rust +enum IpProto { + Tcp = 6, + Udp = 17, + Other = .., +} +``` + +This is not mutually exclusive with unnamed variants, but this RFC chooses to +leave reserved ranges of discriminants as anonymous to keep the feature simple. + +- It is ambiguous what value should be chosen when `IpProto::Other` is used in + an expression. +- Even with an arbitrary rule to choose a discriminant, a consistently + performant `derive(PartialEq)` that compares discriminants instead of ranges + of discriminants will result in + `matches!(o, IpProto::Other) && o != IpProto::Other`. + - A reasonable but less useful alternative is to reject expression usage as + well as `derive(PartialEq)`. +- If discontiguous ranges are allowed as above, the performance of + `matches!(o, EnumName::Variant)` gets worse as the number of variants grows. +- Adding an `Icmp = 1` variant affects `matches!(1 as IpProto, IpProto::Other)`: + it is an API-breaking change. +- A `derive` can be used to determine whether an enum's discriminant is assigned + to a named variant. +- Anonymous discriminant values are useful on their own for enum evolution. + +This can be left as future work for the language. + +## `..` at the end + +```rust +#[repr(u8)] +enum IpProto { + Tcp = 6, + Udp = 17, + + // "the rest of the variants exist" + .. +} +``` + +- This is less flexible than `_ = ..`, and is awkward to restrict to smaller or + discontiguous ranges. +- This resembles the [rest pattern] more than the [full range expression] that + discriminants are assigned to and the [wildcard pattern] that it requires. + +[full range expression]: https://doc.rust-lang.org/reference/expressions/range-expr.html#grammar-RangeFullExpr +[rest pattern]: https://doc.rust-lang.org/reference/patterns.html?#rest-patterns +[wildcard pattern]: https://doc.rust-lang.org/reference/patterns.html?#wildcard-pattern + +## An "other" variant carries unknown discriminants like a tuple variant + +An alternative way to specify a field-less open enum could be to write this: + +```rust +#[repr(u32)] +enum IpProto { + Tcp = 6, + Udp = 17, + + // bikeshed syntax + Other(0..6 | 7..17 | 18..=u32::MAX), +} +``` + +This would mean that the `Other` variant is a named way to refer to unlisted +values and works in pattern matching naturally, while being a zero-cost +representation: + +```rust +if let IpProto::Other(x) = proto { + // `proto` was *not* `Tcp` or `Udp`; its integer value is in `x`. +} +``` + +However, this has some problems. For one, it's peculiar for a tuple variant +syntax to not carry a payload, but a discriminant. It is also possible to +build the variant with a discriminant value, which means that it would need +to be constrained by a [pattern type][pattern types] - one that may end up +far more complicated if it overlaps with named variants. It is also an API +breaking change to move the discriminant `2` to a new named variant, since +it breaks anyone passing `2` into an `IpProto::Other` expression. + +```rust +if let IpProto::Other(x) = IpProto::Other(6) { + // This branch is not taken, since it's actually an `IpProto::Tcp`! +} +``` + +Instead, to get this behavior with this RFC's proposed syntax, the +author could use a third-party derive to check against the named variants, +and an `unsafe` transmute or `as` cast to construct the enum value from +integer. This makes it clear that declaring a new named variant with an +unnamed variant's discriminant will affect the method's return value. + +```rust +#[repr(transparent, u32)] +#[derive(IsNamedVariant)] +enum IpProto { + Tcp = 6, + Udp = 17, + _ = .., +} + +assert!(!(3u32 as IpProto).is_named_variant()); +assert!((6u32 as IpProto).is_named_variant()); +``` + +## Require `non_exhaustive`, don't forbid it + +Perhaps an unnamed variant could _require_ `#[non_exhaustive]`, rather than +forbid it? This RFC opts against that, with the following considerations: + +Pros: + +- `non_exhaustive` already implies adding another wildcard branch. This could + make it easier to explain to new users by fitting the idea of "needs wildcard + branch" into one mental bucket. +- This would make the unstable allow-by-default + `non_exhaustive_omitted_patterns` lint more obviously correct to apply to + enums with unnamed variants. + +Cons: + +- It expands the scope of `non_exhaustive`: the wildcard branch required by + unnamed variants apply to the defining crate as well as foreign crates. This + could make it harder to explain to newer users. +- The variant name being an underscore _already_ implies that a wildcard branch + is needed. +- It always requires two lines to achieve ABI non-exhaustiveness. +- Consider this enum: + + ```rust + #[repr(u8)] + #[non_exhaustive] + enum OpenEnum { + X000 = 0, + X001 = 1, + // XNNN = N, + X254 = 254, + _ = 255, + } + ``` + + When adding `X255`, the `non_exhaustive` _should_ also be removed, but as of + today, an open enum gives no warning if it is `non_exhaustive`. This is even + though it would necessarily be an API and ABI-breaking change to add a new + variant by changing the `repr`. This is non-obvious and can be avoided by + forbidding `non_exhaustive` when a valid unnamed variant exists. + +# Prior art + +_Open_ and _closed_ enums are [pre-existing industry terms][acord-xml]. + +## Enum openness in other languages + +- C++'s [scoped enumerations][cpp-scoped-enums] and C enums are both open + enums. +- C# uses [open enums][cs-open-enums], with a [proposal][cs-closed-enums] to + add closed enums for guaranteed exhaustiveness. +- Java uses closed enums. +- [Protobuf][protobuf-enum] uses closed enums with the `proto2` syntax, treating + unlisted enum values as unknown fields, and changed the semantics to open + enums with the `proto3` syntax. This was in part because of lessons learned + from protocol evolution and service deployment as described above. +- Swift uses both closed and open enums, based on if it is `@frozen`. An + `@unknown default` branch is required for open enums, the `@unknown` being + another way to achieve the design goals of the + `non_exhaustive_omitted_range_patterns` lint. + +[acord-xml]: https://docs.oracle.com/cd/B40099_02/books/ConnACORDFINS/ConnACORDFINSApp_DataType10.html +[cpp-scoped-enums]: https://en.cppreference.com/w/cpp/language/enum#Scoped_enumerations +[cs-open-enums]: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/enum#conversions +[cs-closed-enums]: https://github.com/dotnet/csharplang/issues/3179 +[protobuf-enum]: https://developers.google.com/protocol-buffers/docs/reference/cpp-generated#enum + +## Other crates that use open enums + +Users today are simulating open enums with other language constructs, but it's +a suboptimal experience: + +- [open-enum], written by the author of this RFC. It's a procedural + macro which converts any field-less `enum` definition to an equivalent + newtype integer with associated constants. +- Bindgen is [aware of the problem][bindgen-ub] with FFI and closed enums, and + avoids creating Rust enums from C/C++ enums because of this. It provides an + option for newtype enums directly. +- ICU4X uses newtype enums for [certain properties][icu4x-props] which must be + forwards compatible with future versions of the enum. +- OpenTitan's [`with_unknown!`] macro also uses this pattern to create + "C-like enums". +- `winapi-rs` defines an [`ENUM`][winapi-enum] macro which generates plain + integers for simple `enum` definitions. + +The [`newtype-enum` crate][newtype-enum-crate] is an entirely different pattern +than what is described here. + +[bindgen-ub]: https://github.com/rust-lang/rust/issues/36927 +[icu4x-props]: https://github.com/unicode-org/icu4x/blob/ff1d4b370b834281e3524118fb41883341a7e2bd/components/properties/src/props.rs#L56-L106 +[newtype-enum-crate]: https://crates.io/crates/newtype-enum +[open-enum]: https://crates.io/crates/open-enum +[`with_unknown!`]: https://github.com/lowRISC/opentitan/blob/06584dc620c633e88631f97f1fc1e22c1980c21c/sw/host/ot_hal/src/util/unknown.rs#L7-L48 +[winapi-enum]: https://github.com/retep998/winapi-rs/blob/77426a9776f4328d2842175038327c586d5111fd/src/macros.rs#L358-L380 + +## `bitflags` + +The bitflags crate also uses [an unnamed value][bitflags-unnamed] with `_` to +specify valid bits without assigning a name to them. + +[bitflags-unnamed]: https://docs.rs/bitflags/latest/bitflags/macro.bitflags.html#named-and-unnamed-flags + +## `abi_stable` + +[`abi_stable::NonExhaustive`] uses an associated type to hold a typed raw +discriminant for an enum. It is not ergonomic to `match` on discriminant values +directly, but another macro could improve this. + +[`abi_stable::NonExhaustive`]: https://docs.rs/abi_stable/0.11.3/abi_stable/nonexhaustive_enum/struct.NonExhaustive.html + +## Unnamed fields + +The [Unnamed fields] RFC reserves space for future extension in a `struct` or +`union` for FFI purposes, allowing ABI to be planned ahead of time. Unnamed +variants have similar motivations, but no great workaround. The future work +proposed below to allow `_(payload) = discriminants` further unifies these +concepts by reserving space for `payload` to be held in the enum. + +[Unnamed fields]: https://github.com/rust-lang/rfcs/blob/master/text/2102-unnamed-fields.md + +## `repr(open)` RFC + +There's an [unmerged RFC][enum-repr-open] that defines a `repr(open)` syntax as +described in the Alternatives section above. + +[enum-repr-open]: https://github.com/madsmtm/rfcs/blob/enum-repr-no-niches/text/3803-enum-repr-open.md + +# Unresolved questions + +None. + +# Future possibilities + +## Discriminant ranges for named variants + +A future extension could allow for named variants to specify ranges as +discriminants. This bikeshed syntax avoids many of the drawbacks in the +related Alternatives section above. + +```rust +#[repr(u8)] +enum Color { + Red = 0, + Green = 1, + // Must specify a non-overlapping range, + // including if `..` is the discriminant. + Unknown = 2..=50, +} + +// This is fine. +assert_eq!(Color::Red as u8, 0); + +// error: ambiguous discriminant for `Color::Unknown` +// help: specify a discriminant with `2 as Color::Unknown` +// let c = Color::Unknown; + +// Use an `as` cast to construct `Color::Unknown` safely. +let c = 3 as Color::Unknown; +assert_eq!(c as u8, 3); + +// error: invalid discriminant for `Color::Unknown` +// help: `Color::Unknown` has the discriminant range `2..=u8::MAX` +// let c = 0 as Color::Unknown; + +let d = 10u8; +// error: non-constant expression used for enum ranged variant cast +// let c = d as Color::Unknown; + +// This is fine. +let c = const { 1 + 1 } as Color::Unknown; + +// Pattern types could extend this further: +let e = match d { + x @ 2..=50 => x as Color::Unknown, + _ => unreachable!(), +}; +assert_eq!(e as u8, 10); +``` + +## Unnamed variants on enums with field data + +Unnamed variants on enums with field data would allow library authors to plan +for future ABI compatibility by reserving discriminants and data space for an +enum. This requires significantly more documentation and care regarding ABI +stability before this can be stabilized. + +For example: + +```rust +#[repr(u32)] +pub enum Shape { + Circle { radius: f32 } = 0, + Rectangle { width: f32, height: f32 } = 1, + _ = 2..=10, +} +``` + +- This reserves discriminants `2..=10` as valid for the `Shape` enum. It's not + an ABI-breaking change to add new variants with data to `Shape` using these + discriminants, so long as it doesn't affect the layout of the `Shape`. +- `Drop` glue is forbidden for field data (for a similar reason as `union`). +- The payload bytes of `Shape` are treated as opaque and never as padding. + +By putting field data in an unnamed variant, `Shape` can specifically +reserve the size and alignment needed to hold future variants' fields: + +```rust +#[repr(u32)] +pub enum Shape { + Circle { radius: f32 } = 0, + Rectangle { width: f32, height: f32 } = 1, + + // This reserves discriminants `2..=10` and the layout to hold a + // thin pointer without breaking ABI. It's as if there were a variant + // for `&'static ()` in the enum's internal `union`. + _(&'static ()) = 2..=10, + + // Because of the above, it's not an ABI-breaking change to add this + // variant since the layout won't be affected: + // FromInfo { name: &'static ShapeInfo } = 2, +} +``` + +## Tuple-like syntax for `repr` enums + +A very useful thing this RFC enables is that replacing this: + +```rust +// The "newtype integer enum" pattern. +#[derive(PartialEq, Eq)] +pub struct Color(u32); +impl Color { + const Red: Color = Color(0); + const Blue: Color = Color(1); + const Green: Color = Color(2); +} +``` + +with this: + +```rust +#[derive(PartialEq, Eq)] +#[repr(u32)] +pub enum Color { + Red, + Blue, + Green, + _ = .., +} +``` + +is a non-breaking change for client crates. + +However, if the library initially exposed the discriminant field as `pub`, as +`bindgen`, `icu4x`, and `open-enum` do, then the migration to an open `enum` +requires that `Color(discriminant)` and `color.0` also function as originally. + +These each have their own independent utility: + +### Tuple constructor + +The enum name is a constructor `fn(Repr) -> Enum`: + +```rust +assert_eq!(Color(1), Color::Blue); +assert!( + [0, 3, 2].map(Color), + matches!([Color::Red, _, Color::Green]) +) +``` + +- This is valid for any open enum, the same as the `as` cast from integer. +- This mirrors the `derive(Debug)` format, is ergonomic, and is clear at + callsite. Thus it may be worth adding to Rust even if `.0` isn't. +- When should one prefer the constructor over the `as` cast? Always? + +### Discriminant field access + +`.0` provides direct access to the discriminant value: + +```rust +let mut c = Color::Blue; +assert_eq!(c.0, 1); +c.0 += 1; +assert!(matches!(c, Color::Green)); +assert_eq!(c.0, 2); +``` + +This is subjectively ugly and undiscoverable syntax to access the discriminant +of an `enum`. One possibility: when introduced, treat as deprecated and throw a +warning to recommend a better syntax than `.0` but still allow the desired +non-breaking migration. + +There are a few distinct advantages compared to `as` casting: + +- It is possible to get a reference directly to the discriminant, which can be + useful when performing lifetime-constrained zero-copy serialization. +- The type of `.0` is exactly the `repr`, and doesn't require the user specify a + type to `as` cast to and possibly truncate. Currently, there's no language + feature in Rust that does this - it requires a macro or codegen to guarantee. + This can cause subtle bugs, especially for `repr(C)`: + + ```rust + #[repr(C)] + enum Oops { + // On any platform where this is more than `c_int::MAX`. + TooBig = 2_147_483_649, + } + assert_eq!(Oops::TooBig as core::ffi::c_int, -2_147_483_647); + ``` + + Instead, `.0` accesses the discriminant without fear of truncation: + + ```rust + assert_eq!(Oops::TooBig.0, 2_147_483_649); + // mismatched types, expected `i32`, got `i64` + // let _: c_int = X::V.0; + ``` + +This could be supported for _any_ enum with an explicit `repr(Int)` by having +closed enums be `unsafe` to mutate through `.0` - it's an [unsafe field]. + +```rust +#[repr(u32)] +enum X { + A = 0, + B = 1, +} +let mut x = X::A; +assert_eq!(x.0, 0); + +// SAFETY: 1 is a valid discriminant for `X` +unsafe { x.0 += 1; } + +assert!(matches!(x, X::B)); +``` + +[unsafe field]: https://rust-lang.github.io/rust-project-goals/2025h2/unsafe-fields.html + +## Extracting the integer value of the discriminant for fielded enums + +A fielded enum with `#[repr(Int)]` and/or `#[repr(C)]` is guaranteed to have its +discriminant values starting from 0. However, for any given value of that enum, +there's no built-in way to extract what the integer value of the discriminant is +safely. The unsafe mechanism is `(&thenum as *const _ as *const Int).read()`. +For open fielded enums, this would be even more valuable, since the discriminant +could be entirely unknown and the programmer may want to know its value. + +Perhaps this uses the same `.0` syntax as above, or an extension to +`mem::Discriminant`? + +## `match` on ranges of enums + +```rust +#[repr(u32)] +enum HttpStatusCode { + Ok = 200, + NoContent = 204, + Internal = 500, + Unavailable = 503, + _ = 100..=599, +} +let code = unsafe { transmute(301u32) }; +let name = match code { + HttpStatusCode::Ok => "ok", + HttpStatusCode::NotFound => "not found", + + // Matches on discriminants 500..=503. + HttpStatusCode::Internal..=HttpStatusCode::Unavailable => + "lower server error", + + // Explicit `repr` allows matching on the discriminant value. + 100..=199 => "info", + 200..=299 => "success", + 300..=399 => "redirection", + 400..=499 => "client error", + 500..=599 => "server error", + + // Exhaustive match, no wildcard branch needed. +} +``` From c4d811608770f3462336805614e0d2965910e1cf Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Tue, 9 Dec 2025 11:32:54 -0800 Subject: [PATCH 02/42] Add PR number, start date --- text/{0000-unnamed-variants.md => 3894-unnamed-variants.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-unnamed-variants.md => 3894-unnamed-variants.md} (99%) diff --git a/text/0000-unnamed-variants.md b/text/3894-unnamed-variants.md similarity index 99% rename from text/0000-unnamed-variants.md rename to text/3894-unnamed-variants.md index 35032741c10..ae3b21d16ac 100644 --- a/text/0000-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1,6 +1,6 @@ - Feature Name: (`unnamed_variants`) -- Start Date: -- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Start Date: 2025-12-09 +- RFC PR: [rust-lang/rfcs#3894](https://github.com/rust-lang/rfcs/pull/3894) - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) From d77969897013ca61786a65c1a486310742dd7fbb Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Tue, 9 Dec 2025 12:21:10 -0800 Subject: [PATCH 03/42] Fix typo, qualify as-cast type inference --- text/3894-unnamed-variants.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index ae3b21d16ac..8d3a9408fdc 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1034,7 +1034,7 @@ integer is a valid discriminant. - Casting from other integer types is rejected. - If an expression with the `{integer}` inference variable type is used as the source for an `as` cast to an open enum, it is uniquely constrained to the - backing integer type. + explicit backing integer type. This excludes `repr(C)`; see below. ```rust #[repr(u8)] @@ -1063,7 +1063,7 @@ A `repr(C)` unit-only open enum may be `as` cast from: visible effect. - This means that authors who don't know or care about short-enum platforms can cast from `c_int` and `c_uint` to most `repr(C)` open enums, while - preventing those unexpected truncations when necessary. + preventing unexpected truncations when necessary. ```rust const TEN: isize = 10; @@ -1236,7 +1236,7 @@ However, this pattern has some distinct disadvantages when used to emulate an open enum, as described in the [Motivation](#newtype-integers-are-bad-for-enumeration) section above. -[Pattern types][pattern types] can contrain the valid values for an integer +[Pattern types][pattern types] can constrain the valid values for an integer newtype, but do not help with the enum ergonomics issue. ## As an `enum` attribute From 81008ebe3d40a3f49860e1227ff37a8781f21244 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Tue, 9 Dec 2025 12:36:44 -0800 Subject: [PATCH 04/42] Clarify deserialization of bytes in embedded There is no reason for these bytes to be in flash memory - it is merely one of the applicable use cases. --- text/3894-unnamed-variants.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 8d3a9408fdc..ad922209a28 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -206,11 +206,10 @@ This has drawbacks: ## Zero-copy deserialization -A common pattern on embedded systems is to read data structures directly from -a `[u8]`, facilitated by libraries like -[`zerocopy`][zerocopy-frombytes-derive] or -[`bytemuck`][bytemuck-checkedbitpattern]. In order to do this, the bytes in -flash must always be validated to be one of the known discriminants. +A common pattern on embedded systems is to read data structures directly from a +`[u8]`, facilitated by libraries like [`zerocopy`][zerocopy-frombytes-derive] or +[`bytemuck`][bytemuck-checkedbitpattern]. In order to do this, the bytes must +always be validated to be one of the known discriminants. This scales poorly for performance and code bloat as more enums are added to be deserialized in a message. It is more flexible to defer wildcard branches for From a1e9b59fc2f66865ff504b9a5ce5a2caa1d75eda Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Tue, 9 Dec 2025 17:52:34 -0800 Subject: [PATCH 05/42] Add `enum` to feature name, clarify zero-copy further unnamed_variants is an unambiguous feature name that describes what the author is now able to spell. However, it takes more mental effort to remember what a variant is, while "enum variant" is immediately clear. This also clarifies the zero-copy deserialization motivation language further. --- text/3894-unnamed-variants.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index ad922209a28..9d38bfe265a 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1,4 +1,4 @@ -- Feature Name: (`unnamed_variants`) +- Feature Name: (`unnamed_enum_variants`) - Start Date: 2025-12-09 - RFC PR: [rust-lang/rfcs#3894](https://github.com/rust-lang/rfcs/pull/3894) - Rust Issue: @@ -208,15 +208,15 @@ This has drawbacks: A common pattern on embedded systems is to read data structures directly from a `[u8]`, facilitated by libraries like [`zerocopy`][zerocopy-frombytes-derive] or -[`bytemuck`][bytemuck-checkedbitpattern]. In order to do this, the bytes must -always be validated to be one of the known discriminants. - -This scales poorly for performance and code bloat as more enums are added to be -deserialized in a message. It is more flexible to defer wildcard branches for -unknown discriminants to the point when the enum is `match`ed on, rather than -up-front during deserialization. When these checks are undesirable, ergonomics -must be sacrificed for compatibility and performance by using an integer -newtype. +[`bytemuck`][bytemuck-checkedbitpattern]. In order to do this, the bytes for an +enum must always be validated to be one of the known discriminants. + +This scales poorly for performance and code bloat as more enums and variants are +added to be deserialized in a message. It is more flexible to defer wildcard +branches for unknown discriminants to the point when the enum is `match`ed on, +rather than up-front during deserialization. When these checks are undesirable, +ergonomics must be sacrificed for compatibility and performance by using an +integer newtype. [bytemuck-checkedbitpattern]: https://docs.rs/bytemuck/latest/bytemuck/checked/trait.CheckedBitPattern.html [zerocopy-frombytes-derive]: https://docs.rs/zerocopy/0.6.1/zerocopy/derive.FromBytes.html From bf548641ff7bc26137a075aabbc7779442f9af0c Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 10 Dec 2025 11:43:35 -0800 Subject: [PATCH 06/42] Increase heading levels See f17e862 for reasoning. --- text/3894-unnamed-variants.md | 116 +++++++++++++++++----------------- 1 file changed, 58 insertions(+), 58 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 9d38bfe265a..931e941702a 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -4,7 +4,7 @@ - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) -# Summary +## Summary Enable ranges of enum discriminants to be reserved ahead of time, requiring all users of that enum to consider those values as valid. This includes within @@ -18,7 +18,7 @@ for an enum, it becomes an _open enum_. If it is [unit-only], it can then be [unit-only]: https://doc.rust-lang.org/reference/items/enumerations.html#r-items.enum.unit-only -# Motivation +## Motivation Enums in Rust have a _closed_ representation, meaning the only valid representations of that type are the variants listed, with any violation of this @@ -89,7 +89,7 @@ existing binaries using this definition to add `Paused = 2`. The `_ = ..` has required _every_ exhaustive `match` of `TaskState`, including in the defining crate, to handle the case where it's not one of the currently-named variants. -## Protobuf +### Protobuf Protocol Buffers (Protobuf), a language-neutral serialization mechanism, is designed to be forwards and backwards compatible when extending a schema. @@ -110,7 +110,7 @@ associated constants for each variant. While this allows Protobuf enums in Rust to be used _mostly_ like enums, this is a suboptimal experience. -### Newtype integers are bad for enumeration +#### Newtype integers are bad for enumeration When the point of a type is to give an integer a set of well-known names (like in C++), a newtype integer isn't as ergonomic to use as an `enum`: @@ -138,7 +138,7 @@ in C++), a newtype integer isn't as ergonomic to use as an `enum`: If Protobuf instead declared generated Rust enums with a `_ = ..` variant, users could have a first-class enum experience with compatible open semantics. -## C interop +### C interop A closed `#[repr(C)]` field-less `enum`s is [hazardous][repr-c-field-less] to use when interoperating with C, mostly because it is so easy to trigger @@ -162,7 +162,7 @@ default, instead of a exposing a less-effective `non_exhaustive` attribute. [bindgen-enum-variation]: https://docs.rs/bindgen/0.72.1/bindgen/enum.EnumVariation.html -## Dynamic Linking +### Dynamic Linking Dynamically linked libraries, Rust or otherwise, are prone to ABI compatibility breakage. @@ -178,7 +178,7 @@ others listed. [non-exhaustive-ub]: https://github.com/rust-lang/rust-bindgen/issues/1763 -## Embedded syscalls +### Embedded syscalls TockOS is an embedded OS with a separate user space and kernel space. Its syscall ABI defines that kernel error codes are between 1 and 1024. It's highly @@ -204,7 +204,7 @@ This has drawbacks: [libtock-errorcode]: https://github.com/tock/libtock-rs/blob/master/platform/src/error_code.rs#L30-L33 -## Zero-copy deserialization +### Zero-copy deserialization A common pattern on embedded systems is to read data structures directly from a `[u8]`, facilitated by libraries like [`zerocopy`][zerocopy-frombytes-derive] or @@ -221,7 +221,7 @@ integer newtype. [bytemuck-checkedbitpattern]: https://docs.rs/bytemuck/latest/bytemuck/checked/trait.CheckedBitPattern.html [zerocopy-frombytes-derive]: https://docs.rs/zerocopy/0.6.1/zerocopy/derive.FromBytes.html -## Restricted range integers +### Restricted range integers Unnamed variants can be used to define integers that are statically restricted to a particular range, including with niches. @@ -282,7 +282,7 @@ type FuelLevel = Ranged; [pattern types]: https://github.com/rust-lang/rust/pull/107606 -# Guide-level explanation +## Guide-level explanation Enums have a _closed_ representation by default, meaning that any enum value must be represented by one of the listed variants. Constructing any enum value @@ -391,7 +391,7 @@ let fruit2 = 5isize as Fruit; This open enum is much like a `struct Fruit(u32)`, except it is treated as an enum by IDEs and developers. -## Interaction with `#[non_exhaustive]` +### Interaction with `#[non_exhaustive]` An enum declared both `non_exhaustive` and with an unnamed variant is rejected. On a field-less enum, it is not a breaking change to replace a @@ -417,9 +417,9 @@ By contrast, an unnamed variant affects API _and_ ABI semver compatibility: For enums that have relevant discriminant values, an unnamed variant may be the better choice. This is often the case for enums declaring an explicit `repr`. -# Reference-level explanation +## Reference-level explanation -## Unnamed variants +### Unnamed variants An **unnamed variant** is an enum variant with `_` declared for its name. It is assigned to one or many **reserved discriminants**. These discriminants are @@ -591,7 +591,7 @@ these types: } ``` -### Type Inference +#### Type Inference The discriminant expression for an unnamed variant has its type inferred as if it were an argument to a generic function accepting the valid types for the @@ -618,7 +618,7 @@ impl ReserveDiscriminants for RangeToInclusive {} impl ReserveDiscriminants for RangeFull {} ``` -### `repr(C)` behavior +#### `repr(C)` behavior `repr(C)` enums have special semantics in Rust because the discriminant expression type, `isize`, is not the same as the actual backing integer. These @@ -725,7 +725,7 @@ enum Foo { } ``` -### Grammar changes +#### Grammar changes [EnumVariant] is extended to allow an underscore instead of a variant's name: @@ -738,12 +738,12 @@ EnumVariant -> [EnumVariant]: https://doc.rust-lang.org/reference/items/enumerations.html#grammar-EnumVariant -### No field data +#### No field data This RFC only defines adding unnamed variants to field-less enums, leaving this as future work. -### `non_exhaustive` +#### `non_exhaustive` The `non_exhaustive` attribute on enums and unnamed variants are mutually exclusive: @@ -763,7 +763,7 @@ enum Color { An unnamed variant is more impactful than `non_exhaustive`, since it affects the declaring crate - the enum is "universally non-exhaustive". -### Compatibility +#### Compatibility Given enum versions A and B with some change between them: @@ -798,9 +798,9 @@ It is an API and ABI backwards-compatible change to: - Add an unnamed variant to an enum without `#[non_exhaustive]` or another unnamed variant. The same caveat regarding unused wildcard branches applies. -### Applicable lints +#### Applicable lints -#### Truncatable ranges +##### Truncatable ranges A new `warn`-by-default lint is produced if an unnamed variant's discriminant range can be shortened to avoid overlapping with named variants. @@ -875,7 +875,7 @@ enum ImplicitNextDiscriminant { } ``` -#### Gap of length one caused by an exclusive range +##### Gap of length one caused by an exclusive range The existing [`non_contiguous_range_endpoints`] lint is produced if: @@ -911,7 +911,7 @@ enum Bar { } ``` -#### Forgot to mention a named variant +##### Forgot to mention a named variant The unstable [`non_exhaustive_omitted_patterns`] `allow`-by-default lint is produced if a `match` on an enum with reserved discriminants mentions some, but @@ -949,7 +949,7 @@ let name = match b { }; ``` -### Next variant's implicit discriminant +#### Next variant's implicit discriminant When a named variant without an implicit discriminant follows an unnamed variant, the assigned implicit discriminant is the next integer after the @@ -989,7 +989,7 @@ enum Overflow { } ``` -### Non-literal discriminant expression +#### Non-literal discriminant expression A non-literal range or integer is allowed for an unnamed variant. @@ -1007,7 +1007,7 @@ enum Foo { let _: Foo = unsafe { mem::transmute(15u32) }; ``` -### Only variant +#### Only variant An unnamed variant may be the only variant for an enum. In this case, an `as` cast or `transmute` is the only way to construct an enum value. @@ -1019,7 +1019,7 @@ enum NothingYet { _ = .. } (10 as NothingYet > 5 as NothingYet) ``` -## Open enum conversion +### Open enum conversion An _open enum_ is defined as an `enum` for which every value of its backing integer is a valid discriminant. @@ -1047,7 +1047,7 @@ integer is a valid discriminant. // let _: u32 = x; ``` -### `repr(C)` open enum behavior +#### `repr(C)` open enum behavior The actual backing integer type for a `repr(C)` enum changes based on the variants' numeric discriminant values as described above. @@ -1153,7 +1153,7 @@ _ = signed_byte as Small; _ = signed_byte as SmallSigned; ``` -## Interaction with the standard library +### Interaction with the standard library - `derive(Debug)` formats as `EnumName(X)` when `X` is a reserved discriminant. A `Debug` format changing is not considering an API-breaking change. @@ -1166,7 +1166,7 @@ _ = signed_byte as SmallSigned; field-less enum values with the same discriminant integers as equal and those with different discriminant integers as non-equal. -# Drawbacks +## Drawbacks - The mutual-exclusion with `non_exhaustive` despite having similar motivations could be confusing to explain to new users. @@ -1175,7 +1175,7 @@ _ = signed_byte as SmallSigned; - Rust has not put significant efforts towards ABI compatibility in language constructs in the past. -## Flag enums +### Flag enums It is possible to define `bitflags` style enums using `enum` syntax with unnamed variants. However, if `BitOr` is defined on such an enum, then, rather @@ -1186,7 +1186,7 @@ is why the library defines a `bitflags_match!` macro that avoids it. As future work, a lint could trigger when `|` is used in a pattern with a non-integer type that defines `BitOr` and has structural equality. -# Rationale and alternatives +## Rationale and alternatives Unnamed variants enable a large range of discriminants to be reserved for an enum, whether it's all or some of them. `NonZero`, and an `enum` spelling out @@ -1195,7 +1195,7 @@ each discriminant are the only other ways to achieve this in stable Rust today. The open enum conversion from backing integer is an ergonomic benefit that is made possible by unnamed variants. -## Do nothing +### Do nothing Why not just use an integer newtype or macro? @@ -1238,7 +1238,7 @@ an open enum, as described in the [Pattern types][pattern types] can constrain the valid values for an integer newtype, but do not help with the enum ergonomics issue. -## As an `enum` attribute +### As an `enum` attribute An enum could be made open by specifying it as part of its `repr`: @@ -1277,7 +1277,7 @@ This has the same interaction with `#[non_exhaustive]`. The drawbacks: } ``` -## Unbounded ranges select discriminants based on surrounding variants +### Unbounded ranges select discriminants based on surrounding variants ```rust #[repr(u32)] @@ -1349,7 +1349,7 @@ enum Foo { } ``` -## Forbid unnamed variants' discriminants from overlapping named ones +### Forbid unnamed variants' discriminants from overlapping named ones ```rust #[repr(u32)] @@ -1374,7 +1374,7 @@ desirable. - It cannot be reasonably be equivalent to `Int::MIN..=Int::MAX` without that range allowing named variant overlap. -## Declare niches instead of reserving values +### Declare niches instead of reserving values If an enum selects its discriminants such that a desirable niche exists, like `0`, perhaps it is better to declare ranges of niches rather than reserving @@ -1387,7 +1387,7 @@ declaration. Unnamed variants use the same syntax to assign discriminants, except they do not have to have a name and thus can be assigned to discontiguous ranges. -## Discriminant ranges for named variants instead of unnamed variants +### Discriminant ranges for named variants instead of unnamed variants What if instead this were valid? @@ -1420,7 +1420,7 @@ leave reserved ranges of discriminants as anonymous to keep the feature simple. This can be left as future work for the language. -## `..` at the end +### `..` at the end ```rust #[repr(u8)] @@ -1442,7 +1442,7 @@ enum IpProto { [rest pattern]: https://doc.rust-lang.org/reference/patterns.html?#rest-patterns [wildcard pattern]: https://doc.rust-lang.org/reference/patterns.html?#wildcard-pattern -## An "other" variant carries unknown discriminants like a tuple variant +### An "other" variant carries unknown discriminants like a tuple variant An alternative way to specify a field-less open enum could be to write this: @@ -1500,7 +1500,7 @@ assert!(!(3u32 as IpProto).is_named_variant()); assert!((6u32 as IpProto).is_named_variant()); ``` -## Require `non_exhaustive`, don't forbid it +### Require `non_exhaustive`, don't forbid it Perhaps an unnamed variant could _require_ `#[non_exhaustive]`, rather than forbid it? This RFC opts against that, with the following considerations: @@ -1542,15 +1542,15 @@ Cons: variant by changing the `repr`. This is non-obvious and can be avoided by forbidding `non_exhaustive` when a valid unnamed variant exists. -# Prior art +## Prior art _Open_ and _closed_ enums are [pre-existing industry terms][acord-xml]. -## Enum openness in other languages +### Enum openness in other languages - C++'s [scoped enumerations][cpp-scoped-enums] and C enums are both open enums. -- C# uses [open enums][cs-open-enums], with a [proposal][cs-closed-enums] to +- C## uses [open enums][cs-open-enums], with a [proposal][cs-closed-enums] to add closed enums for guaranteed exhaustiveness. - Java uses closed enums. - [Protobuf][protobuf-enum] uses closed enums with the `proto2` syntax, treating @@ -1568,7 +1568,7 @@ _Open_ and _closed_ enums are [pre-existing industry terms][acord-xml]. [cs-closed-enums]: https://github.com/dotnet/csharplang/issues/3179 [protobuf-enum]: https://developers.google.com/protocol-buffers/docs/reference/cpp-generated#enum -## Other crates that use open enums +### Other crates that use open enums Users today are simulating open enums with other language constructs, but it's a suboptimal experience: @@ -1596,14 +1596,14 @@ than what is described here. [`with_unknown!`]: https://github.com/lowRISC/opentitan/blob/06584dc620c633e88631f97f1fc1e22c1980c21c/sw/host/ot_hal/src/util/unknown.rs#L7-L48 [winapi-enum]: https://github.com/retep998/winapi-rs/blob/77426a9776f4328d2842175038327c586d5111fd/src/macros.rs#L358-L380 -## `bitflags` +### `bitflags` The bitflags crate also uses [an unnamed value][bitflags-unnamed] with `_` to specify valid bits without assigning a name to them. [bitflags-unnamed]: https://docs.rs/bitflags/latest/bitflags/macro.bitflags.html#named-and-unnamed-flags -## `abi_stable` +### `abi_stable` [`abi_stable::NonExhaustive`] uses an associated type to hold a typed raw discriminant for an enum. It is not ergonomic to `match` on discriminant values @@ -1611,7 +1611,7 @@ directly, but another macro could improve this. [`abi_stable::NonExhaustive`]: https://docs.rs/abi_stable/0.11.3/abi_stable/nonexhaustive_enum/struct.NonExhaustive.html -## Unnamed fields +### Unnamed fields The [Unnamed fields] RFC reserves space for future extension in a `struct` or `union` for FFI purposes, allowing ABI to be planned ahead of time. Unnamed @@ -1621,20 +1621,20 @@ concepts by reserving space for `payload` to be held in the enum. [Unnamed fields]: https://github.com/rust-lang/rfcs/blob/master/text/2102-unnamed-fields.md -## `repr(open)` RFC +### `repr(open)` RFC There's an [unmerged RFC][enum-repr-open] that defines a `repr(open)` syntax as described in the Alternatives section above. [enum-repr-open]: https://github.com/madsmtm/rfcs/blob/enum-repr-no-niches/text/3803-enum-repr-open.md -# Unresolved questions +## Unresolved questions None. -# Future possibilities +## Future possibilities -## Discriminant ranges for named variants +### Discriminant ranges for named variants A future extension could allow for named variants to specify ranges as discriminants. This bikeshed syntax avoids many of the drawbacks in the @@ -1680,7 +1680,7 @@ let e = match d { assert_eq!(e as u8, 10); ``` -## Unnamed variants on enums with field data +### Unnamed variants on enums with field data Unnamed variants on enums with field data would allow library authors to plan for future ABI compatibility by reserving discriminants and data space for an @@ -1724,7 +1724,7 @@ pub enum Shape { } ``` -## Tuple-like syntax for `repr` enums +### Tuple-like syntax for `repr` enums A very useful thing this RFC enables is that replacing this: @@ -1760,7 +1760,7 @@ requires that `Color(discriminant)` and `color.0` also function as originally. These each have their own independent utility: -### Tuple constructor +#### Tuple constructor The enum name is a constructor `fn(Repr) -> Enum`: @@ -1777,7 +1777,7 @@ assert!( callsite. Thus it may be worth adding to Rust even if `.0` isn't. - When should one prefer the constructor over the `as` cast? Always? -### Discriminant field access +#### Discriminant field access `.0` provides direct access to the discriminant value: @@ -1840,7 +1840,7 @@ assert!(matches!(x, X::B)); [unsafe field]: https://rust-lang.github.io/rust-project-goals/2025h2/unsafe-fields.html -## Extracting the integer value of the discriminant for fielded enums +### Extracting the integer value of the discriminant for fielded enums A fielded enum with `#[repr(Int)]` and/or `#[repr(C)]` is guaranteed to have its discriminant values starting from 0. However, for any given value of that enum, @@ -1852,7 +1852,7 @@ could be entirely unknown and the programmer may want to know its value. Perhaps this uses the same `.0` syntax as above, or an extension to `mem::Discriminant`? -## `match` on ranges of enums +### `match` on ranges of enums ```rust #[repr(u32)] From 19c4228403c398e5ba0234fa8b517c7b2390bc23 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 10 Dec 2025 14:34:00 -0800 Subject: [PATCH 07/42] Allow no-effect unnamed variants as a lint, add alternative The `taken_discrimimant_ranges` and `empty_discriminant_ranges` lints catch the situation in which an unnamed variant has no effect. The latter is much more dangerous than the former, so it is `deny`-by-default. However, both should be allowed for certain codegen and macro cases as described. This also moves around and expands some language regarding `derive` difficulty in open-enum, as well as add a suggested alternative. --- text/3894-unnamed-variants.md | 302 +++++++++++++++++++++------------- 1 file changed, 187 insertions(+), 115 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 931e941702a..49835e88d63 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -118,7 +118,6 @@ in C++), a newtype integer isn't as ergonomic to use as an `enum`: - It is arduous to read the generated definition - the variants are inside of an `impl` instead of next to the name. It hides the type's nature as an enum. - It's invalid to `use` the pseudo-variants like with `use EnumName::*`. -- `Debug` derives are less useful. - The third-party macro ecosystem built around enums can't be used. - Rust is a systems language that can move data around efficiently, and so first-class support for named integers is valuable for embedded programmers. @@ -126,12 +125,18 @@ in C++), a newtype integer isn't as ergonomic to use as an `enum`: - No "fill match arms" in rust-analyzer. - The [`non-exhaustive patterns` error][E0004] lists only integer values, and cannot suggest the named variants of the enum. - - The unstable `non_exhaustive_omitted_patterns` lint has no way to work with - this enum-alike, even though treating it like a `non_exhaustive` enum would - be more helpful. + - The unstable [`non_exhaustive_omitted_patterns`] lint has no easy way to + work with this enum-alike, even though treating it like a `non_exhaustive` + enum would be more helpful. - Generated rustdoc is less clear (the pseudo-enum is grouped with `struct`s). - In order for a pseudo-variant name to match the normal style for an enum variant name, `allow(non_uppercase_globals)` is required. +- `derive`s that work with names are less useful. The built-in `derive(Debug)` + can't know the variant names to list. The `open-enum` crate, which provides an + attribute macro to construct newtype integers from an `enum` declaration, + requires a disctinct `derive` ecosystem for operations like `TryFrom`, + `Debug`, `IsKnownVariant`, ser/de, etc. - a worse experience than if all + derives were capable of reading a first-class open `enum` definition. [E0004]: https://doc.rust-lang.org/stable/error-index.html#E0004 @@ -421,8 +426,8 @@ better choice. This is often the case for enums declaring an explicit `repr`. ### Unnamed variants -An **unnamed variant** is an enum variant with `_` declared for its name. It is -assigned to one or many **reserved discriminants**. These discriminants are +An **unnamed variant** is an enum variant with `_` declared for its name. It +is assigned to a set of **reserved discriminants**. These discriminants are valid for the enum, and may be assigned to a named variant in the future. It is valid to `transmute` to an enum type from a reserved discriminant. @@ -492,79 +497,12 @@ these types: } ``` - - The range must be non-empty. - - ```rust - #[repr(u8)] - enum Foo { - X = 0, - // error: empty range assigned to `_` variant - // help: variant has discriminant range `0..1` - _ = Self::X..Self::Y, - Y = 1, - } - - #[repr(isize)] - enum Bar { - X = 2, - // error: empty range assigned to `_` variant - // help: variant has discriminant range `-2..0` - _ = Self::X..Self::Y, - Y = 0, - } - ``` - - - It is usually a mistake to specify an empty range. - - Allowing this would enable this peculiar situation: - - ```rust - #[repr(u8)] - enum Foo { - X = CONST1, - _ = (CONST1 + 1)..CONST2, - Y = CONST2, - } - ``` - - If `CONST1 + 1 >= CONST2`, then `E` has no reserved discriminants, and - thus no wildcard arm is needed in a `match`, even though an unnamed - variant is syntactically present. - - An empty or negative range could accidentally cause UB if fewer - discriminants are reserved than expected. - - If edge cases are found that necessitate allowing this, this can be made - a `deny`-by-default lint in the future. - - There must be at least one discriminant available to reserve in the range. - - ```rust - enum Foo { - X, - Y, - // error: all discriminants in range `0..=1u8` already assigned - // help: `0` is assigned here: `X` - // help: `1` is assigned here: `Y` - _ = 0..2, - } - ``` - - - Thus, an unnamed variant cannot be specified on an enum that is already - open: - - ```rust - #[repr(u8)] - enum NamedU8 { - James = 0, - Fernando = 1, - Sally = 2, - // ... Named variant for every other u8 ... - Jolene = 255, - - // error: all discriminants in range `0..=255` already assigned - // help: `0` is assigned here: `James = 0` - // help: `255` is assigned here: `Jolene = 255` - _ = .., - } - ``` - + - The range should be non-empty. A + [`deny`-by-default lint](#empty-discriminant-ranges) is produced if this is + violated. + - There should be at least one discriminant available to reserve in the range. + A [`warn`-by-default lint](#taken-discriminant-ranges) is produced if this + is violated. - `start..` (`core::ops::RangeFrom`) - Equivalent to `start..=Int::MAX`. - `..end` (`core::ops::RangeTo`) @@ -713,6 +651,9 @@ enum Foo { as this Rust enum, regardless of the discriminant values assigned: ```rust +// `allow` effective when there are 256 variants within the `u8`/`i8` range on +// a short-enum platform. Only macros/codegen like bindgen bother with this. +#[allow(taken_discriminant_ranges)] #[repr(C)] enum Foo { Name1 = Value1, @@ -800,14 +741,92 @@ It is an API and ABI backwards-compatible change to: #### Applicable lints +##### Empty discriminant ranges + +`empty_discriminant_ranges` is a new `deny`-by-default lint. It should be +produced if the discriminant range assigned to an unnamed variant is empty. + +```rust +#[repr(u8)] +enum Foo { + X = 0, + // error: empty range assigned to `_` variant + // help: variant has discriminant range `0..1` + _ = Self::X..Self::Y, + Y = 1, +} + +#[repr(isize)] +enum Bar { + X = 2, + // error: empty range assigned to `_` variant + // help: variant has discriminant range `-2..0` + _ = Self::X..Self::Y, + Y = 0, +} +``` + +- It is usually a mistake to specify an empty range. +- An empty or negative range could accidentally cause UB if certain + discriminants are expected to be reserved but are not due to reversing the + `start` and `end` of the range. +- If `allow`ed, the unnamed variant declaration has no effect. +- There are rare use cases involving macro or non-literal discriminants + in which in may be intentional to declare an empty variant in order to + avoid complex discriminant analysis. + +##### Taken discriminant ranges + +`taken_discriminant_ranges` is a new `warn`-by-default lint. It should be +produced if every discriminant in the range assigned to an unnamed variant is +already assigned to a named variant. Thus, the unnamed variant does not +introduce any reserved discriminants and has no effect on the enum. + +```rust +#[repr(u8)] +enum Foo { + X, + Y, + // warning: all discriminants in range `0..=1` already assigned + // help: remove the `_` variant; it has no effect + // help: `0` is assigned here: `X` + // help: `1` is assigned here: `Y` + _ = 0..2, +} +``` + +This warning should thus be produced when specifying an unnamed variant on an +enum that is already open. Any macro or codegen that intends to make an enum +open can ignore this lint when adding `_ = ..`: + +```rust +// Say bindgen generated this from a C enum. +// It shouldn't have to count the number of variants and compare that against +// the `repr` to know if the enum's already open and must avoid placing the +// `_ = ..`. It can just allow the warning. +#[allow(taken_discriminant_ranges)] +#[repr(u8)] +enum NamedU8 { + James = 0, + Fernando = 1, + Sally = 2, + // ... Named variant for every other u8 ... + Jolene = 255, + + _ = .., +} +``` + ##### Truncatable ranges -A new `warn`-by-default lint is produced if an unnamed variant's discriminant -range can be shortened to avoid overlapping with named variants. +`overlong_discriminant_ranges` is a new `warn`-by-default lint. It should be +produced if an unnamed variant's discriminant range can be shortened to avoid +overlapping with named variants. Let `start..=end` be the range of discriminants that an unnamed variant definition is assigned to, regardless of the actual range type used. An -`overlong_discriminant_ranges` lint is produced if all of the below are true: +`overlong_discriminant_ranges` lint should be produced if all of the below are +true: - The bound is specified as a range expression in the variant's discriminant expression, and not as an identifier or block. @@ -820,6 +839,7 @@ definition is assigned to, regardless of the actual range type used. An defined by an unbounded range. - The prefix is an overlong side _or_ the following variant, if any, has an explicit discriminant. +- The `taken_discriminant_ranges` lint is not produced for this unnamed variant. ```rust #[repr(u32)] @@ -877,7 +897,7 @@ enum ImplicitNextDiscriminant { ##### Gap of length one caused by an exclusive range -The existing [`non_contiguous_range_endpoints`] lint is produced if: +The existing [`non_contiguous_range_endpoints`] lint should be produced if: [`non_contiguous_range_endpoints`]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint_defs/builtin/static.NON_CONTIGUOUS_RANGE_ENDPOINTS.html @@ -913,9 +933,9 @@ enum Bar { ##### Forgot to mention a named variant -The unstable [`non_exhaustive_omitted_patterns`] `allow`-by-default lint is -produced if a `match` on an enum with reserved discriminants mentions some, but -not all, of the named variants. +The unstable [`non_exhaustive_omitted_patterns`] `allow`-by-default lint should +be produced if a `match` on an enum with reserved discriminants mentions some, +but not all, of the named variants. [`non_exhaustive_omitted_patterns`]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint_defs/builtin/static.NON_EXHAUSTIVE_OMITTED_PATTERNS.html @@ -1238,6 +1258,41 @@ an open enum, as described in the [Pattern types][pattern types] can constrain the valid values for an integer newtype, but do not help with the enum ergonomics issue. +### Attribute to improve diagnostic behavior for associated `const` + +Newtype integers could improve the ergonomics for a "fill match arms" analyzer +capabilities and other diagnostics with an attribute placed on pseudo-variants: + +```rust +#[repr(transparent)] +#[derive(PartialEq, Eq)] +struct Color(pub i8); + +#[allow(non_upper_case_globals)] +impl Color { + // Tells rust-analyzer "this is like an enum variant" + #[diagnostic::enum_variant] + pub const Red: Color = Color(0); + + #[diagnostic::enum_variant] + pub const Blue: Color = Color(1); + + #[diagnostic::enum_variant] + pub const Black: Color = Color(2); +} +``` + +However: + +- Open enums require even more typing for the desired semantics. +- `derive`s cannot be easily written with enum variant names. In order to avoid + duplicating the names, a `derive` macro must directly inter-operate with + another macro that generates these pseudo-variants like `open-enum`. +- This is less discoverable than a user trying to `as` cast to an enum and + having the compiler inform them of `_ = ..` as an option. +- It's not clear how this would relate to the functionality of the + [`non_exhaustive_omitted_patterns`] lint. + ### As an `enum` attribute An enum could be made open by specifying it as part of its `repr`: @@ -1259,8 +1314,6 @@ This has the same interaction with `#[non_exhaustive]`. The drawbacks: - It's not as clear what the attribute does, in contrast to the `_ = ..` syntax mirroring known concepts: we're introducing new valid values, `_` means "unnamed/wildcard", and `..` means "the rest" as the discriminants. -- Unnamed variants meld well with unnamed fields in `struct`/`union` for ABI - stability. - It is not clear why a `repr` would affect `match`/`as` behavior, even though this does affect how it is valid to represent the type. - There are many alternative syntaxes for this, such as @@ -1277,6 +1330,9 @@ This has the same interaction with `#[non_exhaustive]`. The drawbacks: } ``` +- Unnamed variants meld well with [unnamed fields] in `struct`/`union` for ABI + stability, if that is ever stabilized. + ### Unbounded ranges select discriminants based on surrounding variants ```rust @@ -1349,31 +1405,6 @@ enum Foo { } ``` -### Forbid unnamed variants' discriminants from overlapping named ones - -```rust -#[repr(u32)] -// error: discriminant `200` assigned more than once -enum HttpStatusCode { - Ok = 200, - _ = 100..=599, -} -``` - -This makes it entirely unambiguous which discriminant is assigned to which -variant, without precedence rules. However, `_ = ..` to "make it open" is still -desirable. - -- Forbidding named variant overlaps with `_ = ..` makes it nearly useless, since - it then must be the only variant for the enum. -- Giving `..` special behavior to reserve "the rest" of the variants is then - inconsistent with other ranges' behavior. - - There is precedent for `..` acting differently than other ranges, such as - when `match`ing a number or `char`. This `..`, however, is an expression and - not a pattern. - - It cannot be reasonably be equivalent to `Int::MIN..=Int::MAX` without that - range allowing named variant overlap. - ### Declare niches instead of reserving values If an enum selects its discriminants such that a desirable niche exists, like @@ -1500,6 +1531,47 @@ assert!(!(3u32 as IpProto).is_named_variant()); assert!((6u32 as IpProto).is_named_variant()); ``` +### Forbid unnamed variants' discriminants from overlapping named ones + +```rust +#[repr(u32)] +// error: discriminant `200` assigned more than once +enum HttpStatusCode { + Ok = 200, + _ = 100..=599, +} +``` + +This makes it entirely unambiguous which discriminant is assigned to which +variant, without precedence rules. However, `_ = ..` to "make it open" is still +desirable. + +- Forbidding named variant overlaps with `_ = ..` makes it nearly useless, since + it then must be the only variant for the enum. +- Giving `..` special behavior to reserve "the rest" of the variants is then + inconsistent with other ranges' behavior. + - There is precedent for `..` acting differently than other ranges, such as + when `match`ing a number or `char`. This `..`, however, is an expression and + not a pattern. + - It cannot be reasonably be equivalent to `Int::MIN..=Int::MAX` without that + range allowing named variant overlap. + +### Require an unnamed variant reserve at least one discriminant + +It is a desirable property for an unnamed variant to always introduce a reserved +discriminant. + +This would mean that an unnamed variant's presence in an enum always requires a +wildcard branch when `match`ing. Otherwise, a peculiar situation is possible in +which an enum definition declares an unnamed variant, but does not have any +reserved discriminants and thus no wildcard branch is needed. + +However, upholding this requirement prevents `_ = ..` from always working to +mean "ensure this enum is open". In order for macros or codegen like `bindgen` +to ensure an enum is open, they would need to handle the particular edge case of +an enum with 256 variants and an 8-bit discriminant and leave out the variant. +Instead, the lints can be `allow`ed for carefully-considered macros/codegen. + ### Require `non_exhaustive`, don't forbid it Perhaps an unnamed variant could _require_ `#[non_exhaustive]`, rather than @@ -1613,13 +1685,13 @@ directly, but another macro could improve this. ### Unnamed fields -The [Unnamed fields] RFC reserves space for future extension in a `struct` or +The [unnamed fields] RFC reserves space for future extension in a `struct` or `union` for FFI purposes, allowing ABI to be planned ahead of time. Unnamed variants have similar motivations, but no great workaround. The future work proposed below to allow `_(payload) = discriminants` further unifies these concepts by reserving space for `payload` to be held in the enum. -[Unnamed fields]: https://github.com/rust-lang/rfcs/blob/master/text/2102-unnamed-fields.md +[unnamed fields]: https://github.com/rust-lang/rfcs/blob/master/text/2102-unnamed-fields.md ### `repr(open)` RFC From adc745b8cda1b8f2901e8e92322c5b1b90187e9a Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 10 Dec 2025 15:03:35 -0800 Subject: [PATCH 08/42] Extend drawbacks of open enum via attribute Based on this comment by @dhardy: https://github.com/rust-lang/rfcs/pull/3803#issuecomment-325722825 --- text/3894-unnamed-variants.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 49835e88d63..2d5fac6196b 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1332,6 +1332,11 @@ This has the same interaction with `#[non_exhaustive]`. The drawbacks: - Unnamed variants meld well with [unnamed fields] in `struct`/`union` for ABI stability, if that is ever stabilized. +- An `#[repr(u8)] enum E { A, B }` has two possible values, but an open enum + would instead have 256. Attributes are not typically used to adjust a type's + validity to this degree. `#[non_exhaustive]` is barely an exception; it merely + prevents exhaustive matches. Therefore, something stronger than an attribute + should be required to open an enum. ### Unbounded ranges select discriminants based on surrounding variants From a7f30fce2c20bf6f75ce732e5e5c7f10b97162d1 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 10 Dec 2025 15:16:22 -0800 Subject: [PATCH 09/42] Fix typo in empty discriminant ranges --- text/3894-unnamed-variants.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 2d5fac6196b..b809ce369e2 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -760,16 +760,16 @@ enum Foo { enum Bar { X = 2, // error: empty range assigned to `_` variant - // help: variant has discriminant range `-2..0` + // help: variant has discriminant range `2..0` _ = Self::X..Self::Y, Y = 0, } ``` -- It is usually a mistake to specify an empty range. +- It is almost always a mistake to specify an empty range. - An empty or negative range could accidentally cause UB if certain discriminants are expected to be reserved but are not due to reversing the - `start` and `end` of the range. + `start` and `end` of the range. Thus, it is `deny`-by-default. - If `allow`ed, the unnamed variant declaration has no effect. - There are rare use cases involving macro or non-literal discriminants in which in may be intentional to declare an empty variant in order to From dbc719b581a74b1d903f649c170c717c715d2e75 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 10 Dec 2025 16:12:25 -0800 Subject: [PATCH 10/42] Fix empty_discriminant_ranges and C# typos --- text/3894-unnamed-variants.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index b809ce369e2..23975702055 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -751,8 +751,9 @@ produced if the discriminant range assigned to an unnamed variant is empty. enum Foo { X = 0, // error: empty range assigned to `_` variant - // help: variant has discriminant range `0..1` - _ = Self::X..Self::Y, + // note: this variant has no effect + // help: variant has discriminant range `1..1` + _ = (Self::X + 1)..Self::Y, Y = 1, } @@ -760,8 +761,9 @@ enum Foo { enum Bar { X = 2, // error: empty range assigned to `_` variant - // help: variant has discriminant range `2..0` - _ = Self::X..Self::Y, + // note: this variant has no effect + // help: variant has discriminant range `3..0` + _ = (Self::X + 1)..Self::Y, Y = 0, } ``` @@ -1627,7 +1629,7 @@ _Open_ and _closed_ enums are [pre-existing industry terms][acord-xml]. - C++'s [scoped enumerations][cpp-scoped-enums] and C enums are both open enums. -- C## uses [open enums][cs-open-enums], with a [proposal][cs-closed-enums] to +- C♯ uses [open enums][cs-open-enums], with a [proposal][cs-closed-enums] to add closed enums for guaranteed exhaustiveness. - Java uses closed enums. - [Protobuf][protobuf-enum] uses closed enums with the `proto2` syntax, treating From 10225c384283190dd0f57d42fe3fd4a5af4602ad Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 10 Dec 2025 18:15:56 -0800 Subject: [PATCH 11/42] Fix implicit/explicit swap --- text/3894-unnamed-variants.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 23975702055..cd75f16d852 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -973,7 +973,7 @@ let name = match b { #### Next variant's implicit discriminant -When a named variant without an implicit discriminant follows an unnamed +When a named variant without an explicit discriminant follows an unnamed variant, the assigned implicit discriminant is the next integer after the declared discriminant range for that unnamed variant. If the unnamed variant is assigned to an integer, it is the next integer. From 0b7283dd9c003ff184a9ffff5003d03c789afaaa Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 24 Dec 2025 20:43:03 -0800 Subject: [PATCH 12/42] Add rationale rejecting implicit discriminants and no-repr Based on [this comment](https://github.com/rust-lang/rfcs/pull/3894#issuecomment-3682414869). --- text/3894-unnamed-variants.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index cd75f16d852..eb86ac84afc 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1621,6 +1621,31 @@ Cons: variant by changing the `repr`. This is non-obvious and can be avoided by forbidding `non_exhaustive` when a valid unnamed variant exists. +### Allow an implicit discriminant expression for unnamed variants + +Consider: + +```rust +#[repr(u32)] +enum Color { + Red, + Green, + _, // this is an unnamed variant, but covering what discriminant(s)? +} +``` + +Ordinarily, a variant's implicit discriminant is one more than the previous +variant's. However, a common usage of an unnamed variant is to open the entire +enum, and so it is ambiguous what exactly the variant does. It is also not a +particularly large burden to require an explicit discriminant expression. + +### Allow usage without `repr` + +Consider if an unnamed variant could be present without a `repr`. It could be +equivalent to `#[non_exhaustive]`. However, this is confusing for a syntax that +describes ranges of variants: what does the range `_ = ..` actually cover? Is +there still ABI compatibility? + ## Prior art _Open_ and _closed_ enums are [pre-existing industry terms][acord-xml]. From 295b740d353488a764f52eb7045f796371102f44 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 21 Jan 2026 12:03:10 -0800 Subject: [PATCH 13/42] Clarify Swift prior art --- text/3894-unnamed-variants.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index eb86ac84afc..836d0338d3a 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1661,15 +1661,18 @@ _Open_ and _closed_ enums are [pre-existing industry terms][acord-xml]. unlisted enum values as unknown fields, and changed the semantics to open enums with the `proto3` syntax. This was in part because of lessons learned from protocol evolution and service deployment as described above. -- Swift uses both closed and open enums, based on if it is `@frozen`. An - `@unknown default` branch is required for open enums, the `@unknown` being - another way to achieve the design goals of the - `non_exhaustive_omitted_range_patterns` lint. +- Swift uses both closed and open enums for enums with data, based on if it's + compiled in library evolution mode and marked `@frozen`. A `default` branch is + required when [`switch`ing on a _nonfrozen enumeration_][swift-open-enums], + and an `@unknown default` emits a warning if there are named enumeration cases + that utilize that branch. This achieves the same goal as the + `non_exhaustive_omitted_range_patterns` lint in a different manner. [acord-xml]: https://docs.oracle.com/cd/B40099_02/books/ConnACORDFINS/ConnACORDFINSApp_DataType10.html [cpp-scoped-enums]: https://en.cppreference.com/w/cpp/language/enum#Scoped_enumerations [cs-open-enums]: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/enum#conversions [cs-closed-enums]: https://github.com/dotnet/csharplang/issues/3179 +[swift-open-enums]: https://docs.swift.org/swift-book/documentation/the-swift-programming-language/statements/#Switching-Over-Future-Enumeration-Cases [protobuf-enum]: https://developers.google.com/protocol-buffers/docs/reference/cpp-generated#enum ### Other crates that use open enums From e939f1d4c18412d60688c28a30b9f60107cd449f Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 21 Jan 2026 13:49:02 -0800 Subject: [PATCH 14/42] Reword some alternative discussion --- text/3894-unnamed-variants.md | 102 ++++++++++++++++++++++------------ 1 file changed, 65 insertions(+), 37 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 836d0338d3a..6d8426ffe58 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1286,10 +1286,12 @@ impl Color { However: -- Open enums require even more typing for the desired semantics. -- `derive`s cannot be easily written with enum variant names. In order to avoid - duplicating the names, a `derive` macro must directly inter-operate with - another macro that generates these pseudo-variants like `open-enum`. +- Open enums require even more typing for the desired semantics. They remain + a degraded experience compared to closed enums. +- It's very hard to compose macros that use this pattern. Macros cannot easily + manipulate enum variant names, especially if a macro is responsible for + generating the pseudo-variants. A bespoke attribute must be generated and + recognized by other macros that support open enums to use. - This is less discoverable than a user trying to `as` cast to an enum and having the compiler inform them of `_ = ..` as an option. - It's not clear how this would relate to the functionality of the @@ -1430,6 +1432,7 @@ have to have a name and thus can be assigned to discontiguous ranges. What if instead this were valid? ```rust +#[repr(u8)] enum IpProto { Tcp = 6, Udp = 17, @@ -1439,24 +1442,66 @@ enum IpProto { This is not mutually exclusive with unnamed variants, but this RFC chooses to leave reserved ranges of discriminants as anonymous to keep the feature simple. +It can be left as [future work](#discriminant-ranges-for-named-variants) +for the language. Some of the issues are: -- It is ambiguous what value should be chosen when `IpProto::Other` is used in - an expression. -- Even with an arbitrary rule to choose a discriminant, a consistently - performant `derive(PartialEq)` that compares discriminants instead of ranges - of discriminants will result in - `matches!(o, IpProto::Other) && o != IpProto::Other`. - - A reasonable but less useful alternative is to reject expression usage as - well as `derive(PartialEq)`. -- If discontiguous ranges are allowed as above, the performance of - `matches!(o, EnumName::Variant)` gets worse as the number of variants grows. - Adding an `Icmp = 1` variant affects `matches!(1 as IpProto, IpProto::Other)`: - it is an API-breaking change. -- A `derive` can be used to determine whether an enum's discriminant is assigned - to a named variant. -- Anonymous discriminant values are useful on their own for enum evolution. + it is an API-breaking change. Unnamed variants are more useful for enum + evolution - a key design goal. +- It is ambiguous what value should be chosen when `IpProto::Other` is used in + an expression. Some reasonable ways to avoid that are: + - Define an arbitrary rule to choose a discriminant for an `IpProto::Other` + expression. + - Forbid direct construction of `IpProto::Other`. It can only be constructed + via `unsafe` or, for open enums, `as`-cast from the backing integer to + `IpProto`. There's no check that the discriminant represents an `Other` + variant. + - A discriminant that is valid for `IpProto::Other` must be provided when + constructing the variant. Bikeshed syntax: `x as IpProto::Other`. + - A simple implementation requires that the discriminant `x` be a `const` + value to be checked at compile time as a valid discriminant for + `IpProto::Other`. + - To support dynamic values, this would either have to be a fallible enum + constructor or use pattern types to ensure that the input value is valid + for the `Other` variant. +- Even if the expression ambiguity issue is resolved, it is not clear how + `derive(PartialEq)` should function. Currently it always compares discriminant + values, but if that is kept, then it's possible for + `matches!(o, IpProto::Other) && o != IpProto::Other`. If `derive(PartialEq)` + treats all `IpProto::Other` as equal, then it may drastically reduce the + performance of the `derive` without an obvious opt-in by the author. + - If named variants' ranges can overlap other named variants as shown above, + then the performance of `matches!(o, IpProto::Other)` degrades as further + variants are added and the set of discriminants representing `Other` becomes + more sparse. No other pattern has this characteristic, where the performance + of matching a pattern is affected by unmentioned properties of the matched + type. + +Most of the utility provided by named variant discriminant ranges can be +provided by replacing it with unnamed variants and using a macro to determine +whether an enum's discriminant is assigned to a named variant. This makes it +clear that declaring a new named variant with an unnamed variant's discriminant +will affect the method's return value. For example: -This can be left as future work for the language. +```rust +#[repr(transparent, u32)] +#[derive(IsNamedVariant)] +enum IpProto { + Tcp = 6, + Udp = 17, + _ = .., +} + +/// Equivalent to fallibly building `IpProto::Other` from `x`. +fn build_unknown_proto(x: u32) -> Option { + (!(x as IpProto).is_named_variant()).then_some(x as u32) +} + +assert!(!(3u32 as IpProto).is_named_variant()); +assert!(build_unknown_proto(3).is_some()); +assert!((6u32 as IpProto).is_named_variant()); +assert!(build_unknown_proto(6).is_none()); +``` ### `..` at the end @@ -1519,24 +1564,7 @@ if let IpProto::Other(x) = IpProto::Other(6) { } ``` -Instead, to get this behavior with this RFC's proposed syntax, the -author could use a third-party derive to check against the named variants, -and an `unsafe` transmute or `as` cast to construct the enum value from -integer. This makes it clear that declaring a new named variant with an -unnamed variant's discriminant will affect the method's return value. - -```rust -#[repr(transparent, u32)] -#[derive(IsNamedVariant)] -enum IpProto { - Tcp = 6, - Udp = 17, - _ = .., -} - -assert!(!(3u32 as IpProto).is_named_variant()); -assert!((6u32 as IpProto).is_named_variant()); -``` +A `derive(IsNamedVariant)` macro as shown above could replace this behavior. ### Forbid unnamed variants' discriminants from overlapping named ones From cb9eefc50ebb13f9f3683173d3539b288759af8f Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Fri, 23 Jan 2026 09:53:02 -0800 Subject: [PATCH 15/42] Mention default discriminant attribute for Other-variant alt --- text/3894-unnamed-variants.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 6d8426ffe58..411a7e424af 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1452,6 +1452,8 @@ for the language. Some of the issues are: an expression. Some reasonable ways to avoid that are: - Define an arbitrary rule to choose a discriminant for an `IpProto::Other` expression. + - The enum author uses an attribute to specify the "default" discriminant for + an `IpProto::Other` expression. - Forbid direct construction of `IpProto::Other`. It can only be constructed via `unsafe` or, for open enums, `as`-cast from the backing integer to `IpProto`. There's no check that the discriminant represents an `Other` From b0c64e36b246e79c9352343a77193b169e89264f Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Fri, 23 Jan 2026 18:44:15 -0800 Subject: [PATCH 16/42] Clean up unnamed variant vs. declaration Distinguish between unnamed variant declarations (the actual syntax), and the individual unnamed variants that each discriminant represents in the declared range. Also clean up some other language. --- text/3894-unnamed-variants.md | 104 ++++++++++++++++++---------------- 1 file changed, 54 insertions(+), 50 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 411a7e424af..55c2b82e288 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -325,8 +325,8 @@ let fruit: Fruit = unsafe { core::mem::transmute(5u32) }; // `fruit` is not any of the named variants. assert!(!matches!(fruit, Fruit::Apple | Fruit::Orange | Fruit::Banana)); -// These are both rejected: an unnamed variant can't construct reserved -// discriminants or patttern match on them. +// These are both rejected: unnamed variants can't be constructed with +// a variant expression, nor pattern match directly on them. // assert!(!matches!(fruit, Fruit::_)); // let fruit = Fruit::_; ``` @@ -431,16 +431,17 @@ is assigned to a set of **reserved discriminants**. These discriminants are valid for the enum, and may be assigned to a named variant in the future. It is valid to `transmute` to an enum type from a reserved discriminant. -An unnamed variant does not declare an identifier scoped under the enum name, +An unnamed variant does not declare a constructor scoped under the enum name, unlike a named variant. `EnumName::_` remains an invalid expression and pattern. -An unnamed variant may be specified more than once on the same enum. It is valid -to reserve multiple ranges of discriminants. Those ranges may be discontiguous. +An unnamed variant declaration may be specified more than once on the same enum. +It is valid to claim multiple ranges of discriminants. Those ranges may be +discontiguous. -An explicit `repr(Int)` is required on an enum to declare an unnamed variant. +To declare an unnamed variant, the `enum` must have an explicit `repr(Int)`. `Int` is one of the primitive integers or `C`. If it is `C`, then `Int` below is -`isize`. An unnamed variant must specify a discriminant expression with one of -these types: +`isize`. An unnamed variant declaration must specify a discriminant expression +with one of these types: - `Int` - Reserves a particular discriminant value. @@ -462,7 +463,9 @@ these types: `start..=end` (`core::ops::RangeInclusive`) - Ensures every discriminant value in the range is reserved. - Named variants have higher precedence than unnamed variants when assigning - discriminants to variants. + discriminants to variants. Thus, the set of discriminants claimed by an + unnamed variant declaration may be a discontiguous subset of the specified + range. ```rust #[repr(u32)] @@ -499,23 +502,24 @@ these types: - The range should be non-empty. A [`deny`-by-default lint](#empty-discriminant-ranges) is produced if this is - violated. - - There should be at least one discriminant available to reserve in the range. + violated and the unnamed variant declaration does not introduce an unnamed + variant. + - There should be at least one discriminant available to claim in the range. A [`warn`-by-default lint](#taken-discriminant-ranges) is produced if this - is violated. + is violated and the unnamed variant declaration does not introduce an + unnamed variant. - `start..` (`core::ops::RangeFrom`) - - Equivalent to `start..=Int::MAX`. + - Equivalent to `start..=Int::MAX` for non-`repr(C)` enums. - `..end` (`core::ops::RangeTo`) - - Equivalent to `Int::MIN..end`. + - Equivalent to `Int::MIN..end` for non-`repr(C)` enums. - `..=end` (`core::ops::RangeToInclusive`) - - Equivalent to `Int::MIN..=end`. + - Equivalent to `Int::MIN..=end` for non-`repr(C)` enums. - `..` (`core::ops::RangeFull`) - - Equivalent to `Int::MIN..=Int::MAX`. + - Equivalent to `Int::MIN..=Int::MAX` for non-`repr(C)` enums. - Reserves the rest of the discriminants for `Int`. This always makes an enum open without consideration for named variants' discriminants. - Because unnamed variants cannot have conflicting discriminants, this is the - only unnamed variant allowed on the enum when used. It is called the enum's - _open variant_. + only unnamed variant declaration allowed on the enum when used. ```rust // error: discriminant value `1` assigned more than once @@ -602,11 +606,11 @@ const _: () = assert!( ); ``` -The unbounded end of a discriminant range never affects the backing integer of a -`repr(C)` enum. When a range with an unbounded end (`start..`, `..end`, -`..=end`, `..`) is used as an unnamed variant's discriminant expression in a -`repr(C)` enum, the set of discriminants that is reserved by that unbounded end -is dependent on the other variants' discriminants. +The unbounded end of a discriminant range **never** affects the backing integer +of a `repr(C)` enum. For a `repr(C)` enum, when a range with an unbounded end +(`start..`, `..end`, `..=end`, `..`) is used as an unnamed variant declaration's +discriminant expression, the effective bound of the claimed range is dependent +on what the backing integer would be if no unnamed variants were declared. ```rust #[repr(C)] @@ -644,11 +648,11 @@ This behavior means that it is sound to expose a C enum defined like this: enum Foo { Name1 = Value1, Name2 = Value2, - // etc. + // ... }; ``` -as this Rust enum, regardless of the discriminant values assigned: +as this Rust open enum, regardless of the discriminant values assigned: ```rust // `allow` effective when there are 256 variants within the `u8`/`i8` range on @@ -658,10 +662,7 @@ as this Rust enum, regardless of the discriminant values assigned: enum Foo { Name1 = Value1, Name2 = Value2, - // etc. - - // The rest of the discriminants for an enum with the named variants - // are reserved and valid. Unchecked casts can't invoke UB. + // ... _ = .., } ``` @@ -781,8 +782,10 @@ enum Bar { `taken_discriminant_ranges` is a new `warn`-by-default lint. It should be produced if every discriminant in the range assigned to an unnamed variant is -already assigned to a named variant. Thus, the unnamed variant does not -introduce any reserved discriminants and has no effect on the enum. +already assigned to a named variant. This results in the unnamed variant +definition having no effect. While an unnamed variant is syntactically present, +no unnamed variant is introduced to the `enum` as it has no discriminants to +claim. ```rust #[repr(u8)] @@ -936,8 +939,8 @@ enum Bar { ##### Forgot to mention a named variant The unstable [`non_exhaustive_omitted_patterns`] `allow`-by-default lint should -be produced if a `match` on an enum with reserved discriminants mentions some, -but not all, of the named variants. +be produced if a `match` on an enum with unnamed variants mentions some, but not +all, of the named variants. [`non_exhaustive_omitted_patterns`]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint_defs/builtin/static.NON_EXHAUSTIVE_OMITTED_PATTERNS.html @@ -973,8 +976,8 @@ let name = match b { #### Next variant's implicit discriminant -When a named variant without an explicit discriminant follows an unnamed -variant, the assigned implicit discriminant is the next integer after the +When a named variant without an explicit discriminant follows an unnamed variant +declaration, the assigned implicit discriminant is the next integer after the declared discriminant range for that unnamed variant. If the unnamed variant is assigned to an integer, it is the next integer. @@ -1013,7 +1016,7 @@ enum Overflow { #### Non-literal discriminant expression -A non-literal range or integer is allowed for an unnamed variant. +A non-literal range or integer is allowed for an unnamed variant declaration. ```rust const VALID_FOO: Range = 10..100; @@ -1031,8 +1034,9 @@ let _: Foo = unsafe { mem::transmute(15u32) }; #### Only variant -An unnamed variant may be the only variant for an enum. In this case, an `as` -cast or `transmute` is the only way to construct an enum value. +An unnamed variant declaration may be the only variant declaration for an enum. +In this case, an `as` cast or `transmute` is the only way to construct an enum +value. ```rust #[repr(u32)] @@ -1048,11 +1052,10 @@ integer is a valid discriminant. - An open enum always has an explicit `repr` backing integer, or is `repr(C)`. - An enum is open if every discriminant value for that integer is associated - with a named variant or is reserved with an unnamed variant. + with a named or unnamed variant. - For a field-less enum, this means every initialized bit pattern is valid. -- A [unit-only] open enum may be `as` cast from its backing integer: +- A [unit-only] open enum may be `as` cast from its backing integer _only_: `2u8 as Color`. See below for `repr(C)` behavior. - - Casting from other integer types is rejected. - If an expression with the `{integer}` inference variable type is used as the source for an `as` cast to an open enum, it is uniquely constrained to the explicit backing integer type. This excludes `repr(C)`; see below. @@ -1177,16 +1180,17 @@ _ = signed_byte as SmallSigned; ### Interaction with the standard library -- `derive(Debug)` formats as `EnumName(X)` when `X` is a reserved discriminant. - A `Debug` format changing is not considering an API-breaking change. -- `Default` forbids `#[default]` from being specified on an unnamed variant, - but this may change in the future. +- `derive(Debug)` formats as `EnumName(X)` when formatting an unnamed variant: + `X` is its claimed discriminant. A `Debug` format changing is not considering + an API-breaking change. +- `Default` forbids `#[default]` from being specified on an unnamed variant, but + this may change in the future. - The derives `Clone`, `Copy`, `Eq`, `Hash`, `Ord`, `PartialEq`, and - `PartialOrd` are unaffected by unnamed variants on a field-less enum. - They all operate on discriminants, and this includes reserved discriminants. -- `mem::Discriminant` continues to operate as before, always treating - field-less enum values with the same discriminant integers as equal and - those with different discriminant integers as non-equal. + `PartialOrd` are unaffected by unnamed variants on a field-less enum. They all + operate on discriminants, including those assigned to unnamed variants. +- `mem::Discriminant` continues to operate as before, always treating field-less + enum values with the same discriminant integers as equal and those with + different discriminant integers as non-equal. ## Drawbacks From 40c60c4caa1492b76fb28fef8b591e8b25563925 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Thu, 22 Jan 2026 13:46:29 -0800 Subject: [PATCH 17/42] Use 'claim' instead of 'reserve' for discriminants The term 'reserve' can be confusing to those familiar with networking protocols, as it often means "was used in the past and can no longer be used any more" as in Protobuf's `reserved` keyword. This commit switches to using 'claim' instead, and avoids giving a particular name to the set of discriminants that are solely assigned to unnamed variants. --- text/3894-unnamed-variants.md | 133 ++++++++++++++++++---------------- 1 file changed, 69 insertions(+), 64 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 55c2b82e288..997a1e307ee 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -6,11 +6,10 @@ ## Summary -Enable ranges of enum discriminants to be reserved ahead of time, requiring -all users of that enum to consider those values as valid. This includes within -the declaring crate. +Enable ranges of enum discriminants to be considered valid by all users ahead +of time. This includes within the declaring crate. -`_ = RANGE` is an _unnamed variant_ definition. It specifies that enum +`_ = RANGE` is an _unnamed variant_ declaration. It specifies that enum discriminants in `RANGE` are valid. It is sound to construct unnamed variants with `unsafe`, and to handle them over FFI. If there is no invalid discriminant for an enum, it becomes an _open enum_. If it is [unit-only], it can then be @@ -72,13 +71,13 @@ What if it isn't feasible to recompile **every** part of the system that uses the enum in order to avoid the breaking change? ```rust -/// `TaskState` v1 reserves discriminants instead of using `non_exhaustive`. +/// `TaskState` v1 is an open enum instead of using `non_exhaustive`. #[repr(u8)] #[derive(Clone, Copy, Debug, PartialEq, Eq)] pub enum TaskState { Stopped = 0, Running = 1, - // There are reserved variants for the rest of the discriminants: + // There are unnamed variants for the rest of the discriminants: // The `_` resembles a wildcard seen when `match`ing. _ = .., } @@ -174,7 +173,7 @@ breakage. Ensuring ABI compatibility when extending a library requires extra care. While `non_exhaustive` grants API compatibility as variants are added, it [does _not_ -provide ABI compatibility][non-exhaustive-ub]. By reserving discriminants for +provide ABI compatibility][non-exhaustive-ub]. By claiming discriminants for future extensions to an enum, libraries can choose to remain ABI forwards-compatible as new variants are added. @@ -242,7 +241,7 @@ macro_rules! make_ranged_int { type Error = (); fn try_from(val: $repr) -> Result<$name, ()> { match val { - // SAFETY: `val` is a valid discriminant for `$name` + // SAFETY: `val` is a valid discriminant for `$name`. $($range)* => Ok(unsafe { mem::transmute(val) }), _ => Err(()), } @@ -308,8 +307,8 @@ let fruit: Fruit = unsafe { core::mem::transmute(5u32) }; assert_eq!(mem::transmute(Option::::None, 2u32)); ``` -However, by declaring an **unnamed variant**, the discriminant `5` is _reserved_ -and becomes sound to transmute from. +However, by declaring an **unnamed variant**, the discriminant `5` is _claimed_ +by `Fruit` and becomes sound to transmute from. ```rust #[repr(u32)] // An explicit repr is required to declare an unnamed variant. @@ -319,7 +318,7 @@ enum Fruit { Banana = 4, // Banana is represented with 4u32. _ = 5, // Some future variant will be represented with 5u32. } -// SAFETY: 5 is a reserved discriminant for `Fruit`. +// SAFETY: 5 is a valid discriminant for `Fruit`. let fruit: Fruit = unsafe { core::mem::transmute(5u32) }; // `fruit` is not any of the named variants. @@ -346,7 +345,7 @@ match fruit { ``` An unnamed variant accepts a range as its discriminant expression, which ensures -each discriminant in the range is reserved and valid to use. +each discriminant in the range is claimed and valid to use. ```rust #[repr(u32)] // Fruit is represented with specific discriminants of `u32`. @@ -354,9 +353,10 @@ enum Fruit { Apple, // Apple is represented with 0u32. Orange, // Orange is represented with 1u32. Banana = 4, // Banana is represented with 4u32. - _ = 3..=10, // 3 through 10 inclusive are valid discriminants for `Fruit`. + _ = 3..=10, // Unnamed variants in `Fruit` are represented with + // discriminants 3 through 10 inclusive. } -// SAFETY: 7 is a reserved discriminant for `Fruit` +// SAFETY: 7 is a valid discriminant for `Fruit`. let fruit: Fruit = unsafe { core::mem::transmute(7u32) }; ``` @@ -371,7 +371,8 @@ enum Fruit { Apple, // Apple is represented with 0u32. Orange, // Orange is represented with 1u32. Banana = 4, // Banana is represented with 4u32. - _ = .., // The rest of the discriminants in `u32` are reserved. + _ = .., // Unnamed variants in `Fruit` are represented with the + // remaining discriminants in `u32`. } // Using an `as` cast from `u32`. let fruit = 3 as Fruit; @@ -413,8 +414,8 @@ enum may be added as the type evolves. By contrast, an unnamed variant affects API _and_ ABI semver compatibility: -- It reserves specific ranges of discriminants. -- These reserved discriminants are valid to represent without naming the future +- It claims specific ranges of discriminants. +- These claimed discriminants are valid to represent without naming the future variants that use them. - Crates can manipulate these unnamed enum variants without recompilation. - It affects all crates, including the declaring one. @@ -426,10 +427,12 @@ better choice. This is often the case for enums declaring an explicit `repr`. ### Unnamed variants -An **unnamed variant** is an enum variant with `_` declared for its name. It -is assigned to a set of **reserved discriminants**. These discriminants are -valid for the enum, and may be assigned to a named variant in the future. It is -valid to `transmute` to an enum type from a reserved discriminant. +An **unnamed variant declaration** is an `enum` variant declaration with `_` as +the variant's name. It is assigned a set of **claimed discriminants**, each +element of that set representing a single **unnamed variant** of the `enum`. +These unnamed variants are valid for the enum and their claimed discriminants +may be reassigned to named variants in the future. It is valid to `transmute` to +an enum type from a claimed discriminant. An unnamed variant does not declare a constructor scoped under the enum name, unlike a named variant. `EnumName::_` remains an invalid expression and pattern. @@ -444,7 +447,7 @@ To declare an unnamed variant, the `enum` must have an explicit `repr(Int)`. with one of these types: - `Int` - - Reserves a particular discriminant value. + - Claims a particular discriminant value. - The discriminant must not be assigned to another variant of the enum - whether named or unnamed. @@ -461,7 +464,7 @@ with one of these types: - `start..end` (`core::ops::Range`) or\ `start..=end` (`core::ops::RangeInclusive`) - - Ensures every discriminant value in the range is reserved. + - Ensures every discriminant value in the range is claimed. - Named variants have higher precedence than unnamed variants when assigning discriminants to variants. Thus, the set of discriminants claimed by an unnamed variant declaration may be a discontiguous subset of the specified @@ -473,13 +476,14 @@ with one of these types: Ok = 200, NotFound = 404, // Ensures the discriminants in 100..=599 are valid for Self. - // Actually reserves 100..=199, 201..=403, and 405..=599. + // Actually claims 100..=199, 201..=403, and 405..=599. _ = 100..=599, } ``` - - The range must not overlap with discriminants assigned to unnamed variants. - Multiple unnamed variants have equal claim to a discriminant value. + - The range must not overlap with discriminants claimed by other unnamed + variants. Multiple unnamed variant declarations have equal claim to a + discriminant value. ```rust #[repr(u8)] @@ -516,7 +520,7 @@ with one of these types: - Equivalent to `Int::MIN..=end` for non-`repr(C)` enums. - `..` (`core::ops::RangeFull`) - Equivalent to `Int::MIN..=Int::MAX` for non-`repr(C)` enums. - - Reserves the rest of the discriminants for `Int`. This always makes an enum + - Claims the rest of the discriminants for `Int`. This always makes an enum open without consideration for named variants' discriminants. - Because unnamed variants cannot have conflicting discriminants, this is the only unnamed variant declaration allowed on the enum when used. @@ -548,16 +552,16 @@ enum X { _ = validate::(20..=30), // ... } -const fn validate>(x: T) -> T { x } -trait ReserveDiscriminants {} -impl ReserveDiscriminants for u32 {} -// ... impl ReserveDiscriminants for Int {} ... -impl ReserveDiscriminants for Range {} -impl ReserveDiscriminants for RangeInclusive {} -impl ReserveDiscriminants for RangeFrom {} -impl ReserveDiscriminants for RangeTo {} -impl ReserveDiscriminants for RangeToInclusive {} -impl ReserveDiscriminants for RangeFull {} +const fn validate>(x: T) -> T { x } +trait ClaimDiscriminants {} +impl ClaimDiscriminants for u32 {} +// ... impl ClaimDiscriminants for Int {} ... +impl ClaimDiscriminants for Range {} +impl ClaimDiscriminants for RangeInclusive {} +impl ClaimDiscriminants for RangeFrom {} +impl ClaimDiscriminants for RangeTo {} +impl ClaimDiscriminants for RangeToInclusive {} +impl ClaimDiscriminants for RangeFull {} ``` #### `repr(C)` behavior @@ -616,20 +620,20 @@ on what the backing integer would be if no unnamed variants were declared. #[repr(C)] enum SmallNonnegative { X = 0, - // Reserves `1..=c_int::MAX`. + // Claims `1..=c_int::MAX`. _ = 1.., } #[repr(C)] enum BigOpen1 { X = isize::MAX, - // Reserves `isize::MIN..isize::MAX`. + // Claims `isize::MIN..isize::MAX`. _ = .., } #[repr(C)] enum BigOpen2 { - // Reserves `isize::MIN..0`. + // Claims `isize::MIN..0`. _ = ..0, _ = 0..=isize::MAX, } @@ -722,7 +726,7 @@ Given enum versions A and B with some change between them: It is an API and ABI fully-compatibile change to: - Add a named variant to a field-less enum using a discriminant that was - previously reserved. + previously claimed. - When doing the this, removing the last unnamed variant may cause warnings for unused code in client libraries, as a wildcard branch is no longer required. This can be avoided by then adding `#[non_exhaustive]` to the @@ -732,7 +736,7 @@ It is an API fully-compatible and ABI backwards-compatible change to: - Replace `#[non_exhaustive]` on an enum with an unnamed variant. - This may require changes to the defining crate to add wildcard branches. -- Add another reserved discriminant, if an unnamed variant already exists on the +- Add another claimed discriminant, if an unnamed variant already exists on the enum. It is an API and ABI backwards-compatible change to: @@ -771,7 +775,7 @@ enum Bar { - It is almost always a mistake to specify an empty range. - An empty or negative range could accidentally cause UB if certain - discriminants are expected to be reserved but are not due to reversing the + discriminants are expected to be claimed but are not due to reversing the `start` and `end` of the range. Thus, it is `deny`-by-default. - If `allow`ed, the unnamed variant declaration has no effect. - There are rare use cases involving macro or non-literal discriminants @@ -1214,7 +1218,7 @@ non-integer type that defines `BitOr` and has structural equality. ## Rationale and alternatives -Unnamed variants enable a large range of discriminants to be reserved for an +Unnamed variants enable a large range of discriminants to be claimed for an enum, whether it's all or some of them. `NonZero`, and an `enum` spelling out each discriminant are the only other ways to achieve this in stable Rust today. @@ -1327,8 +1331,8 @@ This has the same interaction with `#[non_exhaustive]`. The drawbacks: - There are many alternative syntaxes for this, such as `#[non_exhaustive(repr)]` or `[open]` / `#[open(Range)]`. All should require a `repr(Int)` be specified. -- Allowing for a reservation of particular ranges instead of a full opening - could be done with a pattern-type-like syntax, but this is less discoverable: +- Allowing a claim of particular ranges instead of a full opening could be done + with a pattern-type-like syntax, but this is less discoverable: ```rust #[repr(u8 in 1..=100)] @@ -1352,10 +1356,10 @@ This has the same interaction with `#[non_exhaustive]`. The drawbacks: #[repr(u32)] enum Foo { X, - // Reserves `1..=4`. + // Claims `1..=4`. _ = .., Y = 5, - // Reserves `6..=10`. + // Claims `6..=10`. _ = ..=10, } @@ -1418,10 +1422,10 @@ enum Foo { } ``` -### Declare niches instead of reserving values +### Declare niches instead of claiming discriminants If an enum selects its discriminants such that a desirable niche exists, like -`0`, perhaps it is better to declare ranges of niches rather than reserving +`0`, perhaps it is better to declare ranges of niches rather than claiming discriminants? It can be very confusing to mix positive and negative assertions, and this would @@ -1445,9 +1449,9 @@ enum IpProto { ``` This is not mutually exclusive with unnamed variants, but this RFC chooses to -leave reserved ranges of discriminants as anonymous to keep the feature simple. -It can be left as [future work](#discriminant-ranges-for-named-variants) -for the language. Some of the issues are: +leave claimed ranges of discriminants as anonymous to keep the feature simple. +It can be left as [future work](#discriminant-ranges-for-named-variants) for the +language. Some of the issues are: - Adding an `Icmp = 1` variant affects `matches!(1 as IpProto, IpProto::Other)`: it is an API-breaking change. Unnamed variants are more useful for enum @@ -1589,7 +1593,7 @@ desirable. - Forbidding named variant overlaps with `_ = ..` makes it nearly useless, since it then must be the only variant for the enum. -- Giving `..` special behavior to reserve "the rest" of the variants is then +- Giving `..` special behavior to claim "the rest" of the variants is then inconsistent with other ranges' behavior. - There is precedent for `..` acting differently than other ranges, such as when `match`ing a number or `char`. This `..`, however, is an expression and @@ -1597,15 +1601,16 @@ desirable. - It cannot be reasonably be equivalent to `Int::MIN..=Int::MAX` without that range allowing named variant overlap. -### Require an unnamed variant reserve at least one discriminant +### Require an unnamed variant claim at least one discriminant -It is a desirable property for an unnamed variant to always introduce a reserved -discriminant. +It is a desirable property for an unnamed variant declaration to always +claim at least one discriminant. -This would mean that an unnamed variant's presence in an enum always requires a +This would mean that an unnamed variant declaration in an enum always requires a wildcard branch when `match`ing. Otherwise, a peculiar situation is possible in -which an enum definition declares an unnamed variant, but does not have any -reserved discriminants and thus no wildcard branch is needed. +which an enum definition declares unnamed variants, but since the set of claimed +discriminants is empty, does not actually define any unnamed variants and thus +no wildcard branch is needed. However, upholding this requirement prevents `_ = ..` from always working to mean "ensure this enum is open". In order for macros or codegen like `bindgen` @@ -1824,7 +1829,7 @@ assert_eq!(e as u8, 10); ### Unnamed variants on enums with field data Unnamed variants on enums with field data would allow library authors to plan -for future ABI compatibility by reserving discriminants and data space for an +for future ABI compatibility by claiming discriminants and data space for an enum. This requires significantly more documentation and care regarding ABI stability before this can be stabilized. @@ -1839,7 +1844,7 @@ pub enum Shape { } ``` -- This reserves discriminants `2..=10` as valid for the `Shape` enum. It's not +- This claims discriminants `2..=10` as valid for the `Shape` enum. It's not an ABI-breaking change to add new variants with data to `Shape` using these discriminants, so long as it doesn't affect the layout of the `Shape`. - `Drop` glue is forbidden for field data (for a similar reason as `union`). @@ -1854,7 +1859,7 @@ pub enum Shape { Circle { radius: f32 } = 0, Rectangle { width: f32, height: f32 } = 1, - // This reserves discriminants `2..=10` and the layout to hold a + // This claims discriminants `2..=10` and reserves the layout to hold a // thin pointer without breaking ABI. It's as if there were a variant // for `&'static ()` in the enum's internal `union`. _(&'static ()) = 2..=10, @@ -1973,7 +1978,7 @@ enum X { let mut x = X::A; assert_eq!(x.0, 0); -// SAFETY: 1 is a valid discriminant for `X` +// SAFETY: 1 is a valid discriminant for `X`. unsafe { x.0 += 1; } assert!(matches!(x, X::B)); From eb42cd8ea096b4b14b818f8aaf2770cbf3c02a69 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 11 Mar 2026 12:24:37 -0700 Subject: [PATCH 18/42] Clarify explicit `repr` is required for discriminant field access It's too implicit if the section is taken out of context. --- text/3894-unnamed-variants.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 997a1e307ee..aba3eb2950e 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1925,7 +1925,8 @@ assert!( #### Discriminant field access -`.0` provides direct access to the discriminant value: +`.0` provides direct access to the discriminant value of `enum`s with an +explicit representation: ```rust let mut c = Color::Blue; @@ -1940,6 +1941,9 @@ of an `enum`. One possibility: when introduced, treat as deprecated and throw a warning to recommend a better syntax than `.0` but still allow the desired non-breaking migration. +As with unnamed variants, the `enum` must not be `repr(Rust)` in order to +guarantee that an integer is used as the discriminant. + There are a few distinct advantages compared to `as` casting: - It is possible to get a reference directly to the discriminant, which can be From 2889e7a02d1633b9719e8da9f5676b7b4ef453cf Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 11 Mar 2026 12:24:55 -0700 Subject: [PATCH 19/42] Link other proposals for discriminant field access --- text/3894-unnamed-variants.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index aba3eb2950e..3a7f77cac3b 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1990,6 +1990,12 @@ assert!(matches!(x, X::B)); [unsafe field]: https://rust-lang.github.io/rust-project-goals/2025h2/unsafe-fields.html +There are also existing proposals to [read][rfc-3607] and [write][rfc-3727] this +discriminant directly with different syntax. + +[rfc-3607]: https://github.com/rust-lang/rfcs/pull/3607 +[rfc-3727]: https://github.com/rust-lang/rfcs/pull/3727 + ### Extracting the integer value of the discriminant for fielded enums A fielded enum with `#[repr(Int)]` and/or `#[repr(C)]` is guaranteed to have its From 28dae0be825e9cd36cc8bef58e48efd06a3d2aeb Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Thu, 12 Mar 2026 09:05:40 -0700 Subject: [PATCH 20/42] Describe syntax of enum discriminant alternatives Also use a more stable link for unmerged/proposed RFCs. --- text/3894-unnamed-variants.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 3a7f77cac3b..c80dcd83001 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1769,10 +1769,10 @@ concepts by reserving space for `payload` to be held in the enum. ### `repr(open)` RFC -There's an [unmerged RFC][enum-repr-open] that defines a `repr(open)` syntax as +There's an [RFC proposal][enum-repr-open] that defines a `repr(open)` syntax as described in the Alternatives section above. -[enum-repr-open]: https://github.com/madsmtm/rfcs/blob/enum-repr-no-niches/text/3803-enum-repr-open.md +[enum-repr-open]: https://github.com/rust-lang/rfcs/pull/3803 ## Unresolved questions @@ -1991,7 +1991,9 @@ assert!(matches!(x, X::B)); [unsafe field]: https://rust-lang.github.io/rust-project-goals/2025h2/unsafe-fields.html There are also existing proposals to [read][rfc-3607] and [write][rfc-3727] this -discriminant directly with different syntax. +discriminant directly. They propose alternative syntax, with +`.enum#discriminant` rather than `.0` and `discriminant_of!`/`set_discriminant` +built-ins respectively. [rfc-3607]: https://github.com/rust-lang/rfcs/pull/3607 [rfc-3727]: https://github.com/rust-lang/rfcs/pull/3727 From 93c963f1ee0de82805ef7b8c6a44c06a97238db8 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Thu, 2 Apr 2026 08:55:36 -0700 Subject: [PATCH 21/42] Clarify "No field data" section --- text/3894-unnamed-variants.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index c80dcd83001..c56170cc0c4 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -686,8 +686,8 @@ EnumVariant -> #### No field data -This RFC only defines adding unnamed variants to field-less enums, leaving this -as future work. +This RFC only defines adding unnamed variants to field-less enums, leaving +unnamed variants in enums with fields as future work. #### `non_exhaustive` From edede1741359808f7842cb275588e4a67b72c923 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 22 Apr 2026 07:35:26 -0700 Subject: [PATCH 22/42] Rewrite the Compatibility section and clarify terminology Rewritten to clarify the valid conversion from `struct` and describe CFI. s/forwards/forward/ compatibility s/backwards/backward/ compatibility s/foreign/downstream/ --- text/3894-unnamed-variants.md | 163 ++++++++++++++++++++++++++-------- 1 file changed, 125 insertions(+), 38 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index c56170cc0c4..fef365f38dd 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -46,9 +46,9 @@ pub enum TaskState { } ``` -`non_exhaustive` is specified for forwards compatibility, since it should be a +`non_exhaustive` is specified for forward compatibility, since it should be a non-breaking change for variants to be added to `TaskState`. This works by -requiring foreign crates to include a wildcard branch when `match`ing. Once a +requiring downstream crates to include a wildcard branch when `match`ing. Once a new `Paused` variant is added to `TaskState`, any code that previously compiled when using the `TaskState` will continue to do so. However, if any part of the system is _not_ recompiled, that old code will see the `Paused` variant as @@ -91,7 +91,7 @@ crate, to handle the case where it's not one of the currently-named variants. ### Protobuf Protocol Buffers (Protobuf), a language-neutral serialization mechanism, is -designed to be forwards and backwards compatible when extending a schema. +designed to be forward and backward compatible when extending a schema. Initially, it defined all of its enums as closed. However, this caused confusing and often incorrect behavior with `repeated` enums, and so the `proto3` syntax [switched to open enums][protobuf-history]. Handling unknown values @@ -175,7 +175,7 @@ Ensuring ABI compatibility when extending a library requires extra care. While `non_exhaustive` grants API compatibility as variants are added, it [does _not_ provide ABI compatibility][non-exhaustive-ub]. By claiming discriminants for future extensions to an enum, libraries can choose to remain ABI -forwards-compatible as new variants are added. +forward compatible as new variants are added. Projects like Redox and relibc would use this feature for this reason among others listed. @@ -410,7 +410,7 @@ enum may be added as the type evolves. - It is flexible in how new variants are represented. - It does _not_ affect what discriminants are currently valid to represent. - Crates must be recompiled to use new enum variants. -- It affects _only_ foreign crates. +- It affects _only_ downstream crates. By contrast, an unnamed variant affects API _and_ ABI semver compatibility: @@ -711,38 +711,124 @@ declaring crate - the enum is "universally non-exhaustive". #### Compatibility -Given enum versions A and B with some change between them: - -- A change is forwards-compatible if a library designed for enum version A can - use A or B. -- A change is backwards-compatible if a library designed for enum version B can - use A or B. -- A change is fully-compatible if it is both forwards and backwards compatible. -- A change is API compatible if the change does not affect static compilation - using a single enum source, either A or B. -- A change is ABI compatible if the change does not affect dynamically linked - libraries compiled using enum versions A and B (with the same Rust compiler). - -It is an API and ABI fully-compatibile change to: - -- Add a named variant to a field-less enum using a discriminant that was - previously claimed. - - When doing the this, removing the last unnamed variant may cause warnings - for unused code in client libraries, as a wildcard branch is no longer - required. This can be avoided by then adding `#[non_exhaustive]` to the - enum. +Given an enum in crate version _A_, published first, and version _B_ introducing +some change to the enum: + +- An enum in _A_ or _B_ may be a `repr(Int)` true `enum` with unnamed variants + as described by this RFC or a newtype `struct` wrapping `Int` with `pub` + associated constants for each named variant. +- A change is **API compatible** if idiomatic downstream code designed for + version _A_ behaves correctly when it's upgraded to version _B_ and statically + recompiled. + - This excludes breaking changes due to glob imports and other discouraged + behavior. + - Compatibility is required in only one direction: downstream source code + written with _A_ must continue to compile with _B_, not vice versa. + - This corresponds to a **minor change** as defined by [RFC 1105] and the + [Cargo SemVer Reference]; we use "API compatible" here to distinguish from + ABI concerns. By contrast, a **major change** requires non-trivial + changes to be made in downstream source code to accommodate it. +- The enum in _A_ is **ABI forward compatible** with _B_ if code that is + compiled to receive enum values of version _A_ functions correctly when it + receives values with the ABI of version _B_. +- The enum in _B_ is **ABI backward compatible** with _A_ if code that is + compiled to receive enum values of version _B_ functions correctly when it + receives values with the ABI of version _A_. This is relevant to separate + compilation of interoperating systems, such as with plugins or microservices. +- A change is **ABI compatible** if dynamically linked libraries compiled with + _A_ and/or _B_ interoperate correctly. Both directions of compatibility must + be considered for ABI: a library compiled with _A_ may produce values that are + then passed to code expecting _B_, and vice versa. + - This requires that the enum versions are backward and forward ABI compatible + with each other. + - This requires all code be compiled with the same version of Rust or to use a + stable `repr` and calling convention where ABI compatibility is expected. +- [Control Flow Integrity](#control-flow-integrity) (CFI) introduces further + constraints when considering ABI compatibility. + +[RFC 1105]: https://rust-lang.github.io/rfcs/1105-api-evolution.html +[Cargo SemVer Reference]: https://doc.rust-lang.org/cargo/reference/semver.html + +These changes are **API and ABI compatible**: + +- Replace a `repr(transparent)` newtype `struct` wrapping a non-`pub` `Int` with + a `repr(Int)` open `enum` of the same name and defining the same variant + names. + - This breaks the defining crate's usage of `.0`. + - Associated constants may represent multiple variants with the same + discriminant. + - For `repr(C)`, the `Int` must be ABI compatible with the target's chosen + integer type for a C `enum` with an equivalent definition. This is usually + `core::ffi::c_int`. + - This defines a `repr(Int)` `enum` as having the same ABI as `Int`. See + [Control Flow Integrity](#control-flow-integrity) for ABI caveats. +- Given an `enum` in _A_ with an unnamed variant claiming discriminant _D_, add + a named variant in _B_ claiming discriminant _D_. + - This replaces the unnamed variant, although the unnamed variant declaration + may remain unchanged if _D_ is contained in its discriminant range. + - Removing the last unnamed variant may warn for `unreachable_patterns` in + downstream crates, as a wildcard branch is no longer required. This can be + avoided by adding `#[non_exhaustive]` to the enum when removing the last + unnamed variant. -It is an API fully-compatible and ABI backwards-compatible change to: +These changes are **API compatible** and produce a _B_ that is +**ABI backward compatible** with _A_: -- Replace `#[non_exhaustive]` on an enum with an unnamed variant. +- Replace `#[non_exhaustive]` with an unnamed variant on an `enum`. - This may require changes to the defining crate to add wildcard branches. -- Add another claimed discriminant, if an unnamed variant already exists on the - enum. - -It is an API and ABI backwards-compatible change to: - -- Add an unnamed variant to an enum without `#[non_exhaustive]` or another - unnamed variant. The same caveat regarding unused wildcard branches applies. + - _B_ may produce values that are invalid if passed to code compiled with _A_. +- Given an `enum` in _A_ that has an invalid discriminant _D_ and is either + `#[non_exhaustive]` or contains unnamed variants, add an unnamed variant in _B_ + claiming discriminant _D_. + - _B_ may produce values that are invalid if passed to code compiled with _A_. + +These changes are **ABI compatible** but break API compatibility, and are +particularly sensitive to [CFI](#control-flow-integrity): + +- Replace a `repr(transparent)` newtype `struct` wrapping a `pub` `Int` with a + `repr(Int)` open `enum` of the same name and defining the same variant names. + - This breaks downstream source code using `.0` to access the discriminant. + - This breaks downstream source code using the tuple constructor to build a + value with a given discriminant. +- Replace a `repr(Int)` open `enum` with a `repr(transparent)` newtype `struct` + wrapping a non-`pub` `Int`. + - This breaks downstream source code that writes `use Enum::Variant` because + associated constants cannot be imported. + - If the `struct` field in B is instead `pub`, it is a possibly-breaking API + change due to breaking source code that defines a `fn` with the same name as + the enum. + +This change produces a _B_ that is **ABI backward compatible** with _A_ +but breaks API compatibility: + +- Given an `enum` in _A_ that contains no unnamed variants and isn't + `#[non_exhaustive]`, add an unnamed variant. + - This breaks exhaustive `match` downstream and in the defining crate when + _B_ is substituted. + +##### Control Flow Integrity + +Control Flow Integrity describes a set of checks inserted into a compiled +program to make it harder to exploit bugs. One such check validates indirect +jumps, such as function pointer invocations, by requiring the argument types +passed by the caller to match the parameter types expected by the callee. CFI +treats a mismatch as erroneous and aborts the program. + +The [`cfi_encoding`] attribute overrides the symbol that distinguishes types for +CFI. Depending on [how the C enums were compiled][libc-5066] and how CFI is +configured, it may be necessary to set an explicit `cfi_encoding` to avoid +causing CFI errors, like when replacing a `repr(transparent)` `struct` with an +`enum`. + +`repr(Int)` `enum`s are defined as ABI compatible with `Int` and `repr(C)` +`enum`s as ABI compatible with the target's chosen integer type for the enum. +The presence of an unnamed variant in an `enum` does not affect its CFI +encoding. This RFC does not otherwise define +[how `repr(Int)` enums should interact with CFI][ucg-489]. + +[ucg-489]: https://github.com/rust-lang/unsafe-code-guidelines/issues/489 +[`cfi_encoding`]: https://doc.rust-lang.org/nightly/unstable-book/language-features/cfi-encoding.html +[libc-5066]: https://github.com/rust-lang/libc/issues/5066 #### Applicable lints @@ -1635,8 +1721,8 @@ Pros: Cons: - It expands the scope of `non_exhaustive`: the wildcard branch required by - unnamed variants apply to the defining crate as well as foreign crates. This - could make it harder to explain to newer users. + unnamed variants applies to the defining crate as well as downstream crates. + This could make it harder to explain to newer users. - The variant name being an underscore _already_ implies that a wildcard branch is needed. - It always requires two lines to achieve ABI non-exhaustiveness. @@ -1726,7 +1812,7 @@ a suboptimal experience: avoids creating Rust enums from C/C++ enums because of this. It provides an option for newtype enums directly. - ICU4X uses newtype enums for [certain properties][icu4x-props] which must be - forwards compatible with future versions of the enum. + forward compatible with future versions of the enum. - OpenTitan's [`with_unknown!`] macro also uses this pattern to create "C-like enums". - `winapi-rs` defines an [`ENUM`][winapi-enum] macro which generates plain @@ -1776,7 +1862,8 @@ described in the Alternatives section above. ## Unresolved questions -None. +Is the Control Flow Integrity encoding of types the only blocker for `repr(Int)` +`enum` to be ABI compatibile with `Int`? ## Future possibilities From 876011d14c697ba32cda289ba3df7e6a88c4d34a Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 22 Apr 2026 07:37:46 -0700 Subject: [PATCH 23/42] Add a syntax "sugar" section The "sugar" is in quotes due to this being able to desugar to enums with an impossibly large number of variants. --- text/3894-unnamed-variants.md | 36 +++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index fef365f38dd..cb919b33f54 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -423,6 +423,42 @@ By contrast, an unnamed variant affects API _and_ ABI semver compatibility: For enums that have relevant discriminant values, an unnamed variant may be the better choice. This is often the case for enums declaring an explicit `repr`. +### Syntax "sugar" + +An unnamed variant declaration can be thought of as optimized syntax sugar for +declaring a variant with an unwritable name for each unused discriminant in the +declared range. + +```rust +// This: +#[repr(u32)] +enum Fruit { + Apple, + Orange, + Banana = 4, + _ = 2..=5, +} + +// Is like syntax sugar for: +#[repr(u32)] +enum Fruit { + Apple = 0, + Orange = 1, + Banana = 4, + #[doc(hidden)] _Unnamed2 = 2, + #[doc(hidden)] _Unnamed3 = 3, + // No `_Unnamed4` because that's claimed by `Banana`. + #[doc(hidden)] _Unnamed5 = 5, +} +``` + +However, not even the defining crate can write `Fruit::_Unnamed2`, unlike a +[private enum variant][private-variants-rfc]. There's also no limit to the +number of unnamed variants an enum can allocate, so the entire range of `u32` +can be declared as valid discriminants with `_ = ..`. + +[private-variants-rfc]: https://github.com/rust-lang/rfcs/issues/3506 + ## Reference-level explanation ### Unnamed variants From 60c14b79a70cb8bba5972d8c9b661de61a86b13a Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 22 Apr 2026 07:40:54 -0700 Subject: [PATCH 24/42] Re-introduce `..` at the end as an alternative --- text/3894-unnamed-variants.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index cb919b33f54..05c01a175d7 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1557,6 +1557,28 @@ declaration. Unnamed variants use the same syntax to assign discriminants, except they do not have to have a name and thus can be assigned to discontiguous ranges. +### `..` at the end + +```rust +#[repr(u8)] +enum IpProto { + Tcp = 6, + Udp = 17, + + // "the rest of the variants exist" + .. +} +``` + +- This is less flexible than `_ = ..`, is awkward to restrict to smaller or + discontiguous ranges, and introduces a larger syntax change. +- This resembles the [rest pattern] more than the [full range expression] that + discriminants are assigned to and the [wildcard pattern] that it requires. + +[full range expression]: https://doc.rust-lang.org/reference/expressions/range-expr.html#grammar-RangeFullExpr +[rest pattern]: https://doc.rust-lang.org/reference/patterns.html?#rest-patterns +[wildcard pattern]: https://doc.rust-lang.org/reference/patterns.html?#wildcard-pattern + ### Discriminant ranges for named variants instead of unnamed variants What if instead this were valid? From c13a0b11fb9c730cf6945ec37e6d05fac08a189a Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 22 Apr 2026 07:42:58 -0700 Subject: [PATCH 25/42] Clarify some alternatives, move tuple discriminant access there --- text/3894-unnamed-variants.md | 429 +++++++++++++++++----------------- 1 file changed, 213 insertions(+), 216 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 05c01a175d7..b045736fb25 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1595,11 +1595,15 @@ enum IpProto { This is not mutually exclusive with unnamed variants, but this RFC chooses to leave claimed ranges of discriminants as anonymous to keep the feature simple. It can be left as [future work](#discriminant-ranges-for-named-variants) for the -language. Some of the issues are: - -- Adding an `Icmp = 1` variant affects `matches!(1 as IpProto, IpProto::Other)`: - it is an API-breaking change. Unnamed variants are more useful for enum - evolution - a key design goal. +language. Some of the concerns are: + +- It is a possibly-breaking change to add `Icmp = 1`. It affects the result of + `matches!(1 as IpProto, IpProto::Other)`; changing the semantics of a pattern + match in downstream code is a major (API-breaking) change for all current Rust + features. However, upstream code or the [Cargo SemVer Reference] _could_ warn + downstream users that it's not appropriate to depend upon the valid + discriminants for `Other` remaining the same across a minor version bump + because it is a `..` variant. - It is ambiguous what value should be chosen when `IpProto::Other` is used in an expression. Some reasonable ways to avoid that are: - Define an arbitrary rule to choose a discriminant for an `IpProto::Other` @@ -1624,101 +1628,202 @@ language. Some of the issues are: `matches!(o, IpProto::Other) && o != IpProto::Other`. If `derive(PartialEq)` treats all `IpProto::Other` as equal, then it may drastically reduce the performance of the `derive` without an obvious opt-in by the author. - - If named variants' ranges can overlap other named variants as shown above, - then the performance of `matches!(o, IpProto::Other)` degrades as further - variants are added and the set of discriminants representing `Other` becomes - more sparse. No other pattern has this characteristic, where the performance - of matching a pattern is affected by unmentioned properties of the matched - type. - -Most of the utility provided by named variant discriminant ranges can be -provided by replacing it with unnamed variants and using a macro to determine -whether an enum's discriminant is assigned to a named variant. This makes it -clear that declaring a new named variant with an unnamed variant's discriminant -will affect the method's return value. For example: +- If named variants' ranges can overlap other named variants as shown above, + then the performance of `matches!(o, IpProto::Other)` degrades as further + variants are added and the set of discriminants representing `Other` becomes + more sparse. No other pattern has this characteristic where the performance + of matching a pattern is affected by unmentioned properties of the matched + type. This is not great for a systems language. -```rust -#[repr(transparent, u32)] -#[derive(IsNamedVariant)] -enum IpProto { - Tcp = 6, - Udp = 17, - _ = .., -} +The ability to distinguish a known and unknown discriminant granted by this +feature can be substituted with unnamed variants and a +[derive macro](#isnamedvariant-derive). -/// Equivalent to fallibly building `IpProto::Other` from `x`. -fn build_unknown_proto(x: u32) -> Option { - (!(x as IpProto).is_named_variant()).then_some(x as u32) -} - -assert!(!(3u32 as IpProto).is_named_variant()); -assert!(build_unknown_proto(3).is_some()); -assert!((6u32 as IpProto).is_named_variant()); -assert!(build_unknown_proto(6).is_none()); -``` - -### `..` at the end - -```rust -#[repr(u8)] -enum IpProto { - Tcp = 6, - Udp = 17, - - // "the rest of the variants exist" - .. -} -``` - -- This is less flexible than `_ = ..`, and is awkward to restrict to smaller or - discontiguous ranges. -- This resembles the [rest pattern] more than the [full range expression] that - discriminants are assigned to and the [wildcard pattern] that it requires. - -[full range expression]: https://doc.rust-lang.org/reference/expressions/range-expr.html#grammar-RangeFullExpr -[rest pattern]: https://doc.rust-lang.org/reference/patterns.html?#rest-patterns -[wildcard pattern]: https://doc.rust-lang.org/reference/patterns.html?#wildcard-pattern - -### An "other" variant carries unknown discriminants like a tuple variant +### A "wildcard" tuple variant with an unknown discriminant field An alternative way to specify a field-less open enum could be to write this: ```rust -#[repr(u32)] +#[repr(u32, bikeshed_niche_optimize)] enum IpProto { Tcp = 6, Udp = 17, - // bikeshed syntax - Other(0..6 | 7..17 | 18..=u32::MAX), + // A private enum variant with a pattern type. + priv Other(u8 is 0..6 | 7..17 | 18..=u32::MAX), } ``` +Because the only valid representations of the field in `Other` are invalid +representations for the other variants, this could be optimized to be the size +of a `u32` and thus an `unsafe` transmute for `0` to `IpProto` results in +`IpProto::Other(0)`. Then, if the variant is declared +[private][private-variants-rfc], it a minor change to add a new variant to +`IpProto`. + This would mean that the `Other` variant is a named way to refer to unlisted -values and works in pattern matching naturally, while being a zero-cost +values and works in pattern matching directly, all while being a zero-cost representation: ```rust +// Changes behavior when the pattern type in `Other` changes. if let IpProto::Other(x) = proto { // `proto` was *not* `Tcp` or `Udp`; its integer value is in `x`. } +assert_eq!(mem::size_of::(), mem::size_of::()); +assert!(matches!( + unsafe { mem::transmute::(0) }, + // Breaks if we add a variant with discriminant `0`. + IpProto::Other(0) +)); +assert!(matches!( + unsafe { mem::transmute::(6) }, + IpProto::Tcp +)); ``` -However, this has some problems. For one, it's peculiar for a tuple variant -syntax to not carry a payload, but a discriminant. It is also possible to -build the variant with a discriminant value, which means that it would need -to be constrained by a [pattern type][pattern types] - one that may end up -far more complicated if it overlaps with named variants. It is also an API -breaking change to move the discriminant `2` to a new named variant, since -it breaks anyone passing `2` into an `IpProto::Other` expression. +Some concerns: + +- `repr(u32)` currently _disables_ niche optimization of the enum based on its + variants' fields, so, for consistency, a new `repr` that enables niche + optimization in a stabilized manner should be defined before this is exposed + to users. +- This requires [pattern types] to function, which is a larger and more complex + feature to implement and for Rust user's to learn than unnamed variants. +- Without [private variants][private-variants-rfc], it is a possibly-breaking + change to replace a discriminant from the `Other` variant with a new named + variant. This breaks code that tries to construct a `Other` with that + discriminant and affects the type signature of the field. It can also affect + the semantic behavior of a pattern match which is considered a major change + for all current Rust features. However, upstream code _could_ warn downstream + users that it's not appropriate to depend upon the pattern type within `Other` + remaining the same across minor version bumps, with documentation or a new + attribute. +- As a result, this feature would have a long critical path to stabilization. +- The wildcard variant optimization shown here has no clear way to extend to + `enum`s with fields in the future. +- It requires extra effort to set up correctly, since the pattern type cannot + overlap the other field-less variants. +- It requires extra effort to ensure this optimization is actually taking + place. Every addition of a variant requires a change to pattern type of the + "wildcard variant" field. This would require a static assertion and likely + a macro to ensure that all expected variants are covered. +- This has the same issue regarding the complexity of `match` as the type + evolves as the `Other = ..` alternative above. +- This would introduce a new concept to Rust users: that a tuple variant field + can carry a discriminant directly rather than a payload (which can + discriminate). + +Like with `Other = ..`, the utility of determining if the discriminant is known +can be provided [with a macro](#isnamedvariant-derive), and an `as` cast +accesses the discriminant value. + +### Forward compatibility with newtype `struct`s + +As described in [Compatibility](#compatibility), it is a minor change to replace +a `repr(transparent)` newtype `struct` wrapping a non-`pub` `Int` with an open +`enum` using unnamed variants. However, this would require the following +non-trivial changes to `repr(Int/C)` enums: + +- The enum name is a constructor `fn(Repr) -> Enum`: -```rust -if let IpProto::Other(x) = IpProto::Other(6) { - // This branch is not taken, since it's actually an `IpProto::Tcp`! -} -``` + ```rust + assert_eq!(Color(1), Color::Blue); + assert!( + [0, 3, 2].map(Color), + matches!([Color::Red, _, Color::Green]) + ) + ``` + + - This is valid for any open enum, the same as the `as` cast from integer. + - This mirrors the `derive(Debug)` format, is ergonomic, and is clear at + callsite. Thus it may be worth adding to Rust even if `.0` isn't. + - When should one prefer the constructor over the `as` cast? Always? + +- `.0` provides direct access to the discriminant value of `enum`s with an + explicit representation: + + ```rust + let mut c = Color::Blue; + assert_eq!(c.0, 1); + c.0 += 1; + assert!(matches!(c, Color::Green)); + assert_eq!(c.0, 2); + ``` -A `derive(IsNamedVariant)` macro as shown above could replace this behavior. + - This could be supported for _any_ enum with an explicit `repr(Int/C)` by + having closed enums be `unsafe` to mutate through `.0` - it's an + [unsafe field]. + +There are some clear benefits: + +- It is possible to get a reference directly to the discriminant, which can be + useful when performing lifetime-constrained zero-copy serialization. +- The type of `.0` is exactly the `repr`, and doesn't require the user specify a + type to `as` cast to and possibly truncate. Currently, there's no language + feature in Rust that does this - it requires a macro or codegen to guarantee. + This can cause subtle bugs, especially for `repr(C)`: + + ```rust + #[repr(C)] + enum Oops { + // On any platform where this is more than `c_int::MAX`. + TooBig = 2_147_483_649, + } + assert_eq!(Oops::TooBig as core::ffi::c_int, -2_147_483_647); + ``` + + Instead, `.0` accesses the discriminant without fear of truncation: + + ```rust + assert_eq!(Oops::TooBig.0, 2_147_483_649); + // mismatched types, expected `i32`, got `i64` + // let _: c_int = X::V.0; + ``` + +- Some discriminant-manipulating operations are simpler than with `as` casts: + + ```rust + #[repr(u32)] + enum X { + A = 0, + B = 1, + } + let mut x = X::A; + assert_eq!(x.0, 0); + + // SAFETY: 1 is a valid discriminant for `X`. + unsafe { x.0 += 1; } + + assert!(matches!(x, X::B)); + ``` + +- A fielded enum with `#[repr(Int)]` and/or `#[repr(C)]` is guaranteed to have + its discriminant values starting from 0. However, for any given value of that + enum, there's no built-in way to extract what the integer value of the + discriminant is safely. The `unsafe` mechanism is + `(&thenum as *const _ as *const Int).read()`. For open fielded enums, some + direct access to the discriminant would be even more valuable, since the + discriminant could be entirely unknown and the user may want to know its + value. + +[unsafe field]: https://rust-lang.github.io/rust-project-goals/2025h2/unsafe-fields.html + +However, this is a subjectively ugly and undiscoverable syntax to access the +discriminant of an `enum`. One possibility: when introduced, treat these forms +as deprecated and throw a warning to recommend a better syntax than `.0` but +still allow the desired migration be a minor change across the ecosystem. + +There are also existing proposals to [read][rfc-3607] and [write][rfc-3727] the +discriminant directly. They propose alternative syntax, with +`.enum#discriminant` rather than `.0` and `discriminant_of!`/`set_discriminant` +built-ins respectively. + +[rfc-3607]: https://github.com/rust-lang/rfcs/pull/3607 +[rfc-3727]: https://github.com/rust-lang/rfcs/pull/3727 + +So, in order for that to work, `.0` would be necessary. However, this is too +confusing of a syntax for an `enum` to access the discriminant. ### Forbid unnamed variants' discriminants from overlapping named ones @@ -1925,6 +2030,38 @@ Is the Control Flow Integrity encoding of types the only blocker for `repr(Int)` ## Future possibilities +### `IsNamedVariant` derive + +There are certain cases in which it's useful to distinguish between known/named +and unknown/unnamed discriminants. Since unnamed variants cannot do that +syntactically, a `derive` macro can read the definition and generate a +`fn(&self) -> bool` that determines if an enum value represents an unnamed +discriminant. This fits the Rust precedent of requiring an opt-in macro for a +minor change to a type definition to result in a change in semantics. While this +can be provided by third parties, it may be better for the ecosystem to have a +standard library solution. + +```rust +#[repr(transparent, u32)] +#[derive(IsNamedVariant)] +enum IpProto { + Tcp = 6, + Udp = 17, + _ = .., +} + +// Equivalent to fallibly building `IpProto::Other` from `x` in the +// "wildcard variant" alternatives above. +fn build_unknown_proto(x: u32) -> Option { + (!(x as IpProto).is_named_variant()).then_some(x as u32) +} + +assert!(!(3u32 as IpProto).is_named_variant()); +assert!(build_unknown_proto(3).is_some()); +assert!((6u32 as IpProto).is_named_variant()); +assert!(build_unknown_proto(6).is_none()); +``` + ### Discriminant ranges for named variants A future extension could allow for named variants to specify ranges as @@ -2015,146 +2152,6 @@ pub enum Shape { } ``` -### Tuple-like syntax for `repr` enums - -A very useful thing this RFC enables is that replacing this: - -```rust -// The "newtype integer enum" pattern. -#[derive(PartialEq, Eq)] -pub struct Color(u32); -impl Color { - const Red: Color = Color(0); - const Blue: Color = Color(1); - const Green: Color = Color(2); -} -``` - -with this: - -```rust -#[derive(PartialEq, Eq)] -#[repr(u32)] -pub enum Color { - Red, - Blue, - Green, - _ = .., -} -``` - -is a non-breaking change for client crates. - -However, if the library initially exposed the discriminant field as `pub`, as -`bindgen`, `icu4x`, and `open-enum` do, then the migration to an open `enum` -requires that `Color(discriminant)` and `color.0` also function as originally. - -These each have their own independent utility: - -#### Tuple constructor - -The enum name is a constructor `fn(Repr) -> Enum`: - -```rust -assert_eq!(Color(1), Color::Blue); -assert!( - [0, 3, 2].map(Color), - matches!([Color::Red, _, Color::Green]) -) -``` - -- This is valid for any open enum, the same as the `as` cast from integer. -- This mirrors the `derive(Debug)` format, is ergonomic, and is clear at - callsite. Thus it may be worth adding to Rust even if `.0` isn't. -- When should one prefer the constructor over the `as` cast? Always? - -#### Discriminant field access - -`.0` provides direct access to the discriminant value of `enum`s with an -explicit representation: - -```rust -let mut c = Color::Blue; -assert_eq!(c.0, 1); -c.0 += 1; -assert!(matches!(c, Color::Green)); -assert_eq!(c.0, 2); -``` - -This is subjectively ugly and undiscoverable syntax to access the discriminant -of an `enum`. One possibility: when introduced, treat as deprecated and throw a -warning to recommend a better syntax than `.0` but still allow the desired -non-breaking migration. - -As with unnamed variants, the `enum` must not be `repr(Rust)` in order to -guarantee that an integer is used as the discriminant. - -There are a few distinct advantages compared to `as` casting: - -- It is possible to get a reference directly to the discriminant, which can be - useful when performing lifetime-constrained zero-copy serialization. -- The type of `.0` is exactly the `repr`, and doesn't require the user specify a - type to `as` cast to and possibly truncate. Currently, there's no language - feature in Rust that does this - it requires a macro or codegen to guarantee. - This can cause subtle bugs, especially for `repr(C)`: - - ```rust - #[repr(C)] - enum Oops { - // On any platform where this is more than `c_int::MAX`. - TooBig = 2_147_483_649, - } - assert_eq!(Oops::TooBig as core::ffi::c_int, -2_147_483_647); - ``` - - Instead, `.0` accesses the discriminant without fear of truncation: - - ```rust - assert_eq!(Oops::TooBig.0, 2_147_483_649); - // mismatched types, expected `i32`, got `i64` - // let _: c_int = X::V.0; - ``` - -This could be supported for _any_ enum with an explicit `repr(Int)` by having -closed enums be `unsafe` to mutate through `.0` - it's an [unsafe field]. - -```rust -#[repr(u32)] -enum X { - A = 0, - B = 1, -} -let mut x = X::A; -assert_eq!(x.0, 0); - -// SAFETY: 1 is a valid discriminant for `X`. -unsafe { x.0 += 1; } - -assert!(matches!(x, X::B)); -``` - -[unsafe field]: https://rust-lang.github.io/rust-project-goals/2025h2/unsafe-fields.html - -There are also existing proposals to [read][rfc-3607] and [write][rfc-3727] this -discriminant directly. They propose alternative syntax, with -`.enum#discriminant` rather than `.0` and `discriminant_of!`/`set_discriminant` -built-ins respectively. - -[rfc-3607]: https://github.com/rust-lang/rfcs/pull/3607 -[rfc-3727]: https://github.com/rust-lang/rfcs/pull/3727 - -### Extracting the integer value of the discriminant for fielded enums - -A fielded enum with `#[repr(Int)]` and/or `#[repr(C)]` is guaranteed to have its -discriminant values starting from 0. However, for any given value of that enum, -there's no built-in way to extract what the integer value of the discriminant is -safely. The unsafe mechanism is `(&thenum as *const _ as *const Int).read()`. -For open fielded enums, this would be even more valuable, since the discriminant -could be entirely unknown and the programmer may want to know its value. - -Perhaps this uses the same `.0` syntax as above, or an extension to -`mem::Discriminant`? - ### `match` on ranges of enums ```rust From b192036688c5b56272cdab5299cf775cdcac2792 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 22 Apr 2026 07:43:47 -0700 Subject: [PATCH 26/42] Add `repr(transparent, Int)` alternative for ABI compat --- text/3894-unnamed-variants.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index b045736fb25..d094b8bf2c1 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1934,6 +1934,29 @@ equivalent to `#[non_exhaustive]`. However, this is confusing for a syntax that describes ranges of variants: what does the range `_ = ..` actually cover? Is there still ABI compatibility? +### Require `repr(transparent, Int)` on `enum` for ABI compatibility with `Int` + +This RFC defines ABI compatibility between `repr(Int/C)` enums and their +representing integers, which matches the C standard (C23 §6.7.3.3). However, +[CFI](#control-flow-integrity) may treat these as incompatible types and abort. + +`repr(Int)` on an `enum` specifies an explicit discriminant, but does not have +to imply that it is ABI compatible with `Int`. What if `repr(transparent, Int)` +could be specified to make it explicit that ABI compatibility with `Int` is +required, including by CFI? + +The reason the RFC does not choose this is because `#[cfi_encoding]` and +compiler flags can predictably override CFI behavior for cases where the +distinction between `repr(transparent)` `struct` and open `enum` may matter. + +### Don't introduce a new `as` cast + +This RFC introduces new a `as` cast from integer to `enum` that _cannot_ cause +data loss. While it would be excellent for Rust to provide a non-`as` mechanism +to convert from integer to `enum` such as a `TryFrom` `derive`, such a mechanism +should be provided for all `enum`s, not just those with unnamed fields as +affected defined by this RFC. + ## Prior art _Open_ and _closed_ enums are [pre-existing industry terms][acord-xml]. From feeaf50f6d37892f11f0bab516b0957b000d86a8 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 22 Apr 2026 07:44:45 -0700 Subject: [PATCH 27/42] Various nits --- text/3894-unnamed-variants.md | 37 ++++++++++------------------------- 1 file changed, 10 insertions(+), 27 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index d094b8bf2c1..6e3b9047991 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -117,7 +117,7 @@ in C++), a newtype integer isn't as ergonomic to use as an `enum`: - It is arduous to read the generated definition - the variants are inside of an `impl` instead of next to the name. It hides the type's nature as an enum. - It's invalid to `use` the pseudo-variants like with `use EnumName::*`. -- The third-party macro ecosystem built around enums can't be used. +- The third-party macro ecosystem built around enums can't be simply used. - Rust is a systems language that can move data around efficiently, and so first-class support for named integers is valuable for embedded programmers. - Code analysis and lints specific to enums are unavailable. @@ -153,9 +153,8 @@ ensure that unknown values cannot be exposed to Rust. [repr-c-field-less]: https://doc.rust-lang.org/reference/type-layout.html#reprc-field-less-enums -With unnamed variants, the current guidance surrounding sharing enums with C can -thus be simplified greatly: add a `_ = ..` variant and UB from invalid values -aren't a concern. +With unnamed variants, interoperating with a C `enum` is very simple: add +`#[repr(C)]` and `_ = ..` to a Rust enum and it's compatible with C. `bindgen` has [multiple ways][bindgen-enum-variation] to generate Rust that correspond to a C enum, the default being to define a series of `const` items. @@ -164,6 +163,10 @@ not always match that of `repr(C)` on a Rust `enum`. A future version of `bindgen` could use this feature to add a `_ = ..` variant to a Rust `enum` by default, instead of a exposing a less-effective `non_exhaustive` attribute. +Today, Rust for Linux configures `bindgen` to generate newtype integers. It +would switch to using a first-class `enum` type if they were sound to use with +an evolving C `enum`. + [bindgen-enum-variation]: https://docs.rs/bindgen/0.72.1/bindgen/enum.EnumVariation.html ### Dynamic Linking @@ -261,28 +264,8 @@ assert_eq!(FuelLevel::try_from(10).unwrap() as u32, 10); assert!(FuelLevel::try_from(21).is_err()); ``` -With other extensions, this could even be generic: - -```rust -trait EnumDiscriminant { - type Ranged>; -} - -impl EnumDiscriminant for u32 { - type Ranged> = RangedU32; -} - -#[repr(u32)] -enum RangedU32> { - _ = RANGE, -} - -type Ranged> = ::Ranged; - -type FuelLevel = Ranged; -``` - -[Pattern types][pattern types] are a more direct way to express this. +[Pattern types][pattern types] are a more direct and flexible way to express +this. [pattern types]: https://github.com/rust-lang/rust/pull/107606 @@ -2108,7 +2091,7 @@ assert_eq!(Color::Red as u8, 0); // help: specify a discriminant with `2 as Color::Unknown` // let c = Color::Unknown; -// Use an `as` cast to construct `Color::Unknown` safely. +// Use an `as` cast to construct `Color::Unknown` without data loss. let c = 3 as Color::Unknown; assert_eq!(c as u8, 3); From 64db5166d304034c8cd9d260e53c9357e2531f0f Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 22 Apr 2026 08:13:25 -0700 Subject: [PATCH 28/42] Correct current usage of enums in Rust for Linux --- text/3894-unnamed-variants.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 6e3b9047991..037896fb3d3 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -163,9 +163,10 @@ not always match that of `repr(C)` on a Rust `enum`. A future version of `bindgen` could use this feature to add a `_ = ..` variant to a Rust `enum` by default, instead of a exposing a less-effective `non_exhaustive` attribute. -Today, Rust for Linux configures `bindgen` to generate newtype integers. It -would switch to using a first-class `enum` type if they were sound to use with -an evolving C `enum`. +Today, Rust for Linux configures `bindgen` to generate newtype integers and raw +integers, based on if the enums are mapped to a `typedef` of the underlying +integer type. It would switch to using a first-class `enum` type where +possible if they were sound to use with an evolving C `enum`. [bindgen-enum-variation]: https://docs.rs/bindgen/0.72.1/bindgen/enum.EnumVariation.html From e1f2b9158b29b2868c65eb18c8356c38b33eb8ae Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 22 Apr 2026 09:00:35 -0700 Subject: [PATCH 29/42] Improve CFI section --- text/3894-unnamed-variants.md | 44 ++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 14 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 037896fb3d3..dca97139ee5 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -828,23 +828,35 @@ but breaks API compatibility: ##### Control Flow Integrity -Control Flow Integrity describes a set of checks inserted into a compiled +Control Flow Integrity (CFI) describes a set of checks inserted into a compiled program to make it harder to exploit bugs. One such check validates indirect -jumps, such as function pointer invocations, by requiring the argument types -passed by the caller to match the parameter types expected by the callee. CFI -treats a mismatch as erroneous and aborts the program. +jumps, such as function pointer invocations, by aborting if the caller and +callee disagree on the type signature of the function. + +CFI treats C enums as a different type from their backing integer type, so +transmuting `fn(MyEnum)` to `fn(c_int)` and calling the function will lead to an +abort even if `MyEnum` is backed by `c_int`. Similarly, transmuting +`fn(MyEnum1)` to `fn(MyEnum2)` and calling the function also leads to an abort. + +When using CFI, `repr(Int/C)` `enum`s are ABI compatible with C `enum` of the +same name and backing integer type, while [_not_ being compatible][ucg-489] with +the backing integer type directly. If the C source uses a [`typedef`][libc-5066] +instead of `enum`, then it already uses the same CFI encoding as the relevant +integer. + +CFI compares the _name_ of the `enum` when validating a function call signature, +so for compatibility between Rust and C over FFI, Rust must declare the open +enum with exactly the same CFI name as the C enum for them to be ABI compatible. +The presence of an unnamed variant in an `enum` does not affect its CFI +encoding. -The [`cfi_encoding`] attribute overrides the symbol that distinguishes types for -CFI. Depending on [how the C enums were compiled][libc-5066] and how CFI is -configured, it may be necessary to set an explicit `cfi_encoding` to avoid -causing CFI errors, like when replacing a `repr(transparent)` `struct` with an -`enum`. +When using CFI, a `#[repr(transparent)]` newtype `struct` is ABI compatible with +the underlying integer type, and not with any `enum` types. -`repr(Int)` `enum`s are defined as ABI compatible with `Int` and `repr(C)` -`enum`s as ABI compatible with the target's chosen integer type for the enum. -The presence of an unnamed variant in an `enum` does not affect its CFI -encoding. This RFC does not otherwise define -[how `repr(Int)` enums should interact with CFI][ucg-489]. +The [`cfi_encoding`] attribute overrides the symbol that distinguishes types for +CFI and can indicate whether the enum is meant to interoperate with a C enum of +the same name or the backing integer. This allows for an ABI compatible switch +from newtype `struct` to open `enum`. [ucg-489]: https://github.com/rust-lang/unsafe-code-guidelines/issues/489 [`cfi_encoding`]: https://doc.rust-lang.org/nightly/unstable-book/language-features/cfi-encoding.html @@ -1929,6 +1941,10 @@ to imply that it is ABI compatible with `Int`. What if `repr(transparent, Int)` could be specified to make it explicit that ABI compatibility with `Int` is required, including by CFI? +This could also be inverted: `repr(Int)` is treated as CFI compatible with `Int` +but `repr(C, Int)` is CFI compatible with a C `enum`. This would be confusing +for `enum`s with fields, where the `C` also changes the layout of the type. + The reason the RFC does not choose this is because `#[cfi_encoding]` and compiler flags can predictably override CFI behavior for cases where the distinction between `repr(transparent)` `struct` and open `enum` may matter. From 7eb6b99507c519b51095c88166a0b2542957584f Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Sat, 2 May 2026 12:35:40 -0700 Subject: [PATCH 30/42] Clean up repr(C) behavior language --- text/3894-unnamed-variants.md | 45 ++++++++++++++++++++++------------- 1 file changed, 29 insertions(+), 16 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index dca97139ee5..d9ccf64de31 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -461,10 +461,13 @@ An unnamed variant declaration may be specified more than once on the same enum. It is valid to claim multiple ranges of discriminants. Those ranges may be discontiguous. -To declare an unnamed variant, the `enum` must have an explicit `repr(Int)`. -`Int` is one of the primitive integers or `C`. If it is `C`, then `Int` below is -`isize`. An unnamed variant declaration must specify a discriminant expression -with one of these types: +To declare an unnamed variant, the `enum` must have an explicit `repr(Int)` to +indicate a **backing integer** for the `enum`. `Int` is one of the primitive +integers or `C`. If it is `C`, then the `Int` for the discriminant expression +below is `isize` and the declaration has further [nuances](#reprc-behavior). + +An unnamed variant declaration must specify a discriminant expression with one +of these types: - `Int` - Claims a particular discriminant value. @@ -592,15 +595,16 @@ enums are ordinarily backed by a `ffi::c_int`, but if any of the assigned discriminants cannot fit, a larger backing integer is chosen that can represent all of them. -> Since this behavior is fraught with ABI mismatches, this is going to change -> to [forbid enums larger than `c_int` or `c_uint`][enum-size-constrain]. +> Since this behavior is fraught with mismatches on different compiler +> platforms, allowing enums larger than `c_int` or `c_uint` is currently being +> [phased out][enum-size-constrain] via a Future Compatibility Warning. + +[enum-size-constrain]: https://github.com/rust-lang/rust/pull/147017 Sometimes this is overridden by the system's ABI. On some rarer platforms, `repr(C)` enums start as small as 1 byte, smaller than the C `int`. The behavior is otherwise the same. -[enum-size-constrain]: https://github.com/rust-lang/rust/pull/147017 - The same rules apply for discriminants assigned to unnamed variants: ```rust @@ -611,18 +615,21 @@ enum Small { } // Named and unnamed variants can both grow a `repr(C)` enum. +// Emits FCW `repr_c_enums_larger_than_int` for `Big1` and `Big2`. +#[repr(C)] enum Big1 { X = 1, _ = isize::MAX, } +#[repr(C)] enum Big2 { X = 1, _ = 2, Y = isize::MAX, } -// On x86_64-unknown-linux-gnu. +// On x86_64-unknown-linux-gnu: const _: () = assert!( size_of::() == 4 && size_of::() == 8 && @@ -637,6 +644,8 @@ discriminant expression, the effective bound of the claimed range is dependent on what the backing integer would be if no unnamed variants were declared. ```rust +// On x86_64-unknown-linux-gnu: + #[repr(C)] enum SmallNonnegative { X = 0, @@ -658,7 +667,6 @@ enum BigOpen2 { _ = 0..=isize::MAX, } -// On x86_64-unknown-linux-gnu. const _: () = assert!( size_of::() == 4 && size_of::() == 8 && @@ -679,14 +687,16 @@ enum Foo { as this Rust open enum, regardless of the discriminant values assigned: ```rust -// `allow` effective when there are 256 variants within the `u8`/`i8` range on -// a short-enum platform. Only macros/codegen like bindgen bother with this. -#[allow(taken_discriminant_ranges)] #[repr(C)] enum Foo { Name1 = Value1, Name2 = Value2, // ... + + // This `allow` is effective when there are 256 variants for `u8`/`i8` + // or 65536 variants for `u16`/`i6` on a short-enum platform. + // Only macros/codegen like bindgen bother with this. + #[allow(taken_discriminant_ranges)] _ = .., } ``` @@ -1211,10 +1221,12 @@ A `repr(C)` unit-only open enum may be `as` cast from: can cast from `c_int` and `c_uint` to most `repr(C)` open enums, while preventing unexpected truncations when necessary. +Examples: + ```rust const TEN: isize = 10; -// Must be able to represent `u8::MAX`: `u8` or `c_int` or `c_uint`. +// Must be able to represent `u8::MAX`: backed by `u8` or `c_int` or `c_uint`. #[repr(C)] enum SmallUnsigned { X = 0, @@ -1231,7 +1243,7 @@ enum Small { _ = .., } -// Must be able to represent negative numbers: `i8` or `c_int`. +// Must be able to represent negative numbers: backed by `i8` or `c_int`. #[repr(C)] enum SmallSigned { X = 0, @@ -1240,7 +1252,8 @@ enum SmallSigned { _ = .., } -// Must be able to hold `isize::MIN..=isize::MAX` which may exceed `c_int`. +// Must be able to hold `isize::MIN..=isize::MAX` which may exceed `c_int`: +// may be backed by `isize`, but could be `c_int` if `c_int` is larger. #[repr(C)] enum Big { X = 0, From 3b99dbbbf48f9e418e0084a2b56ea81f680f8997 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Sat, 2 May 2026 12:39:17 -0700 Subject: [PATCH 31/42] Improve CFI section further --- text/3894-unnamed-variants.md | 106 ++++++++++++++++++++++++---------- 1 file changed, 74 insertions(+), 32 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index d9ccf64de31..6d4ee7b0539 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -838,40 +838,82 @@ but breaks API compatibility: ##### Control Flow Integrity -Control Flow Integrity (CFI) describes a set of checks inserted into a compiled -program to make it harder to exploit bugs. One such check validates indirect -jumps, such as function pointer invocations, by aborting if the caller and -callee disagree on the type signature of the function. - -CFI treats C enums as a different type from their backing integer type, so -transmuting `fn(MyEnum)` to `fn(c_int)` and calling the function will lead to an -abort even if `MyEnum` is backed by `c_int`. Similarly, transmuting -`fn(MyEnum1)` to `fn(MyEnum2)` and calling the function also leads to an abort. - -When using CFI, `repr(Int/C)` `enum`s are ABI compatible with C `enum` of the -same name and backing integer type, while [_not_ being compatible][ucg-489] with -the backing integer type directly. If the C source uses a [`typedef`][libc-5066] -instead of `enum`, then it already uses the same CFI encoding as the relevant -integer. - -CFI compares the _name_ of the `enum` when validating a function call signature, -so for compatibility between Rust and C over FFI, Rust must declare the open -enum with exactly the same CFI name as the C enum for them to be ABI compatible. -The presence of an unnamed variant in an `enum` does not affect its CFI -encoding. - -When using CFI, a `#[repr(transparent)]` newtype `struct` is ABI compatible with -the underlying integer type, and not with any `enum` types. - -The [`cfi_encoding`] attribute overrides the symbol that distinguishes types for -CFI and can indicate whether the enum is meant to interoperate with a C enum of -the same name or the backing integer. This allows for an ABI compatible switch -from newtype `struct` to open `enum`. - -[ucg-489]: https://github.com/rust-lang/unsafe-code-guidelines/issues/489 -[`cfi_encoding`]: https://doc.rust-lang.org/nightly/unstable-book/language-features/cfi-encoding.html +[Control Flow Integrity][llvm-cfi] (CFI) describes a set of checks inserted into +a compiled program to make it harder to exploit bugs. One such check validates +function pointer calls by aborting if the function type signatures of the +dynamic caller and static callee are considered incompatible by the check. + +[llvm-cfi]: https://clang.llvm.org/docs/ControlFlowIntegrity.html + +This function signature is encoded into the binary and compared at runtime. To +compose this string, Clang/Rust use the mangled name of an `enum`, referred +to below as a type's _CFI encoding_. If two types share a CFI encoding, +CFI considers them compatible for the purposes of function pointer casts. + +These all share the same encoding and are compatible for CFI signature checking +when used as parameters or return values in a function using the C ABI +(`extern "C" fn`): + +- `enum foo { ... }` in C/C++ global namespace +- `typedef enum { ... } foo` in C/C++ global namespace +- `typedef enum foo { ... } bar` in C/C++ global namespace using `enum foo` or + `bar`. A `typedef` name is only encoded when the type it names is anonymous. +- `enum class foo { ... }` in C++ global namespace +- `#[repr(C)] enum foo` in Rust (ignoring modules) +- `enum foo : uint16_t` in C and `enum class foo : uint16_t` in C++ also share + this encoding, since Clang doesn't encode the backing integer for the `enum`. + It remains ABI incompatible with the above types. + +These share a different CFI encoding: + +- `int` +- [`typedef int foo`][libc-5066] + [libc-5066]: https://github.com/rust-lang/libc/issues/5066 +This RFC proposes that `#[repr(Int)] enum foo` encode the same as +`enum foo : CEquivalentOfInt` / `enum class foo : CppEquivalentOfInt` when used +in an `extern "C" fn` signature. As of writing, compatibility with C/C++ +encoding is only attempted for `repr(C)` `enum` in `extern "C" fn`. + +Because an `enum` and its backing integer don't share the same encoding, this +triggers an abort when using CFI: + +```rust +// On x86_64-unknown-linux-gnu: +#[repr(C)] +#[derive(Debug)] +enum Foo { + X, Y, Z +} +// Also aborts with `extern "C" fn`, `repr(i32)` enum, and reversed conversion. +let f: fn(Foo) = |x: Foo| println!("{x:?}"); +let g: fn(ffi::c_int) = unsafe { mem::transmute(f) }; +// As of writing, Miri identifies f / g as ABI-incompatible and aborts as well. +(g)(2); +``` + +The [`cfi_encoding`] attribute overrides a type's identifier for CFI. It uses +[Itanium C++ ABI mangling][itanium-mangle] to name the type. For example, +`#[repr(C)] enum foo` encodes as `3foo` and `i32` as `u3i32`. To avoid CFI +aborts, this attribute can be used today to: + +- Make a `repr(transparent)` newtype `struct` encode the same as a C/C++ `enum`. +- Make a `repr(C)` Rust `enum` encode the same as a C/C++ `typedef int foo` + as well as a Rust `repr(transparent)` newtype `struct` wrapping `c_int`. +- Make a `repr(Int)` Rust `enum foo` encode the same as a C/C++ fixed-integer + `enum foo : CEquivalentOfInt`. + +[`cfi_encoding`]: https://doc.rust-lang.org/nightly/unstable-book/language-features/cfi-encoding.html +[itanium-mangle]: https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangle.source-name + +By default, Clang and Rust do not encode integers of the same size in the same +way: C `int` and `long` may encode differently even when they're the same size. +The Clang `-fsanitize-cfi-icall-experimental-normalize-integers` and Rust +`-Zsanitizer-cfi-normalize-integers` flags normalize integer encoding across the +languages so that a C `int` encodes the same as the signed Rust integer of the +same bit width. + #### Applicable lints ##### Empty discriminant ranges From 4b8bbe244ca903801c9ca04b21e06adaae1dbff1 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Sat, 2 May 2026 12:40:22 -0700 Subject: [PATCH 32/42] Replace `repr(transparent, Int)` enum with better alts --- text/3894-unnamed-variants.md | 79 +++++++++++++++++++++++++---------- 1 file changed, 57 insertions(+), 22 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 6d4ee7b0539..77f03364bac 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1769,7 +1769,7 @@ Like with `Other = ..`, the utility of determining if the discriminant is known can be provided [with a macro](#isnamedvariant-derive), and an `as` cast accesses the discriminant value. -### Forward compatibility with newtype `struct`s +### Forward compatibility with all newtype `struct`s As described in [Compatibility](#compatibility), it is a minor change to replace a `repr(transparent)` newtype `struct` wrapping a non-`pub` `Int` with an open @@ -1876,6 +1876,29 @@ built-ins respectively. So, in order for that to work, `.0` would be necessary. However, this is too confusing of a syntax for an `enum` to access the discriminant. +### Require `repr(C, Int)` for compatibility with fixed-type C/C++ `enum` + +This RFC [proposes](#control-flow-integrity) that a `repr(Int)` `enum` be +compatible with a C/C++ `enum` specifying the same fixed underlying type. + +Instead, it could be required that `repr(C)` also be included on a `repr(Int)` +field-less `enum` in order to guarantee compatibility with an equivalent C/C++ +definition. Specifying `repr(C, Int)` on a field-less `enum` is currently +rejected. + +This approach has these disadvantages: + +- `repr(C)` affects the layout of `enum`s with fields; `repr(C, Int)` would mean + very different things for field-less and fielded `enum`s. +- `repr(Int)` on an `enum` with fields is defined as compatible with a + `union`-of-`struct`s where each `struct`'s first field is a C++ + `enum class : CppEquivalentOfInt`. It's inconsistent to have compatibility + _without_ spelling `C` for enums with fields and require `C` for compatibility + of field-less `repr(Int)` enums with their C/C++ counterparts. +- It is reasonable for users to expect that `repr(Int)` `enum` be compatible + with a C enum using the same fixed underlying type, whereas `repr(C)` `enum` + exists to be compatible with a default C definition. + ### Forbid unnamed variants' discriminants from overlapping named ones ```rust @@ -1985,25 +2008,6 @@ equivalent to `#[non_exhaustive]`. However, this is confusing for a syntax that describes ranges of variants: what does the range `_ = ..` actually cover? Is there still ABI compatibility? -### Require `repr(transparent, Int)` on `enum` for ABI compatibility with `Int` - -This RFC defines ABI compatibility between `repr(Int/C)` enums and their -representing integers, which matches the C standard (C23 §6.7.3.3). However, -[CFI](#control-flow-integrity) may treat these as incompatible types and abort. - -`repr(Int)` on an `enum` specifies an explicit discriminant, but does not have -to imply that it is ABI compatible with `Int`. What if `repr(transparent, Int)` -could be specified to make it explicit that ABI compatibility with `Int` is -required, including by CFI? - -This could also be inverted: `repr(Int)` is treated as CFI compatible with `Int` -but `repr(C, Int)` is CFI compatible with a C `enum`. This would be confusing -for `enum`s with fields, where the `C` also changes the layout of the type. - -The reason the RFC does not choose this is because `#[cfi_encoding]` and -compiler flags can predictably override CFI behavior for cases where the -distinction between `repr(transparent)` `struct` and open `enum` may matter. - ### Don't introduce a new `as` cast This RFC introduces new a `as` cast from integer to `enum` that _cannot_ cause @@ -2103,8 +2107,10 @@ described in the Alternatives section above. ## Unresolved questions -Is the Control Flow Integrity encoding of types the only blocker for `repr(Int)` -`enum` to be ABI compatibile with `Int`? +Is the Control Flow Integrity encoding of types the only [blocker][ucg-489] for +`repr(Int)` `enum` to be ABI compatibile with `Int`? + +[ucg-489]: https://github.com/rust-lang/unsafe-code-guidelines/issues/489 ## Future possibilities @@ -2260,3 +2266,32 @@ let name = match code { // Exhaustive match, no wildcard branch needed. } ``` + +### Improved control over CFI encoding + +This RFC defines ABI compatibility between `repr(Int/C)` enums and their +underlying types, which matches the C standard (C23 §6.7.3.3). However, +[CFI](#control-flow-integrity) treats these as incompatible types and aborts. +Two `enum`s with the same underlying type are incompatible: compatibility isn't +transitive. + +The [`cfi_encoding`] attribute allows the name of a type for CFI be directly +controlled, but it has downsides: + +- The CFI encoding of integers is dependent on compiler flags, and so a manual + override can be valid for one set of flags but not another. +- It requires extra knowledge of name mangling. +- The mangling is technically platform dependent: Clang on Windows uses MSVC + mangling for CFI. + +A `cfi_encoding_of` attribute could instead be used to copy the encoding of +another type: + +```rust +#[repr(i32)] +// Compatible with `typedef int32_t Foo` instead of `enum Foo: int32_t` +#[cfi_encoding_of(i32)] +enum Foo { + X, Y, Z +} +``` From 6bee9d59e9fd24c3ccf631aa1d962d5f7e448884 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Sat, 2 May 2026 12:42:16 -0700 Subject: [PATCH 33/42] Clean up forward compat with newtype struct section --- text/3894-unnamed-variants.md | 110 +++++++++++++++++----------------- 1 file changed, 54 insertions(+), 56 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 77f03364bac..28f9a1cd119 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1773,8 +1773,8 @@ accesses the discriminant value. As described in [Compatibility](#compatibility), it is a minor change to replace a `repr(transparent)` newtype `struct` wrapping a non-`pub` `Int` with an open -`enum` using unnamed variants. However, this would require the following -non-trivial changes to `repr(Int/C)` enums: +`enum` using unnamed variants. In order to prevent this, it would require the +following non-trivial changes to `repr(Int/C)` enums: - The enum name is a constructor `fn(Repr) -> Enum`: @@ -1806,76 +1806,74 @@ non-trivial changes to `repr(Int/C)` enums: having closed enums be `unsafe` to mutate through `.0` - it's an [unsafe field]. -There are some clear benefits: + There are some clear benefits: -- It is possible to get a reference directly to the discriminant, which can be - useful when performing lifetime-constrained zero-copy serialization. -- The type of `.0` is exactly the `repr`, and doesn't require the user specify a - type to `as` cast to and possibly truncate. Currently, there's no language - feature in Rust that does this - it requires a macro or codegen to guarantee. - This can cause subtle bugs, especially for `repr(C)`: + - It is possible to get a reference directly to the discriminant, which can be + useful when performing lifetime-constrained zero-copy serialization. + - The type of `.0` is exactly the `repr`, and doesn't require the user specify a + type to `as` cast to and possibly truncate. Currently, there's no language + feature in Rust that does this - it requires a macro or codegen to guarantee. + This can cause subtle bugs, especially for `repr(C)`: - ```rust - #[repr(C)] - enum Oops { - // On any platform where this is more than `c_int::MAX`. - TooBig = 2_147_483_649, - } - assert_eq!(Oops::TooBig as core::ffi::c_int, -2_147_483_647); - ``` + ```rust + #[repr(C)] + enum Oops { + // On any platform where this is more than `c_int::MAX`. + TooBig = 2_147_483_649, + } + assert_eq!(Oops::TooBig as core::ffi::c_int, -2_147_483_647); + ``` - Instead, `.0` accesses the discriminant without fear of truncation: + Instead, `.0` accesses the discriminant without fear of truncation: - ```rust - assert_eq!(Oops::TooBig.0, 2_147_483_649); - // mismatched types, expected `i32`, got `i64` - // let _: c_int = X::V.0; - ``` + ```rust + assert_eq!(Oops::TooBig.0, 2_147_483_649); + // mismatched types, expected `i32`, got `i64` + // let _: c_int = X::V.0; + ``` -- Some discriminant-manipulating operations are simpler than with `as` casts: + - Some discriminant-manipulating operations are simpler than with `as` casts: - ```rust - #[repr(u32)] - enum X { - A = 0, - B = 1, - } - let mut x = X::A; - assert_eq!(x.0, 0); + ```rust + #[repr(u32)] + enum X { + A = 0, + B = 1, + } + let mut x = X::A; + assert_eq!(x.0, 0); - // SAFETY: 1 is a valid discriminant for `X`. - unsafe { x.0 += 1; } + // SAFETY: 1 is a valid discriminant for `X`. + unsafe { x.0 += 1; } - assert!(matches!(x, X::B)); - ``` + assert!(matches!(x, X::B)); + ``` -- A fielded enum with `#[repr(Int)]` and/or `#[repr(C)]` is guaranteed to have - its discriminant values starting from 0. However, for any given value of that - enum, there's no built-in way to extract what the integer value of the - discriminant is safely. The `unsafe` mechanism is - `(&thenum as *const _ as *const Int).read()`. For open fielded enums, some - direct access to the discriminant would be even more valuable, since the - discriminant could be entirely unknown and the user may want to know its - value. + - A fielded enum with `#[repr(Int)]` and/or `#[repr(C)]` is guaranteed to have + its discriminant values starting from 0. However, for any given value of that + enum, there's no built-in way to extract what the integer value of the + discriminant is safely. The `unsafe` mechanism is + `(&thenum as *const _ as *const Int).read()`. For open fielded enums, some + direct access to the discriminant would be even more valuable, since the + discriminant could be entirely unknown and the user may want to know its + value. -[unsafe field]: https://rust-lang.github.io/rust-project-goals/2025h2/unsafe-fields.html + [unsafe field]: https://rust-lang.github.io/rust-project-goals/2025h2/unsafe-fields.html -However, this is a subjectively ugly and undiscoverable syntax to access the -discriminant of an `enum`. One possibility: when introduced, treat these forms -as deprecated and throw a warning to recommend a better syntax than `.0` but -still allow the desired migration be a minor change across the ecosystem. + However, this is a subjectively ugly and undiscoverable syntax to access the + discriminant of an `enum`. Perhaps when introduced, these forms could begin + as deprecated and throw a warning to recommend a better syntax than `.0` but + still allow the desired forward compatibility for `struct` newtype to open + `enum`. -There are also existing proposals to [read][rfc-3607] and [write][rfc-3727] the -discriminant directly. They propose alternative syntax, with -`.enum#discriminant` rather than `.0` and `discriminant_of!`/`set_discriminant` -built-ins respectively. + This better syntax could resemble the existing proposals to [read][rfc-3607] + and [write][rfc-3727] a discriminant directly. They propose alternative + syntax, with an `.enum#discriminant` field rather than `.0` and + `discriminant_of!`/`set_discriminant` built-ins respectively. [rfc-3607]: https://github.com/rust-lang/rfcs/pull/3607 [rfc-3727]: https://github.com/rust-lang/rfcs/pull/3727 -So, in order for that to work, `.0` would be necessary. However, this is too -confusing of a syntax for an `enum` to access the discriminant. - ### Require `repr(C, Int)` for compatibility with fixed-type C/C++ `enum` This RFC [proposes](#control-flow-integrity) that a `repr(Int)` `enum` be From b9829fd2d48241a5c177056b2b2fe0ed89c70475 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Sat, 2 May 2026 12:43:34 -0700 Subject: [PATCH 34/42] Various nits --- text/3894-unnamed-variants.md | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 28f9a1cd119..3b955050408 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -1219,7 +1219,7 @@ enum NothingYet { _ = .. } (10 as NothingYet > 5 as NothingYet) ``` -### Open enum conversion +### Open enum casting An _open enum_ is defined as an `enum` for which every value of its backing integer is a valid discriminant. @@ -1228,6 +1228,8 @@ integer is a valid discriminant. - An enum is open if every discriminant value for that integer is associated with a named or unnamed variant. - For a field-less enum, this means every initialized bit pattern is valid. + - `_ = ..` makes any enum open. This should apply for + [enums with](#unnamed-variants-on-enums-with-field-data) and without fields. - A [unit-only] open enum may be `as` cast from its backing integer _only_: `2u8 as Color`. See below for `repr(C)` behavior. - If an expression with the `{integer}` inference variable type is used as the @@ -1246,7 +1248,7 @@ integer is a valid discriminant. // let _: u32 = x; ``` -#### `repr(C)` open enum behavior +#### `repr(C)` open enum casting The actual backing integer type for a `repr(C)` enum changes based on the variants' numeric discriminant values as described above. @@ -1762,7 +1764,7 @@ Some concerns: - This has the same issue regarding the complexity of `match` as the type evolves as the `Other = ..` alternative above. - This would introduce a new concept to Rust users: that a tuple variant field - can carry a discriminant directly rather than a payload (which can + can carry a discriminant directly rather than a payload (which can be used to discriminate). Like with `Other = ..`, the utility of determining if the discriminant is known @@ -2011,8 +2013,7 @@ there still ABI compatibility? This RFC introduces new a `as` cast from integer to `enum` that _cannot_ cause data loss. While it would be excellent for Rust to provide a non-`as` mechanism to convert from integer to `enum` such as a `TryFrom` `derive`, such a mechanism -should be provided for all `enum`s, not just those with unnamed fields as -affected defined by this RFC. +should be provided for all `enum`s, not just those with unnamed variants. ## Prior art @@ -2124,7 +2125,7 @@ can be provided by third parties, it may be better for the ecosystem to have a standard library solution. ```rust -#[repr(transparent, u32)] +#[repr(u32)] #[derive(IsNamedVariant)] enum IpProto { Tcp = 6, @@ -2155,9 +2156,12 @@ related Alternatives section above. enum Color { Red = 0, Green = 1, - // Must specify a non-overlapping range, - // including if `..` is the discriminant. - Unknown = 2..=50, + // Must specify a non-overlapping range, including with `Unknown = ..`. + // Optional: make this variant private, which prevents downstream users + // from breaking when a new variant is added and the valid discriminants + // for `Unknown` changes. + // Also: Should e.g. `Unknown = 2..=50 | 60` be allowed? + priv Unknown = 2..=50, } // This is fine. From e2ce3a83a0951fe502a70b837cc6e580638f9419 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Sat, 2 May 2026 12:46:39 -0700 Subject: [PATCH 35/42] Add future work section for pattern type casts --- text/3894-unnamed-variants.md | 74 +++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 3b955050408..d8a54891694 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -2238,6 +2238,80 @@ pub enum Shape { } ``` +### Advanced casting with pattern types + +The `as` cast from `enum` to `Int` could implicitly receive the +[pattern type][pattern types] `Int is P`, where `P` matches every possible +discriminant for the source expression. For a literal or `const` enum variant +source, `P` is just that variant's discriminant. For dynamic source values, it +matches every valid discriminant for the `enum`. The source `enum` may also be a +pattern type, which constrains `P`. + +Conversely, the `as` cast from `Int` to `enum` type that is defined in this RFC +could be extended to support more situations by requiring the source `Int` type +be coercible to `Int is P`, where `P` matches every possible discriminant for +the destination type. The destination may itself be a pattern type of `enum`. + +While it could be required that the `enum` type have an explicit `repr(Int)` or +`repr(C)`, it is not technically necessary nor a breaking change from this RFC's +more conservative proposal. All valid discriminants for an `enum` have a known +but possibly different value when cast to the backing integer for the `enum`, +which can then be integrated into `P` for a static lossless check. + +Example syntax: + +```rust +#[derive(Debug, PartialEq)] +enum Color { + Red, + Green, + Blue, + Yellow = 5, +} + +// Compatible with `repr(Rust)` since `1` is statically known to be valid +// for any type chosen for the `{integer}`. +let c = 1 as Color; +assert_eq!(c, Color::Green); + +// error: invalid discriminant for `Color::Unknown` +// help: `Color` has valid discriminants in `0..=2 | 5` +// let c = 3 as Color::Unknown; + +fn some_color() -> Color { Color::Red } + +// No check necessary; casts are bidirectionally infallible. +fn exact_cast(x: u8 is 0..=2 | 5) -> Color { + x as Color +} +assert_eq!(exact_cast(some_color() as u8), Color::Red); + +// No check necessary; only returns `Red` or `Yellow`. +// Bidirectional casts require a pattern return type. +fn subset_cast(x: u8 is 0 | 5) -> Color is Color::Red | Color::Yellow { + x as Color +} +// Compatible because `Color::Red as u8` has the type `u8 is 0`. +assert_eq!(subset_cast(Color::Red as u8), Color::Red); + +// Compatible because `subset_cast(0) as u8` has the type `u8 is 0 | 5`. +assert_eq!(subset_cast(subset_cast(0) as u8), Color::Red); + +// The below *may* be valid if `some_color` is made `const` and keeps returning +// `Color::Red` or `Color::Yellow`. +// error: incompatible pattern type coercion +// help: the source pattern is `0..=2 | 5` +// help: the destination pattern is `0 | 5` +// help: `3..=4` is incompatible +// assert_eq!(subset_cast(some_color() as u8), Color::Red); + +// error: incompatible source pattern type for cast +// help: `Color` has valid discriminants in `0..=2 | 5` +// help: the source pattern type is `0..=5` +// help: `Color` is incompatible with `3..=4` +// fn superset_cast(x: u8 is 0..=5) -> Color { x as Color } +``` + ### `match` on ranges of enums ```rust From 53227821ae7d4908f575a5e34beb414b8e6d9b2d Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Fri, 15 May 2026 14:17:38 -0700 Subject: [PATCH 36/42] Use "underlying" instead of "backing" integer While "representing" may be better, "underlying integer" matches the language of the C standard. --- text/3894-unnamed-variants.md | 59 ++++++++++++++++++----------------- 1 file changed, 31 insertions(+), 28 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index d8a54891694..4cae36f05ba 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -102,8 +102,8 @@ schema version skew. Protobuf generates code for target languages from a schema. On C++, it can directly generate an `enum` - C++ enums are open since it's valid to -`static_cast` an `enum` from its backing integer. However, on Rust, the current -implementation simulates an open enum by using an integer newtype with +`static_cast` an `enum` from its underlying integer. However, on Rust, the +current implementation simulates an open enum by using an integer newtype with associated constants for each variant. While this allows Protobuf enums in Rust to be used _mostly_ like enums, this is @@ -158,9 +158,9 @@ With unnamed variants, interoperating with a C `enum` is very simple: add `bindgen` has [multiple ways][bindgen-enum-variation] to generate Rust that correspond to a C enum, the default being to define a series of `const` items. -Its best-effort logic to determine the backing integer type for a C enum does +Its best-effort logic to determine the underlying integer type for a C enum does not always match that of `repr(C)` on a Rust `enum`. A future version of -`bindgen` could use this feature to add a `_ = ..` variant to a Rust `enum` by +`bindgen` could use this feature to add a `_ = ..` variant to a Rust `enum` by default, instead of a exposing a less-effective `non_exhaustive` attribute. Today, Rust for Linux configures `bindgen` to generate newtype integers and raw @@ -462,9 +462,10 @@ It is valid to claim multiple ranges of discriminants. Those ranges may be discontiguous. To declare an unnamed variant, the `enum` must have an explicit `repr(Int)` to -indicate a **backing integer** for the `enum`. `Int` is one of the primitive -integers or `C`. If it is `C`, then the `Int` for the discriminant expression -below is `isize` and the declaration has further [nuances](#reprc-behavior). +indicate a fixed **underlying integer** for its discriminant. `Int` is one of +the primitive integers or `C`. If it is `C`, then the `Int` for the discriminant +expression below is `isize` and the declaration has further +[nuances](#reprc-behavior). An unnamed variant declaration must specify a discriminant expression with one of these types: @@ -590,10 +591,10 @@ impl ClaimDiscriminants for RangeFull {} #### `repr(C)` behavior `repr(C)` enums have special semantics in Rust because the discriminant -expression type, `isize`, is not the same as the actual backing integer. These -enums are ordinarily backed by a `ffi::c_int`, but if any of the assigned -discriminants cannot fit, a larger backing integer is chosen that can represent -all of them. +expression type, `isize`, is not the same as the actual underlying integer. +These enums ordinarily share a layout with `ffi::c_int`, but if any of the +assigned discriminants cannot fit, a larger underlying integer is chosen that +can represent all of them. > Since this behavior is fraught with mismatches on different compiler > platforms, allowing enums larger than `c_int` or `c_uint` is currently being @@ -637,11 +638,12 @@ const _: () = assert!( ); ``` -The unbounded end of a discriminant range **never** affects the backing integer -of a `repr(C)` enum. For a `repr(C)` enum, when a range with an unbounded end -(`start..`, `..end`, `..=end`, `..`) is used as an unnamed variant declaration's -discriminant expression, the effective bound of the claimed range is dependent -on what the backing integer would be if no unnamed variants were declared. +The unbounded end of a discriminant range **never** affects the underlying +integer of a `repr(C)` enum. For a `repr(C)` enum, when a range with an +unbounded end (`start..`, `..end`, `..=end`, `..`) is used as an unnamed variant +declaration's discriminant expression, the effective bound of the claimed range +is dependent on what the underlying integer would be if no unnamed variants were +declared. ```rust // On x86_64-unknown-linux-gnu: @@ -861,8 +863,8 @@ when used as parameters or return values in a function using the C ABI - `enum class foo { ... }` in C++ global namespace - `#[repr(C)] enum foo` in Rust (ignoring modules) - `enum foo : uint16_t` in C and `enum class foo : uint16_t` in C++ also share - this encoding, since Clang doesn't encode the backing integer for the `enum`. - It remains ABI incompatible with the above types. + this encoding, since Clang doesn't encode the underlying integer for the + `enum`. It remains ABI incompatible with the above types. These share a different CFI encoding: @@ -876,7 +878,7 @@ This RFC proposes that `#[repr(Int)] enum foo` encode the same as in an `extern "C" fn` signature. As of writing, compatibility with C/C++ encoding is only attempted for `repr(C)` `enum` in `extern "C" fn`. -Because an `enum` and its backing integer don't share the same encoding, this +Because an `enum` and its underlying integer don't share the same encoding, this triggers an abort when using CFI: ```rust @@ -1221,20 +1223,21 @@ enum NothingYet { _ = .. } ### Open enum casting -An _open enum_ is defined as an `enum` for which every value of its backing +An _open enum_ is defined as an `enum` for which every value of its underlying integer is a valid discriminant. -- An open enum always has an explicit `repr` backing integer, or is `repr(C)`. +- An open enum always has an explicit `repr` underlying integer, or is + `repr(C)`. - An enum is open if every discriminant value for that integer is associated with a named or unnamed variant. - For a field-less enum, this means every initialized bit pattern is valid. - `_ = ..` makes any enum open. This should apply for [enums with](#unnamed-variants-on-enums-with-field-data) and without fields. -- A [unit-only] open enum may be `as` cast from its backing integer _only_: +- A [unit-only] open enum may be `as` cast from its underlying integer _only_: `2u8 as Color`. See below for `repr(C)` behavior. - If an expression with the `{integer}` inference variable type is used as the source for an `as` cast to an open enum, it is uniquely constrained to the - explicit backing integer type. This excludes `repr(C)`; see below. + explicit underlying integer type. This excludes `repr(C)`; see below. ```rust #[repr(u8)] @@ -1250,7 +1253,7 @@ integer is a valid discriminant. #### `repr(C)` open enum casting -The actual backing integer type for a `repr(C)` enum changes based on the +The actual underlying integer type for a `repr(C)` enum changes based on the variants' numeric discriminant values as described above. A `repr(C)` unit-only open enum may be `as` cast from: @@ -1259,7 +1262,7 @@ A `repr(C)` unit-only open enum may be `as` cast from: be `as` cast from the same discriminant expression assigned to a variant. - Any primitive explicit-width integer that is capable of representing all variants' discriminants and does not exceed the size of the enum for the - platform. Thus any signedness cast performed to the backing integer has no + platform. Thus any signedness cast performed to the underlying integer has no visible effect. - This means that authors who don't know or care about short-enum platforms can cast from `c_int` and `c_uint` to most `repr(C)` open enums, while @@ -1397,7 +1400,7 @@ Unnamed variants enable a large range of discriminants to be claimed for an enum, whether it's all or some of them. `NonZero`, and an `enum` spelling out each discriminant are the only other ways to achieve this in stable Rust today. -The open enum conversion from backing integer is an ergonomic benefit that is +The open enum conversion from underlying integer is an ergonomic benefit that is made possible by unnamed variants. ### Do nothing @@ -1664,7 +1667,7 @@ language. Some of the concerns are: - The enum author uses an attribute to specify the "default" discriminant for an `IpProto::Other` expression. - Forbid direct construction of `IpProto::Other`. It can only be constructed - via `unsafe` or, for open enums, `as`-cast from the backing integer to + via `unsafe` or, for open enums, `as`-cast from the underlying integer to `IpProto`. There's no check that the discriminant represents an `Other` variant. - A discriminant that is valid for `IpProto::Other` must be provided when @@ -2255,7 +2258,7 @@ the destination type. The destination may itself be a pattern type of `enum`. While it could be required that the `enum` type have an explicit `repr(Int)` or `repr(C)`, it is not technically necessary nor a breaking change from this RFC's more conservative proposal. All valid discriminants for an `enum` have a known -but possibly different value when cast to the backing integer for the `enum`, +but possibly different value when cast to the underlying integer for the `enum`, which can then be integrated into `P` for a static lossless check. Example syntax: From 966a996ac48ecabc392df70e96628efab45fb97e Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Fri, 15 May 2026 15:37:08 -0700 Subject: [PATCH 37/42] Lint instead of reject non_exhaustive and unnamed variants `non_exhaustive` on an enum with unnamed variants is instead considered "unused", which is a warning by default. --- text/3894-unnamed-variants.md | 73 ++++++++++++++++++++--------------- 1 file changed, 42 insertions(+), 31 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 4cae36f05ba..2d97c3ed95a 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -719,27 +719,8 @@ EnumVariant -> #### No field data This RFC only defines adding unnamed variants to field-less enums, leaving -unnamed variants in enums with fields as future work. - -#### `non_exhaustive` - -The `non_exhaustive` attribute on enums and unnamed variants are mutually -exclusive: - -```rust -#[non_exhaustive] -#[repr(u8)] -enum Color { - Red = 0, - Green = 1, - // error: An `_` variant cannot be specified on a `non_exhaustive` enum. - // help: remove the `#[non_exhaustive]` - _ = 2, -} -``` - -An unnamed variant is more impactful than `non_exhaustive`, since it affects the -declaring crate - the enum is "universally non-exhaustive". +unnamed variants in enums with fields as +[future work](#unnamed-variants-on-enums-with-field-data). #### Compatibility @@ -918,6 +899,29 @@ same bit width. #### Applicable lints +##### Unused `non_exhaustive` + +The existing [`unused_attributes`] lint also detects the `#[non_exhaustive]` +attribute present on an `enum` with unnamed variants. + +[`unused_attributes`]: https://doc.rust-lang.org/rustc/lints/listing/warn-by-default.html#unused-attributes + +```rust +// warning: `non_exhaustive` has no effect on an enum with unnamed variants +// help: `_ = 2` makes this enum match non-exhaustively in all contexts +// note: `#[warn(unused_attributes)]` (part of `#[warn(unused)]`) on by default +#[non_exhaustive] +#[repr(u8)] +enum Color { + Red = 0, + Green = 1, + _ = 2, +} +``` + +An unnamed variant is more impactful than `non_exhaustive` since it affects the +declaring crate as well - the enum is "universally non-exhaustive". + ##### Empty discriminant ranges `empty_discriminant_ranges` is a new `deny`-by-default lint. It should be @@ -1080,7 +1084,7 @@ enum ImplicitNextDiscriminant { The existing [`non_contiguous_range_endpoints`] lint should be produced if: -[`non_contiguous_range_endpoints`]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint_defs/builtin/static.NON_CONTIGUOUS_RANGE_ENDPOINTS.html +[`non_contiguous_range_endpoints`]: https://doc.rust-lang.org/rustc/lints/listing/warn-by-default.html#non-contiguous-range-endpoints - There exists some unnamed variant assigned to a `start..end` or `..end` discriminant expression, and @@ -1944,17 +1948,19 @@ to ensure an enum is open, they would need to handle the particular edge case of an enum with 256 variants and an 8-bit discriminant and leave out the variant. Instead, the lints can be `allow`ed for carefully-considered macros/codegen. -### Require `non_exhaustive`, don't forbid it +### Require `non_exhaustive` rather than lint if it's there -Perhaps an unnamed variant could _require_ `#[non_exhaustive]`, rather than -forbid it? This RFC opts against that, with the following considerations: +An `enum` with unnamed variants [lints](#unused-non_exhaustive) when the +`#[non_exhaustive]` attribute is present. Perhaps an unnamed variant could +instead _require_ `#[non_exhaustive]`? This RFC opts against that, with the +following considerations: Pros: - `non_exhaustive` already implies adding another wildcard branch. This could make it easier to explain to new users by fitting the idea of "needs wildcard branch" into one mental bucket. -- This would make the unstable allow-by-default +- This would make the unstable `allow`-by-default `non_exhaustive_omitted_patterns` lint more obviously correct to apply to enums with unnamed variants. @@ -1965,7 +1971,7 @@ Cons: This could make it harder to explain to newer users. - The variant name being an underscore _already_ implies that a wildcard branch is needed. -- It always requires two lines to achieve ABI non-exhaustiveness. +- An author must recite two special lines to make an enum open instead of one. - Consider this enum: ```rust @@ -1981,10 +1987,15 @@ Cons: ``` When adding `X255`, the `non_exhaustive` _should_ also be removed, but as of - today, an open enum gives no warning if it is `non_exhaustive`. This is even - though it would necessarily be an API and ABI-breaking change to add a new - variant by changing the `repr`. This is non-obvious and can be avoided by - forbidding `non_exhaustive` when a valid unnamed variant exists. + today, a `repr(u8)` enum with 256 variants gives no warning if it is + `non_exhaustive`. This is even though it would necessarily be an API and + ABI-breaking change to add a new variant by changing the `repr`. This is + non-obvious and can be avoided by warning against `non_exhaustive` when an + enum has an unnamed variant. + +Rust could also reject `non_exhaustive` entirely rather than lint, but this is a +stricter approach than Rust otherwise takes for attributes that unambiguously +have no effect. ### Allow an implicit discriminant expression for unnamed variants From b2c511d8c29e818acf032ac6195ae6f7cd9dbd7d Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Fri, 15 May 2026 15:58:36 -0700 Subject: [PATCH 38/42] Mention Ipv6MulticastScope in Motivation --- text/3894-unnamed-variants.md | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 2d97c3ed95a..3693c7f8e5b 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -186,7 +186,22 @@ others listed. [non-exhaustive-ub]: https://github.com/rust-lang/rust-bindgen/issues/1763 -### Embedded syscalls +### Enums with reserved discriminants + +Enums designed against specific protocols may have reserved values that +shouldn't be directly used but must be cleanly handled if encountered. Two +practical examples: + +#### `Ipv6MulticastScope` + +The unstable [`Ipv6MulticastScope`] enum defines two `doc(hidden)` and +perma-unstable attributes to reserve discriminants with unnameable variants. It +could instead define `_ = 0x0, _ = 0xF` and `#[repr(u8)]` to reserve these +discriminants. + +[`Ipv6MulticastScope`]: https://doc.rust-lang.org/nightly/std/net/enum.Ipv6MulticastScope.html + +#### System call interfaces TockOS is an embedded OS with a separate user space and kernel space. Its syscall ABI defines that kernel error codes are between 1 and 1024. It's highly From a939a7468d487c3af77ed6ac786c7673dfe58da7 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Fri, 15 May 2026 16:12:21 -0700 Subject: [PATCH 39/42] Use "detect" rather than "produce" for lints This mirrors the language used by official docs. --- text/3894-unnamed-variants.md | 40 +++++++++++++++++------------------ 1 file changed, 19 insertions(+), 21 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 3693c7f8e5b..d533f29761c 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -939,8 +939,8 @@ declaring crate as well - the enum is "universally non-exhaustive". ##### Empty discriminant ranges -`empty_discriminant_ranges` is a new `deny`-by-default lint. It should be -produced if the discriminant range assigned to an unnamed variant is empty. +`empty_discriminant_ranges` is a new `deny`-by-default lint. It detects when the +discriminant range assigned to an unnamed variant is empty. ```rust #[repr(u8)] @@ -975,12 +975,11 @@ enum Bar { ##### Taken discriminant ranges -`taken_discriminant_ranges` is a new `warn`-by-default lint. It should be -produced if every discriminant in the range assigned to an unnamed variant is -already assigned to a named variant. This results in the unnamed variant -definition having no effect. While an unnamed variant is syntactically present, -no unnamed variant is introduced to the `enum` as it has no discriminants to -claim. +`taken_discriminant_ranges` is a new `warn`-by-default lint. It detects when +every discriminant in the range assigned to an unnamed variant is already +assigned to a named variant. This results in the unnamed variant definition +having no effect. While an unnamed variant is syntactically present, no unnamed +variant is introduced to the `enum` as it has no discriminants to claim. ```rust #[repr(u8)] @@ -1019,14 +1018,13 @@ enum NamedU8 { ##### Truncatable ranges -`overlong_discriminant_ranges` is a new `warn`-by-default lint. It should be -produced if an unnamed variant's discriminant range can be shortened to avoid -overlapping with named variants. +`overlong_discriminant_ranges` is a new `warn`-by-default lint. It detects when +an unnamed variant's discriminant range can be shortened to avoid overlapping +with named variants. Let `start..=end` be the range of discriminants that an unnamed variant -definition is assigned to, regardless of the actual range type used. An -`overlong_discriminant_ranges` lint should be produced if all of the below are -true: +definition is assigned to, regardless of the actual range type used. The +`overlong_discriminant_ranges` lint detects when all of the below are true: - The bound is specified as a range expression in the variant's discriminant expression, and not as an identifier or block. @@ -1039,7 +1037,7 @@ true: defined by an unbounded range. - The prefix is an overlong side _or_ the following variant, if any, has an explicit discriminant. -- The `taken_discriminant_ranges` lint is not produced for this unnamed variant. +- The `taken_discriminant_ranges` lint doesn't detect this unnamed variant. ```rust #[repr(u32)] @@ -1097,7 +1095,7 @@ enum ImplicitNextDiscriminant { ##### Gap of length one caused by an exclusive range -The existing [`non_contiguous_range_endpoints`] lint should be produced if: +The existing [`non_contiguous_range_endpoints`] lint also detects when: [`non_contiguous_range_endpoints`]: https://doc.rust-lang.org/rustc/lints/listing/warn-by-default.html#non-contiguous-range-endpoints @@ -1133,8 +1131,8 @@ enum Bar { ##### Forgot to mention a named variant -The unstable [`non_exhaustive_omitted_patterns`] `allow`-by-default lint should -be produced if a `match` on an enum with unnamed variants mentions some, but not +The unstable [`non_exhaustive_omitted_patterns`] `allow`-by-default lint also +detects when a `match` on an enum with unnamed variants mentions some, but not all, of the named variants. [`non_exhaustive_omitted_patterns`]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint_defs/builtin/static.NON_EXHAUSTIVE_OMITTED_PATTERNS.html @@ -1156,9 +1154,9 @@ enum Bar { } let b = Bar::A; -// warning: some variants are not matched explicitly +// warning: some named variants are not matched explicitly // pattern `Bar::B` not covered -// help: ensure that all variants are matched explicitly by adding the +// help: ensure that all named variants are matched explicitly by adding the // suggested match arms // note: the matched value is of type `Bar` and the // `non_exhaustive_omitted_patterns` attribute was found @@ -2136,7 +2134,7 @@ described in the Alternatives section above. ## Unresolved questions Is the Control Flow Integrity encoding of types the only [blocker][ucg-489] for -`repr(Int)` `enum` to be ABI compatibile with `Int`? +`repr(Int)` `enum` to be ABI compatible with `Int`? [ucg-489]: https://github.com/rust-lang/unsafe-code-guidelines/issues/489 From ec9d36bae12e8b42308a90e889c673fa6f5a8a13 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 20 May 2026 12:05:55 -0700 Subject: [PATCH 40/42] Add rust-lang/rust tracking issue for unnamed variants --- text/3894-unnamed-variants.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index d533f29761c..43b8c4f5ca5 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -2,7 +2,7 @@ - Start Date: 2025-12-09 - RFC PR: [rust-lang/rfcs#3894](https://github.com/rust-lang/rfcs/pull/3894) - Rust Issue: - [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + [rust-lang/rust#156628](https://github.com/rust-lang/rust/issues/156628) ## Summary From 32bd72e70755b1645daeaa61e2f01ae4537b23c0 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 20 May 2026 12:29:06 -0700 Subject: [PATCH 41/42] Use `-` not `_` for lint reference Again, to be consistent with official docs. --- text/3894-unnamed-variants.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index 43b8c4f5ca5..ccb34daa0b4 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -916,10 +916,10 @@ same bit width. ##### Unused `non_exhaustive` -The existing [`unused_attributes`] lint also detects the `#[non_exhaustive]` +The existing [`unused-attributes`] lint also detects the `#[non_exhaustive]` attribute present on an `enum` with unnamed variants. -[`unused_attributes`]: https://doc.rust-lang.org/rustc/lints/listing/warn-by-default.html#unused-attributes +[`unused-attributes`]: https://doc.rust-lang.org/rustc/lints/listing/warn-by-default.html#unused-attributes ```rust // warning: `non_exhaustive` has no effect on an enum with unnamed variants @@ -939,7 +939,7 @@ declaring crate as well - the enum is "universally non-exhaustive". ##### Empty discriminant ranges -`empty_discriminant_ranges` is a new `deny`-by-default lint. It detects when the +`empty-discriminant-ranges` is a new `deny`-by-default lint. It detects when the discriminant range assigned to an unnamed variant is empty. ```rust @@ -975,7 +975,7 @@ enum Bar { ##### Taken discriminant ranges -`taken_discriminant_ranges` is a new `warn`-by-default lint. It detects when +`taken-discriminant-ranges` is a new `warn`-by-default lint. It detects when every discriminant in the range assigned to an unnamed variant is already assigned to a named variant. This results in the unnamed variant definition having no effect. While an unnamed variant is syntactically present, no unnamed @@ -1018,13 +1018,13 @@ enum NamedU8 { ##### Truncatable ranges -`overlong_discriminant_ranges` is a new `warn`-by-default lint. It detects when +`overlong-discriminant-ranges` is a new `warn`-by-default lint. It detects when an unnamed variant's discriminant range can be shortened to avoid overlapping with named variants. Let `start..=end` be the range of discriminants that an unnamed variant definition is assigned to, regardless of the actual range type used. The -`overlong_discriminant_ranges` lint detects when all of the below are true: +`overlong-discriminant-ranges` lint detects when all of the below are true: - The bound is specified as a range expression in the variant's discriminant expression, and not as an identifier or block. @@ -1037,7 +1037,7 @@ definition is assigned to, regardless of the actual range type used. The defined by an unbounded range. - The prefix is an overlong side _or_ the following variant, if any, has an explicit discriminant. -- The `taken_discriminant_ranges` lint doesn't detect this unnamed variant. +- The `taken-discriminant-ranges` lint doesn't detect this unnamed variant. ```rust #[repr(u32)] @@ -1095,9 +1095,9 @@ enum ImplicitNextDiscriminant { ##### Gap of length one caused by an exclusive range -The existing [`non_contiguous_range_endpoints`] lint also detects when: +The existing [`non-contiguous-range-endpoints`] lint also detects when: -[`non_contiguous_range_endpoints`]: https://doc.rust-lang.org/rustc/lints/listing/warn-by-default.html#non-contiguous-range-endpoints +[`non-contiguous-range-endpoints`]: https://doc.rust-lang.org/rustc/lints/listing/warn-by-default.html#non-contiguous-range-endpoints - There exists some unnamed variant assigned to a `start..end` or `..end` discriminant expression, and @@ -1131,11 +1131,11 @@ enum Bar { ##### Forgot to mention a named variant -The unstable [`non_exhaustive_omitted_patterns`] `allow`-by-default lint also +The unstable [`non-exhaustive-omitted-patterns`] `allow`-by-default lint also detects when a `match` on an enum with unnamed variants mentions some, but not all, of the named variants. -[`non_exhaustive_omitted_patterns`]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint_defs/builtin/static.NON_EXHAUSTIVE_OMITTED_PATTERNS.html +[`non-exhaustive-omitted-patterns`]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint_defs/builtin/static.NON_EXHAUSTIVE_OMITTED_PATTERNS.html This uses the same name as the similar lint for `non_exhaustive` because it is burdensome to require developers to remember two different lints for such From 1b2c28cf50a6ba05545189b5e8bf17b96dba6809 Mon Sep 17 00:00:00 2001 From: Alyssa Haroldsen Date: Wed, 3 Jun 2026 22:29:14 -0700 Subject: [PATCH 42/42] Update guide-level `non_exhaustive` section --- text/3894-unnamed-variants.md | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/text/3894-unnamed-variants.md b/text/3894-unnamed-variants.md index ccb34daa0b4..9f8bdfac8a0 100644 --- a/text/3894-unnamed-variants.md +++ b/text/3894-unnamed-variants.md @@ -398,29 +398,33 @@ enum by IDEs and developers. ### Interaction with `#[non_exhaustive]` -An enum declared both `non_exhaustive` and with an unnamed variant is rejected. -On a field-less enum, it is not a breaking change to replace a -`#[non_exhaustive]` declared on the enum with a contained unnamed variant. -Unnamed variants and `#[non_exhaustive]` both declare that future variants of an -enum may be added as the type evolves. +`#[non_exhaustive]` on an `enum` and an unnamed variant in an `enum` similarly +affect how `match` behaves for that type. -`non_exhaustive` affects API semver compatibility: +`non_exhaustive` affects source code only: -- It is flexible in how new variants are represented. +- It is flexible in how new variants are represented. E.g. it allows adding + variants with fields. - It does _not_ affect what discriminants are currently valid to represent. - Crates must be recompiled to use new enum variants. - It affects _only_ downstream crates. -By contrast, an unnamed variant affects API _and_ ABI semver compatibility: +By contrast, an unnamed variant affects what bit patterns are valid for the type: - It claims specific ranges of discriminants. - These claimed discriminants are valid to represent without naming the future variants that use them. - Crates can manipulate these unnamed enum variants without recompilation. -- It affects all crates, including the declaring one. +- It affects _all_ crates, including the declaring one. -For enums that have relevant discriminant values, an unnamed variant may be the -better choice. This is often the case for enums declaring an explicit `repr`. +Because of this, declaring `#[non_exhaustive]` on an enum with unnamed variants +emits a warning that the attribute is unused. An unnamed variant makes an enum +"universally non-exhaustive" already. + +For enums where the discriminant value is important, an unnamed variant may be +a better choice than `#[non_exhaustive]`. This is often the case for enums +declaring an explicit `repr`. It's a non-breaking change to replace +`#[non_exhaustive]` on an `enum` with at least one unnamed variant. ### Syntax "sugar"