Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add num_variants intrinsic for enums and safe wrapper in std::reflect. #13860

Closed
wants to merge 1 commit into from

Conversation

kemurphy
Copy link
Contributor

This adds an intrinsic to query the number of variants in an enum, along with a safe wrapper for it in std::reflect:

use std::reflect::num_variants;

enum Shape {
    Triangle,
    Square,
    Circle,
}

fn main() {
    let num_shapes: uint = num_variants::<Shape>();
    println!("{}", num_shapes); // prints 3
}

If the type parameter is not an enum, the result is 0.

@kemurphy
Copy link
Contributor Author

This is useful for avoiding the classic C workaround:

enum Shape {
    Triangle = 0u,
    Square,
    Circle,
    NUM_SHAPES
}

@kemurphy
Copy link
Contributor Author

This should really be a macro so it can be used as a constant, but the necessary information isn't available from the macro expander, it looks like. Open to suggestions.

@lilyball
Copy link
Contributor

This sort of thing is really only useful if you can then iterate the range [0,max) and get every variant (or [min,max), as you may start from something besides 0).

Maybe a better solution would be an intrinsic that evaluates to &[Shape], containing every variant in turn? With the caveat that it's only defined for C-like enums.

@eddyb
Copy link
Member

eddyb commented Apr 30, 2014

Intrinsic? Why not a derived trait (or anonymous impl)?

@lilyball
Copy link
Contributor

Good point. Derived trait makes more sense. Bounded would make sense, to give both min_value() and max_value(). I think there should also be a trait that covers "give me the next value" and "give me the previous value" but I don't know offhand of any pre-existing trait that provides this (e.g. range() uses Add and One, but that doesn't work here).

@jdm
Copy link
Contributor

jdm commented Apr 30, 2014

This sort of thing is really only useful if you can then iterate the range [0,max) and get every variant (or [min,max), as you may start from something besides 0).

That's not true; I could use this in Servo today for things like creating a vector holding the buckets for different profiling categories, along with several other similar use cases.

@kemurphy
Copy link
Contributor Author

Yeah, so my use case is actually something like this:

enum NS {
    StructNS,
    ValNS,
    TypeNS,
    ModNS,
}

struct Subscope {
    table: SmallIntMap<NS, DefId>,
}

impl Subscope {
    fn new() -> Subscope {
        Subscope { table: SmallIntMap::with_capacity(num_variants::<NS>()) }
    }
}

That SmallIntMap could easily be a vector. It's more useful as a constant than as a slice, and I think a derived trait is far too much for use cases like that, when all you need to know is how many different distinct values you're going to need.

@lilyball
Copy link
Contributor

A derived trait is much cleaner than a specialized intrinsic. And being able to derive Bounded would produce exactly what you need.

let ns: NS = Bounded::max_value();
Subscope { table: SmallIntMap::with_capacity(ns as uint) }

@kemurphy
Copy link
Contributor Author

Bounded certainly is no good here -- presumably max_value() gives back 3, while I really want 4, and min_value() is totally useless here. Do you mean deriving a trait manually or the compiler doing it automatically? If the former, that's exactly what I'm trying to avoid. If the latter, the compiler needs to inject the extra impl next to the enum in the compilation unit, which hardly sounds cleaner.

@lilyball
Copy link
Contributor

@kemurphy Good point about off-by-one, but you can trivially add 1 to that. And min_value() is only useless if your enum starts at 0. And I mean the compiler doing it using #[deriving(Bounded)] just like it does for e.g. #[deriving(Eq)].

@lilyball
Copy link
Contributor

@kemurphy Here's what the code would look like with that:

#[deriving(Bounded)]
enum NS {
    StructNS = 3, // because maybe it's C compat?
    ValNS,
    TypeNS,
    ModNS,
}

struct Subscope {
    table: SmallIntMap<NS, DefId>,
}

impl Subscope {
    fn new() -> Subscope {
        let (min, max): (NS, NS) = (Bounded::min_value(), Bounded::max_value());
        let count = max as uint - min as uint + 1;
        Subscope { table: SmallIntMap::with_capacity(count) }
    }
}

Note that your proposed intrinsic won't handle this case, because the enum now starts at 3.

@kemurphy
Copy link
Contributor Author

A deriving would be fine, but at the time those are expanded there's no way to associated an enum's type name with its variants. Bounded is still not the answer, in any case.

let (min, max): (NS, NS) = (Bounded::min_value(), Bounded::max_value());
let count = max as uint - min as uint + 1;

That is the most hilariously unfriendly snippet given the task it's trying to accomplish.

Note that your proposed intrinsic won't handle this case, because the enum now starts at 3.

Uh, what? num_variants doesn't care about the values, it just returns the number of variants. Read the patch, please.

@lilyball
Copy link
Contributor

@kemurphy

at the time those are expanded there's no way to associated an enum's type name with its variants

What do you mean? Deriving knows what the variants of the enum it's deriving for are. And even if it can't ask for the actual discriminant values, all it needs is the AST to figure out which variant is smallest or largest.

That is the most hilariously unfriendly snippet given the task it's trying to accomplish.

How is that unfriendly? That's precisely how you'd handle the same enum in C. The classic C solution doesn't need this because the classic C solution has the enums start at 0, but that's not always the case. I've certainly seen C code that looks something like

enum {
    SECTION_A_START = 0x100,
    A1 = SECTION_A_START,
    A2,
    A3,
    SECTION_A_END = A3,

    SECTION_B_START = 0x200,
    B1 = SECTION_B_START,
    B2,
    B3
    SECTION_B_END = B3
    // ...
}

and it needs the same solution to count the number of elements in a given section.

Basically, your num_variants() solution works only when the C-like enum starts at 0. It's unusable otherwise. I explained this before, but you've chosen to completely ignore that.

@kemurphy
Copy link
Contributor Author

I am familiar with the C idiom.

That's precisely how you'd handle the same enum in C.

You just answered your own question ;)

Basically, your num_variants() solution works only when the C-like enum starts at 0. It's unusable otherwise. I explained this before, but you've chosen to completely ignore that.

Again, that's false. I don't care about the values or even the range of values in the enum; I care about how many distinct values there are. Please, please, read the patch -- the intrinsic returns ty::enum_variants(ccx, def_id).len(). It's completely value-agnostic.

@lilyball
Copy link
Contributor

Besides Bounded and my proposed trait that provides .next_value() and .prev_value(), there is actually a use case for a Countable trait that provides a count() method that would be equivalent to your num_values(). However, this would actually be a super trait of Bounded, and its only use case is for when the value has holes.

@kemurphy
Copy link
Contributor Author

Countable would be just fine but it's impossible to implement currently (as is Bounded) -- derivings happen at syntax expansion time, so there's no AST map available to introspect which variants belong to a given enum.

@lilyball
Copy link
Contributor

@kemurphy

I have read the patch. What you seem to be ignoring is the fact that you're trying to add an intrinsic for a niche use-case. That's a bad idea, and it's made worse by the fact that there is a better generalized solution that can be implemented that covers a lot more use cases, that works just fine for you. But you're rejecting it because you consider it to make your code slightly uglier.

@kemurphy
Copy link
Contributor Author

I would prefer the generalized solution if it were possible to implement. This is hardly a niche use case though -- preallocating the necessary size of a map or vector is fairly common practice in the systems world. Refer to @jdm's comment, for example.

@kemurphy
Copy link
Contributor Author

(That is to say, I would prefer Countable for this particular problem.)

@lilyball
Copy link
Contributor

@kemurphy

there's no AST map available to introspect which variants belong to a given enum.

What do you mean? Of course Deriving knows what variants your enum has. It can't possibly work otherwise! What exactly do you think #[deriving(Eq)] is doing? It's not evaluating to

impl Eq for Foo {
    fn eq(&self, other: &Foo) -> bool {
        intrinsics::eq(self, other)
    }
}

It's evaluating to

impl Eq for Foo {
    fn eq(&self, other: &Foo) -> bool {
        match self {
            Foo1 => match other {
                Foo1 => true,
                _ => false
            },
            Foo2 => match other {
                Foo2 => true,
                _ => false
            }
            // ...
        }
    }
}

This very clearly knows what the variants are.

@kemurphy
Copy link
Contributor Author

It looked like derivings happen at the same phase as macro expansion, and it's not available there...?

Is that really what Eq expands to? I guess that would make sense since at that point we have a notion of meta-items and items and how they're attached. If so then I guess it is possible there.

Countable being a super-trait of Bounded though is still wrong.

enum Thing {
    Foo = 42,
    Bar = 1337,
    Quack = 116,
}

And I'm fairly sure 1337 - 42 isn't 3. ;)

@lilyball
Copy link
Contributor

@kemurphy

Countable being a super-trait of Bounded though is still wrong

Only if count() is defined in terms of Bounded. I'm suggesting something like

pub trait Countable : Bounded {
    fn count() -> uint;
}

There could be a default implementation like

pub trait Countable : Bounded {
    fn count() -> uint {
        let min: Self = Bounded::min_value();
        let max: Self = Bounded::max_value();
        max - min + 1
    }
}

which would then be reimplemented for the types that actually do have holes, but I don't know if that's actually a good idea. It would mean that, for a type without holes, you can implement it using impl Countable for Foo {} as long as you already implemented Bounded, but whether or not that's a good idea, I can't answer.

The only real issue with Countable (besides the fact that it's a once-off trait) is the fact that it cannot be implemented on uint or u64, because the count would overflow (even if it's defined to return u64, this would still be a problem). This is actually why I think it might not be a great idea, and why it might be better simply to have Bounded and let you calculate the count yourself (which would only work for types without holes). Alternatively, Countable could just not be implemented on the numeric primitives (after all, it's not particularly useful there). But then it would be a libstd trait that has zero implementing types in libstd, which seems pretty odd.

@kemurphy
Copy link
Contributor Author

I disagree that Countable would be undefined for integer types. Countable would count the range of representable values -- for u64 and i64 both, there are u64::max_value() possible values. The advantage to num_variants is it works on any enum without having to opt into it, but it is totally usable to just have #[deriving(Countable)] enum Foo and later Foo::count().

@kemurphy
Copy link
Contributor Author

I will work on Countable as an alternative to this since apparently the knee-jerk reaction to an intrinsic is that it's dirty and a bad idea. (Not to say that an intrinsic is the best solution here, but it's exactly the kind of thing an intrinsic should do.)

@lilyball
Copy link
Contributor

@kemurphy

or u64 and i64 both, there are u64::max_value() possible values

Nope, there are u64::max_value() + 1 possible values. You forgot about 0. And unfortunately, u64::max_value() + 1 wraps around to 0. We'd need a u128 type for this, except then you'd have the same problem implementing Countable for u128.

@kemurphy
Copy link
Contributor Author

Ah, indeed. Good point.

@lilyball
Copy link
Contributor

@kemurphy Also, as a thought exercise, what would the impl of Countable for f32 do? There is technically a finite number of f32s, but it's not particularly useful to count them. And you also have to consider whether you want to include subnormals in your count, or consider +0/-0 to be distinct.

@kemurphy
Copy link
Contributor Author

Presumably f32 wouldn't implement Countable. It's probably not particularly useful for the integer types to implement Countable, either, which would indeed make it weird since it seems like no other libstd types are likely to derive it.... Which brings us back to the intrinsic.

@lilyball
Copy link
Contributor

I think the most reasonable way to think about Countable is in terms of my hypothetical trait that provides next()/prev(), where count() returns the number of values that can be returned from this trait. Or to put it another way, if you call .next() on Bounded::min_value() count-1 times you'll get a value, but calling it count times will return None. This definition actually does provide a precise answer for how you'd implement it on f32 (although that doesn't mean it should be implemented on f32, I think that's rather pointless). Granted, there's no way to provide a default implementation given this definition. Also, it occurs to me that my suggested default impl using Bounded doesn't work either because you can't coerce arbitrary Bounded values to uint (it works in a concrete impl for a C-like enum because you can cast enum variants to uint).

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

@alexcrichton: ping? Who else should be in on this discussion? If this isn't desired can you or one of the other core devs suggest an approach that is not "implement #[deriving(Bounded)]" or "count your variants manually" that would be accepted instead? Perhaps implementing #[deriving(NumVariants)] or somesuch?

@lilyball
Copy link
Contributor

lilyball commented May 2, 2014

@kemurphy What is your opposition to "implement #[deriving(Bounded)]"? Do you truly feel that saying

let max: MyEnum = Bounded::max_value();
let min: MyEnum = Bounded::min_value();
let count = max as uint - min as uint + 1;

is not a workable solution? I expect that implementing #[deriving(Bounded)] is non-controversial (for C-like enums at least, though non-C-like would work if all fields implement Bounded too). And once it's done, then adding my proposed .next_value()/.prev_value() trait or the Countable trait will make more sense.

As a side note, if we do implement #[deriving(Bounded)] then I think at least the description has to change. Right now it explicitly refers to numbers, but this would generalize it to values that have an upper and lower bound.

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

@kballard: My opposition is that this is not a bounds problem, it is simply a number of variants problem. It doesn't typecheck, one might say. :)

let max: MyEnum = Bounded::max_value();
let min: MyEnum = Bounded::min_value();
let count = max as uint - min as uint + 1;

While this works for my particular use case, in my opinion it is very unwieldy and overly complex. Deriving Bounded also does not work for C-like enums with holes in them, for 2 reasons:

  • You lose the ability to count the variants unless you know where the holes are.
  • Bounded implies the type can take any value between min and max, whereas an enum with holes only takes a subset of those.

Also, generally, deriving Bounded for an enum to me would imply that you could take any old uint and {interpret it as,cast it to} that enum and have things Just Work, which a) violates type safety more than I think is acceptable, and b) also falls apart for enums with holes. Plus the whole fact that the description of Bounded would have to change. It just sounds like an incorrect approach overall to me.

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

in my opinion it is very unwieldy and overly complex.

To elaborate on this, at the very least the expression I'm looking for would be a single function call that doesn't have to do any math (regardless of whether or not it gets constant-folded away anyway). Something like MyEnum::num_variants() is immediately much clearer to my eyes than the let count = ... business above, and I would be very surprised to learn that I were alone in thinking this way.

@lilyball
Copy link
Contributor

lilyball commented May 2, 2014

@kemurphy In a dense C-like enum, calculating the number of variants is identical to calculating the bounds.

Regarding C-like enums with holes:

You lose the ability to count the variants unless you know where the holes are.

True. But is this a use-case that matters right now?

Bounded implies the type can take any value between min and max, whereas an enum with holes only takes a subset of those.

I disagree. Bounded only implies that the type has bounds. It says nothing about whether the type is dense. And in fact I don't think it even makes sense to talk about holes in a Bounded type, because you haven't defined what a hole is. It certainly isn't a representable value that's missing from the type, because by definition, if it's missing from the type, it's not representable. We can talk about holes in a C-like enum because we're actually talking about the discriminant value, which has a larger range than the enum does. But that only holds for C-like enums.

Consider the type u32. You would say it has no holes. I might say it can represent the value 1 and the value 2, but not the value 1.5, therefore it has holes. In fact, by this logic, it has a hole in between every single representable value. I hope you agree that this is a rather absurd way of defining holes, but it's actually what we're doing when talking about C-like enums.

For reference, I think Bounded should be implemented on char. When interpreted as a u32, char is considered to have holes, but when you consider char by itself (and not as a specialized u32) then the idea of a hole disappears.

I guess to make it clearer, the concept of a hole in a type only exists if you also have the concept of addition with that type. You can then define holes as addition operations that result in illegal values (either wrapping or saturating on overflow, to avoid that being considered a hole). But of course addition isn't defined for C-like enums (by default).

@lilyball
Copy link
Contributor

lilyball commented May 2, 2014

@kemurphy Basically what I'm trying to say is, I think #[deriving(Bounded)] is a non-controversial solution to your concrete issue. I think #[deriving(Countable)] is a more elegant solution, but is perhaps more controversial. What I'm pushing for right now is to go ahead and do #[deriving(Bounded)] and to live with the uglier code for the time being, and to consider the more elegant Countable + whatever trait has .next_value()/.prev_value() as something to do later.

Granted, if you can get buy-in to do #[deriving(Countable)] right now, then that's fine. I just think it's not as clearly correct (mostly because we don't have the trait at the moment, and nothing in libstd would even use it).

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

In a dense C-like enum, calculating the number of variants is identical to calculating the bounds.

Agreed.

True. But is this a use-case that matters right now?

No, but ideally the solution here would be identical to and work for such enums. If it solves my use case, it should solve all other use cases too. I'd like to preserve at least a modicum of generality across all C-like enums. :)

I disagree. Bounded only implies that the type has bounds. It says nothing about whether the type is dense. And in fact I don't think it even makes sense to talk about holes in a Bounded type, because you haven't defined what a hole is. It certainly isn't a representable value that's missing from the type, because by definition, if it's missing from the type, it's not representable.

This explanation is heavily dependent on that being the way you define Bounded, but I agree on all fronts.

I think #[deriving(Countable)] is a more elegant solution, but is perhaps more controversial.

I also agree, but will add that I don't think I should be required to derive Bounded in order to derive Countable, when I don't care about the bounds (and again we get back to how that solution still doesn't work for enums with holes).

@lilyball
Copy link
Contributor

lilyball commented May 2, 2014

No, but ideally the solution here would be identical to and work for such enums.

The need for calculating the number of variants of a C-like enum that has holes is much less obvious than the need for calculating the number of variants of a dense enum. A C-like enum that has holes has them because the discriminant has been explicitly set for various variants, and such an enum seems unlikely to be used in a scenario where the variant count alone is enough information (which, honestly, seems really only useful for setting the capacity of a key-value map).

For comparison's sake, in C, needing the number of variants is generally used because the code wants to calculate all the variant values.

I don't think I should be required to derive Bounded in order to derive Countable

Any type that is Countable should also be able to be Bounded. But I suppose there's no real need to make Countable : Bounded.


Here's another suggestion. A derivable trait that provides a method that returns a &'static [T] containing all the variants. This would only work for C-like enums (whereas #[deriving(Bounded)] would work for all enums as long as the fields also implement Bounded), but it would provide not only your desired number of variants, but also a way of iterating over all the distinct variant values.

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

A C-like enum that has holes has them because the discriminant has been explicitly set for various variants, and such an enum seems unlikely to be used in a scenario where the variant count alone is enough information (which, honestly, seems really only useful for setting the capacity of a key-value map).

While a fair point, and utility aside, it still feels wrong to me to provide a "get the number of variants" solution that a) depends on the uint range of values and b) only works for some C-like enums and not others.

Any type that is Countable should also be able to be Bounded.

I should clarify -- I agree with the Bounded bound on Countable; I would like to see a #[deriving(Foo)] where Foo behaves like Countable for all C-like enums without having to specify Bounded.

Here's another suggestion. A derivable trait that provides a method that returns a &'static [T] containing all the variants.

This is something I can get behind. What would the name of such a trait be?

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

Another alternative may be a separate attribute entirely like #[num_variants=NUM_SHAPES], which would emit a static NUM_SHAPES: uint = 3;, or whatever. The benefit here is now this can be used in other statics since it is a compile-time constant and not the result of a function call.

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

Something like that also means we don't have to add an unused trait to libstd.

@lilyball
Copy link
Contributor

lilyball commented May 2, 2014

Associated values is the real solution here, so you can just say Foo::NUM_VALUES. But we don't support that. (I believe this associated value would still need to be #[derive(..)]d, presumably by declaring it on some trait).

Until such time as we get associated values, I think trying to emit a static is a bad API.

So right now, I think the best solution is the trait that returns &[T]. But the question of naming is a tough one. Maybe something like

pub trait EnumValues {
    fn enum_values() -> &'static [Self];
}

Although since this only works for C-like enums, it would be nice to put CLike in the name. But I think CLikeEnumValues is just a bit too unwieldy.

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

Yeah, associated values is the real answer here. For the name of the trait, what about #[deriving(Discrete)]?

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

pub trait Discrete {
    pub fn domain() -> &'static [Self]; // or pub fn values()
}

@lilyball
Copy link
Contributor

lilyball commented May 2, 2014

Why Discrete? I can't tell from that name alone what it's supposed to represent. My best guess is it represents a type that has discrete values, but isn't that all types? The trait we're talking about here really only makes sense for C-like enums, and I think the name should reflect that somehow.

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

Hm, yeah Discrete probably isn't the right word I was looking for something like Finite, I guess? With Discrete I was thinking like "discrete set of values"; the integral types also apply but in that case the trait would look more like

pub trait Discrete {
    pub fn values<T:Iterator>() -> T<Self>;
}

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

Enumerable, maybe?

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

pub trait Enumerable {
    fn values() -> &'static Self;
    fn iter<T: Iterator>() -> T<Self>;
}

@lilyball
Copy link
Contributor

lilyball commented May 2, 2014

That iter() function doesn't work. We don't have higher-kinded types. You'd have to say

pub trait Enumerable<T: Iterator<Self>> {
    fn values() -> &'static Self;
    fn iter() -> T;
}

and then the implementation would have to provide the concrete iterator type.

I don't think this is worthwhile. The same iterator can be accomplished just by calling .iter().map(|&x| x) on the result of values().

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

Oh, derp, I knew that. Oops. Yeah only values() is necessary.

@alexcrichton
Copy link
Member

The more that this progresses into "let's add some more deriving modes" territory, the more this is progressing into "this needs an RFC" territory. There are many ideas being discussed here, and I don't want to have something slide in without general agreement.

Is the opinion so far that the intrinsic may not be the best route to go?

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

@alexcrichton: My opinion is that for things like my use case, the deriving is more tasteful and probably the preferable solution. That being said, I think the intrinsic is still useful from a reflection perspective, but I'm okay with letting it go in favor of the trait.

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

I can throw together a PR for the trait/deriving, and if comments seem to indicate that an RFC is necessary I can write one up. Does that work?

@alexcrichton
Copy link
Member

The scope of this new deriving mode sounds large enough to require going through the RFC process first, but you're welcome to sketch it out and see how it goes!

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

Sure thing. As for this one, r? for reflection or should I just close it?

@alexcrichton
Copy link
Member

It sounds like the opinion is that the deriving-based solution should be attempted first to see how it compares to the intrinsic-based solution.

If it's ok with you as well, I'm going to close this for now, but feel free to reopen if the deriving-based solution doesn't work out!

@kemurphy
Copy link
Contributor Author

kemurphy commented May 2, 2014

Works for me, thanks!

arcnmx pushed a commit to arcnmx/rust that referenced this pull request Jan 9, 2023
fix a bunch of clippy lints

fixes a bunch of clippy lints for fun and profit

i'm aware of this repo's position on clippy. The changes are split into separate commits so they can be reviewed separately
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants