Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient single inheritance #142

Closed
wants to merge 3 commits into from
Closed

Efficient single inheritance #142

wants to merge 3 commits into from

Conversation

nrc
Copy link
Member

@nrc nrc commented Jun 26, 2014

Efficient single inheritance via virtual structs and unification and nesting of structs and enums.

@emberian
Copy link
Member

I think this a relatively conservative but coherent way of adding single inheritance.

/cc @steveklabnik

@pepp-cz
Copy link

pepp-cz commented Jun 26, 2014

Please, no virtual structs, they are so no-rustic! Let me explain ...

While reading the swift tutorial, I realized very important difference between rust and swift. In swift struct and class behaves differently. Structs are passed by-value and uses static dispatch for its methods. Classes are constructed on heap, passed by-ref and uses dynamic dispatch. The behaviour is chosen by the creator of the definition and user has no way to change it. On the other hand, rust moves this decisions to user. He can create struct on stack, heap, use Rc, Arc, Gc or anything else to manage the lifetime of the object. He can use type parameters for static dispatch, trait objects for dynamic dispatch. Structs, instantiation and dispatch are perfectly orthogonal to each other. But virtual structs are not.

EDIT: I have found out that similar approach was suggested in #9 so the rest of the comment is obsolete.

I have got an idea how to deal with this but I have not produced RFC yet because I do not know how DST should work.

The idea is that virtual structs are basically useful only for DOM-like data structures (generally some polymorphic graph data structures). So we provide object wrapper and type-erasing reference type in std that can be used to build such structures. Then, you could add any type implementing particular trait to the structure, even primitive types such as int or float. Decision is in the hands of the user.

Very schematically:

struct<TRAIT, TYPE : TRAIT> PolyObject {
  vtable : *c_void; // pointer to TRAIT's vtable for TYPE
  T obj;
}

struct<TRAIT> PolyRef {
  ref : *c_void;
}

Then, you can wrap instances into PolyObject that can build PolyRef to itself, store PolyRefs into the data structure. Use PolyRefs to build trait object when you need to work with the objects.

The tricky part is how to make this play nicely with smart pointers. Here I lack the knowledge about DST which can be probably used for that. For example, if PolyObject is type-erased DST wrapper of an object, It can be boxed into smart pointer. And PolyRef would not be needed. However I do not know if this can be done.

The code could look like this:

trait Drawable {
  fn draw();
}

impl Drawable for Circle {...}
impl Drawable for Square {...}

let mut v = Vec<Rc<PolyObject<Drawable>>>::new();
v.insert(Rc::new(PolyObject::new::<Drawable>>(Circle::new())));
v.insert(Rc::new(PolyObject::new::<Drawable>>(Square::new())));

for o in v {
  o.deref().draw();
}

With some typedefs and Unref impl it can be made more readable.

@emberian
Copy link
Member

Type erasure is what you don't want here. What you described is exactly
trait objects and DST. That's orthogonal to inheritance. Inheritance, as
described here, does not remove the decision from the user.

On Thu, Jun 26, 2014 at 4:12 AM, pepp-cz [email protected] wrote:

Please, no virtual structs, they are so no-rustic! Let me explain ...

While reading the swift tutorial, I realized very important difference
between rust and swift. In swift struct and class behaves differently.
Structs are passed by-value and uses static dispatch for its methods.
Classes are constructed on heap, passed by-ref and uses dynamic dispatch.
The behaviour is chosen by the creator of the definition and user has no
way to change it. On the other hand, rust moves this decisions to user. He
can create struct on stack, heap, use Rc, Arc, Gc or anything else to
manage the lifetime of the object. He can use type parameters for static
dispatch, trait objects for dynamic dispatch. Structs, instantiation and
dispatch are perfectly orthogonal to each other. But virtual structs are
not.

I have got an idea how to deal with this but I have not produced RFC yet
because I do not know how DST should work.

The idea is that virtual structs are basically useful only for DOM-like
data structures (generally some polymorphic graph data structures). So we
provide object wrapper and type-erasing reference type in std that can be
used to build such structures. Then, you could add any type implementing
particular trait to the structure, even primitive types such as int or
float. Decision is in the hands of the user.

Very schematically:
struct PolyObject {
vtable : *c_void; // pointer to TRAIT's vtable for TYPE
T obj;
}

struct PolyRef {
ref : *c_void;
}

Then, you can wrap instances into PolyObject that can build PolyRef to
itself, store PolyRefs into the data structure. Use PolyRefs to build trait
object when you need to work with the objects.

The tricky part is how to make this play nicely with smart pointers. Here
I lack the knowledge about DST which can be probably used for that. For
example, if PolyObject is type-erased DST wrapper of an object, It can be
boxed into smart pointer. And PolyRef would not be needed. However I do not
know if this can be done.

The code could look like this:

trait Drawable {
fn draw();
}

impl Drawable for Circle {...}
impl Drawable for Square {...}

let mut v = Vec>>::new();
v.insert(Rc::new(PolyObject:[image: 🆕]:>(Circle::new())));
v.insert(Rc::new(PolyObject:[image: 🆕]:>(Square::new())));

for o in v {
o.deref().draw();
}

With some typedefs and Unref impl it can be made more readable.


Reply to this email directly or view it on GitHub
#142 (comment).

http://octayn.net/

@pepp-cz
Copy link

pepp-cz commented Jun 26, 2014

Trait objects carry the underlaying object by value? If so, then they are what I have described. But the core message of my comment was different: Do not introduce virtual methods on structs, use trait objects instead.

If I read it correctly, (virtual) structs do have vtable pointer attached to them. That is exactly what I meant to advise agains. If you use such struct in a monomorphic data structure, than the pointer is of no use. Here the creator of the struct decided and user cannot change the decision. If the struct was just regular struct, user can use it directly in the monomorhic structure or via trait object in a polymorhic one. You can build such structure from unrelated types of objects, no inheritance required. It should be more flexible. Inheritance will be just a tool for code reuse.


Open question: should we use `()` or `{}` when instantiating items with a mix of
named and unnamed fields? Or allow either? Or forbid items having both kinds of
fields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should constrain ourselves to {} when it comes to sort of named arguments. In general I was thinking something like this:

let x: E2 = E2 { f: 34, Variant2(23) };

That is, the current struct literal would be extended to a general data type literal. A list in the braces would then contain zero or more fields (<id>: <expr>), zero or one variant (<path> or <path>(...) or <path> { ... }), and zero or one struct extension (..<expr>) in this exact order. The existing simple variant, say, Some(42) would denote a shorthand for Option { Some(42) }. This syntax is consistent to how the syntax extension syntax would work with such struct-like enums (let y = E2 { Variant, ..x }; let z = E2 { f: 42, ..x };).

@CloudiDust
Copy link
Contributor

IMHO, making enums and structs "unified but subtly different" is very confusing and not orthogonal at all. Also "enums with fields" are no longer enums, if we ever go down this road (which I am against), "classes" would be a somewhat better name (and we don't actually want a second C++, do we?)

Inheritance should only be used in a handful of scenarios as a last resort, and I don't suggest making such changes to the language only for it. How much can we strip out of this to get down to the bare minimum of changes necessary?

@dobkeratops
Copy link

copying c++ single inheritance doesn't seem bad to me; rust fixes the serious problems of unsafety and headers.
It wouldn't need multiple inheritance because its got traits. and enums handle an orthogonal set of cases.

increasing the subset of existing C++ code you can interact with directly would be a good thing, IMO, for getting greater adoption.

I'm surprised there wasn't more interest in the idea of generalizing how trait objects work though... leverage the existing way of representing vtables- provide some safe sugar for the internal vtable hack that is currently possible in rust.

Example (in pseudo-code):

```
class Element {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is ancient, confusing, and potentially misleading. It should just be removed, or updated to look like a real language in existence.

@CloudiDust
Copy link
Contributor

@dobkeratops : The problem is, if Rust provides inheritance in the core language, people will use and inevitably overuse it for things where better solutions are available. (And for most use cases, inheritance is not the best answer.) Maybe keeping this behind a feature gate (even after it becomes stable) can make things better?

and enums handle an orthogonal set of cases.

But then, inheritance as proposed in this RFC seems to require extending and unifying enums and structs, which would then differ mainly in how they would be passed, and how much space they would take. That doesn't seem very orthogonal to me. Yes, they will be "typically" used for different cases, but there would be no guarantee.

Rust used to have a condition system, but it was later removed to simplify the language. One of the reasons for the change, I believe, was that conditions were not quite orthogonal to Options and Results. (Correct me if I am wrong.)

@emberian
Copy link
Member

The condition system wasn't part of the language. It was a macro plus some
clever code. They were removed because they were unused and a pain to
maintain, as well as being hard to explain and usually inferior to the
type-based approach.

On Thu, Jun 26, 2014 at 7:24 AM, Ruochen Zhang [email protected]
wrote:

@dobkeratops https://github.com/dobkeratops : The problem is, if Rust
provides inheritance in the core language, people will use and inevitably
overuse it for things where better solutions are available. (And for most
use cases, inheritance is not the best answer.) Maybe keeping this behind a
feature gate (even after it becomes stable) can make things better?

and enums handle an orthogonal set of cases.

But then, inheritance as proposed in this RFC seems to require extending
and unifying enums and structs, which would then differ mainly in how they
would be passed, and how much space they would take. That doesn't seem very
orthogonal to me. Yes, they will be "typically" used for different cases,
but there would be no guarantee.

Rust used to have a condition system, but it was later removed to simplify
the language. One of the reasons for the change, I believe, was that
conditions were not quite orthogonal to Options and Results. (Correct me
if I am wrong.)


Reply to this email directly or view it on GitHub
#142 (comment).

http://octayn.net/

@pepp-cz
Copy link

pepp-cz commented Jun 26, 2014

@dobkeratops
Are you referring to #9 ? The last comment shows a lot of interest.

@ben0x539
Copy link

This looks much more complicated than what I had in mind when people talked about unifying structs and enums. Also I'm surprised there's vtables involved but it's a completely distinct system from traits.

Drop is marked `inherit`.

It is the programmer's responsibility to call `drop()` for outer-items from the
impl for the inner item, if necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And when they forget?

@bill-myers
Copy link

Using "struct" and "enum" to denote being sized or not seems far less natural than using "struct" to denote instantiable leaf nodes, and "enum" to denote non-instantiable non-leaf nodes, and using the "unsized" keyword to denote unsizedness if needed.

Note also that you can represent enum variants using their minimal size just fine, as long as copies from a dereferenced enum pointer first determine the actual variant size instead of doing a blind memcpy of the whole enum size.

The real semantic issue of "unsized" vs "sized" happens when you can't represent something as constant-sized because the size of variants is unbounded: this is the case if you allow to introduce generic parameters in variants (and then declare a field of that type), and if you allow to inherit outside the crate.

Also, having a vtable pointer or an enum discriminant is semantically equivalent (you can match on the vtable pointer value, and you can lookup virtual functions in an array with an enum discriminant), so I don't think this should be a fundamental part of the semantics.

Making non-leaf structs both overridable and instantiable seems problematic, especially if you make them unsized, since it means that obvious code like let x = Struct {...} is invalid, since Struct is unsized, despite the fact that let x = box Struct {...} would be valid.

Overall, I think the struct/enum design in #11 is better, although I think the method dispatch in #11 vs the traditional Java-like method dispatch (with the ability to override implemented methods in addition to abstract ones) proposed here and elsewhere is more a matter of taste.

Do we need multiple inheritance? We _could_ add it, but there are lots of design
and implementation issues. The use case for multiple inheritance (from bz) is
that some DOM nodes require mixin-style use of classes which currently use
multiple inheritance, e.g., nsIConstraintValidation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if we decide not to implement anything akin to multiple inheritance, are we screwed? I sure hope the DOM isn't that tightly coupled to C++-specific design decisions.

@bstrie
Copy link
Contributor

bstrie commented Jun 26, 2014

I would like to see @pcwalton weigh in on how this proposal would satisfy Servo's requirements. Furthermore, it would be very premature to accept any proposal of this magnitude until @nikomatsakis gets back from vacation and has a chance to consider it.

@bstrie
Copy link
Contributor

bstrie commented Jun 26, 2014

One thing that I don't see mentioned concretely: is this change entirely backwards-compatible? I'm not asking from a wait-till-after-1.0 standpoint (though keeping it gated till afterwards would be nice, if possible), I'm asking because I want to know whether or not our current tutorial material regarding structs and enums will continue to be relevant. It would be nice if the current "division" between structs and enums would suffice to teach people the semantic differences (as well as introduce them to tagged unions with a minimum of noise), and only once they grok these two concepts do they need to be introduced to this advanced material.

@bstrie
Copy link
Contributor

bstrie commented Jun 26, 2014

One more thing: I'm concerned that this is a lot of complexity in order to support such an allegedly narrow use case. In my gut I feel like there's more simplification that could be done here, though unhelpfully I don't have any suggestions.

@emberian
Copy link
Member

Structs and enums don't change very much from their current meaning, they are just extended to newer meanings. I agree that this seems to be a lot of complexity though.

@rpjohnst
Copy link

This adds a lot of complexity and I'm not sure it would actually be useful anywhere. I'd think something like #91, combined with #9, is more the Right Thing for Rust. It completely satisfies Servo's requirements and is much simpler.

@Ericson2314
Copy link
Contributor

The only use case repeatedly brought up is the DOM, which I don't think is a good example motivating situation. The DOM is a not a fundamental problem, it is an existing solution designed around other languages' strengths and weaknesses. The goal for rust should be to have some solution the problems it tackles, not to support every other languages method's of solving those problems.

Trait objects already let one express heterogeneous data structures like the DOM, the problem is just performance. I think a series of optimizations and additional representation choices would be enough to get what Servo needs out of trait objects, for example:

  • fat objects like RFC for "fat objects" for DSTs #9
  • optimizing of getter and setters to be actual field-offsets if all types implementing the trait have the field at the same offset

Maybe macros or a compiler plugin could be used to fake inheritance on top of this, but I'd worry that putting it in the language itself will add a lot of complexity to the language, and be something new users flock to out of familiarity when there are better ways to get the job done.

@lilyball
Copy link
Contributor

@Ericson2314

The only use case repeatedly brought up is the DOM, which I don't think is a good example motivating situation.

As far as Mozilla is concerned, the primary goal of Rust is to be a platform for Servo. Servo cares very much about making the DOM as efficient as it can. This leads to the conclusion that the DOM is in fact the only motivating situation that truly matters for this case.

@bstrie
Copy link
Contributor

bstrie commented Jun 26, 2014

@kballard, I don't disagree. It's very important to cater to issues that Servo has encountered. However, by failing to look for other use cases, we may inadvertently be designing ourselves into a corner. It would be valuable to consider other domains where this feature may be desirable, and ensure that our design supports those use cases cleanly (or not, if we'd prefer that they use some other Rust mechanism instead).

That said, if this feature is literally useful only because of the need for Servo to support the DOM, and no other Rust code ever written could conceivably ever ever want to use this, then I'd prefer a hacky gated solution to something far-reaching and complex.

@zwarich
Copy link

zwarich commented Jun 26, 2014

A few thoughts:

  1. It doesn't make sense to limit inheritance to a single module for efficiency of compilation reasons when Rust's compilation unit is a crate. This would help for Servo's use case, because the DOM is defined in separate modules with a fairly flat hierarchy in a single crate. To initialize a struct with private fields in a different module, you would be forced to export a constructor function, but isn't abstraction over representation the reason you wanted to make the fields private in the first place?

  2. Is there really going to be no need for subtyping in the DOM or any other use case of this feature? Past experience with web browsers suggests that coercions will not be enough, but maybe the number of unsafe downcasts won't be excessive? At some point I would think that a covariant return type would be desired.

    If trait objects are going to have subtyping in the future then it would be strange for these objects to not have subtyping. Also, despite my personal aversion to subtyping, we already have to deal with it in the type checker for lifetimes, so exploiting that for elimination of fallible downcasts might be a good idea.

  3. I feel like this proposal pushes the enum / struct syntax past its logical breaking-point. The whole point of that syntax is to be reminiscent of other 'curly braces' family languages. This proposal makes it so dissimilar from its antecedents that it might be better to come up with something new.

  4. Having a system where destructors are not called (either implicitly, or explicitly but with a check) on parents will mean that this feature is more unsafe (in terms of memory leaks) than the equivalent C++ code. That is undesirable. Anything inherited that will be used polymorphically needs a virtual destructor, so if we are singling out inheritance we should force anything that is inherited from to have a virtual destructor.

@Ericson2314
Copy link
Contributor

@kballard @bstrie Yeah Servo team should never need to sacrifice on performance. But it they are doing something very esoteric, of which DOM may or may not count, better they---or any single project for that matter---sacrifice elegance than make everybody else do so.

@zwarich
Copy link

zwarich commented Jun 27, 2014

Now that I think about it some more, the subtyping question can probably be set aside for the moment, since extending this proposal with subtyping is backwards compatible.

in pointer). enum values have the size of the largest variant plus a
discriminator (modulo optimisation). Struct values have their minimal size. For
example,

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, enums are C# structs and structs are C# classes? This seems extremely confusing to me. It would also be very strange for me to see code like e.g.:

enum Vec3 { x : f32, y : f32, z : f32 }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I should have left a line note not a general comment. The names would indeed be very confusing if unification happens, as structs and enums would be more like haskell's data, not their namesakes from the C-derived languages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CloudiDust, I agree that it's not intuitive, but consider this: Swift adopted our same distinction between enums and structs, and this time next year Swift will be the fifth-most-popular programming language in the world. I agree that data would be philosophically cleaner, but we'd still need to provide a way to opt into either enum-sizing or struct-sizing, and it may be a misguided effort if Swift sets a precedent that we can exploit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bstrie, Swift doesn't try to unify them I guess? And if we do the unification, our "enums" and "structs" would not be like Swift's (or anyone else's) any more. Principle of least surprise, right?

I think we can just say "data is like Swift's enums and structs rolled into one and you can naturally nest them." (Or tell the reader to learn Haskell and write his/her own monad tutorial ;)

Using enum and struct to differentiate sized-ness is also weird. I think a dedicated unsized keyword would be a better choice. (And this keyword would be introduced by DST anyway.)

Or, can we leave enum and struct as-is, and find some other way to do inheritance?

Is it possible to provide building blocks in the language, but put inheritance support in the library? Like with smart pointers and tasks? People are talking about #9 and #91, are they something like that?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to provide building blocks in the language, but put inheritance support in the library?

you can already hack internal vtables now with unsafe code,

I liked the idea of adding some safety around that e.g. a get_vtable(),a 'VTable' intrinsic type, make_vcallable(vtable_ptr,data_ptr) /* with compatibility check the same as for trait objects */ .. This might be a nice gradual change that would let people handle many cases better , and allow experimenting with unusual things ... a single object with multiple interfaces, using the vtable pointer as state information (a table of event handlers), or "class objects", one vtable associated with a collection of several instances, or 'compressed vtable information' imagine a u8 index to consult an array of known types, if you don't need it completely open.. whatever.

But, i think it would also be good for rust to just copy C++ single inheritance/internal vtable classes , functions defined inside the 'class' its' vtable layout, also allow classes as type-bounds, ... in turn increase the subset of C++ that it's easy to bind to and gain more easy adoption.

Just look how Swift has instant mainstream popularity due to compatibility with existing objective-C frameworks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dobkeratops Being able to more easily bind to C++ libraries is very nice, but we would face the risk of people overusing inheritance in other codes unrelated to its best use cases. If we can put inheritance support in the library, like how we have Cell in the library, then we can send the message that inheritance is a valuable tool for exceptional situations.

There is always a trade off.

Can a library-based inheritance solution be used to simulate C++'s semantics to some extent, and bind to C++ libraries, with the help of a binding generator/macro? C++ also has default arguments and ad-hoc function overloading, which Rust has not, so at least for now, I am of the opinion that we need a bindgen anyway.

@bstrie, I think @nick29581 actually prefers unifying enum and struct under the data banner, maybe he has such a proposal (at least in his mind)?

EDIT: And I admit that I was fighting misleading names with a misleading keyword. ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rpjohnst That sounds very neat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd even be OK with these:

  1. forall A, B: B has exactly A's first N fields => Coerce<Fat<U, A>, Fat<U, B>>
  2. forall A, B, U: Coerce<A, B> => Coerce<Fat<U, A>, Fat<U, B>>
  3. Any DST with a vtable (fat pointer or Fat) should be able to support downcasting (to an option).

I doubt think "forall A, B: B has exactly A's first N fields => Coerce<A, B>" is safe, because the polymorphic instances (like my 2nd one, Coerce<A, B> => Coerce<&A, &B>, etc) rely on A and B having the same size.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bstrie Swift may be fifth-most-popular because of strong backing by Apple and because it is targeted on different kind of applications than Rust is. Rust is low-level language and user should have maximum control on everything. Also user should pay only for features he needs. Different behaviour for structs and classes is IMO a bad feature that Swift has got and definitely something what should not be copied to Rust.

Why?

  1. It dictates the user of the a type how it must be created. You cannot have Swift's class by-value on a stack, ever. You pay what you do not need. Personally, I would be very disappointed to find out that a Rust library declares a simple type as a struct and therefore I am forced to allocate it on heap.

  2. It is not clear when the parameter is passed by value or ref by looking at a function call or a function definition. It can be source of naughty bugs where you alter only local copy by accident. However, there is probably no way how would Rust share this weakness with Swift as structs would be still explicitly boxed.

  3. Rust has polymorphism system orthogonal to types, yet this proposal ties polymorphism to type hierarchy. A struct would be suited to one particular polymorphic data-structure but not to any other. You have to build type-hierarchy for every data-structure from the ground.

The proposal #9 it is strictly superior solution.

  1. You can create any struct or enum on the stack, heap or turn it into fat object if you want to.
  2. You can turn even int or float to a fat object, and you can do that for different traits on different places. You can mix completely independent types in a polymorphic collection. You have the freedom to do that anywhere and yet you have the power to do that for every type. And it completely independent from inheritance which you may or may not use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
2014年6月28日 上午3:54于 "pepp-cz" [email protected]写道:

In 0000-virtual.md:

+Non-leaf struct values are unsized, that is they follow the rules for DSTs. You
+cannot use non-leaf structs as value types, only pointers to such types. E.g.,
+(given the definition of S1 above) one can write x: S2 (since it is a leaf
+struct), x: &S2, and x: &S1, but not x: S1. Struct values have their
+minimal size (i.e., their size does not take into account other variants). This
+is also current behaviour. Pointers to structs are DST pointers, but are not
+fat. They point to a pointer to a vtable, followed by the data in the struct.
+The vtable pointer allows identification of the struct variant.
+
+To summarise the important differences between enums and structs: enum objects
+may be passed by value where an outer enum type is expected. Struct objects may
+only be passed by reference (borrowed reference, or any kind of smart or built-
+in pointer). enum values have the size of the largest variant plus a
+discriminator (modulo optimisation). Struct values have their minimal size. For
+example,
+

@bstrie https://github.com/bstrie Swift may be fifth-most-popular
because of strong backing by Apple and because it is targeted on different
kind of applications than Rust is. Rust is low-level language and user
should have maximum control on everything. Also user should pay only for
features he needs. Different behaviour for structs and classes is IMO a bad
feature that Swift has got and definitely something what should not be
copied to Rust.

Why?

  1. It dictates the user of the a type how it must be created. You cannot
    have Swift's class by-value on a stack, ever. You pay what you do not need.
    Personally, I would be very disappointed to find out that a Rust library
    declares a simple type as a struct and therefore I am forced to allocate it
    on heap.

  2. It is not clear when the parameter is passed by value or ref by looking
    at a function call or a function definition. It can be source of naughty
    bugs where you alter only local copy by accident. However, there is
    probably no way how would Rust share this weakness with Swift as structs
    would be still explicitly boxed.

  3. Rust has polymorphism system orthogonal to types, yet this proposal
    ties polymorphism to type hierarchy. A struct would be suited to one
    particular polymorphic data-structure but not to any other. You have to
    build type-hierarchy for every data-structure from the ground.

The proposal #9 #9 it is strictly
superior solution.

  1. You can create any struct or enum on the stack, heap or turn it into
    fat object if you want to.
  2. You can turn even int or float to a fat object, and you can do that for
    different traits on different places. You can mix completely independent
    types in a polymorphic collection. You have the freedom to do that anywhere
    and yet you have the power to do that for every type. And it completely
    independent from inheritance which you may or may not use.


Reply to this email directly or view it on GitHub
https://github.com/rust-lang/rfcs/pull/142/files#r14312029.

@nielsle
Copy link

nielsle commented Jun 30, 2014

How about allowing an enum to use an existing (Sized?) struct definition as a variant? (Edited)

struct Sprite{ x:int, y: int,  gif: ... stuff goes here.. }
#[unsized] 
enum Shape { 
    struct SpriteVar = Sprite,
    Circle(int,int,int),
    Rectangle(int,int,int,int)
}
let x: SpriteVar = Sprite {x: 2, y:2, gif: ... stuff goes here.. }
match x {
    s @ SpriteVar           => { .... }
    Circle(x,y,r)            => { .... }
    Rectangle(x1,y1,x2,y2)   => { .... }
}

That allows you to use the same syntax for struct objects and enum variants, so you can change back and forth without having to rewrite all your code. Furthermore the same struct can belong to several enums allowing for easy subtyping (if the memory layout permits).

The match statements should allow you to match for struct-variants that implement a certain trait.

trait Drawable { ... };
impl Drawable for Sprite { ... };
match x {
    s @ Drawable             => { .... }
    ....
}

Matching for Deref<T>, could allow you to specify that some variants share fields.

Edit: The enum has a different memory layout than the struct, so rust should not allow you to coerce struct such as 'Sprite' into an enum value without copying. Hopefully this will not be too confusing.

@bill-myers
Copy link

@nikomatsakis Does the distinction between sized/unsized enum/struct really need to made when declaring the type?

Isn't it possible to make enums behave both as unsized and as sized depending on context without making too many compromises? (when declaring a field of enum type it would be sized, but when declaring an &Enum it would be unsized)

Or alternatively, adding syntax to make any enum unsized? E.g. "None" would be a zero-sized data type, but "Option[None]" would be an Option data type constrained to be None, and "&mut Option[*]" would be a DST pointer, while "&mut Option" is a pointer to Option.

@sinistersnare
Copy link

As someone who has not been really following this discussion, can someone explain what the difference between structs and enums will be if this RFC is implemented?

It seems as if structs are gaining variants and enums are gaining fields, so is there going to be a difference? It seems non-orthogonal and I feel the current way is better to separate logic.

@Ericson2314
Copy link
Contributor

@nikomatsakis @bill-myers Yeah unsized layout and open datatypes are orthogonal, so I'd be wary to have the former motivate the latter.

While ASTs are brought up I like "Two-level types" approach as described in http://blog.ezyang.com/2013/05/the-ast-typing-problem/ , to be able to add more variants or more fields in common to all variants. I'd guess unsized + the type that ties the knot not boxing the core AST + packing the discriminants together, would be all that's needed to make this just as performant.

#![unsized_fat_enum, packed_discriminants]

enum LValue<LV, RV> { ... }

enum Statement<LV, RV> {
    Mutate(Box<LV>, Box<RV>),
    ...
}

enum RValue<LV, RV> {
    Block([Box<Statement<LV, RV>>])
    LValue(LV),
    App(Box<RV>, [Box<RV>]),
    ...
}

mod Typed {
    struct Typed<T> (Type, T)
    struct LValue(Typed<super::LValue<LValue, RValue>>)
    struct RValue(Typed<super::RValue<LValue, RValue>>)
}

mod Plugin {
    trait CustomLValue { ... }
    trait CustomRValue { ... }
    enum Custom<S, C> { Std(S), Custom(C) }
    struct LValue(Custom<super::LValue<LValue, RValue>, CustomLValue>)
    struct RValue(Custom<super::RValue<LValue, RValue>, CustomRValue>)
}

@nikomatsakis
Copy link
Contributor

On Mon, Jun 30, 2014 at 06:38:29AM -0700, bill-myers wrote:

Isn't it possible to make enums behave both as unsized and as sized
depending on context without making too many compromises? (when
declaring a field of enum type it would be sized, but when declaring
an &Enum it would be unsized)

It is perhaps possible but presents numerous complications. For one
thing, the enum type (e.g., Option<T>) would have a maximal size,
but we wouldn't necessarily know the size of any given instance of
Option<T>. So if you wrote some code like:

fn foo(x: &Option<int>) {
    let y = *x; // how many bytes to copy?
}

we'd have to generate some complex code that checks whether this
particular option is a None or Some and only copies the payload
appropriately. (Note that if the base type were unsized, this copy
would be illegal.)

For another, we need to distinguish lvalues that will be mutated from
those that will not. Ideally we'd allocate just enough space for the variant
in the case where mutation does not occur. But there are plenty of ambiguous
cases where mutation may or may not occur. The most common will be something
like Box<Option<T>>:

let x = box None;

Here, you might think that we could allocate a very small box for x,
because it is only storing None. But someone might move this into a
mutable location and reassign it:

let mut y = x;
*y = Some(...);

(It is true that the contents of an Rc box cannot be reassigned, so
it may be possible that box(Rc) None could be sized precisely to
fit.)

They have a tag and are the size of the largest variant plus the tag. A pointer
or reference to an enum object is a thin pointer to a regular enum
object. Nested variants should use a single tag and the 'largest variant' must
take into account nesting. Event if we know the static type restricts us to a
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Event/Even/

@nikomatsakis
Copy link
Contributor

@nrc I was talking with @jdm on IRC and we think we have a way to initialize private fields of superstructs without exposing them (and also to modularize constructors, to some extent). The basic idea is to leverage FRU by allowing one to write:

struct SuperType { ... }
struct SubType : SuperType { ... }
...
let x = SubType { subtypefield: x, ..SuperType::new(...) };

Essentially, it is permitted for you to write .. with a superstruct type, in which case you are only required to initialize the other fields.

@nikomatsakis
Copy link
Contributor

@nick29581 in that case, we could remove the module requirement and just limit subtypes to the same compilation unit.

@nick29581 also, I've been thinking that this proposal is really not about making a specialized variation of trait objects, nor object-oriented programming per se, as it is about making a generalized version of enums to cover more use cases (like refinement and being able to allocate instances of exact size that cannot change variants). I think we should change our nomenclature somewhat appropriately, though I'm not 100% sure what to change it to.

@nikomatsakis
Copy link
Contributor

@nick29581 just remembered that super types are unsized, so returning them doesn't work. Still, if we could find some scheme like that, it would be good.

@jdm
Copy link

jdm commented Aug 8, 2014

My only remaining concern after the hierarchy/inherited field initialization stuff is addressed is calling inherited methods without requiring explicit casting. @nikomatsakis tells me that that's going to be discussed, so I'm pretty happy with this proposal.

@CloudiDust
Copy link
Contributor

@nick29581 @nikomatsakis

This RFC proposes that all variants of an enum/struct are usable as types, and the names don't form a hierarchy, but are "flattened" and poured into the enclosing scope. And when doing a match, we can use any names from any level of nesting to cover all inner levels.

This may be a breaking change.

Currently it is possible to write code like this:

enum Foo {
    Foo,
    Bar
}

match Foo {
  Foo => println!("This is the Foo variant!"),
  _ => println!("This is the Bar variant!")
}

That's because currently we cannot match against the enum name itself, and the compiler knows that Foo in match is the variant, not the enum,

But if this RFC is implemented, the two Foos would clash.

So may I suggest that we forbid enums having namesake variants, now?

However I'd like there to be an exception, that is when this namesake variant is the only variant of the enum:

enum Foo { Foo(Bar) }

In this case, only a single type Foo should be defined.

That's much like Haskell's

data Foo = Foo Bar

And to deal with the weirdness that "a Rust enum can have fields", I think we can forbid fields at the "top level" of the enum, and instead require a "wrapper variant". So,

enum Foo {
    bar: Bar,
    Variant1 {
        baz: Baz,
        qux: Qux,
        Variant2
    },
    Variant3(int, int)
}

becomes:

enum Foo {
    Foo {
        bar: Bar,
        Variant1 {
            baz: Baz,
            qux: Qux,
            Variant2
        },
        Variant3(int, int)
    }
}

More typing, but no one would think that Foo is not an enum.

Besides, are variants required to come after the fields? (I think so, but this fact is not clear in the RFC.)

Variant1,
Variant2(int),
VariantNest {
Variant4,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I first saw this nested enum syntax, I was worried that it conflicted with struct-style enum variants; i.e., how does one distinguish the case shown here from enum Names { Variant1, Variant2(int), VariantNest { x: int } }

But on reflection, I think part of the reason you have picked the syntax shown here and above is precisely that it supports (unifies) the struct-variant syntax with your named-field syntax shown above.

You allude to this in the presentation above when you say that this RFC makes struct-variants and tuple-structs less ad-hoc.

I think you should go further and actually add some concrete examples showing how the struct-variant syntax looks and how it is subtly distinguished from (or, if you prefer, "unified with") the nested enum syntax.

@nrc nrc self-assigned this Sep 4, 2014
nrc added a commit to nrc/rfcs that referenced this pull request Sep 17, 2014
Efficient single inheritance via virtual structs and unification and nesting of structs and enums. In contrast to rust-lang#142, this RFC uses traits for virtual dispatch rather than a custom system around impls. The parts of the RFC about nested structs and enums are identical to rust-lang#142.
@Stebalien
Copy link
Contributor

This may have been mentioned somewhere but I didn't see it. Instead of virtual functions, how about just using traits. Basically, use traits to describe functionality that must be implemented by children.

First, add the ability to specify that a struct(/enum) must implement a trait (this might also be useful for compile-time optimizations):

struct X: Y; // There is a struct X that must implement trait Y
impl Y for X {
    // ... A required implementation
}

This is consistent with the current trait syntax because the trait T: S is essentially a dependency specification (trait T depends on trait S). In this case, struct X depends on trait Y.

Given this feature, one could write the following:

enum Node {
    children: Vec<Box<Node>>,
    parent: Box<Node>,

    Element: AttributeHooks {
        id: &str,
        attrs: HashMap<&str, &str>,

        HTMLImageElement,
        HTMLVideoElement {
            cross_origin: bool
        }
    },
    TextNode {
        content: &str,
    }
}

trait AttributeHooks {
    fn before_set_attr(&self, key: &str, value: &str);
    fn after_set_attr(&self, key: &str, value: &str);
}

impl AttributeHooks for HTMLImageElement {
    fn before_set_attr(&self, key: &str, value: &str) {
        if key == "src" {
            //
        }
    }
}
impl AttributeHooks for HTMLVideoElement {
    fn after_set_attr(&self, key: &str, value: &str) {
        if key == "crossOrigin" {
            cross_origin = value == "true";
        }
    }
}

Due to the Element: AttributeHooks dependency, all leaves of Element need to implement AttributeHooks. However, for simplicity, I would let the programmer omit empty trait implementations; that is, I wouldn't force the programmer to write impl AttributeHooks for SomeElement {}.

The primary reason I prefer this over virtual functions is that it is consistent with the current use of traits. The secondary reason is that it doesn't allow for method override chains (which, in my experience, tends to lead to unexpected behavior). If you need to "inject" some functionality in the middle of an inheritance chain, you can just create another trait:

enum First : FirstTrait {
    a: int
    Second : SecondTrait {
        b: int
        Third {
            c: int
        }
    }
}
trait FirstTrait {
    fn test(&self);
}
trait SecondTrait {
   fn handle_test(&self);
}
impl FirstTrait for Second {
    fn test(&self) {
        println!("testing");
        self.handle_test();
    }
}

@CloudiDust
Copy link
Contributor

@Stebalien, there is a variation of this proposal that uses trait for method dispatch: #245.

And for pure trait based inheritance (which enhances traits, but do not modify struct/enum in any way other than an optional syntax sugar) you may be interested in this alternative proposal: #250, associated field inheritance.

There are also #9, #91, #223, which provides special structs/traits to do inheritance, but do not directly enhance normal structs/enums/traits.

@CloudiDust
Copy link
Contributor

Well #250 does provide field mapping and some attributes which can be seen as enhancements to structs.

@nrc
Copy link
Member Author

nrc commented Sep 23, 2014

Closing in favour of other RFCs which address the same problem (see http://discuss.rust-lang.org/t/summary-of-efficient-inheritance-rfcs/494), in particular #245.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.