Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Tuple-based construction and deconstruction of immutable types #9411

Closed
MadsTorgersen opened this issue Mar 2, 2016 · 32 comments
Closed

Comments

@MadsTorgersen
Copy link
Contributor

Tuple-based construction and deconstruction of immutable types

In #9330 I listed some of the recent alternatives we've explored for API patterns in support of immutable object initializers, with-expressions and positional deconstruction in C#.

Here is an alternative approach to these patterns, involving a variant of the builder pattern based on tuples.

The builder pattern and tuples

The builder pattern is the approach of enabling initialization of an immutable object by creating it from a mutable companion object, a "builder object".

A traditional downside of builders is that they require an extra object to be allocated. This can be ameliorated by making the builder type a struct.

Another downside of builders is that they require the declaration of an extra type for this purpose. This can be mitigated by using tuples as the builder types.

Tuples are proposed in #347 and are one of our top features for C# 7. They will be mutable structs, and have optional names associated with each element. This makes them ideal as builders for immutable object construction.

Tuples and name/position matching

The core problem in #9330 is the mapping that each language feature requires between positions and property names, in order to generate code for the feature.

Tuple-based API patterns address that need for a mapping, because a tuple is already both position- and name-based. Each tuple member has a position and a name, and can be accessed through both.

API patterns and consuming code

Each of the features in #9330 can be supported using a tuple-based API pattern.

Let's again assume an immutable type like this:

public class Person
{
  public string FirstName { get; }
  public string LastName { get; }

  public Person(string firstName, string lastName)
  {
    FirstName = firstName;
    LastName = lastName;
  }
}

and consuming code like this:

var p = new Person { FirstName = "Mickey", LastName = "Mouse" }; // object initializer
if (p is Person("Mickey", *)) // positional deconstruction
{
  return p with { FirstName = "Minney" }; // with-expression
}

Object initializers

The expression

new Person { FirstName = "Mickey", LastName = "Mouse" }

Gets generated as

var builder = default((string FirstName, string LastName));
builder.FirstName = "Mickey";
builder.LastName = "Mouse";
...
new Person(builder)

Note how the generated code is nicely equivalent to that of today's mutating object initializers, except it mutates the builder object rather than the resulting object itself.

To facilitate this, the Person type needs a constructor overload that takes a builder tuple:

public class Person
{
  ...
  public Person((string FirstName, string LastName) builder) { ... }
}

With-expressions

With expressions work similarly to object initializers, except that the builder comes from an existing object, and object creation happens through a call to a With method (which can ensure that the runtime type and any state not known about at compile time get copied over).

The expression

p with { FirstName = "Minney" }

Gets generated as:

var builder = p.GetValues();
builder.FirstName = "Minnie";
...
p.With(builder);

To facilitate this, the Person class needs two new methods:

public class Person
{
  ...
  public (string FirstName, string LastName) GetValues() { ... }
  public Person With((string FirstName, string LastName) builder) { ... }
}

There are alternatives to this. For instance, the With method could instead take a lambda that initializes the builder. Then we wouldn't need the separate GetValues method. However, that's going to be useful below as well.

Positional deconstruction

The expression

p is Person("Mickey", *)

Get generated as

var values = p.GetValues();
...
values.Item1 == "Mickey"

This uses the same GetValues method as the with-expression, but consumes it in a positional way (through the underlying Item1...Itemn properties that tuples will have).

Discussion

The benefit to this approach is that no magic is needed: No case-insensitive name-matching, no special default values.

One downside is that the features won't apply to unchanged existing types: they require specific methods. GetValues and With can be added as extension methods (unless you want With to be virtual), but Create cannot.

Another downside is that the generated code for the features is a little more involved. That seems like an unavoidable consequence of not letting the compiler rely on "magic knowledge".

Finally, it relies heavily on the exact design of tuples. That is pretty much agreed on, so maybe not a big deal, but this is one new language feature relying strongly on another.

@dominicpease
Copy link

Is positional deconstruction for cases like this pattern matching pretty much set in stone? It was in #9330 as well, and I can't say I like it as applied to class instances. I understand it for tuples (which might not have any other option, and is I assume the driving force?), but it seems like it builds fragile and obscure relationships into a design, and perhaps unnecessarily.

For example, what if we went the other direction, and had an operator or indexer-style syntax that could be used either for property or positional access? This would give us the ability for such things to participate in expressions, too. We could have

p is Person(#FirstName == "Mickey", *)
p is Person(#1 == "Mickey")
p is Person(Math.Abs(#3) > q#3)

In order for this to work without something before the # (the abbreviated syntax), you'd need to be able to rely on type/referent inference, of course, which I assume in the case of is wouldn't be a problem; otherwise using p#1 doesn't seem too burdensome. The implicit use could perhaps be useful in other scenarios, too, but you'd need to be able to establish a type context in advance.

Note that with a named property it would be equivalent (as far as my imagination takes me) to . aside from the option to use the abbreviated syntax.

@MadsTorgersen
Copy link
Contributor Author

@dominicpease Nothing is set in stone. For pattern matching we do intend for there to be a name-based equivalent. Currently the envisioned syntax is:

p is Person { FirstName is Mickey }

The only thing positional matching (for anything but tuples) adds over that is brevity. There's a strong tradition for positional matching in languages that do have pattern matching, but we could certainly live without it.

What you see is us in a mode of saying "how will we do it if we do it". This is important, because if we decide to allowthe feature, we want records to support it, and we then need them to generate the corresponding API patterns.

@alrz
Copy link
Member

alrz commented Mar 2, 2016

Apparently these features depend on "mutable tuples" I thought mutability is not being considered at this point at least, and who wants mutable tuples?

public Person((string FirstName, string LastName) builder) { ... }

If we're going to use this constructor pattern to be able to use object initializers, perhaps records should also generate this constructor and utilize splatting? That's one ugly constructor. Besides, if using object initializers is that important I still prefer #9234 and exact name matching.

public Person(this.FirstName, this.LastName) {}

I don't see why all this trouble just to use an alternative way of creating an object that is not meant to be created that way.

public Person With((string FirstName, string LastName) builder) { ... }

Quoting yourself from #347:

I doubt that a method would commonly be declared directly to just take a tuple.

That's not really helping! Please do not bring this code style to C#.

@dominicpease
Copy link

@MadsTorgersen Let me preface this by saying that despite my counter-suggestions I am leery of jumping too quickly to new syntax to solve problems. However, I do think an operator-based method could work at least as well as the positional deconstruction proposal, even with records and completely hand-written tuple-like classes; and especially in those cases I'd rather see it done with an explicit (but compact) expression syntax rather than the dangling values proposed here. Perhaps something like public string FirstName #1 { get; } becomes a shorthand for [PositionalAccessor(1)] etc, and public class Record##becomes a shorthand for[PositionalAccess(AccessorOrder.DefaultConstructor)] etc`?

This may be a horrible idea, but I like the possibility of more expressivity of intent and flexibility of usage than what I've seen so far.

@mburbea
Copy link

mburbea commented Mar 2, 2016

@MadsTorgersen , personally at least in the examples displayed I find myself leaning towards the following

if(o is Person p && p.FirstName == "Mickey")

From a perspective of readability it is terse, and does not add too much extra complexity. The is pattern matching that allows assignment is perfect.

However, the with syntax is a bit odd to look at especially based on how people decide to format it.

if(o is Person { FirstName is "Mickey", LastName is null})
{
....
}

I don't really like using is here for equivalency. I mean certainly I like the sql-esque is null.
Not to sound like Bill Clinton but that depends on the defintion of is mean?
How is this equality checking done? Is it calling .Equals, is it calling EqualityComparer<T>.Default is it applying the == operator? Can I specify my own equality comparison?

This matters as double.NaN == double.NaN is false, but double.NaN.Equals(double.NaN) is true.

@bbarry
Copy link

bbarry commented Mar 2, 2016

Apparently these features depend on "mutable tuples" I thought mutability is not being considered at this point at least, and who wants mutable tuples?

The mutable tuple issue is easily bypassed:

The expression

new Person { FirstName = "Mickey", LastName = "Mouse" }

Gets generated as

var _1 = "Mickey"; //necessary to do these in order to 
var _2 = "Mouse";  //preserve evaluation order of initializer
var _builder = new ValueTuple<string, string>(_1, _2);
new Person(_builder)

@alrz
Copy link
Member

alrz commented Mar 2, 2016

@bbarry What about this one.

var builder = p.GetValues();
builder.FirstName = "Minnie";
...
p.With(builder);

I can think of this,

builder = (builder.FirstName, LastName: "Minnie");

But does it worth the trouble?

@bbarry
Copy link

bbarry commented Mar 3, 2016

The expression

p with { FirstName = "Minney" }

Gets generated as:

var _deconstruct = p.GetValues();
var _FirstName = "Minnie";
var _builder = new ValueTuple<string, string>(_FirstName, _deconstruct.Item2);
p.With(_builder);

Very straightforward codegen.

My question would be if

p with { FirstName = "Minney", LastName = "Monster" }

can be optimized (the compiler knows these are all the properties on that tuple) to:

var _1 = "Minney";
var _2 = "Monster"; 
var _builder = new ValueTuple<string, string>(_1, _2);
new Person(_builder)

(does GetValues() need to be called by spec?)

I think the answer is no (GetValues() must be called if you use with) because the method may have side affects.

Another downside to this proposal is that these methods make adding properties to the type a breaking change. You cannot extend Person here with MiddleName for example and make the necessary changes to these methods because it changes the return type of GetValues (an overload of the With method can be added; only the GetValues method has this problem).

@alrz
Copy link
Member

alrz commented Mar 3, 2016

@bbarry OK, consider a class with one property. You'll need to return a oneple from GetValues then? And the idea of passing GetValues return value to With is not safe. You must remember to change both at the same time. At first I thought utilizing tuples would be a nice for this scenario, but there will be some shortcomings. See #9005 for discussion.

@HaloFour
Copy link

HaloFour commented Mar 3, 2016

@mburbea

Per the pattern matching spec (so far):

Constant Pattern

A constant pattern tests the runtime value of an expression against a constant value. The constant may be any constant expression, such as a literal, the name of a declared const variable, or an enumeration constant.

An expression e matches a constant pattern c if object.Equals(e, c) returns true.

constant-pattern
     : constant-expression
     ;

It is a compile-time error if the static type of e is not pattern compatible with the type of the constant.

So the comparisons are pretty limited. I also don't believe that constant patterns (or any simple patterns) are permitted to be used in simple is expressions, so if (x is "Foo") would not be legal.

Update: The linked spec is newer and disposes of the numbering scheme. Updated comment.

@alrz
Copy link
Member

alrz commented Mar 3, 2016

@HaloFour So far.

@ufcpp
Copy link
Contributor

ufcpp commented Mar 3, 2016

I don't like the approach which needs extra methods like GetValues and With. I always struggle to reduce binary size of our apps. These methods may increase the size.

@bbarry
Copy link

bbarry commented Mar 3, 2016

And the idea of passing GetValues return value to With is not safe.

I'm not sure what you are meaning by this.

I'm not sure tuples cannot be used here, but I am pretty sure mutability is not an issue.

Given:

public class Person
{
  public string FirstName { get; }
  public string LastName { get; }

  public Person(string firstName, string lastName)
  {
    FirstName = firstName;
    LastName = lastName;
  }
}

One could write extension methods GetValues and With such that deconstruction-pattern and with-expression syntax elements can be compiled into working code.

Since GetValues return type is used here in these examples, one must write successive implementations of it as extension methods since the act of adding or removing a property to the Person type involves changing the return type of this method and that is a change that breaks all usages. Further one must write new With overloads that are type compatible with the return types of GetValues (edit: well not exactly type compatible, you need not mutate the return value from GetValues and pass the result into With but could instead create a new variable in between; still they need to be structurally compatible in some semantic sense).

For example I might write these extension methods for a deconstruction to work with Person:

public static class PersonExtensions
{
  public static A GetValues(this Person p) { ... }
  public static Person With(this A builder) { ... }
}

Suppose that code ships. In the next version the company decides Person::MiddleName is needed. How can this be added such that it doesn't break existing code? What are the semantics of a not-recompiled line in some library:

p with { FirstName = "Minnie" }

Does it change MiddleName?

@alrz
Copy link
Member

alrz commented Mar 3, 2016

I'm not sure what you are meaning by this.

I meant this code,

var builder = p.GetValues();
builder.FirstName = "Minnie";
...
p.With(builder);

There. The two methods are tightly coupled. and you must remember if you change one you must change the other.

Does it change MiddleName?

Why it is a concern? You didn't mention MiddleName in with { ... } so it woudn't. Also as long as you didn't include this property in With and GetValues I think you can not use it in your with expression. And once you do, as I said, you'll need to change With as well.

I don't know what problem actually this proposal is trying to solve at the point that it suggests to also implement With with tuples, while this default values would have just work.

public Person With(string firstName = this.FirstName,  string lastName = this.LastName) =>
  new Person(firstName, lastName);

@chrisaut
Copy link

chrisaut commented Mar 3, 2016

if(o is Person { FirstName is "Mickey", LastName is null})
{
....
}

I don't really like using is here for equivalency. I mean certainly I like the sql-esque is null.
Not to sound like Bill Clinton but that depends on the defintion of is mean?
How is this equality checking done? Is it calling .Equals, is it calling EqualityComparer.Default is it applying the == operator? Can I specify my own equality comparison?

This matters as double.NaN == double.NaN is false, but double.NaN.Equals(double.NaN) is true

I strongly agree with this, please don't use "is" in this way, == is exactly the same length and makes it clear what code is being generated.

Can I call code like this

if(o is Person { string.Equals(FirstName , "Mickey", StringComparison.OrdinalIgnoreCase), LastName is null})
 {
 ....
 }

or is it only strictly "is" (whatever that actually generates)?

@alrz
Copy link
Member

alrz commented Mar 3, 2016

@chrisaut Perhaps you're looking for #8457.

@DavidArno
Copy link

I want to 👍 @alrz here with his comment:

Apparently these features depend on "mutable tuples" I thought mutability is not being considered at this point at least, and who wants mutable tuples?

I read "Tuples are proposed in #347 and are one of our top features for C# 7. They will be mutable structs" (emphasis mine) and pretty much gave up in despair. Please, please, please do not implement tuples as mutable types.

@HaloFour
Copy link

HaloFour commented Mar 3, 2016

@DavidArno @alrz

Per the next part of the same spec:

Should tuples be mutable or immutable? The nice thing about them being structs is that the user can choose. If a reference to the tuple is readonly then the tuple is readonly.

@DavidArno
Copy link

@HaloFour,

I don't follow what the phrase "If a reference to the tuple is readonly then the tuple is readonly". If tuples are structs, then they are values, not references! Guess I've missed something...

@HaloFour
Copy link

HaloFour commented Mar 3, 2016

@DavidArno

I know, I noticed that too. I assume that they used "reference" as a bad term for any of fields, locals and parameters.

@HaloFour
Copy link

HaloFour commented Mar 3, 2016

@chrisaut Per the pattern matching spec you couldn't use an arbitrary expression as a subpattern to be used within a property pattern. So, with pattern matching, you'd have the following options:

if (o is Person { FirstName is var firstName, LastName is null }
        && string.Equals(firstName, "Mickey", StringComparison.OrdinalIgnoreCase))
{
    // firstName variable is in scope here
}

if (o is Person { LastName is null } p
        && string.Equals(p.FirstName, "Mickey", StringComparison.OrdinalIgnoreCase))
{
    // p variable is in scope here
}

if (o is Person p
        && p.LastName == null
        && string.Equals(p.LastName, "Mickey", StringComparison.OrdinalIgnoreCase))
{
    // p variable is in scope here
}

@bondsbw
Copy link

bondsbw commented Mar 3, 2016

I also vote against using is.

equals is already a contextual keyword that has this same meaning in other contexts.

p is Person { FirstName equals "Mickey" }

seems more consistent.

But frankly, I don't care for either. I understand why equals is needed in LINQ, so that the full join clause can be executed at the target SQL instance, but I'm not sure I see its purpose here other than to introduce a new language element for the sake of having something new.

@HaloFour
Copy link

HaloFour commented Mar 3, 2016

@bondsbw

That might make sense specifically with constant patterns, but what if the subpattern is another form of pattern? Especially another form of type pattern which is already an extension of the is operator?

p is Student { 
    Course is OnlineCourse course,
    Teacher is TenuredProfessor {
        Name is var professorName
    }
}

@bondsbw
Copy link

bondsbw commented Mar 3, 2016

@HaloFour
Sure, is makes sense for those patterns. It is a synonym for the concept has-type-of.

But I'm not sure if I can buy that FirstName has-type-of "Mickey", that doesn't make much sense to me.

@alrz
Copy link
Member

alrz commented Mar 3, 2016

equals makes sense for join because each side turns to a lambda but if you're going to special case it in patterns I prefer #8457. Not particularly related to this proposal though.

@alrz
Copy link
Member

alrz commented Mar 3, 2016

@bondsbw FirstName is "Mickey" is the consequence of overloading is with pattern-matching capabilities and "Mickey" being a constant-pattern. I agree that it doesn't feel right in this particular context.

@bondsbw
Copy link

bondsbw commented Mar 3, 2016

@alrz Agreed.

So what would be the impact if is does not work with constant patterns, in general? Is it just a simple question of preference?

@HaloFour
Copy link

HaloFour commented Mar 3, 2016

@alrz @bondsbw Probably a conversation for #206 and not here. While I think it might read better I think it would also be weird if different patterns used different verbs.

property-subpattern
    : identifier 'is' complex-pattern
    | identifier 'is' type
    | identifier 'equals' constant-pattern
    | identifier 'into' variable-identifier
    ;

@bondsbw
Copy link

bondsbw commented Mar 3, 2016

I mean, I can deal with the idea that is has evolved to pattern matching instead of just type matching. But just as I don't like the fact that join only blesses equality (with merit), this seems to be doing the same (without merit). I'm not sure the best solution but I like where #8457 is going.

I'll move to that thread for discussion.

@alrz
Copy link
Member

alrz commented Mar 3, 2016

@bondsbw The direct correspondence of linq operators to extension methods caused this limitation, using operators like == could have been ambiguous in that context. But this is not the case for property patterns, and the == operator wouldn't translate to something else here. That's why I prefer to use operators.

@HaloFour I don't know how linq and patterns are related that makes you suggest the same semantics of into in patterns. In my opinion, if anything, it should be syntactically similar to a relational-expression.

@HaloFour
Copy link

HaloFour commented Mar 3, 2016

@alrz Purely spaghetti. Not a proposal/suggestion as much as illustration.

@gafter
Copy link
Member

gafter commented Apr 21, 2017

Issue moved to dotnet/csharplang #466 via ZenHub

@gafter gafter closed this as completed Apr 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests