Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F.18: Confused about difference between "will-move-from", "forward" and "in" #1407

Closed
russellmcc opened this issue Apr 16, 2019 · 11 comments
Closed
Assignees

Comments

@russellmcc
Copy link
Contributor

russellmcc commented Apr 16, 2019

Hi! I'm writing in the hope that we can can clarify the cases in which F.16 applies vs. the cases in which F.18 or F.19 applies. Without having the terms "forward", "will-move-from", and "in" defined, I'm honestly not sure what the guidelines are recommending in certain cases. Just as one example, consider the following toy:

    struct A {
      vector<int> data;
    };
    
    struct B {
      vector<int> data;
    };
    
    B convert(A a) {
      return B{std::move(a.data)};
    }

Here, which rule applies to the parameter a in convert? Is it an "in" parameter? Is it a "will-move-from" parameter? Is it a "forward" parameter? I'm not sure how to make the distinction, and if pressed I could create arguments for all three options. This is important because without understanding which case we're in, it's impossible to know how I'm supposed to write the convert function in accordance with the guidelines.

Please note, I'm not asking for help specifically writing this convert function, this is just one example. I'm confused about these guidelines in numerous other cases as well. I think for this set of guidelines to be truly useful to me, I need actual definitions for "in", "will-move-from", and "forward".

@jwakely
Copy link
Contributor

jwakely commented Apr 16, 2019

The answer for this example seems pretty obvious. It's not an "in" parameter, because you're not just providing a value as read-only input to the function. It's not a "forward" parameter, because the function is making use of it not just forwarding it to another function. It's obviously a "will move from" parameter, because you're moving from it!

You could write the return statement as

return B{std::move(a).data};

which should make it even more obvious.

F.18 says the parameter should be A&&.

This doesn't seem like a compelling case for additional definitions.

@jwakely
Copy link
Contributor

jwakely commented Apr 16, 2019

To write code that is in accordance with the guidelines you can look at the enforcement rules. If your code would be flagged by a checker that enforces the rule, it's not in accordance. Passing A&& and not moving it or forwarding it? Probably wrong. Passing an expensive-to-copy object by value then not doing anything with the copy that couldn't have been done with a const reference to the original? Probably wrong.

@russellmcc
Copy link
Contributor Author

Not at all disputing that F.18 applies here (as I said, I don't quite understand the guidelines as written), but I still think it would be helpful to clarify the terms. What if I had written it like this?

    struct A {
      vector<int> data;
    };
    
    struct B {
      vector<int> data;
    };
    
    B convert(const A& a) {
      return B{a.data};
    }

In this case, I didn't move from it, so it's not "obviously" a will-move-from parameter.

@jwakely
Copy link
Contributor

jwakely commented Apr 17, 2019

It's obviously not a will-move-from parameter if you aren't moving from it.

I'm struggling to see what the problem is.

The type of parameter is dictated by what the function does with it. If you want to modify the parameter (e.g. by moving it) then obviously don't pass it by reference-to-const.

F.16 in particular has been standard practice for decades, and is just common sense. If you don't want to modify it (e.g. just copy it as in your second example) then don't pass by value unless it's cheap to copy (like an int), and don't pass by reference-to-non-const (neither A& nor A&&) because those suggest to users of the API that you'll modify the parameter, and also unnecessarily restricts what callers can pass to the function (A& means they can't pass non-const lvalues or rvalues, and A&& means they can't pass lvalues).

F.18 deals with cases where you might be trying to choose between defining void sink(T); or void sink(T&&);. The guideline says that if you intend to move from a parameter, pass by rvalue reference to make that intention explicit, rather than passing by value and causing a copy/move at the call site. Some other coding styles prefer to pass by value, so that the function can be called with lvalues or rvalues, but this can lead to inadvertent copies at the call site. By taking T&& the caller has to be explicit about passing an rvalue, e.g. by calling sink(std::move(x)) not just sink(x), or explicitly saying sink(T{x}) if they want to force a copy. The exception is when T is move-only and cheap to move, like unique_ptr, because then there's no risk of sink(x) making an inadvertent copy (you can't do that with a move-only type) and so defining the function as void sink(T) is fine.

@russellmcc
Copy link
Contributor Author

russellmcc commented Apr 17, 2019

Thanks for your help with this - sorry I'm not doing a good job of adequately explaining my confusion - confusion is by nature hard to explain. I really do appreciate your patience in bearing with me.

I'm now starting to believe that perhaps I now understand the guidelines as written, but I wish the guidelines offered more guidance about how to author functions/overload sets such as convert above.

You mentioned, "the type of parameter is dictated by what the function does with it", however I don't think it's that simple: if I'm trying to write a new function/overload set, I can't simply look at what it's doing with the parameter, because the function doesn't exist yet. In the specific case of convert, my goal is to create a B using the data of an A in an efficient way. I'm looking to the guidelines for help translating this concept into the mechanics of C++.

Based on my current (still not totally clear) understanding, the following are all valid ways to write convert:

Choice A: input is an "in parameter"

This is covered by the chart above F.16 in "normal parameter passing", and in the rule F.16 if I decide I'm not in an "advanced use".

B convert(const A& a) {
 return B{a.data};
}

Choice B: input is an "in parameter" but I want to optimize it for advanced uses

This option is called "in & retain copy" in the chart above F.16 in the "advanced" section.

B convert(const A& a) {
 return B{a.data};
}

B convert(A&& a) {
 return B{std::move(a).data};
}

Choice C: input is a "will-move-from" parameter

This is covered by F.18 and "in & move from" in the chart in the advanced section.

B convert(A&& a) {
 return B{std::move(a).data};
}

Choice D: input is a "forward" parameter

This does not appear on the chart, but seems to be allowed by F.19

template <typename A>
B convert(A&& a) {
 return B{std::forward<A>(a).data};
}

So, based on my current understanding of the guidelines, all four of these are in accordance.

However, my goal as an implementor of convert hasn't changed in all four of these options - I just want to provide callers a convenient way of converting between A and B. I'm often in a situation where I'm considering all of these as available options. So, maybe my concern is this: since this choice is subtle and has pretty major performance implications, it would be nice if the guidelines offered a more concrete recommendation on which choice to make. In an ideal world, I'd only have one option recommended by the guidelines.

I guess, getting more specific, the thing that is confusing me is what seem to be overlapping guidelines for "in & retain copy" (F.16) and "in & move from" (F.18) - it seems that the cases where these are appropriate are really, really similar. The only situations I can think of where one would be appropriate but not the other is when a type is either slow-to-move or noncopyable. Most of the time, if one is appropriate, they both would be. So, how do I as the implementor of a function decide to follow the "in & retain copy" guideline (in which case I provide separate T&& and const T& overloads), or to follow the "in & move from" guideline (in which case I provide only a T&&)? I don't see how I as the implementor of a function am supposed to make this choice.

If you're asking me (which I know you aren't!), the best option for when you want to retain an argument like this is providing only a T&& overload, since this avoids hard-to-audit copies at the callsite, which can be major performance problems.

@jwakely
Copy link
Contributor

jwakely commented Apr 17, 2019

So, based on my current understanding of the guidelines, all four of these are in accordance.

I agree.

And I also agree that there's little difference between some of the options you show for this example, and that the choice may be fairly arbitrary (or based on other concerns not described by F.16 etc).

As I see it the main point of the guidelines is to say don't do one of these versions:

// this is not an in-out parameter, should not use A&
B convert(A& a) {
 return B{a.data};
}

// definitely don't move from lvalue references, that's just rude!
B convert(A& a) {
 return B{std::move(a).data};
}

// this is not a cheap to copy type, should not use A
B convert(A a) {
 return B{a.data};
}

// avoids a copy within the function, but might cause one at the call-site
B convert(A a) {
 return B{std::move(a).data};
}

// don't use a forwarding reference when not forwarding
template <typename T>
B convert(T&& a) {
 return B{a.data};
}

Any of your almost-equivalent options would be OK, but these versions would not be.

Maybe the confusion was just that you've already internalized similar rules to F.16 etc. and were looking for expert-level advice, when actually they're trying to offer beginner- and intemediate-level advice.

@russellmcc
Copy link
Contributor Author

Thanks so much for the help. Sounds like we're agreed on the facts, and I totally agree that the guidelines as written do provide value as they disallow some non-optimal approaches that you've laid out.

I'm still feeling that the choice between providing both const T& and T&& overloads and a T&& overload alone is one that could in principle be made in the guidelines, since it feels to me like an arbitrary choice. This would help with codebase consistency, and it may be more clear to beginners to provide a single way of doing things rather than many options. If this doesn't make sense to you or others, I'm happy to drop it as I feel that my confusion has been resolved.

Points in favor of only a T&& overload:

  • Avoids accidental copies at callsite
  • Less boilerplate

Points in favor of both const T& and T&&:

  • Better performance in the copy case - just copies the input rather than copy followed by move

@russellmcc
Copy link
Contributor Author

Another thing that could have helped resolve this confusion earlier for me is a note somewhere that the guidelines provide a lot of freedom on how to write functions. In particular, the chart above F.16 appeared to me to be a sort of decision tree on how to write an overload set; but however as we've discussed there's still plenty of freedom given in exactly how to take parameters.

@jwakely
Copy link
Contributor

jwakely commented Apr 17, 2019

I'm still feeling that the choice between providing both const T& and T&& overloads and a T&& overload alone is one that could in principle be made in the guidelines, since it feels to me like an arbitrary choice.

It is there, tucked away in F.16:

For advanced uses (only), where you really need to optimize for rvalues passed to "input-only" parameters:

  • If the function is going to unconditionally move from the argument, take it by &&. See F.18.
  • If the function is going to keep a copy of the argument, in addition to passing by const& (for lvalues), add an overload that passes the parameter by && (for rvalues) and in the body std::moves it to its destination. Essentially this overloads a "will-move-from"; see F.18.

Another thing that could have helped resolve this confusion earlier for me is a note somewhere that the guidelines provide a lot of freedom on how to write functions.

That's a good suggestion. Could you submit a pull request?

@russellmcc
Copy link
Contributor Author

russellmcc commented Apr 17, 2019

That's a good suggestion. Could you submit a pull request?

Sure! I'll get to it in the next few days.

It is there, tucked away in F.16:

Hrm. Maybe I'm still a bit puzzled. Going back to my convert example, I'd like to disallow either choice B or choice C, as I think the decision to write it either way is arbitrary and making a single choice could increase consistency and reduce confusion/bikeshedding for beginners. However, it seems like both are supported by the note in F.16 - choice C works because it unconditionally moves, and choice B works because it's "keeping a copy" and overloads a will-move-from. What I'm suggesting is we put guidance in to choose one or the other. If it were up to me, I'd suggest something like "If you are consuming data, prefer taking arguments by rvalue reference, and do not provide a const lvalue reference overload".

@hsutter hsutter self-assigned this Apr 25, 2019
@hsutter
Copy link
Contributor

hsutter commented Apr 25, 2019

Editors call: Thanks, looking forward to the PR.

@hsutter hsutter closed this as completed Apr 25, 2019
russellmcc added a commit to russellmcc/CppCoreGuidelines that referenced this issue Apr 28, 2019
As I discussed in isocpp#1407, I found the existing wording confusing as I
expected that the guidelines would provide me with one recommended way
to author function arguments in all situations.  However, the
discussion on the issue clarified that the intent was to allow many
ways to author functions, and only to disallow specific clearly
non-optimal cases.

This is my attempt to reduce that confusion, by explicitly calling out
when such freedoms exist.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants