Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup move semantics and other misc cleanups #3348

Closed
wants to merge 2 commits into from

Conversation

seelabs
Copy link
Collaborator

@seelabs seelabs commented Apr 10, 2020

Clean up some code flagged by a static analysis run

@codecov-io
Copy link

codecov-io commented Apr 10, 2020

Codecov Report

Merging #3348 into develop will decrease coverage by 0.00%.
The diff coverage is 68.42%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3348      +/-   ##
===========================================
- Coverage    70.22%   70.22%   -0.01%     
===========================================
  Files          683      683              
  Lines        54621    54625       +4     
===========================================
+ Hits         38360    38362       +2     
- Misses       16261    16263       +2     
Impacted Files Coverage Δ
src/ripple/app/ledger/Ledger.cpp 78.99% <ø> (ø)
src/ripple/app/misc/TxQ.h 96.66% <ø> (ø)
src/ripple/app/misc/impl/TxQ.cpp 95.98% <ø> (-0.02%) ⬇️
src/ripple/nodestore/impl/Shard.cpp 0.00% <0.00%> (ø)
src/ripple/overlay/impl/OverlayImpl.cpp 29.13% <0.00%> (ø)
src/ripple/overlay/impl/PeerImp.cpp 0.00% <0.00%> (ø)
src/ripple/rpc/impl/ShardArchiveHandler.cpp 62.84% <0.00%> (-0.29%) ⬇️
src/ripple/rpc/impl/TransactionSign.cpp 89.04% <ø> (ø)
src/ripple/shamap/SHAMap.h 96.96% <ø> (ø)
src/ripple/shamap/SHAMapTreeNode.h 100.00% <ø> (ø)
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 023f570...6a69340. Read the comment docs.

@@ -55,7 +55,7 @@ GRPCServerImpl::CallData<Request, Response>::CallData(
, responder_(&ctx_)
, bindListener_(std::move(bindListener))
, handler_(std::move(handler))
, requiredCondition_(std::move(requiredCondition))
, requiredCondition_(requiredCondition)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intended for wider team review:

I'm uncomfortable removing a std::move in a context like this because it will cause future readers to wonder if the missing move on this one argument is perhaps an oversight and should be added. Future readers will continually have to look up the underlying type to discover that a move on it will neither help nor hurt.

Move semantics was designed from day 1 to interoperate with "copy only" types with no negative consequences so that programmers would not have to worry about this issue so much. Especially in generic code, but more generally, even in situations like this.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1377.htm#Cast%20to%20rvalue

The "request to move" semantics turn out to be very handy in generic code. One can request that a type move itself without having to know whether or not the type is really movable. If the type is movable it will move, else if the type is copyable, it will copy, else you will get a compile-time error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there's a good universal answer for this. I agree that uses of std::move in copy-only situations are benign. I think there are two additional considerations however.

  1. Does the presence of the std::move cause an otherwise useful static checker to produce enough noise that the static checker becomes less useful? [E.g., it generatesstd::move has no effect]. We have to think about keeping our tools useful.

  2. It is possible for there to be a situation where if a move took place there would be a use after move, but that move doesn't currently happen because the type is only copyable. (That's not the situation right here.) So leaving that move in place would become a bug if the type actually becomes moveable in the future.

I think consideration 1 is the most important. If consideration 1 can be worked around then tools and code reviews will probably manage issue 2 for us? But consideration 2 would add risk to making a type moveable if it hasn't been before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scott D tells me that this warning can be disabled, without disabling a warning about use after move, or a warning about moves that cause the code to become more expensive.

I agree that case 2 is a bug that should be fixed (except when that use has no preconditions, e.g. assign-to). And I believe that the tools will catch that.

Another aspect that has come up between Scott and I in private conversations is that this warning is flagging cases where the client signals his intention that he no longer needs the value with move. But the std::move is only a request to move, not a requirement. It is up to the far away client code whether to actually pilfer the object or not. And that can change with the implementation of the far away code, without needing to survey clients. Clients silently get the optimization when the far away code changes, with no semantic or syntax change to the client code. Example:

06740e6#diff-9647085a1d5d0d2cf7b0b835b0e8c742R147

In this case the client is saying I don't need the value of ledger any more, but the RCLValidatedLedger constructor doesn't move from it. But in the RCLValidatedLedger might change to move from it. If that happens, the client gets the optimization automatically and silently. And in the mean time, the move in the client code causes no harm, and signals to the reader the intent: last use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit torn here. On the one hand, I understand that std::move is just a request, and programmers can't expect it to actually mean that the given instance is being moved from. On the other hand, specifying it, especially in a non-generic-programming context, when you know it's not applicable only serves to confuse.

Either way, a future programmer has to go looking at the code to attempt to decipher what's going on and if this is a case where the move was missed or explicitly omitted. In that context, I wonder if it makes sense to use a marker similar to std::move that expressly indicates that the programmer expliictly chose to not use std::move?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the vast majority of cases move is used for two things: 1) as an optimization technique; 2) To pass ownership of unique resources.

Code that's missing move (that compiles) is almost never a bug. I don't think anyone is going to look at code where a statement has a missing move and wonder if there's a bug (although they may wondering if there's a missed optimization opportunity).

Up to now, we haven't put move in code that didn't have a benefit (at least intentionally). If we see a move, we can infer that the type would benefit from a move. Seeing move(requiredCondition), gives a false signal about the type of requiredCondition. I would not have guessed that someone would put a move around an enum (of course, that's a circular argument - it sends a signal about the type because that's how we've used it; you are proposing to change that). The other "cost" of putting in a unneeded move is readability suffers. foo(std::move(a), std::move(b), std::move(c)) is more cluttered foo(a, b, c). If we put a move, I'd like it to have a potential benefit more than the surrounding code uses move.

The move in the RCLValidatedLedger(ledger) that you pointed out is a different case. It would not send the wrong signal about the ledger type, and there is a realistic chance that a future change would take advantage of the move.

My vote is I don't want the move for the enum. I'm weakly against the move for RCLValidatedLedger, but only weakly (I'd prefer moves only where it makes a difference); I do see legitimate advantages of putting it there.

Of course, Howard has thought about move semantics move deeply that any of us. I think I'll add the move back in to the RCLValidatedLedger call - Howard's vote should count move than my "weakly against".

I'll also don't think putting a move around an enum is that big a deal. The "costs" I've mentioned are all minor. While I'm more strongly against that change, I'm OK doing it if you (Howard) feel strongly.

Note: I'm not worried about making the static analyzer happy. It would be a large task and I suspect the code would be worse for the effort.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seelabs Maybe I am misunderstanding how move works, but can't we do something like this?: cjcobb23@92a60d5

Based on my understanding of move semantics, the actual move is only going to happen in the callback at NetworkOPs.cpp:2273; specifically, instead of copying the Blobs into the vector, the data will be moved. For the callback that doesn't move the parameters ( NetworkOPs.cpp:2234), no actual move will occur; the arguments will be passed by reference. So, the callback that does not move will not behave any differently. Feel free to correct me if I am wrong (@HowardHinnant ).

Re: reallocation within soci. The parameters used by soci are actually the txnData and txnMeta variables, which have type soci::blob( link). We then, call the convert function (defined here), to copy data from a soci::blob into a Blob. This convert function first performs a resize on the Blob, and then copies the data from soci::blob into Blob.
The move I am proposing would then move those Blobs into the vector (which is captured by reference in the lambdas), instead of copying into the vector.

Let's consider each case. First, without the move: The Blob objects are being reused, which means that the resize in convert might not perform a reallocation; the object will already own some heap storage from the previous iteration of the loop. Then, a copy is performed (inside convert). Then, the Blob is copied into the result vector, resulting in a copy as well as an allocation; the allocation occurs during the copy constructor of Blob. In summary, there is a possible but unlikely reallocation, followed by a copy, another copy and an allocation.

Now, with the move: The Blob objects that are being reused will not own any heap storage, since at the end of each iteration of the loop, the Blobs will be moved from. So, the resize in convert will always perform an allocation. Then, a copy is performed (inside convert). Then, Blob is moved into the result vector, meaning no copy and no allocation.

Comparing the two:
Without move: possible but unlikely allocation, copy, copy, allocation
With move: allocation, copy
To keep things simple, we can ignore the possible but unlikely allocation in the first case. In this way, the two approaches are identical except for an additional copy if the move is not used.

Sorry in advance if I am mistaken and this is all wrong. I would really like the move to occur in this situation, since the account_tx call is rather expensive, and it would be nice to increase the performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjcobb23 Very nice! I did not consider passing rvalue references to the callback (I almost always pass sink parameters as values - that way callers can choose to move or copy at their discretion). In this case there's only one caller, and we know we can move there. Love it! (And smacking myself on the head for not thinking of it myself).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjcobb23 Just a note: For the case of the callback on NetworkOPs.cpp:2234, if we did pass by value, that change would be pessimizing for that case (but of course, not for pass by rvalue ref). So we really wouldn't want to do this if you didn't suggest passing by rvalue ref.

Copy link
Collaborator Author

@seelabs seelabs Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjcobb23 fixed in 334821850 [fold] Use rvalue refs on accountTxPage callbacks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the only person who doesn't find std::move around a small value type confusing. I'd be more confused by seeing a list of constructor initializers where some but not all arguments are moved.

I think Howard has already said everything I feel:

  • A std::move that doesn't move doesn't hurt anything, and could help if, one day, it starts to move.
  • A std::move serves as a good indicator of "I'm done with this value", even if it doesn't move.
  • A std::move is not a guarantee of a move.

In my opinion, any confusion surrounding an appearance of std::move is likely caused by the reader's misinterpretation of std::move as an expression of "this value will be moved" instead of "I'm done with this value, and don't care if it is moved or not". Perhaps we could define an alias like give.

nbougalis
nbougalis previously approved these changes Apr 11, 2020
@@ -1726,7 +1727,7 @@ int ApplicationImp::fdRequired() const
int needed = 128;

// 1.5 times the configured peer limit for peer connections:
needed += static_cast<int>(0.5 + (1.5 * overlay_->limit()));
needed += lround(1.5 * overlay_->limit());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just do away with the floating point math here and just use 2 * overlay_->limit() instead.

Configuring additional file descriptors doesn't actually hurt us, with one exception: we hit the hard rlimit and if that happens, we produce a useful message and exit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in bd4cc1711 [fold] Misc fixes:

@@ -55,7 +55,7 @@ GRPCServerImpl::CallData<Request, Response>::CallData(
, responder_(&ctx_)
, bindListener_(std::move(bindListener))
, handler_(std::move(handler))
, requiredCondition_(std::move(requiredCondition))
, requiredCondition_(requiredCondition)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit torn here. On the one hand, I understand that std::move is just a request, and programmers can't expect it to actually mean that the given instance is being moved from. On the other hand, specifying it, especially in a non-generic-programming context, when you know it's not applicable only serves to confuse.

Either way, a future programmer has to go looking at the code to attempt to decipher what's going on and if this is a case where the move was missed or explicitly omitted. In that context, I wonder if it makes sense to use a marker similar to std::move that expressly indicates that the programmer expliictly chose to not use std::move?

@@ -55,7 +55,7 @@ GRPCServerImpl::CallData<Request, Response>::CallData(
, responder_(&ctx_)
, bindListener_(std::move(bindListener))
, handler_(std::move(handler))
, requiredCondition_(std::move(requiredCondition))
, requiredCondition_(requiredCondition)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move semantics was designed from day 1 to interoperate with "copy only" types with no negative consequences so
that programmers would not have to worry about this issue so much.

This is generally how I feel. I see the use of std::move as a request to move, written as an optimization. If the move doesn't happen, its not usually a bug or a problem, unless I really, really need that optimization. For this reason, I am hesitant to remove a move.

However, I agree that a move around an enum, or any type that clearly has no heap storage (int, bool, etc), is confusing to the reader; an actual move could never happen, so using move makes it seem like the programmer made a mistake, makes the intentions of the code unclear and might cause the reader to question whether they know what is going on. However, I think this situation is somewhat of an exception; the other arguments to the constructor are being moved from, so the intention seems clear to me: "Move each argument, if possible". I'm fine if you want to remove the move here, though I don't think its necessary to do so.

However, I am confused why the move is being removed from here: 06740e6#diff-3f6df86ad506fd7e45db86b186e102e2R2273 .
This seems like a perfect use case for move. Why is this being removed? Is the move not actually happening? If so, instead of removing the std::move, we should figure out how to make the move actually happen.

cjcobb23
cjcobb23 previously approved these changes Apr 14, 2020
Copy link
Contributor

@cjcobb23 cjcobb23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@nbougalis nbougalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. I'd like to see the two comments about addGiveUpdate and addUpdateItem addressed. Beyond that, it's at your discretion.

src/ripple/app/ledger/Ledger.cpp Outdated Show resolved Hide resolved
src/ripple/app/ledger/Ledger.cpp Outdated Show resolved Hide resolved
src/ripple/app/ledger/Ledger.cpp Outdated Show resolved Hide resolved
src/ripple/app/misc/NetworkOPs.cpp Outdated Show resolved Hide resolved
src/ripple/app/misc/impl/AccountTxPaging.cpp Outdated Show resolved Hide resolved
src/ripple/app/misc/impl/TxQ.cpp Outdated Show resolved Hide resolved
@@ -1726,7 +1727,7 @@ int ApplicationImp::fdRequired() const
int needed = 128;

// 1.5 times the configured peer limit for peer connections:
needed += static_cast<int>(0.5 + (1.5 * overlay_->limit()));
needed += lround(1.5 * overlay_->limit());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️

@HowardHinnant
Copy link
Contributor

@HowardHinnant
Copy link
Contributor

Here, https://github.com/ripple/rippled/blob/develop/src/ripple/shamap/impl/SHAMapTreeNode.cpp#L99 , item should now be moved (in 12 similar places).

@seelabs
Copy link
Collaborator Author

seelabs commented Apr 22, 2020

@HowardHinnant I can't comment on top level comments, but all of those top level comments should be fixed in 4e58b1edc [fold] Add some missing std::moves

BTW, I know this needs to be rebased. I expect there will be a new beta tomorrow (the re-formatting beta), so I'll rebase after that beta gets merged.

HowardHinnant
HowardHinnant previously approved these changes Apr 22, 2020
@seelabs
Copy link
Collaborator Author

seelabs commented Apr 24, 2020

Squashed and rebased onto b3

@seelabs
Copy link
Collaborator Author

seelabs commented Apr 28, 2020

rebased onto b4

nbougalis
nbougalis previously approved these changes Apr 28, 2020
@seelabs seelabs added the Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. label Apr 28, 2020
@carlhua carlhua added Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. and removed Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. labels Apr 28, 2020
@seelabs seelabs mentioned this pull request Apr 29, 2020
* Make sure variables are always initialized
* Use lround instead of adding .5 and casting
* Remove some unneeded vars
* Check for null before calling strcmp
* Remove redundant if conditions
* Remove make_TxQ factory function
@seelabs
Copy link
Collaborator Author

seelabs commented May 8, 2020

rebased on b5

@HowardHinnant HowardHinnant self-requested a review May 8, 2020 20:35
This was referenced May 21, 2020
@codecov-commenter
Copy link

Codecov Report

Merging #3348 into develop will decrease coverage by 0.00%.
The diff coverage is 68.42%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3348      +/-   ##
===========================================
- Coverage    70.44%   70.44%   -0.01%     
===========================================
  Files          682      682              
  Lines        54392    54396       +4     
===========================================
+ Hits         38315    38317       +2     
- Misses       16077    16079       +2     
Impacted Files Coverage Δ
src/ripple/app/ledger/Ledger.cpp 78.99% <ø> (ø)
src/ripple/app/misc/TxQ.h 96.66% <ø> (ø)
src/ripple/app/misc/impl/TxQ.cpp 95.98% <ø> (-0.02%) ⬇️
src/ripple/nodestore/impl/Shard.cpp 0.00% <0.00%> (ø)
src/ripple/overlay/impl/OverlayImpl.cpp 28.76% <0.00%> (ø)
src/ripple/overlay/impl/PeerImp.cpp 0.00% <0.00%> (ø)
src/ripple/rpc/impl/ShardArchiveHandler.cpp 62.84% <0.00%> (-0.29%) ⬇️
src/ripple/rpc/impl/TransactionSign.cpp 89.04% <ø> (ø)
src/ripple/shamap/SHAMap.h 96.96% <ø> (ø)
src/ripple/shamap/SHAMapTreeNode.h 100.00% <ø> (ø)
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9771210...ef7c9da. Read the comment docs.

@manojsdoshi manojsdoshi mentioned this pull request May 27, 2020
@seelabs seelabs deleted the cleanups branch May 27, 2020 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants