Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve online delete backend lock #3342

Closed
wants to merge 3 commits into from

Conversation

miguelportilla
Copy link
Contributor

Addresses a possibility in the online delete process where one or more backend shared pointer references may become invalid during rotation.

@nbougalis
Copy link
Contributor

@miguelportilla, this branch is based off of the wrong 1.6.0-b1; your commit needs to apply on top of 4f422f6.

Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice changes, this improves the lock situation quite a bit. Left two minor comments.

@@ -102,19 +148,49 @@ DatabaseRotatingImp::sweep()
std::shared_ptr<NodeObject>
DatabaseRotatingImp::fetchFrom(uint256 const& hash, std::uint32_t seq)
{
Backends b = getBackends();
auto nObj = fetchInternal(hash, b.writableBackend);
auto backends = [&] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good place for a structured binding (here and in the similar lambda in for_each). But fine as-is as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, thanks!

src/ripple/app/misc/SHAMapStoreImp.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@nbougalis nbougalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 but holding off on official signoff for a response to https://github.com/ripple/rippled/pull/3342/files#r409703104

seelabs
seelabs previously approved these changes Apr 17, 2020
Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

nbougalis
nbougalis previously approved these changes Apr 20, 2020
savedState.lastRotated = lastRotated_;
state_db_.setState(savedState);

JLOG(journal_.warn()) << "finished rotation " << validatedSeq;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous locking behavior needs to be there for rotateBackends() and for clearCaches()--the same lock needs to cover them both. For the fullbelow cache, each call to clear() increments an integer called "m_gen" (generation). The generation determines if the cache entry is valid for the existing backends, or if it applies to backends that just rotated.

{
auto oldBackend {std::move(archiveBackend_)};
std::lock_guard lock(mutex_);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be injected by the caller, described above.

savedState.archiveDb = dbRotating_->rotateBackends(
std::move(newBackend));
savedState.lastRotated = lastRotated_;
state_db_.setState(savedState);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setState() should come before rotateBackends(), as in previous behavior. In the event of process crash after setState() but before rotateBackends(), then upon process startup, the correct Backends would be in place because the state database would reflect that, even if they hadn't been rotated within the process before the crash. However, with the new behavior, if the process crashes after rotation, but before the Backend names (filepthss) are persisted, then upon process restart, rippled will expect to see the old Backends.

As for the lock, in previous behavior, setState() and rotateBackend() were both called within a lock. It looks safe to now call setState() without being locked together with rotateBackend().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: could we "wrap" this logic inside a TRANSACTION?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean a SQLIte transaction, no. Because rotateBackends() is not involved with SQLite, but affects the Backend objects that write to the nodestore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtrippled Good point, addressed.

@mtrippled mtrippled requested a review from nbougalis April 21, 2020 00:47
Copy link
Contributor

@nbougalis nbougalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditional approval, based on @mtrippled's concerns being addressed.

@miguelportilla miguelportilla dismissed stale reviews from nbougalis and seelabs via f66e258 April 30, 2020 17:32
@miguelportilla miguelportilla force-pushed the rotate_fix branch 2 times, most recently from f66e258 to 0d262e6 Compare April 30, 2020 17:37
@codecov-io
Copy link

Codecov Report

Merging #3342 into develop will decrease coverage by 0.01%.
The diff coverage is 46.55%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3342      +/-   ##
===========================================
- Coverage    70.23%   70.21%   -0.02%     
===========================================
  Files          683      683              
  Lines        54630    54648      +18     
===========================================
+ Hits         38370    38372       +2     
- Misses       16260    16276      +16     
Impacted Files Coverage Δ
src/ripple/nodestore/DatabaseRotating.h 100.00% <ø> (ø)
src/ripple/nodestore/impl/DatabaseRotatingImp.h 66.66% <ø> (+20.95%) ⬆️
src/ripple/nodestore/impl/DatabaseRotatingImp.cpp 52.22% <34.04%> (-22.29%) ⬇️
src/ripple/app/misc/SHAMapStoreImp.cpp 74.59% <100.00%> (+0.13%) ⬆️
src/ripple/server/impl/BaseWSPeer.h 68.10% <0.00%> (-0.55%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2bf3b19...c2541ad. Read the comment docs.

Copy link
Collaborator

@mtrippled mtrippled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rotateWithLock with a callback is a decent solution to the lock problem. 👍

@miguelportilla miguelportilla added the Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. label May 1, 2020
@nbougalis nbougalis mentioned this pull request May 3, 2020
@miguelportilla miguelportilla deleted the rotate_fix branch June 2, 2020 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants