Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve deterministic transaction sorting in TxQ #4077

Closed
wants to merge 1 commit into from

Conversation

ximinez
Copy link
Collaborator

@ximinez ximinez commented Jan 19, 2022

High Level Overview of Change

This is a follow-up to #4022. That PR added functionality to order the transaction queue by fee descending, then by transaction ID ascending. This PR makes the transaction ordering less predictable by XORing the transaction ID with the parent ledger hash. This is similar to the mechanism used in CanonicalTXSet to randomize the order of accounts, so no account has an advantage. This change randomizes the order of transactions paying the same fee, so no transaction has an advantage.

After weighing several options, I decided that the simplest solution was to leave the data structure intact and only change the comparison function - basically the same type of change made to transaction queue ordering in #4022. The trick to making it work is to reorder the queue whenever the parent ledger hash changes, which happens in TxQ::accept. Again, to keep things simple, the queue data structure (byFee_) is emptied and rebuilt. That structure is intrusive - the objects are actually scoped in the byAccount_ structure - so there is zero risk of anything getting dropped. It turns out this method is fast, too.

As validators upgrade to the version with this change, if the network is under high load, such that there are a lot of transactions in the queues, their initial proposals will diverge, but that won't be a problem because the issues around transactions not being properly deferred have been fixed.

No amendment is needed for this change.

Type of Change

  • [X ] New feature (non-breaking change which adds functionality)

Test Plan

  • Reproduce the conditions that were tested in Address elevated transaction fees. #4022 on a test network.
  • Migrate the test validators one at a time to this version.
  • Verify that there is no additional congestion, and that validators stay in sync.

Copy link
Collaborator

@scottschurr scottschurr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this. The approach looks good. I appreciate the time you took to try out different ways of re-sorting the queued transactions.

I left a few comments for your consideration. I'll be interested to hear your take on them.

@@ -1740,7 +1775,7 @@ TxQ::getTxRequiredFeeAndSeq(
}

std::vector<TxQ::TxDetails>
TxQ::getAccountTxs(AccountID const& account, ReadView const& view) const
TxQ::getAccountTxs(AccountID const& account) const
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for removing the unused argument.

@@ -1761,7 +1796,7 @@ TxQ::getAccountTxs(AccountID const& account, ReadView const& view) const
}

std::vector<TxQ::TxDetails>
TxQ::getTxs(ReadView const& view) const
TxQ::getTxs() const
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spotting! Thanks for removing the unused argument.

Comment on lines 1540 to 1542
// std::cerr << "Changing parent hash for ledger " <<
// view.info().seq << " from " << parentHash_ <<
// " to " << parentHash << std::endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove the commented out writes to cerr or turn it into a JLOG?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh geez. I completely forgot to take those out. Whoops. 🤦

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines +650 to +654
return (lhs.txID ^ MaybeTx::parentHashComp) <
(rhs.txID ^ MaybeTx::parentHashComp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doxygen comment preceding this change probably needs to be updated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// In the absence of a "re-sort the list in place" function, this
// was the fastest method tried to repopulate the list.
// Other methods included: create a new list and moving items over one at a
// time, create a new list and merge the old list into it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate you trying the various techniques and leaving a comment that you did so. That leaves me less inclined to do my own experiments.

BEAST_EXPECT(metrics.minProcessingFeeLevel == expectedMin);
BEAST_EXPECT(metrics.medFeeLevel == expectedMed);
using namespace std::string_literals;
BEAST_EXPECTS(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great change. It's really useful for the error message to display the expected vs actual values.

Since we're in here improving error reporting, I'd like to suggest we take it a step further. We could also report the line that the error comes from. Line 55 (this actual line) is not very useful since there are dozens of lines in the tests that call this error checking facility. We'd like to know which calling line had the error.

That can be accomplished by having the caller pass in their line number. It might look something like...

    checkMetrics(
        int line,
        jtx::Env& env,
        std::size_t expectedCount,
        std::optional<std::size_t> expectedMaxCount,
        std::size_t expectedInLedger,
        std::size_t expectedPerLedger,
        std::uint64_t expectedMinFeeLevel,
        std::uint64_t expectedMedFeeLevel = 256 * 500)

Then we explicitly add the line information to the fail() call, like:

        metrics.txCount == expectedCount
            ? pass()
            : fail(
                  "txCount: "s + std::to_string(metrics.txCount) + "/" +
                      std::to_string(expectedCount),
                  __FILE__,
                  line);

Then callers need to pass the line number in, like...

        checkMetrics(__LINE__, env, 0, flagMaxQueue, 0, flagPerLedger, 256);

You can see what I have in mind in the top-most commit here: https://github.com/scottschurr/rippled/commits/ed-txq-order-lcl

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea, and I'm totally stealing it.

/*
std::cerr << "Acct: " << tx.txn->at(sfAccount) <<
", Seq: " << tx.txn->at(sfSequence) << std::endl;
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why this was put in. But I don't think we want to leave the write to cerr in the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines 1467 to 1491
/*
std::cerr << "Queued:" << std::endl;
for (auto const& pair : qTxCount1)
{
std::cerr << "\t( " << pair.first << ", " << pair.second <<
" )" << std::endl;
}
std::cerr << std::endl;
std::cerr << "alice: " << alice.id() << ", seq: " << aliceSeq <<
", q: " << qTxCount1[alice.id()] << std::endl;
std::cerr << "bob: " << bob.id() << ", seq: " << bobSeq <<
", q: " << qTxCount1[bob.id()] << std::endl;
std::cerr << "charlie: " << charlie.id() << ", seq: " << charlieSeq <<
", q: " << qTxCount1[charlie.id()] << std::endl;
std::cerr << "daria: " << daria.id() << ", seq: " << dariaSeq <<
", q: " << qTxCount1[daria.id()] << std::endl;
std::cerr << "elmo: " << elmo.id() << ", seq: " << elmoSeq <<
", q: " << qTxCount1[elmo.id()] << std::endl;
std::cerr << "fred: " << fred.id() << ", seq: " << fredSeq <<
", q: " << qTxCount1[fred.id()] << std::endl;
std::cerr << "gwen: " << gwen.id() << ", seq: " << gwenSeq <<
", q: " << qTxCount1[gwen.id()] << std::endl;
std::cerr << "hank: " << hank.id() << ", seq: " << hankSeq <<
", q: " << qTxCount1[hank.id()] << std::endl;
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to remove the commented out writes to cerr.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines 1520 to 1527
/*
std::cerr << "Queued:" << std::endl;
for (auto const& pair : qTxCount1)
{
std::cerr << "\t( " << pair.first << ", " << pair.second <<
" )" << std::endl;
}
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to remove the commented out writes to cerr.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines 1540 to 1547
/*
std::cerr << "Queued:" << std::endl;
for (auto const& pair : qTxCount2)
{
std::cerr << "\t( " << pair.first << ", " << pair.second <<
" )" << std::endl;
}
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to remove the commented out writes to cerr.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Collaborator

@scottschurr scottschurr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks great! Thanks for doing this.

@ximinez ximinez added the Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required. label Feb 25, 2022
@ximinez ximinez force-pushed the txq-order-lcl branch 3 times, most recently from 01f56e2 to a491a84 Compare February 25, 2022 22:14
@ximinez
Copy link
Collaborator Author

ximinez commented Feb 26, 2022

Rebased on to 1.8.5 and squashed.

@ximinez ximinez force-pushed the txq-order-lcl branch 2 times, most recently from d63702d to 69b40a1 Compare February 26, 2022 04:55
* Txs with the same fee level will sort by TxID XORed with the parent
  ledger hash.
* The TxQ is re-sorted after every ledger.
* Attempt to future-proof the TxQ tie breaking test
@manojsdoshi manojsdoshi mentioned this pull request Mar 30, 2022
@ximinez ximinez deleted the txq-order-lcl branch March 31, 2022 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready to merge *PR author* thinks it's ready to merge. Has passed code review. Perf sign-off may still be required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants