Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.0] PH: Reliability improvements #1814

Merged
merged 10 commits into from
Nov 2, 2023
Merged

[5.0] PH: Reliability improvements #1814

merged 10 commits into from
Nov 2, 2023

Conversation

heifner
Copy link
Member

@heifner heifner commented Oct 24, 2023

  • Performance harness tests were failing because the last few trxs from the trx_generator were being dropped. Modified the trx_generator to wait for all writes to finish before terminating.
  • Modify the performance harness to wait for the complete test time for the trxs in blocks.
  • Update the performance harness to --print-missing-transactions by default as this is very useful for debugging why the test failed.
  • Updated Cluster.py to support --produce-block-offset-ms 0 for older versions of nodeos.

Still need to run a complete test run on a quiescent machine.

Issue #1690

…sync_write instead of send. Wait for all write callbacks to finish before exit.
…e --print-missing-transactions=True the default as it is useful when there are missing trxs.
@heifner heifner requested review from greg7mdp and linh2931 October 24, 2023 17:01
@heifner heifner added the OCI Work exclusive to OCI team label Oct 24, 2023
@@ -495,7 +495,7 @@ def configureConnections():
scrapeTrxGenTrxSentDataLogs(trxSent, self.trxGenLogDirPath, self.ptbConfig.quiet)
if len(trxSent) != self.ptbConfig.expectedTransactionsSent:
print(f"ERROR: Transactions generated: {len(trxSent)} does not match the expected number of transactions: {self.ptbConfig.expectedTransactionsSent}")
blocksToWait = 2 * self.ptbConfig.testTrxGenDurationSec + 10
blocksToWait = self.data.startBlock + (2 * self.ptbConfig.testTrxGenDurationSec) + 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocksToWait is meant to be the number of blocks to wait after a given block. so the startBlock was not included on purpose. You should see the startBlock being added into the call to waitForTrasactionsInBlockRange in performance_test_basic.py and then in Node.py waitForTransactionsInBlockRange -> overallEndBlock = startBlock + maxFutureBlocks

Copy link
Member Author

@heifner heifner Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I looked at that wrong. It must have been failing because it took more than 10 blocks for the trx generators to fire up as it was consistently failing on not getting the trxs from the blocks when they were there. Since the function is called waitForTransactionsInBlockRange, I think I'll change it to take a range instead of start and offset.

@@ -39,7 +39,7 @@ class TpsTestConfig:
numTrxGensUsed: int = 0
targetTpsPerGenList: List[int] = field(default_factory=list)
quiet: bool = False
printMissingTransactions: bool=False
printMissingTransactions: bool=True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this was not defaulted to True initially was that if the transaction generators initially fail to connect completely, then the log file is riddled with potentially tens of thousands of log messages for dropped transactions.

@@ -461,6 +461,11 @@ def connectGroup(group, producerNodes, bridgeNodes) :
if "--plugin eosio::history_api_plugin" in args:
argsArr.append("--is-nodeos-v2")
break

# Handle common case of specifying no block offset for older versions
if "v2" in self.nodeosVers or "v3" in self.nodeosVers or "v4" in self.nodeosVers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a little better, no need to change though.

Suggested change
if "v2" in self.nodeosVers or "v3" in self.nodeosVers or "v4" in self.nodeosVers:
if any(s in self.nodeosVers for s in ["v2", "v3", "v4"]):

@heifner heifner merged commit dafd740 into release/5.0 Nov 2, 2023
@heifner heifner deleted the GH-1690-perf-5.0 branch November 2, 2023 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCI Work exclusive to OCI team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants