tools: pingpong total latency #4757

brianolson · 2022-11-04T20:08:39Z

Summary

Measure the total latency of a transaction. Measure from the moment the txn send API returns to the moment we see the txn in a comitted block.

Test Plan

This is a test. Run on local private cluster and maybe aws test cluster.

AlgoAxel · 2022-11-07T19:12:13Z

shared/pingpong/pingpong.go

+		case st := <-pps.sentTxid:
+			if len(txidList) < txidLatencySampleSize {
+				index := len(txidList)
+				txidList = append(txidList, st.txid)
+				byTxid[st.txid] = txidSendTimeIndexed{
+					st,
+					index,
+				}
+			} else {
+				// random replacement
+				evict := rand.Intn(len(txidList))
+				delete(byTxid, txidList[evict])
+				txidList[evict] = st.txid
+				byTxid[st.txid] = txidSendTimeIndexed{
+					st,
+					evict,
+				}
+			}


I really like this random replacement scheme, just thinking out loud -- if your sample size is smaller than the number of data points, why not just do a circular buffer? Advantages I see would be that the datapoints would be still well-ordered and you would not be missing any data for the range of time the sample was collected. The way it works now makes it so that the most recent datapoints are most-likely to be included, and the least recent datapoints are least-likely to be included, which would also be the case with a circular buffer.

if the rate is larger than the buffer then a circular buffer could lose almost all the data. With a buffer of 10_000 but 26_000 transactions in a block it would only know about the most recent transactions and only measure their latency. Better to measure over a longer duration.

sorry why not make it 26000 then?

Old habit from working in RAM-scare environments. And to make up some more justification: maybe I don't even want to log all of the txns, but just a sample, because we also don't need to process a full 6000 TPS of this data.

AlgoAxel · 2022-11-07T19:12:39Z

shared/pingpong/pingpong.go

+func (pps *WorkerState) txidLatencyBlockWaiter(ctx context.Context, ac *libgoal.Client) {
+	done := ctx.Done()
+restart:
+	// I wish Go had macros


// something something vim your way to Go Macros :)

AlgoAxel · 2022-11-07T19:20:19Z

shared/pingpong/pingpong.go

+			fmt.Fprintf(os.Stderr, "block waiter w: %v", err)
+			time.Sleep(5 * time.Second)
+			goto restart


Looks like this loop feeds blocks to the latencyBlocks handling, which in turn calls time.Now to figure out the latency from the recorded time to block time.

But, looking at this bit here, is it possible that this loop will be sleeping when goal publishes a new block for consumption? If that happens, the time.Now used for calculation would contain that delay, right?

Since this is error handling, I suspect that we don't really expect small temporary errors like that, but wanted to check anyhow.

reduced the err restart time to 1 second; maybe the error condition will go away and we'll restart the API calls and not oversleep the round change.

codecov · 2022-12-08T21:14:30Z

Codecov Report

Merging #4757 (315c271) into master (c41cd69) will decrease coverage by 0.76%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master    #4757      +/-   ##
==========================================
- Coverage   53.63%   52.88%   -0.76%     
==========================================
  Files         432      432              
  Lines       54058    54166     +108     
==========================================
- Hits        28996    28647     -349     
- Misses      22813    23243     +430     
- Partials     2249     2276      +27

Impacted Files	Coverage Δ
shared/pingpong/config.go	`0.00% <ø> (ø)`
shared/pingpong/pingpong.go	`0.00% <0.00%> (ø)`
data/transactions/error.go	`0.00% <0.00%> (-50.00%)`	⬇️
tools/network/dnssec/client.go	`44.11% <0.00%> (-38.24%)`	⬇️
catchup/ledgerFetcher.go	`13.48% <0.00%> (-25.85%)`	⬇️
cmd/tealdbg/cdtdbg.go	`63.52% <0.00%> (-18.83%)`	⬇️
tools/network/dnssec/trustedzone.go	`88.60% <0.00%> (-10.13%)`	⬇️
util/codecs/json.go	`0.00% <0.00%> (-9.81%)`	⬇️
data/basics/overflow.go	`43.05% <0.00%> (-9.73%)`	⬇️
data/transactions/transaction.go	`34.19% <0.00%> (-8.71%)`	⬇️
... and 36 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

--aout append path

brianolson · 2023-01-13T14:20:30Z

after a bunch of updates I think this is again ready for review and it would be good to get this extra measurement into any new tests

shared/pingpong/pingpong.go

algorandskiy · 2023-01-17T22:43:52Z

shared/pingpong/pingpong.go

+	out := pps.latencyOuts[len(pps.latencyOuts)-1]
+	for {
+		select {
+		case st := <-pps.sentTxid:


true sampling should be done on the pps.sentTxid writer side. Otherwise 10k samples will be fully overwritten in few rounds under full TPS.

algobolson · 2023-01-18T22:11:00Z

circleci is dumb

algobolson · 2023-01-18T22:11:09Z

circleci is still dumb

brianolson added 3 commits October 25, 2022 15:30

wip init

f80e4e2

Merge remote-tracking branch 'origin/master' into pp-total-latency

86645fc

working tx latency measurement

ed4801f

brianolson added the Team Carbon-11 label Nov 4, 2022

Merge remote-tracking branch 'origin/master' into pp-total-latency

1fb74bb

AlgoAxel reviewed Nov 7, 2022

View reviewed changes

brianolson added 9 commits November 8, 2022 14:33

wip

5138c89

txid trace

430befc

standalone tx latency tester

14f8cc2

Merge remote-tracking branch 'origin/master' into pp-total-latency

c1c01fd

wip debug

a569089

error messages

99ec33c

fix round fetching

88a6d45

fix

4f3223e

plot output of pingpong TotalLatencyOut

b00d7fc

brianolson changed the title ~~test: pingpong total latency~~ tools: pingpong total latency Nov 22, 2022

brianolson marked this pull request as ready for review November 22, 2022 21:43

algorandskiy assigned brianolson Dec 6, 2022

brianolson added the New Feature label Dec 8, 2022

Merge remote-tracking branch 'origin/master' into pp-total-latency

cf3c437

brianolson marked this pull request as draft December 8, 2022 20:37

cleanup. comment

e08a71d

brianolson marked this pull request as ready for review December 8, 2022 21:21

brianolson added 6 commits December 8, 2022 16:27

smaller err restart time so we less likely to oversleep a block

9f8a93b

Merge remote-tracking branch 'origin/master' into pp-total-latency

9224c5b

pingpong run --latency latencylog.gz

1c3197b

gzip in, statistics out

2a44310

accumulate data over walking a dir of tars of logs

f8c87a4

latency plot reporting tweaks

04f778a

--aout append path

latency report html mode

226f3a6

algorandskiy reviewed Jan 17, 2023

View reviewed changes

shared/pingpong/pingpong.go Outdated Show resolved Hide resolved

shared/pingpong/pingpong.go Outdated Show resolved Hide resolved

shared/pingpong/pingpong.go Outdated Show resolved Hide resolved

shared/pingpong/pingpong.go Outdated Show resolved Hide resolved

algorandskiy reviewed Jan 17, 2023

View reviewed changes

algobolson closed this Jan 18, 2023

algobolson reopened this Jan 18, 2023

code review cleanup

315c271

algorandskiy approved these changes Jan 19, 2023

View reviewed changes

algorandskiy closed this Jan 19, 2023

algorandskiy reopened this Jan 19, 2023

algorandskiy merged commit 67b06a4 into algorand:master Jan 19, 2023

This was referenced Jan 23, 2023

go-algorand 3.14.0-beta Release PR #5043

Merged

go-algorand 3.14.1-beta Release PR #5055

Merged

This was referenced Jan 30, 2023

go-algorand 3.14.1-stable Release PR #5076

Merged

go-algorand 3.14.2-stable Release PR #5110

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tools: pingpong total latency #4757

tools: pingpong total latency #4757

brianolson commented Nov 4, 2022

AlgoAxel Nov 7, 2022

brianolson Jan 13, 2023

bbroder-algo Jan 13, 2023

brianolson Jan 13, 2023

AlgoAxel Nov 7, 2022

AlgoAxel Nov 7, 2022

brianolson Dec 8, 2022

codecov bot commented Dec 8, 2022 •

edited

Loading

brianolson commented Jan 13, 2023

algorandskiy Jan 17, 2023

algobolson commented Jan 18, 2023

algobolson commented Jan 18, 2023

tools: pingpong total latency #4757

tools: pingpong total latency #4757

Conversation

brianolson commented Nov 4, 2022

Summary

Test Plan

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Dec 8, 2022 • edited Loading

Codecov Report

brianolson commented Jan 13, 2023

Choose a reason for hiding this comment

algobolson commented Jan 18, 2023

algobolson commented Jan 18, 2023

codecov bot commented Dec 8, 2022 •

edited

Loading