Catchpoint: Optimize catchpoint #4254

ghost · 2022-07-12T16:01:39Z

Summary

Test Plan

ledger/accountdb.go

ledger/catchpointwriter.go

ledger/catchpointwriter_test.go

algorandskiy · 2022-07-22T21:00:05Z

ledger/catchupaccessor.go

@@ -330,12 +337,80 @@ func (c *CatchpointCatchupAccessorImpl) processStagingBalances(ctx context.Conte
 		}

 		normalizedAccountBalances, err = prepareNormalizedBalancesV6(balances.Balances, c.ledger.GenesisProto())
+		isNotFinalEntry = make([]bool, len(balances.Balances))
+		for i, balance := range balances.Balances {
+			isNotFinalEntry[i] = balance.IsNotFinalEntry


why need to store? the look below always accesses the current value ie. isNotFinalEntry[i] while processing i'th entry

The old version catchpointFileBalancesChunkV5 won't have that field. This way it wont break with old catchpoint versions.

ledger/accountdb.go

codecov · 2022-07-27T18:56:06Z

Codecov Report

Merging #4254 (cfa250d) into master (e4d6d42) will increase coverage by 0.09%.
The diff coverage is 72.13%.

@@            Coverage Diff             @@
##           master    #4254      +/-   ##
==========================================
+ Coverage   55.19%   55.28%   +0.09%     
==========================================
  Files         398      398              
  Lines       50165    50263      +98     
==========================================
+ Hits        27689    27789     +100     
- Misses      20159    20162       +3     
+ Partials     2317     2312       -5

Impacted Files	Coverage Δ
ledger/catchupaccessor.go	`62.15% <56.36%> (-0.87%)`	⬇️
ledger/accountdb.go	`73.02% <75.45%> (+0.44%)`	⬆️
ledger/catchpointtracker.go	`62.89% <100.00%> (ø)`
ledger/catchpointwriter.go	`59.13% <100.00%> (+1.47%)`	⬆️
ledger/tracker.go	`74.78% <0.00%> (ø)`
ledger/acctonline.go	`79.41% <0.00%> (+0.52%)`	⬆️
catchup/service.go	`70.12% <0.00%> (+0.74%)`	⬆️
data/transactions/verify/txn.go	`44.64% <0.00%> (+0.89%)`	⬆️
catchup/peerSelector.go	`100.00% <0.00%> (+1.04%)`	⬆️
... and 3 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

ledger/accountdb.go

ledger/catchpointwriter_test.go

algorandskiy · 2022-07-28T15:28:19Z

ledger/accountdb.go

+		}
+
+		if moreRows {
+			// we're done with this iteration.


this looks inconsistent:
the original implementation supplies accountCount as a batch size, but for resources we limit it in resCb. I think it would be clearer and consistent if processAllResources would accept a batch size like this function. In this case we'll not need to circle the moreRows / chunkFull from the callback down to the readers and back.

cce · 2022-08-19T16:10:14Z

ledger/catchpointwriter.go

@@ -79,14 +85,18 @@ type encodedBalanceRecordV6 struct {
 	Address     basics.Address      `codec:"a,allocbound=crypto.DigestSize"`
 	AccountData msgp.Raw            `codec:"b,allocbound=basics.MaxEncodedAccountDataSize"`
 	Resources   map[uint64]msgp.Raw `codec:"c,allocbound=basics.MaxEncodedAccountDataSize"`
+
+	// flag indicating whether there are more records for the same account coming up
+	ExpectingMoreEntries bool `codec:"e"`


So, if a client is running a version that doesn't have this change, they will get a message about "unrecognized msgp field e" or similar? Seems fine ... especially if there is a consensus upgrade coming up

algorandskiy · 2022-08-19T19:30:46Z

Catchpoint apply code appears broken
feature

~/go/bin-cp-oom/catchpointdump net -l -n betanet.algodev.network -r 19780000 -s
[ Done ] Downloaded http://r4.betanet.algodev.network:4160/v1/betanet-v1.0/ledger/brycg
[ Done ] Loaded
Unable to load catchpoint file into in-memory database : sql: no rows in result set

master

catchpointdump net -l -n betanet.algodev.network -r 19780000 -s
[ Done ] Downloaded http://r3.betanet.algodev.network:4160/v1/betanet-v1.0/ledger/brycg
[ Done ] Loaded

algorandskiy · 2022-08-22T15:09:09Z

vet: ledger/catchpointwriter_test.go:361:22: undeclared name: ioutil

Verified catchpoints creation and applying on betanet, all right.

ledger/accountdb.go

ledger/catchupaccessor_test.go

cce · 2022-08-22T16:10:11Z

ledger/accountdb.go

+		} else {
+			var sqliteErr sqlite3.Error
+			if errors.As(err, &sqliteErr) && sqliteErr.Code == sqlite3.ErrConstraint && sqliteErr.ExtendedCode == sqlite3.ErrConstraintUnique  {
+				// address exists: overflowed account record: find addrid


didn't expect you would rely on the DB to tell you if you re-used the same account twice, I assumed you would just be able to tell as you were going through the bals list and saw a duplicate. But I guess you didn't want to assume the bals list is ordered?

Maybe you could add a check here to make sure that the balance.encodedAccountData is empty?

Why should balance.encodedAccountData be empty?

doesn't this mean you're on the second chunk for this account, and it's all resources now?

the idea is base account info is always set, and overflowed entries just chunks with some more data.
Nicholas had a prev version with tracking account address from the top (see d895c5d) and this did not look elegant at all.
We do have resources counting logic so the applying catchpoint would break if Total* counters mismatch to actual counts (@nicholasguoalgorand do we have a test for this?) so relying on DB looks safe here.

oh I see. so the chunks really could come in any order, then, too. OK! but only one of the encodedAccountData blobs would actually make it into the accounts table — hopefully they are all the same.

cce · 2022-08-22T16:19:33Z

ledger/accountdb.go

 		}
-		err = callback(addr, aidx, &resData, buf)
+		count++
+		if resourceCount > 0 && count == resourceCount {


maybe a comment here to explain what this callback(..., true) is for

+1 need a comment

cce · 2022-08-22T16:22:32Z

ledger/catchpointwriter.go

+
+	// DefaultMaxResourcesPerChunk defines the max number of resources that go in a singular chunk
+	// 3000000 resources * 300B/resource => roughly max 1GB per chunk
+	DefaultMaxResourcesPerChunk = 3000000


1GB is a lot of RAM to handle a chunk, plus it's probably more with various copying/decoding activity going on... plus there are really big resources (app data) that can be greater than 300B. how about e.g. 10x or 100x smaller?

You could also be tracking the size of the encoded resources, e.g. resourceBytesCount, but maybe just setting this to 1000, 5000, 10000 or something would be easier.

changed to 10x smaller

algorandskiy · 2022-08-22T19:04:16Z

ledger/catchpointwriter.go

@@ -35,26 +35,32 @@ const (
 	// BalancesPerCatchpointFileChunk defines the number of accounts that would be stored in each chunk in the catchpoint file.
 	// note that the last chunk would typically be less than this number.
 	BalancesPerCatchpointFileChunk = 512
+
+	// DefaultMaxResourcesPerChunk defines the max number of resources that go in a singular chunk
+	// 3000000 resources * 300B/resource => roughly max 1GB per chunk


please update the comment as well

algorandskiy · 2022-08-23T14:08:07Z

ledger/catchpointwriter.go

@@ -35,26 +35,32 @@ const (
 	// BalancesPerCatchpointFileChunk defines the number of accounts that would be stored in each chunk in the catchpoint file.
 	// note that the last chunk would typically be less than this number.
 	BalancesPerCatchpointFileChunk = 512
+
+	// DefaultMaxResourcesPerChunk defines the max number of resources that go in a singular chunk
+	// 3000000 resources * 300B/resource => roughly max 100MB per chunk


this line still has 3,000,000 not 300k

nicholasguo added 4 commits July 7, 2022 13:34

wip

ac1cc48

more wip

998dc7b

inital catchpoint writing done

02f54e7

read db test passed

e3226d6

ghost self-assigned this Jul 12, 2022

ghost marked this pull request as draft July 12, 2022 16:01

impl apply

11fd7bb

ghost added Enhancement Team Carbon-11 labels Jul 12, 2022

ghost changed the title ~~Optimize catchpoint~~ Catchpoint: Optimize catchpoint Jul 12, 2022

nicholasguo added 7 commits July 13, 2022 09:52

fix bug

8a5aac6

wip full test

ae8c7ad

test fixed

3b6fb4e

remove extra file

b67d60d

sanity

8650784

fix apply check

ecd9bab

remake msgp

f5902c9

algorandskiy reviewed Jul 22, 2022

View reviewed changes

nicholasguo added 5 commits July 26, 2022 10:37

Merge branch 'master' into nguo/catchpoint-oom

3be5553

some fix

c71c988

addressed comments

8424a23

fix bug

91b153f

fix test

b369d16

ghost commented Jul 27, 2022

View reviewed changes

ledger/accountdb.go Show resolved Hide resolved

nicholasguo added 3 commits July 27, 2022 13:01

remove os.remove

0bf647e

suppress reviewdog warning

2b5fc9d

gofmt

90db558

ghost marked this pull request as ready for review July 27, 2022 23:40

algorandskiy reviewed Jul 28, 2022

View reviewed changes

nicholasguo added 2 commits July 28, 2022 11:53

refactor

4d0d542

Merge branch 'master' into nguo/catchpoint-oom

00eee65

cce reviewed Aug 19, 2022

View reviewed changes

nicholasguo added 3 commits August 19, 2022 15:55

fix 0 address bug

d895c5d

Merge branch 'master' into nguo/catchpoint-oom

c236bb7

merge

0557469

algorandskiy requested changes Aug 22, 2022

View reviewed changes

ledger/accountdb.go Outdated Show resolved Hide resolved

ledger/catchupaccessor_test.go Outdated Show resolved Hide resolved

nicholasguo added 2 commits August 22, 2022 08:44

fix

0d1a5c8

fmt

5500d20

cce reviewed Aug 22, 2022

View reviewed changes

nicholasguo added 4 commits August 22, 2022 09:32

Merge branch 'master' into nguo/catchpoint-oom

ef0bc94

fix

7d0b952

add test

f2a6b36

decrease resource limit

60d9d54

algorandskiy reviewed Aug 22, 2022

View reviewed changes

update comment

6ec6899

algorandskiy reviewed Aug 23, 2022

View reviewed changes

update comment

cfa250d

algorandskiy approved these changes Aug 23, 2022

View reviewed changes

algorandskiy merged commit b89953a into algorand:master Aug 23, 2022

Algo-devops-service mentioned this pull request Sep 20, 2022

go-algorand 3.10.0-beta Release PR #4565

Merged

onetechnical mentioned this pull request Sep 30, 2022

go-algorand 3.10.0-beta Release PR #4612

Merged

Algo-devops-service mentioned this pull request Sep 30, 2022

go-algorand 3.10.0-stable Release PR #4618

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Catchpoint: Optimize catchpoint #4254

Catchpoint: Optimize catchpoint #4254

ghost commented Jul 12, 2022

algorandskiy Jul 22, 2022

ghost Jul 26, 2022

codecov bot commented Jul 27, 2022 •

edited

Loading

algorandskiy Jul 28, 2022

cce Aug 19, 2022

algorandskiy commented Aug 19, 2022

algorandskiy commented Aug 22, 2022

cce Aug 22, 2022

cce Aug 22, 2022

ghost Aug 22, 2022

cce Aug 22, 2022

algorandskiy Aug 22, 2022

cce Aug 22, 2022

cce Aug 22, 2022

algorandskiy Aug 23, 2022

cce Aug 22, 2022

cce Aug 22, 2022 •

edited

Loading

ghost Aug 22, 2022

algorandskiy Aug 22, 2022

algorandskiy Aug 23, 2022

Catchpoint: Optimize catchpoint #4254

Catchpoint: Optimize catchpoint #4254

Conversation

ghost commented Jul 12, 2022

Summary

Test Plan

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 27, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

algorandskiy commented Aug 19, 2022

algorandskiy commented Aug 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cce Aug 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 27, 2022 •

edited

Loading

cce Aug 22, 2022 •

edited

Loading