Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make chunk assignment immutable #6850

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

tim-barry
Copy link
Contributor

@tim-barry tim-barry commented Jan 7, 2025

The assignment of verifiers to chunks of an execution receipt should be immutable after it is created. Currently it is mutable, with the Assignment.Add() method being used to add chunks (and the verifiers for them) one-by-one during creation. This PR adds a new AssignmentBuilder type and moves the Add() method to it.

Assignment now has no methods that mutate it. However, the initial implementation doesn't do a deep copy of the data, so any changes to the Builder will continue to be reflected in already-built Assignments. No existing code does this or should do this (reuse the Builder after building the assignment) so I believe this isn't a problem.

In addition, the PR converts the type of the mapping from chunks to verifiers from a map[chunk.Index] to a slice, still indexed by the chunk.Index.

Issue: https://github.com/onflow/flow-go-internal/issues/5357

Add a new `AssignmentBuilder` type and move the `Add()` method to it.
`Assignment` now has no methods that mutate it. However, the initial
implementation doesn't do a deep copy of the data, so any changes to
the Builder will continue to be reflected in already-constructed `Assignment`s.

Some existing tests use other functions of the Assignment while still building it;
this behaviour hasn't been changed, and these tests continue to work as before.
@codecov-commenter
Copy link

codecov-commenter commented Jan 7, 2025

Codecov Report

Attention: Patch coverage is 27.27273% with 40 lines in your changes missing coverage. Please review.

Project coverage is 41.18%. Comparing base (c6a1f97) to head (edc05b0).
Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
model/chunks/chunkassignment.go 0.00% 21 Missing ⚠️
engine/verification/utils/unittest/helper.go 41.17% 10 Missing ⚠️
utils/unittest/chain_suite.go 0.00% 6 Missing ⚠️
engine/verification/utils/mocked.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6850      +/-   ##
==========================================
+ Coverage   41.15%   41.18%   +0.02%     
==========================================
  Files        2065     2109      +44     
  Lines      182499   185673    +3174     
==========================================
+ Hits        75116    76477    +1361     
- Misses     101077   102788    +1711     
- Partials     6306     6408     +102     
Flag Coverage Δ
unittests 41.18% <27.27%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Cleaned up the code for the test, and in the process discovered that the test
did not do what it claimed to:
Each verifier would be assigned a pair of adjacent chunks, and those pairs
would be overlapping; so the first chunk had one verifier, the next four had
two, the sixth had one, and the last four chunks were not assigned any verifiers.

Updated the test to use (and double-check) round-robin order of chunk assignment.
The specific ordering is not important, just that it is consistent
(and preferably easy to understand).
Per onflow/flow-go-internal#5357 use a slice instead
of a map. This now enforces that chunks are assigned (added to the Builder)
in order of their Index (currently via a panic if the order is incorrect).

Necessitated changes to the helper for TestVerificationHappyPath, which was
not assigning some chunks (skipping odd-numbered chunks). Instead for now
we explicitly assign an empty verifier set.
@tim-barry tim-barry marked this pull request as ready for review January 8, 2025 18:23
@tim-barry
Copy link
Contributor Author

Don't know why the there are suddenly failing engine/access tests in f0e1a5b when the commit just changed code comments; the tests that fail in CI pass when I run tests locally.

model/chunks/chunkassignment.go Outdated Show resolved Hide resolved
model/chunks/chunkassignment.go Outdated Show resolved Hide resolved
model/chunks/chunkassignment.go Show resolved Hide resolved
model/chunks/chunkassignment.go Outdated Show resolved Hide resolved
model/chunks/chunkassignment.go Outdated Show resolved Hide resolved
Comment on lines 48 to 49
// TODO: method should probably error if chunk has unknown index
func (a *Assignment) HasVerifier(chunk *flow.Chunk, identifier flow.Identifier) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you address this TODO as part of the PR as well? There is only one caller.

model/chunks/chunkassignment.go Outdated Show resolved Hide resolved
Comment on lines 34 to 35
// Verifiers returns the list of verifier nodes assigned to a chunk
func (a *Assignment) Verifiers(chunk *flow.Chunk) flow.IdentifierList {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In several existing functions (Verifiers, HasVerifier, and Add) we are passing in a Chunk but only using the Index field. Here, the documentation states returns the list of verifier nodes assigned to a chunk, but we aren't actually checking anything about the chunk besides its index. Really it's up to the caller to make sure the chunk and assignment are associated, otherwise we might return an incorrect verifier set, without an error indicating the failure.

I think passing in just the index, rather than the entire chunk, would make this fact more obvious in the API. If you agree, do you mind making that change here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it makes more sense in the API; done in edc05b0

engine/verification/utils/unittest/helper.go Outdated Show resolved Hide resolved
@jordanschalm
Copy link
Member

Nice work. Most comments are related to documentation, and some additional suggestions for improving this area of code:

@jordanschalm
Copy link
Member

Don't know why the there are suddenly failing engine/access tests in f0e1a5b when the commit just changed code comments; the tests that fail in CI pass when I run tests locally.

The TestMessageIndexTransactionStatusesProviderResponse_HappyPath test was added recently, on Dec 19. It is probably a new flaky test.

Hey @AndriiDiachuk do you have any context on TestMessageIndexTransactionStatusesProviderResponse_HappyPath being flaky? It has failed twice in a row here, even with the 5x retry built into the CI job. It fails when I run multiple times locally as well (20-30% or so):

go test ./engine/access/rest/websockets/data_providers/... -testify.m ^TestMessageIndexTransactionStatusesProviderResponse_HappyPath$ -count 10

@tim-barry feel free to disable/quarantine the test using unittest.SkipUnless and create an issue to address it as a flaky test, to un-block this PR.

tim-barry and others added 3 commits January 9, 2025 12:36
Also ensure the Builder can't modify the assignment after building it,
by revoking its reference.

Co-authored-by: Jordan Schalm <[email protected]>
Don't call .Build() on the AssignmentBuilder more than once
The Assignment and AssignmentBuilder functions only used the chunk.Index of the
chunk they received, so refactoring this makes it clear that the Assignment
only knows about chunk Indexes and not other features of chunks.
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the very nice work.

I am suggesting to add error returns in a couple places in the AssignmentBuilder. If you incorporate those suggestions, lets make sure we check the error return in all places (goland provides a nice "Find Usages" feature on right-click that makes it very easy to go over all calls for a particular method).
All of my suggestions are under the assumption that the respective changes won't take very long. If you find yourself in a situation where you think that a suggestion will take multiple hours to implement, please let me know and just skip it. 🤞

func NewAssignment() *Assignment {
return &Assignment{
verifiersForChunk: make(map[uint64]map[flow.Identifier]struct{}),
// AssignmentBuilder is a helper to create a new Assignment.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// AssignmentBuilder is a helper to create a new Assignment.
// AssignmentBuilder is a helper to create a new Assignment. It is for constructing
// a single Assignment only and should be discarded after calling `Build()`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me personally, it is helpful when reading code if the methods belonging to a struct are grouped together in the file - that way I can read through the functionality of one struct without having to jump forth and back or skip over methods from other structs. Maybe we could first specify Assignment including all of its methods and then proceed on to the AssignmentBuilder.

[optional aesthetic comment]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: we could remove lines 89 and 90 in

// AssignmentDataPack
//
// AssignmentDataPack provides a storable representation of chunk assignments on
// mempool
type AssignmentDataPack struct {

}
}

// Build constructs and returns the immutable assignment.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer "safe by default" implementations. I admit here it is more a perfectionist's approach with limited practical benefits 😅 🙈

Suggested change
// Build constructs and returns the immutable assignment.
// Build constructs and returns the immutable assignment. The AssignmentBuilder has served its
// purpose and should be discarded after this call. Subsequent calls of any methods will panic.

In order to make this statement true:

Subsequent calls of any methods will panic.

we could include the following sanity check right before line 28

if a.verifiersForChunk == nil {
  panic("method `AssignmentBuilder.Build` has previously been called - do not reuse AssignmentBuilder")	
}

Comment on lines +36 to +39
if chunkIdx >= uint64(len(a.verifiersForChunk)) {
// the chunk does not exist in the assignment, so it has no verifiers
return v
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly worried about this logic, because every chunk should have a verifier assignment, even if the assignment itself is empty. Requesting the assignment for a non-existing chunk is likely a symptom of a bug or state corruption. In either case, there is a severe risk that proceeding would lead to an incorrect result.

I think the most general and safe approach would be to return a sentinel error and let the caller decide what to do.

  1. we typically include sentinel errors at the top of the file:
    var ErrUnknownChunkIndex = errors.New("verifier assignment for invalid chunk requested")
    so users of this package can check for the sentinel with errors.Is(err, chunks.ErrUnknownChunkIndex)
  2. for verifiers, we could write:
    // Verifiers returns the list of verifier nodes assigned to a chunk. The protocol mandates
    // that for each chunk in a block, a verifier assignment exists (though it may be empty) and
    // that each block must at least have one chunk. 
    // If there is no chunk with the specified index, an ErrUnknownChunkIndex is returned. 
    func (a *Assignment) Verifiers(chunkIdx uint64) (flow.IdentifierList, error) {
      if chunkIdx >= uint64(len(a.verifiersForChunk)) {
        // the chunk does not exist in the assignment, so it has no verifiers
        return nil, ErrUnknownChunkIndex
      }
      ⋮

In the corresponding tests, it would be very great to check that the implementation returns the expected error type

Comment on lines +51 to +52
// assign each chunk to exactly one verifier, in round-robin order
// Since there are 2x as many chunks as verifiers, each verifier will have 2 chunks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice update of the comment 🙇 - much clearer now

}
a.verifiersForChunk[chunk.Index] = v
// sorts verifiers list based on their identifier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we are sorting here ... maps in golang have no determined order.

Suggested change
// sorts verifiers list based on their identifier

}
a.verifiersForChunk[chunk.Index] = v
// sorts verifiers list based on their identifier
a.verifiersForChunk = append(a.verifiersForChunk, verifiers.Lookup())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test

// TestAssignDuplicate tests assign Add duplicate verifiers
func (a *PublicAssignmentTestSuite) TestAssignDuplicate() {
prompted me to come back here.

Formally, the flow protocol mandates that the same verifier is not assigned repeatedly to the same chunk (as this would weaken the protocol's security). With the addition of the AssignmentBuilder, we can now easily enforce this requirement. The code for generating the assignment should already obey this rule (👉 this check). But who knows - there might be a change in the future that accidentally breaks this requirement ... and now we can easily detect such a bug here 🤩.

Suggested change
a.verifiersForChunk = append(a.verifiersForChunk, verifiers.Lookup())
// Formally, the flow protocol mandates that the same verifier is not assigned
// repeatedly to the same chunk (as this would weaken the protocol's security).
vs := verifiers.Lookup()
if len(vs) != len(verifiers) {
return fmt.Errorf("repeated assignment of the same verifier to the same chunk is a violation of protocol rules")
}
a.verifiersForChunk = append(a.verifiersForChunk, vs)

Comment on lines 74 to 75
// TestAssignDuplicate tests assign Add duplicate verifiers
func (a *PublicAssignmentTestSuite) TestAssignDuplicate() {
Copy link
Member

@AlexHentschel AlexHentschel Jan 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should verify that the AssignmentBuilder rejects assignments with repeated verification nodes. I hope that the extra check won't blow up any tests🤞

@@ -36,7 +36,7 @@ func NewApprovalCollector(
) (*ApprovalCollector, error) {
chunkCollectors := make([]*ChunkApprovalCollector, 0, result.Result.Chunks.Len())
for _, chunk := range result.Result.Chunks {
chunkAssignment := assignment.Verifiers(chunk).Lookup()
chunkAssignment := assignment.Verifiers(chunk.Index).Lookup()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This runs in the consensus nodes and therefore I am trying to be cautions about resource consumption:

  • to the best of my understanding, this is the only production code that calls Assignment.Verifiers (all the other places are tests)
  • The assignment internally stores a map for each chunk (ie the assigned verifiers as a set). Upon calling Assignment.Verifiers, we convert the set into a slice and then the ApprovalCollector converts it back into a set here 😅.
  • Arguably, its not a significant resource drain ... but I also think it might be quick to avoid that resource drain. Especially for the consensus nodes, we really try to keep their load light, because the consensus system is very latency sensitive.

I would suggest that Assignment.Verifiers returns a map instead of a slice - we should copy the map to protect the receiver from modifying the Assignment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants