You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
rocketpool_watchtower | 2023/10/23 15:41:19 [Interval 14 Tree] Starting generation of Merkle rewards tree for interval 14.
rocketpool_watchtower | 2023/10/23 15:41:19 [Interval 14 Tree] Found snapshot event: Beacon block 7421279, execution block 18232197
rocketpool_watchtower | 2023/10/23 15:41:19 [Interval 14 Tree] Error getting state for block 18232197: missing trie node e134d1cdbed48ce32c7bd3466af8b787ecb4a90da03c1a908c5418f5d4d16fd6 (path ) state 0xe134d1cdbed48ce32c7bd3466af8b787ecb4a90da03c1a908c5418f5d4d16fd6 is not available
rocketpool_watchtower | 2023/10/23 15:41:19 [Interval 14 Tree] Primary EC cannot retrieve state for historical block 18232197, using archive EC [https://mainnet.infura.io/v3/ABC]
rocketpool_watchtower | 2023/10/23 15:41:20 Getting network state for EL block 18232197, Beacon slot 7421279
rocketpool_watchtower | 2023/10/23 15:41:20 [Interval 14 Tree] error getting state for beacon slot 7421279: error getting network contracts: error executing multicall for contract retrieval: missing trie node e134d1cdbed48ce32c7bd3466af8b787ecb4a90da03c1a908c5418f5d4d16fd6 (path ) state 0xe134d1cdbed48ce32c7bd3466af8b787ecb4a90da03c1a908c5418f5d4d16fd6 is not available
rocketpool_watchtower | 2023/10/23 15:41:20 *** Rewards tree generation failed. ***
The first error is uninteresting, in that we know the local EC is not an archive EC, and the state being queried is 25 days old.
However, kbw is using an infura as an archive ec and has configured it correctly in the TUI.
Inspecting generateRewardsTree shows us that, despite the archive EC passing its validity checks, we have some subsequent validation that fails to use the new client.
t.handleError(fmt.Errorf("%s Error verifying rETH address with Archive EC: %w", err))
return
}
} else {
// No archive node specified
t.handleError(fmt.Errorf("***ERROR*** Primary EC cannot retrieve state for historical block %d and the Archive EC is not specified.", elBlockHeader.Number.Uint64()))
return
}
}
}
// Sanity check the rETH address to make sure the client is working right
ifaddress!=t.cfg.Smartnode.GetRethAddress() {
t.handleError(fmt.Errorf("***ERROR*** Your Primary EC provided %s as the rETH address, but it should have been %s!", address.Hex(), t.cfg.Smartnode.GetRethAddress().Hex()))
t.handleError(fmt.Errorf("***ERROR*** Primary EC cannot retrieve state for historical block %d and the Archive EC is not specified.", elBlockHeader.Number.Uint64()))
However, the inner scope that 'replaces' the client instance with a new one (the archive client) only overwrites outer-scoped variables address and client.
Therefore, when we check the rETH address on L203, we succeed, but then this fails:
t.handleError(fmt.Errorf("%s error getting state for beacon slot %d: %w", generationPrefix, rewardsEvent.ConsensusBlock.Uint64(), err))
return
}
t.m.GetStateForSlot(...) has no references to either of the updated variables, so it is clear that this validation will always fail when the original EC doesn't have the appropriate state.
t.m is a state.NetworkStateManager passed in when newGenerateRewardsTree(...) is called. It is created here:
// Check if the manager should ignore sync checks and/or default to using the fallback (used by the API container when driven by the CLI)
ifc.GlobalBool("ignore-sync-check") {
ecManager.ignoreSyncCheck=true
}
ifc.GlobalBool("force-fallbacks") {
ecManager.primaryReady=false
}
}
})
returnecManager, err
}
which is the 'generic' "i need an EC, please check the primary and the fallback" function, but does not account for watchtower's archive fallback.
A very naive solution would be to simply replace t.m in generateRewardsTree, eg:
diff --git a/rocketpool/watchtower/generate-rewards-tree.go b/rocketpool/watchtower/generate-rewards-tree.go
index e0207089..0f87db33 100644
--- a/rocketpool/watchtower/generate-rewards-tree.go+++ b/rocketpool/watchtower/generate-rewards-tree.go@@ -200,6 +200,13 @@ func (t *generateRewardsTree) generateRewardsTree(index uint64) {
t.handleError(fmt.Errorf("%s Error verifying rETH address with Archive EC: %w", err))
return
}
++ // Make sure we use the new client+ t.m, err = state.NewNetworkStateManager(client, t.cfg, ec, t.bc, &t.log)+ if err != nil {+ t.handleError(fmt.Errorf("%s Error creating new NetworkStateManager with Archive EC: %w", err))+ return+ }
} else {
// No archive node specified
t.handleError(fmt.Errorf("***ERROR*** Primary EC cannot retrieve state for historical block %d and the Archive EC is not specified.", elBlockHeader.Number.Uint64()))
BUT I really don't like the side-effects here. The function purity is damaged.
Instead I'd suggest that t.m need not exist- it appears to only be used in this function. We should remove it from the struct, and have generateRewardsTree(...) unconditionally create a NetworkStateManager according to which client is available and error-free:
diff --git a/rocketpool/watchtower/generate-rewards-tree.go b/rocketpool/watchtower/generate-rewards-tree.go
index e0207089..6485e0e9 100644
--- a/rocketpool/watchtower/generate-rewards-tree.go+++ b/rocketpool/watchtower/generate-rewards-tree.go@@ -38,11 +38,10 @@ type generateRewardsTree struct {
bc beacon.Client
lock *sync.Mutex
isRunning bool
- m *state.NetworkStateManager
}
// Create generate rewards Merkle Tree task
-func newGenerateRewardsTree(c *cli.Context, logger log.ColorLogger, errorLogger log.ColorLogger, m *state.NetworkStateManager) (*generateRewardsTree, error) {+func newGenerateRewardsTree(c *cli.Context, logger log.ColorLogger, errorLogger log.ColorLogger) (*generateRewardsTree, error) {
// Get services
cfg, err := services.GetConfig(c)
@@ -82,7 +81,6 @@ func newGenerateRewardsTree(c *cli.Context, logger log.ColorLogger, errorLogger
rp: rp,
lock: lock,
isRunning: false,
- m: m,
}
return generator, nil
@@ -166,13 +164,24 @@ func (t *generateRewardsTree) generateRewardsTree(index uint64) {
return
}
+ var stateManager *state.NetworkStateManager+
// Try getting the rETH address as a canary to see if the block is available
client := t.rp
opts := &bind.CallOpts{
BlockNumber: elBlockHeader.Number,
}
address, err := client.RocketStorage.GetAddress(opts, crypto.Keccak256Hash([]byte("contract.addressrocketTokenRETH")))
- if err != nil {+ if err == nil {++ // Create the state manager with using the primary or fallback (not necessarily archive) EC+ stateManager, err = state.NewNetworkStateManager(client, t.cfg, t.rp.Client, t.bc, &t.log)+ if err != nil {+ t.handleError(fmt.Errorf("%s Error creating new NetworkStateManager with ARchive EC: %w", err))+ return+ }+ } else {+ // Check if an Archive EC is provided, and if using it would potentially resolve the error
errMessage := err.Error()
t.log.Printlnf("%s Error getting state for block %d: %s", generationPrefix, elBlockHeader.Number.Uint64(), errMessage)
if strings.Contains(errMessage, "missing trie node") || // Geth
@@ -200,6 +209,13 @@ func (t *generateRewardsTree) generateRewardsTree(index uint64) {
t.handleError(fmt.Errorf("%s Error verifying rETH address with Archive EC: %w", err))
return
}
++ // Create the state manager with the archive EC+ stateManager, err = state.NewNetworkStateManager(client, t.cfg, ec, t.bc, &t.log)+ if err != nil {+ t.handleError(fmt.Errorf("%s Error creating new NetworkStateManager with ARchive EC: %w", err))+ return+ }
} else {
// No archive node specified
t.handleError(fmt.Errorf("***ERROR*** Primary EC cannot retrieve state for historical block %d and the Archive EC is not specified.", elBlockHeader.Number.Uint64()))
@@ -216,7 +232,7 @@ func (t *generateRewardsTree) generateRewardsTree(index uint64) {
}
// Get the state for the target slot
- state, err := t.m.GetStateForSlot(rewardsEvent.ConsensusBlock.Uint64())+ state, err := stateManager.GetStateForSlot(rewardsEvent.ConsensusBlock.Uint64())
if err != nil {
t.handleError(fmt.Errorf("%s error getting state for beacon slot %d: %w", generationPrefix, rewardsEvent.ConsensusBlock.Uint64(), err))
return
diff --git a/rocketpool/watchtower/watchtower.go b/rocketpool/watchtower/watchtower.go
index 9d05f1f4..acf94b80 100644
--- a/rocketpool/watchtower/watchtower.go+++ b/rocketpool/watchtower/watchtower.go@@ -171,7 +171,7 @@ func run(c *cli.Context) error {
if err != nil {
return fmt.Errorf("error during penalties check: %w", err)
}*/
- generateRewardsTree, err := newGenerateRewardsTree(c, log.NewColorLogger(SubmitRewardsTreeColor), errorLog, m)+ generateRewardsTree, err := newGenerateRewardsTree(c, log.NewColorLogger(SubmitRewardsTreeColor), errorLog)
if err != nil {
return fmt.Errorf("error during manual tree generation check: %w", err)
}
which cleanly eliminates the reason this bug appeared in the first place- we persisted too much configuration to pseudo-global variables instead of passing local variables through a functional chain.
The text was updated successfully, but these errors were encountered:
Reported by @Kweiss on discord:
The first error is uninteresting, in that we know the local EC is not an archive EC, and the state being queried is 25 days old.
However, kbw is using an infura as an archive ec and has configured it correctly in the TUI.
Inspecting
generateRewardsTree
shows us that, despite the archive EC passing its validity checks, we have some subsequent validation that fails to use the new client.Specifically:
smartnode/rocketpool/watchtower/generate-rewards-tree.go
Lines 164 to 213 in ffe2085
The first error in the logs, which again is uninteresting, comes from
smartnode/rocketpool/watchtower/generate-rewards-tree.go
Line 167 in ffe2085
We correctly identify that it's a
missing trie node
error and detect the presence of an archive ec. since we do not see the error here:smartnode/rocketpool/watchtower/generate-rewards-tree.go
Line 195 in ffe2085
However, the inner scope that 'replaces' the client instance with a new one (the archive client) only overwrites outer-scoped variables
address
andclient
.Therefore, when we check the rETH address on L203, we succeed, but then this fails:
smartnode/rocketpool/watchtower/generate-rewards-tree.go
Lines 209 to 213 in ffe2085
t.m.GetStateForSlot(...)
has no references to either of the updated variables, so it is clear that this validation will always fail when the original EC doesn't have the appropriate state.t.m
is astate.NetworkStateManager
passed in whennewGenerateRewardsTree(...)
is called. It is created here:smartnode/rocketpool/watchtower/watchtower.go
Lines 113 to 116 in ffe2085
and the client that it uses is the result of the call here:
smartnode/rocketpool/watchtower/watchtower.go
Line 77 in ffe2085
ultimately calling
smartnode/shared/services/services.go
Lines 221 to 237 in ffe2085
which is the 'generic' "i need an EC, please check the primary and the fallback" function, but does not account for watchtower's archive fallback.
A very naive solution would be to simply replace
t.m
ingenerateRewardsTree
, eg:BUT I really don't like the side-effects here. The function purity is damaged.
Instead I'd suggest that
t.m
need not exist- it appears to only be used in this function. We should remove it from the struct, and havegenerateRewardsTree(...)
unconditionally create a NetworkStateManager according to which client is available and error-free:which cleanly eliminates the reason this bug appeared in the first place- we persisted too much configuration to pseudo-global variables instead of passing local variables through a functional chain.
The text was updated successfully, but these errors were encountered: