Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Execute tortoise beacon round only if node is synced at the start of the epoch #2715

Closed

Conversation

dshulyak
Copy link
Contributor

@dshulyak dshulyak commented Aug 24, 2021

Motivation

Tortoise beacon round should be skipped if node is not synced at the start of it.

Tortoise beacon checks that the miner is eligible to send any message during the round (proposal, initial vote, following) using miner weights from the previous epoch. If node doesn't have ATXs at the start of the epoch it is almost guaranteed to produce invalid beacon.

probably related #2699 . will close after verifying

Changes

  • skip tortoise beacon round if we are not synced at the start of it

Test Plan

it addresses race condition that only appears when new node is added to the network. if this works latenodes test must be stabilized even futher.

TODO

  • Explain motivation or link existing issue(s)
  • Test changes and document test plan
  • Update documentation as needed

DevOps Notes

  • This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
  • This PR does not affect public APIs
  • This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
  • This PR does not make changes to log messages (which monitoring infrastructure may rely on)

@dshulyak
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 24, 2021
@bors
Copy link

bors bot commented Aug 24, 2021

try

Build failed:

@dshulyak
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 24, 2021
@bors
Copy link

bors bot commented Aug 24, 2021

try

Build failed:

cmd/node/node.go Show resolved Hide resolved
@dshulyak
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 24, 2021
@bors
Copy link

bors bot commented Aug 24, 2021

try

Build failed:

@@ -156,13 +162,24 @@ type TortoiseBeacon struct {
proposalChans map[types.EpochID]chan *proposalMessageWithReceiptData
}

// SetSyncState updates sync state provider. Must be executed only once.
func (tb *TortoiseBeacon) SetSyncState(sync SyncState) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename SetSyncStateProvider? or it sounds like you are changing the state

@@ -359,6 +376,10 @@ func (tb *TortoiseBeacon) handleEpoch(ctx context.Context, epoch types.EpochID)

return
}
if !tb.sync.IsSynced(ctx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or you can register a channel to be notified when the node turned synced. see RegisterChForSynced.
tho this does not notify you again if the node goes out of sync, i think it serves your purpose here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be better. but if node goes out of sync again i will need to re-register. and for that, i still need a pointer

@@ -661,7 +661,8 @@ func (app *App) initServices(ctx context.Context,
AlwaysListen: app.Config.AlwaysListen,
}
syncer := syncer.NewSyncer(ctx, syncerConf, clock, msh, layerFetch, app.addLogger(SyncLogger, lg))

// TODO(dshulyak) this needs to be improved, but dependency graph is a bit complicated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try RegisterChForSynced in syncer.go?

@dshulyak
Copy link
Contributor Author

bors try

bors bot added a commit that referenced this pull request Aug 25, 2021
@bors
Copy link

bors bot commented Aug 25, 2021

try

Build failed:

@dshulyak
Copy link
Contributor Author

Anyway this change makes latenodes not less flaky but more, because it misses tortoise beacon in sync consistently now. It can't be merged until we solve that issue.

@antonlerner
Copy link
Contributor

Anyway this change makes latenodes not less flaky but more, because it misses tortoise beacon in sync consistently now. It can't be merged until we solve that issue.

I can propose that for the meantime TB validation will not fail block validation, @nkryuchkov has a new syncing method he's working on that should solve this issue so perhaps it's best to patch it now and wait a couple of days to get full functionality of both TB and TB sync

@dshulyak
Copy link
Contributor Author

I can propose that for the meantime TB validation will not fail block validation, @nkryuchkov has a new syncing method he's working on that should solve this issue so perhaps it's best to patch it now and wait a couple of days to get full functionality of both TB and TB sync

sounds good to me. @nkryuchkov do you see any issues with the proposal?

@nkryuchkov
Copy link
Contributor

@dshulyak no, I don't see any. I did that yesterday in https://github.com/nkryuchkov/go-spacemesh/tree/execute-beacon-only-synced-disable-validation, it increased the values in the test output closer to expected ones, however, there're some still issues in hare (namespace name: pw37j). I tried to disable and mock the tortoise beacon, the test passed, so the issue is in it. I'm fixing hare now

@nkryuchkov
Copy link
Contributor

bors merge

bors bot pushed a commit that referenced this pull request Aug 26, 2021
…the epoch (#2715)

## Motivation

Tortoise beacon round should be skipped if node is not synced at the start of it.

Tortoise beacon checks that the miner is eligible to send any message during the round (proposal, initial vote, following) using miner weights from the previous epoch. If node doesn't have ATXs at the start of the epoch it is almost guaranteed to produce invalid beacon.  

probably related #2699 . will close after verifying

## Changes
- skip tortoise beacon round if we are not synced at the start of it

## Test Plan
it addresses race condition that only appears when new node is added to the network. if this works latenodes test must be stabilized even futher.

## TODO
- [x] Explain motivation or link existing issue(s)
- [ ] Test changes and document test plan
- [ ] Update documentation as needed

## DevOps Notes
<!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases -->
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)


Co-authored-by: Nikita Kryuchkov <[email protected]>
@bors
Copy link

bors bot commented Aug 26, 2021

Build failed:

@nkryuchkov
Copy link
Contributor

bors merge

bors bot pushed a commit that referenced this pull request Aug 26, 2021
…the epoch (#2715)

## Motivation

Tortoise beacon round should be skipped if node is not synced at the start of it.

Tortoise beacon checks that the miner is eligible to send any message during the round (proposal, initial vote, following) using miner weights from the previous epoch. If node doesn't have ATXs at the start of the epoch it is almost guaranteed to produce invalid beacon.  

probably related #2699 . will close after verifying

## Changes
- skip tortoise beacon round if we are not synced at the start of it

## Test Plan
it addresses race condition that only appears when new node is added to the network. if this works latenodes test must be stabilized even futher.

## TODO
- [x] Explain motivation or link existing issue(s)
- [ ] Test changes and document test plan
- [ ] Update documentation as needed

## DevOps Notes
<!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases -->
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)


Co-authored-by: Nikita Kryuchkov <[email protected]>
@nkryuchkov
Copy link
Contributor

bors merge-

@bors
Copy link

bors bot commented Aug 26, 2021

Canceled.

@nkryuchkov
Copy link
Contributor

bors merge

bors bot pushed a commit that referenced this pull request Aug 26, 2021
…the epoch (#2715)

## Motivation

Tortoise beacon round should be skipped if node is not synced at the start of it.

Tortoise beacon checks that the miner is eligible to send any message during the round (proposal, initial vote, following) using miner weights from the previous epoch. If node doesn't have ATXs at the start of the epoch it is almost guaranteed to produce invalid beacon.  

probably related #2699 . will close after verifying

## Changes
- skip tortoise beacon round if we are not synced at the start of it

## Test Plan
it addresses race condition that only appears when new node is added to the network. if this works latenodes test must be stabilized even futher.

## TODO
- [x] Explain motivation or link existing issue(s)
- [ ] Test changes and document test plan
- [ ] Update documentation as needed

## DevOps Notes
<!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases -->
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)


Co-authored-by: Nikita Kryuchkov <[email protected]>
@bors
Copy link

bors bot commented Aug 26, 2021

Build failed:

@nkryuchkov
Copy link
Contributor

bors merge

bors bot pushed a commit that referenced this pull request Aug 26, 2021
…the epoch (#2715)

## Motivation

Tortoise beacon round should be skipped if node is not synced at the start of it.

Tortoise beacon checks that the miner is eligible to send any message during the round (proposal, initial vote, following) using miner weights from the previous epoch. If node doesn't have ATXs at the start of the epoch it is almost guaranteed to produce invalid beacon.  

probably related #2699 . will close after verifying

## Changes
- skip tortoise beacon round if we are not synced at the start of it

## Test Plan
it addresses race condition that only appears when new node is added to the network. if this works latenodes test must be stabilized even futher.

## TODO
- [x] Explain motivation or link existing issue(s)
- [ ] Test changes and document test plan
- [ ] Update documentation as needed

## DevOps Notes
<!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases -->
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)


Co-authored-by: Nikita Kryuchkov <[email protected]>
@bors
Copy link

bors bot commented Aug 26, 2021

Build failed:

@nkryuchkov
Copy link
Contributor

bors merge

bors bot pushed a commit that referenced this pull request Aug 26, 2021
…the epoch (#2715)

## Motivation

Tortoise beacon round should be skipped if node is not synced at the start of it.

Tortoise beacon checks that the miner is eligible to send any message during the round (proposal, initial vote, following) using miner weights from the previous epoch. If node doesn't have ATXs at the start of the epoch it is almost guaranteed to produce invalid beacon.  

probably related #2699 . will close after verifying

## Changes
- skip tortoise beacon round if we are not synced at the start of it

## Test Plan
it addresses race condition that only appears when new node is added to the network. if this works latenodes test must be stabilized even futher.

## TODO
- [x] Explain motivation or link existing issue(s)
- [ ] Test changes and document test plan
- [ ] Update documentation as needed

## DevOps Notes
<!-- Please uncheck these items as applicable to make DevOps aware of changes that may affect releases -->
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)


Co-authored-by: Nikita Kryuchkov <[email protected]>
@bors
Copy link

bors bot commented Aug 26, 2021

Pull request successfully merged into develop.

Build succeeded:

@bors bors bot changed the title Execute tortoise beacon round only if node is synced at the start of the epoch [Merged by Bors] - Execute tortoise beacon round only if node is synced at the start of the epoch Aug 26, 2021
@bors bors bot closed this Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants