tortoise: support multiple smeshers #5087

dshulyak · 2023-09-26T07:52:39Z

consensus state is isolated in the tortoise module.
all changes related to multiple smeshers support should be in in miner module.

for consistent interface for registration please see #5085

computing active set and eligibilities

activeset should be prepared only once per epoch, regardless the number of registered smeshers.
need to be careful not to create multiple parallel readers where each one will try to prepare its own activeset.

data structure should reflect that we store only one copy of activeset. one way to implement it would be to check on startup if any of the registered smeshers already created activeset, if so then we cache that activeset. otherwise code prepares activeset before going into parallel part.

the other non-parallel part is computing tortoise.EncodeVotes only once per layer, regardless of the registered smeshers.

the parallel part consists of preparing eligibility cache per smesher, selecting txs for proposals and signing transactions.

oracle refactoring

oracle should not store any state. no signers, no eligibilities.
all shared state should be stored on miner instance, the structure should reflect what we compute in parallel and what not.

skipping vrf/sig validation on self-publish

unnecessary work, without this refactoring will require parallelization as well

The text was updated successfully, but these errors were encountered:

…5130) related: #5113 it drops RefBallot function, and will allow to drop that index in a followup related: #5106 it eliminates repetitive disk reads and makes potentially expensive calls more transparent. added latency of execution to the logs related: #5087 refactoring to draw a line between per-smesher data that needs to be loaded once per epoch, and calls to external components. tortoise/mesh hash are reusable for every smesher, get txs is not reusable. it makes adding support for multiple smeshers significantly simpler

…pacemeshos#5130) related: spacemeshos#5113 it drops RefBallot function, and will allow to drop that index in a followup related: spacemeshos#5106 it eliminates repetitive disk reads and makes potentially expensive calls more transparent. added latency of execution to the logs related: spacemeshos#5087 refactoring to draw a line between per-smesher data that needs to be loaded once per epoch, and calls to external components. tortoise/mesh hash are reusable for every smesher, get txs is not reusable. it makes adding support for multiple smeshers significantly simpler

closes: #5087 data is separated into shared data (beacon and active set) and signer specific data. both beacon and activeset are used from shared data, until smeshers generated a reference ballot. once any smesher generated a ballot it will be using data recorded in the reference ballot. build method now loops over all signers (copied at the start of the layer). there are parts that have to be run once for all signers and parts that makes sense to run in parallel. serial parts: - loading share data (beacon and active set) - deciding on mesh hash - tally votes & encode votes tortoise calls parallel parts: - loading data (it can be also run serially, but it was convenient to run it in parallel) - computing eligibilities (this is done once per node startup) - selecting txs - publishing proposal. this is the most important to avoid blocking serially in Publish while it runs validation worker pool (errgroup) is limited by number of cores as there is no network requests during parallel work.

dshulyak added area/tortoise feat/multi smeshers labels Sep 26, 2023

dshulyak added this to Dev team kanban Sep 26, 2023

dshulyak moved this to 📋 Backlog in Dev team kanban Sep 26, 2023

dshulyak moved this from 📋 Backlog to 🔖 Next in Dev team kanban Sep 27, 2023

pigmej mentioned this issue Oct 3, 2023

Multiple post services per node - multiple ATX spacemeshos/pm#261

Closed

6 tasks

dshulyak self-assigned this Oct 5, 2023

dshulyak moved this from 🔖 Next to 🏗 Doing in Dev team kanban Oct 5, 2023

dshulyak mentioned this issue Oct 5, 2023

[Merged by Bors] - miner: refactor code for multiple smesher and better observability #5130

Closed

dshulyak mentioned this issue Oct 12, 2023

[Merged by Bors] - miner: multi signers #5135

Closed

bors bot closed this as completed in b545097 Oct 16, 2023

github-project-automation bot moved this from 🏗 Doing to ✅ Done in Dev team kanban Oct 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tortoise: support multiple smeshers #5087

tortoise: support multiple smeshers #5087

dshulyak commented Sep 26, 2023

tortoise: support multiple smeshers #5087

tortoise: support multiple smeshers #5087

Comments

dshulyak commented Sep 26, 2023

computing active set and eligibilities

oracle refactoring

skipping vrf/sig validation on self-publish