Skip to content

Commit

Permalink
Move node key to config directory and enable loading of multiple iden…
Browse files Browse the repository at this point in the history
…tities (#5592)

## Motivation

This completes the implementation of multi-smeshing support.

Related: spacemeshos/post#270
  • Loading branch information
fasmat committed Mar 4, 2024
1 parent c3f3793 commit 6a0ea84
Show file tree
Hide file tree
Showing 46 changed files with 1,760 additions and 882 deletions.
77 changes: 72 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,12 @@ encrypted connection between the post service and the node over insecure connect

Smeshers using the default setup with a supervised post service do not need to make changes to their node configuration.

#### Fully migrated local state into `node_state.sql`

With this release the node has fully migrated its local state into `node_state.sql`. During the first start after the
upgrade the node will migrate the data from disk and store it in the database. This change also allows the PoST data
directory to be set to read only after the migration is complete, as the node will no longer write to it.

#### New poets configuration

Upgrading requires changes in config and in CLI flags (if not using the default).
Expand Down Expand Up @@ -92,17 +98,65 @@ configuration is as follows:
}
```

#### Extend go-spacemesh with option to manage multiple identities/PoST services

**NOTE:** This is a new feature, not yet supported by Smapp and possibly subject to change. Please use with caution.

A node can now manage multiple identities and their life cycle. This reduces the amount of data that is needed to be
broadcasted / fetched from the network and reduces the amount of data that needs to be stored locally, because only one
database is needed for all identities instead of one for each.

To ensure you are eligible for rewards of any given identity, the associated PoST service must be running and connected
to the node during the cyclegap set in the node's configuration. After successfully broadcasting the ATX and registering
at a PoET server the PoST services can be stopped with only the node having to be online.

This change moves the private keys associated for an identity from the PoST data directory to the node's data directory
and into the folder `identities` (i.e. if `state.sql` is in folder `data` the keys will now be stored in `data/identities`).
The node will automatically migrate the `key.bin` file from the PoST data directory during the first startup and copy
it to the new location as `identity.key`. The content of the file stays unchanged (= the private key of the identity hex-encoded).

##### Adding new identities/PoST services to a node

To add a new identity to a node, initialize PoST data with `postcli` and let it generate a new private key for you:

```shell
./postcli -provider=2 -numUnits=4 -datadir=/path/to/data \
-commitmentAtxId=c230c51669d1fcd35860131e438e234726b2bd5f9adbbd91bd88a718e7e98ecb
```

Make sure to replace `provider` with your provider of choice and `numUnits` with the number of PoST units you want to
initialize. The `commitmentAtxId` is the commitment ATX ID for the identity you want to initialize. For details on the
usage of `postcli` please refer to [postcli README](https://github.com/spacemeshos/post/cmd/postcli/README.md).

During initialization `postcli` will generate a new private key and store it in the PoST data directory as `key.bin`.
Copy this file to your `data/identities` directory and rename it to `xxx.key` where `xxx` is a unique identifier for
the identity. The node will automatically pick up the new identity and manage its lifecycle after a restart.

Setup the `post-service` [binary](https://github.com/spacemeshos/post-rs/releases) or
[docker image](https://hub.docker.com/r/spacemeshos/post-service/tags) with the data and configure it to connect to your
node. For details refer to the [post-service README](https://github.com/spacemeshos/post-rs/blob/main/service/README.md).

##### Migrating existing identities/PoST services to a node

If you have multiple nodes running and want to migrate to use only one node for all identities:

1. Stop all nodes.
2. Copy the `key.bin` files from the PoST data directories of all nodes to the data directory of the node you want to
use for both identities and into the folder `data/identities`. Rename the files to `xxx.key` where `xxx` is a unique
identifier for each identity.
3. Start the node managing the identities.
4. For every identity setup a post service to use the existing PoST data for that identity and connect to the node.
For details refer to the [post-service README](https://github.com/spacemeshos/post-rs/blob/main/service/README.md).

**WARNING:** DO NOT run multiple nodes with the same identity at the same time. This will result in an equivocation
and permanent ineligibility for rewards.

### Highlights

* [#5293](https://github.com/spacemeshos/go-spacemesh/pull/5293) change poet servers configuration
The config now takes the poet server address and its public key. See the [Upgrade Information](#new-poets-configuration)
for details.

* [#5219](https://github.com/spacemeshos/go-spacemesh/pull/5219) Migrate data from `nipost_builder_state.bin` to `node_state.sql`.

The node will automatically migrate the data from disk and store it in the database. The migration will take place at the
first startup after the upgrade.

* [#5390](https://github.com/spacemeshos/go-spacemesh/pull/5390)
Distributed PoST verification.

Expand All @@ -111,12 +165,25 @@ configuration is as follows:
If a node finds a proof invalid, it will report it to the network by
creating a malfeasance proof. The malicious node will then be blacklisted by the network.

* [#5592](https://gihtub.com/spacemeshos/go-spacemesh/pull/5592)
Extend node with option to have multiple PoST services connect. This allows users to run multiple PoST services,
without the need to run multiple nodes. A node can now manage multiple identities and will manage the lifecycle of
those identities.
To collect rewards for every identity, the associated PoST service must be running and connected to the node during
the cyclegap set in the node's configuration.

### Features

### Improvements

* [#5219](https://github.com/spacemeshos/go-spacemesh/pull/5219) Migrate data from `nipost_builder_state.bin` to `node_state.sql`.

The node will automatically migrate the data from disk and store it in the database. The migration will take place at the
first startup after the upgrade.

* [#5418](https://github.com/spacemeshos/go-spacemesh/pull/5418) Add `grpc-post-listener` to separate post service from
`grpc-private-listener` and not require mTLS for the post service.

* [#5465](https://github.com/spacemeshos/go-spacemesh/pull/5465)
Add an option to cache SQL query results. This is useful for nodes with high peer counts.

Expand Down
72 changes: 50 additions & 22 deletions activation/activation.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ const (
// Config defines configuration for Builder.
type Config struct {
GoldenATXID types.ATXID
LabelsPerUnit uint64
RegossipInterval time.Duration
}

Expand All @@ -68,8 +69,7 @@ type Config struct {
type Builder struct {
accountLock sync.RWMutex
coinbaseAccount types.Address
goldenATXID types.ATXID
regossipInterval time.Duration
conf Config
cdb *datastore.CachedDB
localDB *localsql.Database
publisher pubsub.Publisher
Expand Down Expand Up @@ -143,8 +143,7 @@ func NewBuilder(
b := &Builder{
parentCtx: context.Background(),
signers: make(map[types.NodeID]*signing.EdSigner),
goldenATXID: conf.GoldenATXID,
regossipInterval: conf.RegossipInterval,
conf: conf,
cdb: cdb,
localDB: localDB,
publisher: publisher,
Expand All @@ -165,11 +164,11 @@ func (b *Builder) Register(sig *signing.EdSigner) {
b.smeshingMutex.Lock()
defer b.smeshingMutex.Unlock()
if _, exists := b.signers[sig.NodeID()]; exists {
b.log.Error("signing key already registered", zap.Stringer("id", sig.NodeID()))
b.log.Error("signing key already registered", log.ZShortStringer("id", sig.NodeID()))
return
}

b.log.Info("registered signing key", zap.Stringer("id", sig.NodeID()))
b.log.Info("registered signing key", log.ZShortStringer("id", sig.NodeID()))
b.signers[sig.NodeID()] = sig

if b.stop != nil {
Expand Down Expand Up @@ -213,11 +212,11 @@ func (b *Builder) startID(ctx context.Context, sig *signing.EdSigner) {
b.run(ctx, sig)
return nil
})
if b.regossipInterval == 0 {
if b.conf.RegossipInterval == 0 {
return
}
b.eg.Go(func() error {
ticker := time.NewTicker(b.regossipInterval)
ticker := time.NewTicker(b.conf.RegossipInterval)
defer ticker.Stop()
for {
select {
Expand Down Expand Up @@ -253,7 +252,7 @@ func (b *Builder) StopSmeshing(deleteFiles bool) error {
var resetErr error
for _, sig := range b.signers {
if err := b.nipostBuilder.ResetState(sig.NodeID()); err != nil {
b.log.Error("failed to reset builder state", log.ZShortStringer("nodeId", sig.NodeID()), zap.Error(err))
b.log.Error("failed to reset builder state", log.ZShortStringer("id", sig.NodeID()), zap.Error(err))
err = fmt.Errorf("reset builder state for id %s: %w", sig.NodeID().ShortString(), err)
resetErr = errors.Join(resetErr, err)
continue
Expand All @@ -277,13 +276,13 @@ func (b *Builder) SmesherIDs() []types.NodeID {
return maps.Keys(b.signers)
}

func (b *Builder) buildInitialPost(ctx context.Context, nodeId types.NodeID) error {
func (b *Builder) buildInitialPost(ctx context.Context, nodeID types.NodeID) error {
// Generate the initial POST if we don't have an ATX...
if _, err := b.cdb.GetLastAtx(nodeId); err == nil {
if _, err := b.cdb.GetLastAtx(nodeID); err == nil {
return nil
}
// ...and if we haven't stored an initial post yet.
_, err := nipost.InitialPost(b.localDB, nodeId)
_, err := nipost.InitialPost(b.localDB, nodeID)
switch {
case err == nil:
b.log.Info("load initial post from db")
Expand All @@ -296,14 +295,10 @@ func (b *Builder) buildInitialPost(ctx context.Context, nodeId types.NodeID) err

// Create the initial post and save it.
startTime := time.Now()
post, postInfo, err := b.nipostBuilder.Proof(ctx, nodeId, shared.ZeroChallenge)
post, postInfo, err := b.nipostBuilder.Proof(ctx, nodeID, shared.ZeroChallenge)
if err != nil {
return fmt.Errorf("post execution: %w", err)
}
metrics.PostDuration.Set(float64(time.Since(startTime).Nanoseconds()))
public.PostSeconds.Set(float64(time.Since(startTime)))
b.log.Info("created the initial post")

initialPost := nipost.Post{
Nonce: post.Nonce,
Indices: post.Indices,
Expand All @@ -313,7 +308,23 @@ func (b *Builder) buildInitialPost(ctx context.Context, nodeId types.NodeID) err
CommitmentATX: postInfo.CommitmentATX,
VRFNonce: *postInfo.Nonce,
}
return nipost.AddInitialPost(b.localDB, nodeId, initialPost)
err = b.validator.Post(ctx, nodeID, postInfo.CommitmentATX, post, &types.PostMetadata{
Challenge: shared.ZeroChallenge,
LabelsPerUnit: postInfo.LabelsPerUnit,
}, postInfo.NumUnits)
if err != nil {
b.log.Error("initial POST is invalid", log.ZShortStringer("smesherID", nodeID), zap.Error(err))
if err := nipost.RemoveInitialPost(b.localDB, nodeID); err != nil {
b.log.Fatal("failed to remove initial post", log.ZShortStringer("smesherID", nodeID), zap.Error(err))
}
return fmt.Errorf("initial POST is invalid: %w", err)
}

metrics.PostDuration.Set(float64(time.Since(startTime).Nanoseconds()))
public.PostSeconds.Set(float64(time.Since(startTime)))
b.log.Info("created the initial post")

return nipost.AddInitialPost(b.localDB, nodeID, initialPost)
}

func (b *Builder) run(ctx context.Context, sig *signing.EdSigner) {
Expand Down Expand Up @@ -379,7 +390,7 @@ func (b *Builder) run(ctx context.Context, sig *signing.EdSigner) {
}
}

func (b *Builder) buildNIPostChallenge(ctx context.Context, nodeID types.NodeID) (*types.NIPostChallenge, error) {
func (b *Builder) BuildNIPostChallenge(ctx context.Context, nodeID types.NodeID) (*types.NIPostChallenge, error) {
select {
case <-ctx.Done():
return nil, ctx.Err()
Expand Down Expand Up @@ -451,6 +462,23 @@ func (b *Builder) buildNIPostChallenge(ctx context.Context, nodeID types.NodeID)
if err != nil {
return nil, fmt.Errorf("get initial post: %w", err)
}
b.log.Info("verifying the initial post")
initialPost := &types.Post{
Nonce: post.Nonce,
Indices: post.Indices,
Pow: post.Pow,
}
err = b.validator.Post(ctx, nodeID, post.CommitmentATX, initialPost, &types.PostMetadata{
Challenge: shared.ZeroChallenge,
LabelsPerUnit: b.conf.LabelsPerUnit,
}, post.NumUnits)
if err != nil {
b.log.Error("initial POST is invalid", log.ZShortStringer("smesherID", nodeID), zap.Error(err))
if err := nipost.RemoveInitialPost(b.localDB, nodeID); err != nil {
b.log.Fatal("failed to remove initial post", log.ZShortStringer("smesherID", nodeID), zap.Error(err))
}
return nil, fmt.Errorf("initial POST is invalid: %w", err)
}
challenge = &types.NIPostChallenge{
PublishEpoch: current + 1,
Sequence: 0,
Expand Down Expand Up @@ -498,7 +526,7 @@ func (b *Builder) Coinbase() types.Address {

// PublishActivationTx attempts to publish an atx, it returns an error if an atx cannot be created.
func (b *Builder) PublishActivationTx(ctx context.Context, sig *signing.EdSigner) error {
challenge, err := b.buildNIPostChallenge(ctx, sig.NodeID())
challenge, err := b.BuildNIPostChallenge(ctx, sig.NodeID())
if err != nil {
return err
}
Expand Down Expand Up @@ -630,7 +658,7 @@ func (b *Builder) getPositioningAtx(ctx context.Context, nodeID types.NodeID) (t
ctx,
b.cdb,
nodeID,
b.goldenATXID,
b.conf.GoldenATXID,
b.validator,
b.log,
VerifyChainOpts.AssumeValidBefore(time.Now().Add(-b.postValidityDelay)),
Expand All @@ -639,7 +667,7 @@ func (b *Builder) getPositioningAtx(ctx context.Context, nodeID types.NodeID) (t
)
if errors.Is(err, sql.ErrNotFound) {
b.log.Info("using golden atx as positioning atx")
return b.goldenATXID, nil
return b.conf.GoldenATXID, nil
}
return id, err
}
Expand Down
50 changes: 46 additions & 4 deletions activation/activation_multi_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -222,15 +222,33 @@ func TestRegossip(t *testing.T) {

func Test_Builder_Multi_InitialPost(t *testing.T) {
tab := newTestBuilder(t, 5, WithPoetConfig(PoetConfig{PhaseShift: layerDuration * 4}))

var eg errgroup.Group
for _, sig := range tab.signers {
sig := sig
eg.Go(func() error {
numUnits := uint32(12)

post := &types.Post{
Indices: types.RandomBytes(10),
Nonce: rand.Uint32(),
Pow: rand.Uint64(),
}
meta := &types.PostMetadata{
Challenge: shared.ZeroChallenge,
LabelsPerUnit: tab.conf.LabelsPerUnit,
}

commitmentATX := types.RandomATXID()
tab.mValidator.EXPECT().Post(gomock.Any(), sig.NodeID(), commitmentATX, post, meta, numUnits).Return(nil)
tab.mnipost.EXPECT().Proof(gomock.Any(), sig.NodeID(), shared.ZeroChallenge).Return(
&types.Post{Indices: make([]byte, 10)},
post,
&types.PostInfo{
CommitmentATX: types.RandomATXID(),
CommitmentATX: commitmentATX,
Nonce: new(types.VRFPostIndex),
NumUnits: numUnits,
NodeID: sig.NodeID(),
LabelsPerUnit: tab.conf.LabelsPerUnit,
},
nil,
)
Expand All @@ -249,7 +267,6 @@ func Test_Builder_Multi_InitialPost(t *testing.T) {
func Test_Builder_Multi_HappyPath(t *testing.T) {
layerDuration := 2 * time.Second
tab := newTestBuilder(t, 3, WithPoetConfig(PoetConfig{PhaseShift: layerDuration * 4, CycleGap: layerDuration}))
tab.regossipInterval = 0 // disable regossip for testing

// step 1: build initial posts
initialPostChan := make(chan struct{})
Expand All @@ -264,12 +281,23 @@ func Test_Builder_Multi_HappyPath(t *testing.T) {
Nonce: rand.Uint32(),
Pow: rand.Uint64(),

NumUnits: 4,
NumUnits: uint32(12),
CommitmentATX: types.RandomATXID(),
VRFNonce: types.VRFPostIndex(rand.Uint64()),
}
initialPost[sig.NodeID()] = &nipost

post := &types.Post{
Indices: nipost.Indices,
Nonce: nipost.Nonce,
Pow: nipost.Pow,
}
meta := &types.PostMetadata{
Challenge: shared.ZeroChallenge,
LabelsPerUnit: tab.conf.LabelsPerUnit,
}
tab.mValidator.EXPECT().Post(gomock.Any(), sig.NodeID(), nipost.CommitmentATX, post, meta, nipost.NumUnits).
Return(nil)
tab.mnipost.EXPECT().Proof(gomock.Any(), sig.NodeID(), shared.ZeroChallenge).DoAndReturn(
func(ctx context.Context, _ types.NodeID, _ []byte) (*types.Post, *types.PostInfo, error) {
<-initialPostChan
Expand All @@ -283,6 +311,7 @@ func Test_Builder_Multi_HappyPath(t *testing.T) {
NumUnits: nipost.NumUnits,
CommitmentATX: nipost.CommitmentATX,
Nonce: &nipost.VRFNonce,
LabelsPerUnit: tab.conf.LabelsPerUnit,
}

return post, postInfo, nil
Expand Down Expand Up @@ -315,6 +344,19 @@ func Test_Builder_Multi_HappyPath(t *testing.T) {
return postGenesisEpoch.FirstLayer() + 1
},
)

nipost := initialPost[sig.NodeID()]
post := &types.Post{
Indices: nipost.Indices,
Nonce: nipost.Nonce,
Pow: nipost.Pow,
}
meta := &types.PostMetadata{
Challenge: shared.ZeroChallenge,
LabelsPerUnit: tab.conf.LabelsPerUnit,
}
tab.mValidator.EXPECT().Post(gomock.Any(), sig.NodeID(), nipost.CommitmentATX, post, meta, nipost.NumUnits).
Return(nil)
}

// step 3: create ATX
Expand Down
Loading

0 comments on commit 6a0ea84

Please sign in to comment.