diff --git a/docs/docs/building-apps/02-app-mempool.md b/docs/docs/building-apps/02-app-mempool.md index 51c76a4c9d75..031d065641d1 100644 --- a/docs/docs/building-apps/02-app-mempool.md +++ b/docs/docs/building-apps/02-app-mempool.md @@ -2,7 +2,7 @@ sidebar_position: 1 --- -# Application mempool +# Application Mempool :::note Synopsis This sections describes how the app-side mempool can be used and replaced. diff --git a/docs/docs/building-apps/03-app-upgrade.md b/docs/docs/building-apps/03-app-upgrade.md index d60e781177bc..f90c7b385230 100644 --- a/docs/docs/building-apps/03-app-upgrade.md +++ b/docs/docs/building-apps/03-app-upgrade.md @@ -2,7 +2,7 @@ sidebar_position: 1 --- -# Application upgrade +# Application Upgrade :::note This document describes how to upgrade your application. If you are looking specifically for the changes to perform between SDK versions, see the [SDK migrations documentation](https://docs.cosmos.network/main/migrations/intro). @@ -12,6 +12,149 @@ This document describes how to upgrade your application. If you are looking spec This section is currently incomplete. Track the progress of this document [here](https://github.com/cosmos/cosmos-sdk/issues/11504). ::: +:::note + +### Pre-requisite Reading + +* [`x/upgrade` Documentation](https://docs.cosmos.network/main/modules/upgrade) + +::: + +## General Workflow + +Let's assume we are running v0.38.0 of our software in our testnet and want to upgrade to v0.40.0. +How would this look in practice? First of all, we want to finalize the v0.40.0 release candidate +and there install a specially named upgrade handler (eg. "testnet-v2" or even "v0.40.0"). An upgrade +handler should be defined in a new version of the software to define what migrations +to run to migrate from the older version of the software. Naturally, this is app-specific rather +than module specific, and must be defined in `app.go`, even if it imports logic from various +modules to perform the actions. You can register them with `upgradeKeeper.SetUpgradeHandler` +during the app initialization (before starting the abci server), and they serve not only to +perform a migration, but also to identify if this is the old or new version (eg. presence of +a handler registered for the named upgrade). + +Once the release candidate along with an appropriate upgrade handler is frozen, +we can have a governance vote to approve this upgrade at some future block height (e.g. 200000). +This is known as an upgrade.Plan. The v0.38.0 code will not know of this handler, but will +continue to run until block 200000, when the plan kicks in at `BeginBlock`. It will check +for existence of the handler, and finding it missing, know that it is running the obsolete software, +and gracefully exit. + +Generally the application binary will restart on exit, but then will execute this BeginBlocker +again and exit, causing a restart loop. Either the operator can manually install the new software, +or you can make use of an external watcher daemon to possibly download and then switch binaries, +also potentially doing a backup. The SDK tool for doing such, is called [Cosmovisor](https://docs.cosmos.network/main/tooling/cosmovisor). + +When the binary restarts with the upgraded version (here v0.40.0), it will detect we have registered the +"testnet-v2" upgrade handler in the code, and realize it is the new version. It then will run the upgrade handler +and *migrate the database in-place*. Once finished, it marks the upgrade as done, and continues processing +the rest of the block as normal. Once 2/3 of the voting power has upgraded, the blockchain will immediately +resume the consensus mechanism. If the majority of operators add a custom `do-upgrade` script, this should +be a matter of minutes and not even require them to be awake at that time. + +## Integrating With An App + +Setup an upgrade Keeper for the app and then define a `BeginBlocker` that calls the upgrade +keeper's BeginBlocker method: + +```go + func (app *myApp) BeginBlocker(ctx sdk.Context, req abci.RequestBeginBlock) (abci.ResponseBeginBlock, error) { + app.upgradeKeeper.BeginBlocker(ctx, req) + return abci.ResponseBeginBlock{}, nil + } +``` + +The app must then integrate the upgrade keeper with its governance module as appropriate. The governance module +should call ScheduleUpgrade to schedule an upgrade and ClearUpgradePlan to cancel a pending upgrade. + +## Performing Upgrades + +Upgrades can be scheduled at a predefined block height. Once this block height is reached, the +existing software will cease to process ABCI messages and a new version with code that handles the upgrade must be deployed. +All upgrades are coordinated by a unique upgrade name that cannot be reused on the same blockchain. In order for the upgrade +module to know that the upgrade has been safely applied, a handler with the name of the upgrade must be installed. +Here is an example handler for an upgrade named "my-fancy-upgrade": + +```go +app.upgradeKeeper.SetUpgradeHandler("my-fancy-upgrade", func(ctx sdk.Context, plan upgrade.Plan) { + // Perform any migrations of the state store needed for this upgrade +}) +``` + +This upgrade handler performs the dual function of alerting the upgrade module that the named upgrade has been applied, +as well as providing the opportunity for the upgraded software to perform any necessary state migrations. Both the halt +(with the old binary) and applying the migration (with the new binary) are enforced in the state machine. Actually +switching the binaries is an ops task and not handled inside the sdk / abci app. + +Here is a sample code to set store migrations with an upgrade: + +```go +// this configures a no-op upgrade handler for the "my-fancy-upgrade" upgrade +app.UpgradeKeeper.SetUpgradeHandler("my-fancy-upgrade", func(ctx sdk.Context, plan upgrade.Plan) { + // upgrade changes here +}) +upgradeInfo, err := app.UpgradeKeeper.ReadUpgradeInfoFromDisk() +if err != nil { + // handle error +} +if upgradeInfo.Name == "my-fancy-upgrade" && !app.UpgradeKeeper.IsSkipHeight(upgradeInfo.Height) { + storeUpgrades := store.StoreUpgrades{ + Renamed: []store.StoreRename{{ + OldKey: "foo", + NewKey: "bar", + }}, + Deleted: []string{}, + } + // configure store loader that checks if version == upgradeHeight and applies store upgrades + app.SetStoreLoader(upgrade.UpgradeStoreLoader(upgradeInfo.Height, &storeUpgrades)) +} +``` + +## Halt Behavior + +Before halting the ABCI state machine in the BeginBlocker method, the upgrade module will log an error +that looks like: + +```text + UPGRADE "" NEEDED at height : +``` + +where `Name` and `Info` are the values of the respective fields on the upgrade Plan. + +To perform the actual halt of the blockchain, the upgrade keeper simply panics which prevents the ABCI state machine +from proceeding but doesn't actually exit the process. Exiting the process can cause issues for other nodes that start +to lose connectivity with the exiting nodes, thus this module prefers to just halt but not exit. + +## Automation + +Read more about [Cosmovisor](https://docs.cosmos.network/main/tooling/cosmovisor), the tool for automating upgrades. + +## Canceling Upgrades + +There are two ways to cancel a planned upgrade - with on-chain governance or off-chain social consensus. +For the first one, there is a `CancelSoftwareUpgrade` governance proposal, which can be voted on and will +remove the scheduled upgrade plan. Of course this requires that the upgrade was known to be a bad idea +well before the upgrade itself, to allow time for a vote. If you want to allow such a possibility, you +should set the upgrade height to be `2 * (votingperiod + depositperiod) + (safety delta)` from the beginning of +the first upgrade proposal. Safety delta is the time available from the success of an upgrade proposal +and the realization it was a bad idea (due to external testing). You can also start a `CancelSoftwareUpgrade` +proposal while the original `SoftwareUpgrade` proposal is still being voted upon, as long as the voting +period ends after the `SoftwareUpgrade` proposal. + +However, let's assume that we don't realize the upgrade has a bug until shortly before it will occur +(or while we try it out - hitting some panic in the migration). It would seem the blockchain is stuck, +but we need to allow an escape for social consensus to overrule the planned upgrade. To do so, there's +a `--unsafe-skip-upgrades` flag to the start command, which will cause the node to mark the upgrade +as done upon hitting the planned upgrade height(s), without halting and without actually performing a migration. +If over two-thirds run their nodes with this flag on the old binary, it will allow the chain to continue through +the upgrade with a manual override. (This must be well-documented for anyone syncing from genesis later on). + +Example: + +```shell + start --unsafe-skip-upgrades ... +``` + ## Pre-Upgrade Handling Cosmovisor supports custom pre-upgrade handling. Use pre-upgrade handling when you need to implement application config changes that are required in the newer version before you perform the upgrade. @@ -37,24 +180,24 @@ Here is a sample structure of the `pre-upgrade` command: ```go func preUpgradeCommand() *cobra.Command { - cmd := &cobra.Command{ - Use: "pre-upgrade", - Short: "Pre-upgrade command", + cmd := &cobra.Command{ + Use: "pre-upgrade", + Short: "Pre-upgrade command", Long: "Pre-upgrade command to implement custom pre-upgrade handling", - Run: func(cmd *cobra.Command, args []string) { + Run: func(cmd *cobra.Command, args []string) { - err := HandlePreUpgrade() + err := HandlePreUpgrade() - if err != nil { - os.Exit(30) - } + if err != nil { + os.Exit(30) + } - os.Exit(0) + os.Exit(0) - }, - } + }, + } - return cmd + return cmd } ``` @@ -62,8 +205,10 @@ Ensure that the pre-upgrade command has been registered in the application: ```go rootCmd.AddCommand( - // .. - preUpgradeCommand(), - // .. - ) + // .. + preUpgradeCommand(), + // .. + ) ``` + +When not using Cosmovisor, ensure to run ` pre-upgrade` before starting the application binary. diff --git a/docs/docs/migrations/01-intro.md b/docs/docs/migrations/01-intro.md index b27b294ea297..47c5c245a9ab 100644 --- a/docs/docs/migrations/01-intro.md +++ b/docs/docs/migrations/01-intro.md @@ -8,7 +8,7 @@ To smoothen the update to the latest stable release, the SDK includes a CLI comm Additionally, the SDK includes in-place migrations for its core modules. These in-place migrations are useful to migrate between major releases. * Hard-fork migrations are supported from the last major release to the current one. -* In-place module migrations are supported from the last two major releases to the current one. +* [In-place module migrations](https://docs.cosmos.network/main/core/upgrade#overwriting-genesis-functions) are supported from the last two major releases to the current one. Migration from a version older than the last two major releases is not supported. diff --git a/tools/cosmovisor/README.md b/tools/cosmovisor/README.md index e53166ccd311..564040b33dd7 100644 --- a/tools/cosmovisor/README.md +++ b/tools/cosmovisor/README.md @@ -88,7 +88,7 @@ All arguments passed to `cosmovisor run` will be passed to the application binar * `DAEMON_POLL_INTERVAL` (*optional*, default 300 milliseconds), is the interval length for polling the upgrade plan file. The value must be a duration (e.g. `1s`). * `DAEMON_DATA_BACKUP_DIR` option to set a custom backup directory. If not set, `DAEMON_HOME` is used. * `UNSAFE_SKIP_BACKUP` (defaults to `false`), if set to `true`, upgrades directly without performing a backup. Otherwise (`false`, default) backs up the data before trying the upgrade. The default value of false is useful and recommended in case of failures and when a backup needed to rollback. We recommend using the default backup option `UNSAFE_SKIP_BACKUP=false`. -* `DAEMON_PREUPGRADE_MAX_RETRIES` (defaults to `0`). The maximum number of times to call `pre-upgrade` in the application after exit status of `31`. After the maximum number of retries, Cosmovisor fails the upgrade. +* `DAEMON_PREUPGRADE_MAX_RETRIES` (defaults to `0`). The maximum number of times to call [`pre-upgrade`](https://docs.cosmos.network/main/building-apps/app-upgrade#pre-upgrade-handling) in the application after exit status of `31`. After the maximum number of retries, Cosmovisor fails the upgrade. * `COSMOVISOR_DISABLE_LOGS` (defaults to `false`). If set to true, this will disable Cosmovisor logs (but not the underlying process) completely. This may be useful, for example, when a Cosmovisor subcommand you are executing returns a valid JSON you are then parsing, as logs added by Cosmovisor make this output not a valid JSON. ### Folder Layout diff --git a/types/module/module.go b/types/module/module.go index 1764cf457a46..769465db72b7 100644 --- a/types/module/module.go +++ b/types/module/module.go @@ -637,7 +637,7 @@ type VersionMap map[string]uint64 // return app.mm.RunMigrations(ctx, cfg, fromVM) // }) // -// Please also refer to docs/core/upgrade.md for more information. +// Please also refer to https://docs.cosmos.network/main/core/upgrade for more information. func (m Manager) RunMigrations(ctx sdk.Context, cfg Configurator, fromVM VersionMap) (VersionMap, error) { c, ok := cfg.(*configurator) if !ok { diff --git a/x/upgrade/README.md b/x/upgrade/README.md index d1be1b2c8c2a..698e822ba866 100644 --- a/x/upgrade/README.md +++ b/x/upgrade/README.md @@ -41,12 +41,6 @@ may contain various metadata about the upgrade, typically application specific upgrade info to be included on-chain such as a git commit that validators could automatically upgrade to. -#### Sidecar Process - -If an operator running the application binary also runs a sidecar process to assist -in the automatic download and upgrade of a binary, the `Info` allows this process to -be seamless. This tool is [Cosmovisor](https://github.com/cosmos/cosmos-sdk/tree/main/tools/cosmovisor#readme). - ```go type Plan struct { Name string @@ -55,6 +49,12 @@ type Plan struct { } ``` +#### Sidecar Process + +If an operator running the application binary also runs a sidecar process to assist +in the automatic download and upgrade of a binary, the `Info` allows this process to +be seamless. This tool is [Cosmovisor](https://github.com/cosmos/cosmos-sdk/tree/main/tools/cosmovisor#readme). + ### Handler The `x/upgrade` module facilitates upgrading from major version X to major version Y. To diff --git a/x/upgrade/doc.go b/x/upgrade/doc.go index 4ebab204f725..a66ab41ea52d 100644 --- a/x/upgrade/doc.go +++ b/x/upgrade/doc.go @@ -7,140 +7,6 @@ Without software support for upgrades, upgrading a live chain is risky because a their state machines at exactly the same point in the process. If this is not done correctly, there can be state inconsistencies which are hard to recover from. -# General Workflow - -Let's assume we are running v0.38.0 of our software in our testnet and want to upgrade to v0.40.0. -How would this look in practice? First of all, we want to finalize the v0.40.0 release candidate -and there install a specially named upgrade handler (eg. "testnet-v2" or even "v0.40.0"). An upgrade -handler should be defined in a new version of the software to define what migrations -to run to migrate from the older version of the software. Naturally, this is app-specific rather -than module specific, and must be defined in `app.go`, even if it imports logic from various -modules to perform the actions. You can register them with `upgradeKeeper.SetUpgradeHandler` -during the app initialization (before starting the abci server), and they serve not only to -perform a migration, but also to identify if this is the old or new version (eg. presence of -a handler registered for the named upgrade). - -Once the release candidate along with an appropriate upgrade handler is frozen, -we can have a governance vote to approve this upgrade at some future block height (e.g. 200000). -This is known as an upgrade.Plan. The v0.38.0 code will not know of this handler, but will -continue to run until block 200000, when the plan kicks in at BeginBlock. It will check -for existence of the handler, and finding it missing, know that it is running the obsolete software, -and gracefully exit. - -Generally the application binary will restart on exit, but then will execute this BeginBlocker -again and exit, causing a restart loop. Either the operator can manually install the new software, -or you can make use of an external watcher daemon to possibly download and then switch binaries, -also potentially doing a backup. An example of such a daemon is https://github.com/cosmos/cosmos-sdk/tree/main/cosmovisor -described below under "Automation". - -When the binary restarts with the upgraded version (here v0.40.0), it will detect we have registered the -"testnet-v2" upgrade handler in the code, and realize it is the new version. It then will run the upgrade handler -and *migrate the database in-place*. Once finished, it marks the upgrade as done, and continues processing -the rest of the block as normal. Once 2/3 of the voting power has upgraded, the blockchain will immediately -resume the consensus mechanism. If the majority of operators add a custom `do-upgrade` script, this should -be a matter of minutes and not even require them to be awake at that time. - -# Integrating With An App - -Setup an upgrade Keeper for the app and then define a BeginBlocker that calls the upgrade -keeper's BeginBlocker method: - - func (app *myApp) BeginBlocker(ctx sdk.Context, req abci.RequestBeginBlock) (abci.ResponseBeginBlock, error) { - app.upgradeKeeper.BeginBlocker(ctx, req) - return abci.ResponseBeginBlock{}, nil - } - -The app must then integrate the upgrade keeper with its governance module as appropriate. The governance module -should call ScheduleUpgrade to schedule an upgrade and ClearUpgradePlan to cancel a pending upgrade. - -# Performing Upgrades - -Upgrades can be scheduled at a predefined block height. Once this block height is reached, the -existing software will cease to process ABCI messages and a new version with code that handles the upgrade must be deployed. -All upgrades are coordinated by a unique upgrade name that cannot be reused on the same blockchain. In order for the upgrade -module to know that the upgrade has been safely applied, a handler with the name of the upgrade must be installed. -Here is an example handler for an upgrade named "my-fancy-upgrade": - - app.upgradeKeeper.SetUpgradeHandler("my-fancy-upgrade", func(ctx sdk.Context, plan upgrade.Plan) { - // Perform any migrations of the state store needed for this upgrade - }) - -This upgrade handler performs the dual function of alerting the upgrade module that the named upgrade has been applied, -as well as providing the opportunity for the upgraded software to perform any necessary state migrations. Both the halt -(with the old binary) and applying the migration (with the new binary) are enforced in the state machine. Actually -switching the binaries is an ops task and not handled inside the sdk / abci app. - -Here is a sample code to set store migrations with an upgrade: - - // this configures a no-op upgrade handler for the "my-fancy-upgrade" upgrade - app.UpgradeKeeper.SetUpgradeHandler("my-fancy-upgrade", func(ctx sdk.Context, plan upgrade.Plan) { - // upgrade changes here - }) - - upgradeInfo, err := app.UpgradeKeeper.ReadUpgradeInfoFromDisk() - if err != nil { - // handle error - } - - if upgradeInfo.Name == "my-fancy-upgrade" && !app.UpgradeKeeper.IsSkipHeight(upgradeInfo.Height) { - storeUpgrades := store.StoreUpgrades{ - Renamed: []store.StoreRename{{ - OldKey: "foo", - NewKey: "bar", - }}, - Deleted: []string{}, - } - - // configure store loader that checks if version == upgradeHeight and applies store upgrades - app.SetStoreLoader(upgrade.UpgradeStoreLoader(upgradeInfo.Height, &storeUpgrades)) - } - -# Halt Behavior - -Before halting the ABCI state machine in the BeginBlocker method, the upgrade module will log an error -that looks like: - - UPGRADE "" NEEDED at height : - -where Name are Info are the values of the respective fields on the upgrade Plan. - -To perform the actual halt of the blockchain, the upgrade keeper simply panics which prevents the ABCI state machine -from proceeding but doesn't actually exit the process. Exiting the process can cause issues for other nodes that start -to lose connectivity with the exiting nodes, thus this module prefers to just halt but not exit. - -# Automation and Plan.Info - -We have deprecated calling out to scripts, instead with propose https://github.com/cosmos/cosmos-sdk/tree/main/cosmovisor -as a model for a watcher daemon that can launch simd as a subprocess and then read the upgrade log message -to swap binaries as needed. You can pass in information into Plan.Info according to the format -specified here https://github.com/cosmos/cosmos-sdk/tree/main/cosmovisor/README.md#auto-download . -This will allow a properly configured cosmsod daemon to auto-download new binaries and auto-upgrade. -As noted there, this is intended more for full nodes than validators. - -# Canceling Upgrades - -There are two ways to cancel a planned upgrade - with on-chain governance or off-chain social consensus. -For the first one, there is a CancelSoftwareUpgrade proposal type, which can be voted on and will -remove the scheduled upgrade plan. Of course this requires that the upgrade was known to be a bad idea -well before the upgrade itself, to allow time for a vote. If you want to allow such a possibility, you -should set the upgrade height to be 2 * (votingperiod + depositperiod) + (safety delta) from the beginning of -the first upgrade proposal. Safety delta is the time available from the success of an upgrade proposal -and the realization it was a bad idea (due to external testing). You can also start a CancelSoftwareUpgrade -proposal while the original SoftwareUpgrade proposal is still being voted upon, as long as the voting -period ends after the SoftwareUpgrade proposal. - -However, let's assume that we don't realize the upgrade has a bug until shortly before it will occur -(or while we try it out - hitting some panic in the migration). It would seem the blockchain is stuck, -but we need to allow an escape for social consensus to overrule the planned upgrade. To do so, there's -a --unsafe-skip-upgrades flag to the start command, which will cause the node to mark the upgrade -as done upon hitting the planned upgrade height(s), without halting and without actually performing a migration. -If over two-thirds run their nodes with this flag on the old binary, it will allow the chain to continue through -the upgrade with a manual override. (This must be well-documented for anyone syncing from genesis later on). - -Example: - - simd start --unsafe-skip-upgrades ... - -NOTE: Here simd is used as an example binary, replace it with original binary +For more information, read the documentation on https://docs.cosmos.network/main/modules/upgrade. */ package upgrade