-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a delay after halt and before backup. #12101
Comments
I second that. During some upgrades via cosmovisor I had problems with it probably restarting too fast. Cosmovisor or rather the binary it runs raises an error at these occasions that the database (leveldb/cleveldb) is still locked while trying to start a new process. After a manual restart of cosmovisor the nodes run just fine. Logs of described issue:
|
Oh wow, this is a great point that I haven't thought about before I support this being added |
…12188) ## Description Closes: #12101 --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [x] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing) - [x] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [x] updated the relevant documentation or specification - [x] reviewed "Files changed" and left comments if necessary - [x] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable)
Summary
In Cosmovisor, allow a node operator to define a delay between the node halt (for upgrade) and backup.
Problem Definition
As a node operator, I want to define a delay after the halt (for upgrade) so that all node sub-processes have time to terminate and close file handles before moving on to steps that might need to access the files (e.g. backup or restart).
Proposal
Use a
DAEMON_RESTART_DELAY
environment variable to provide the desired delay as atime.Duration
format (e.g.5s
or300ms
), default to 0.As soon as the node halts, have Cosmovisor sleep for that amount of time before moving on to any subsequent steps.
For Admin Use
The text was updated successfully, but these errors were encountered: