Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tracking issue #4

Open
kaustubhkapatral opened this issue May 22, 2023 · 0 comments
Open

Refactor tracking issue #4

kaustubhkapatral opened this issue May 22, 2023 · 0 comments

Comments

@kaustubhkapatral
Copy link
Member

kaustubhkapatral commented May 22, 2023

Current implementation of alertbot has been rendered inefficient due to a number of changes in the Cosmos SDK. The queries made on both the RPC and LCD endpoints are quite resource intensive and cause a lot of load on the server which affects the node's performance. The following list are a few changes which need to be implemented to improve the alertbot

Missed blocks :-

In the current implementation of missed blocks calculation, every block is queried using the LCD endpoint. If validator's hex address (which is defined in the config.toml file) is not present in the the precommits array in the repsonse then the block is marked as missed and inserted in the db. If the number of continuous missed blocks exceeds the missed blocks threshold (which is defined in the config.toml file) then an alert is fired to the user. Querying each and every block puts a lot of load on the node and in the latest release of cosmos sdk /blocks LCD endpoints has been discontinued. This leads to every block being marked as missed and sends an alert to the user.

Instead of relying on the approach of querying each block we should switch to relying on slashing info endpoint. LCD.endpoint/cosmos/slashing/v1beta1/signing_infos/<cons-address>.

{
  "val_signing_info": {
    "address": "cosmosvalcons1d65x8dzt5d5lww0x2k2hwrdtjtyf4ysjdyv245",
    "start_height": "5915501",
    "index_offset": "9487453",
    "jailed_until": "1970-01-01T00:00:00Z",
    "tombstoned": false,
    "missed_blocks_counter": "4"
  }
}

This is a sample response of the endpoint. missed_blocks_counter in the response is the number of blocks a validator has missed in the min_signed_blocks window. Counter increases by +1 whenever a validator misses a block. We should rely on this for alerting the user on missed blocks of their validator. The scrape rate for this job should be in every 15 mins instead of every 3s (which is defined in the config.toml file). In addition to this we should add a sanity check on the index_offset field. The value of index_offset should increase between the scrapes.

Tx alerts:-

In the current implementation of delegation/redelegation/undelegation alerts. each and every block is queried and all the txs present in the blocks are indexed to check if the validator or account address is present in any of the txs. Just like the missed block alerting, querying every block and indexing the txs puts a lot of load on the node. This implementation should be replaced with an approach that relies on tx events. Subscribing to delegation/redelegation/undelegation events and balance change events for account address would drop the load considerably.

Proposal alerts:-

This implementation of proposal alerts can be removed entirely as an another bot has been developed which send out alerts for new proposals.

Query commands:-

Alertbot has telegram commands integrated which when typed gives the user info about various things like /status, /peers, /node, /balance, /rewards, /rpc_status, /endpoints. Value for these responses is continuously queried on chain and stored in the db even if those commands are not being executed. Approach for that should be changed and those specific endpoints should only be queried if the user request it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant