Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pallet-staking] Auto payout validator reward #5894

Open
Ank4n opened this issue Oct 1, 2024 · 12 comments
Open

[pallet-staking] Auto payout validator reward #5894

Ank4n opened this issue Oct 1, 2024 · 12 comments
Assignees
Labels
T2-pallets This PR/Issue is related to a particular pallet.

Comments

@Ank4n
Copy link
Contributor

Ank4n commented Oct 1, 2024

Context

Validator payouts are lazy, and paged. Meaning for each era, and page of nominators (see MaxExposurePageSize), the reward needs to be claimed by calling Staking::payout_stakers.

Validators generally run a bot to claim these. They have HistoryDepth eras (84 eras in dotsama) after which these claims are dropped/become unavailable.

To Do

Utilize tasks to payout rewards.

Probably the best place to schedule this is when a new era is triggered and we set the exposure for the validators.

Other requirements

The tasks should also keep track of pages. This can be as easy as, not remove validator from task until all pages of validator has been paid out.

We also need to have a way to make sure we don't just keep backfilling tasks while not having enough time to process them. I would suggest something like dropping tasks more than X=3 eras (where X is configurable) old.

Related

#4630 is very similar to this issue.

@Ank4n Ank4n added the T2-pallets This PR/Issue is related to a particular pallet. label Oct 1, 2024
@PieWol
Copy link
Contributor

PieWol commented Oct 2, 2024

Hey @Ank4n ,
I'd like to give this a shot. Feel free to assign me :)

@Ank4n Ank4n assigned Ank4n and PieWol and unassigned Ank4n Oct 2, 2024
@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/ux-ui-automatic-payout-for-all-validator/11106/3

@vovacha
Copy link

vovacha commented Dec 19, 2024

@PieWol, what's the status on this? If you're busy, I’m happy to take it over and aim to wrap it up in two weeks. Let me know!

@PieWol
Copy link
Contributor

PieWol commented Dec 19, 2024

Please go for it @vovacha

@PieWol PieWol removed their assignment Dec 19, 2024
@vovacha
Copy link

vovacha commented Dec 24, 2024

@Ank4n Could you answer some of the questions? Its my first PR so I may have follow-up questions later as well.

Some context:

  • Validators: Polkadot 499, Kusama 1000
  • Average nominators per validator: Polkadot ~45, Kusama ~12
  • MaxExposurePageSize: 512 nominators/page
  • Most validators have single page due to avg nominators < page size
  • Weight per payout page = Base(192_836_012) + (47_646_642 * n_nominators) + DB_operations

1. Task design. I see at least several options,not sure which one was suggested or assumed:

  • Simplest approach. One task per page, no lifetime control, no additional storage. 499 tasks per era for Polkadot.
  • One task per page but with lifetime control for cleanup or monitoring. Requires additional management layer and storage.
  • One larger task (per validator or per era) that processes PAGES_PER_BLOCK pages and reschedules itself. Probably sounds too complex.
  • Other option?

2. Cleanup. The requirement to drop tasks older than X=3 eras raises some questions:

My findings:

  • Processing time looks relatively fast ~30 seconds to process all payouts for era based on the number of validators, pages, number of nominators, weight and operational extrinsic limit.
  • Functions are idempotent, overlapping execution looks safe.
  • No similar examples of task lifetime control I found among other pallets.
  • Scheduler already handles task management (delegation).

What's the motivation behind implementing additional task cleanup layer within staking pallet?

3. Execution (assumption). I assume both payout extrinsics will be deprecated and OCW is not considered as execution method. So pallet Scheduler is the way to go.

4. Migration approach (TBD). Feel free to put some comments here as well.

@Ank4n
Copy link
Contributor Author

Ank4n commented Dec 25, 2024

Thanks for looking into this.

1. Task design. I see at least several options,not sure which one was suggested or assumed:

  • Simplest approach. One task per page, no lifetime control, no additional storage. 499 tasks per era for Polkadot.

This ^. But there can be more than 499 tasks per era as some validators can have multiple pages. Also, there can be more than one task that can be executed in the same block (based on how much block weight is free).

2. Cleanup. The requirement to drop tasks older than X=3 eras raises some questions:
What's the motivation behind implementing additional task cleanup layer within staking pallet?

Thanks for crunching the numbers. In practice, this should rarely be an issue. However, it’s likely still needed as a defensive measure, though you can consider it optional for now.

If we’re unable to execute all tasks within an era (due to missed blocks, full blocks, etc.), we’ll need to determine what to prioritize in the next era: the newest tasks or the oldest ones. If this pattern persists over multiple eras, we’ll eventually need to drop older tasks and focus solely on the new ones.

3. Execution (assumption). I assume both payout extrinsics will be deprecated and OCW is not considered as execution method. So pallet Scheduler is the way to go.

We can retain the payout extrinsics. If someone needs to payout urgently and cannot wait for tasks to be automatically executed, or in cases where old unprocessed tasks are deprecated, this extrinsic provides a way to process them manually.

Curious why you mentioned OCW is not considered as execution method. pallet::tasks leverages OCW for task processing, unless I’m missing something? Here’s an example that might clarify.

4. Migration approach (TBD). Feel free to put some comments here as well.

I don’t think any migration is necessary. Is there something specific you had in mind?

@vovacha
Copy link

vovacha commented Jan 2, 2025

@Ank4n Thanks for the clarification. Let me explain my initial thoughts.

Why I assumed OCW is not considered:

  • "Schedule this when new era is triggered" suggests on-chain scheduling.
  • I assumed extrinsics should be deprecated, which is not true.
  • There are some challenges with OCW which are not obvious how to handle:
    • In the example provided there is unsigned extrinsic, in our case we need some account to pay for transaction fees.
    • Task distribution between validators (by hash, delay, etc.). Requires additional complexity, need to do some research here if that's the way to go.

I suggest approach with pallet Scheduler:

  • Implement Task trait for processing single validator's page payout (will reuse do_payout_stakers_by_page logic).
  • Schedule tasks after store_stakers_info.
  • Create storage item in staking pallet to track tasks status (monitoring, cleanup).
  • Either create separate cleanup task or perform cleanup within payout task (remove tasks older than X eras).
  • Keep existing payout extrinsics.
  • No storage migrations needed.

WDYT?

@aurexav
Copy link
Contributor

aurexav commented Jan 10, 2025

@Ank4n
Copy link
Contributor Author

Ank4n commented Jan 13, 2025

@Ank4n Thanks for the clarification. Let me explain my initial thoughts.

Why I assumed OCW is not considered:

  • "Schedule this when new era is triggered" suggests on-chain scheduling.

  • I assumed extrinsics should be deprecated, which is not true.

  • There are some challenges with OCW which are not obvious how to handle:

    • In the example provided there is unsigned extrinsic, in our case we need some account to pay for transaction fees.
    • Task distribution between validators (by hash, delay, etc.). Requires additional complexity, need to do some research here if that's the way to go.

I suggest approach with pallet Scheduler:

  • Implement Task trait for processing single validator's page payout (will reuse do_payout_stakers_by_page logic).
  • Schedule tasks after store_stakers_info.
  • Create storage item in staking pallet to track tasks status (monitoring, cleanup).
  • Either create separate cleanup task or perform cleanup within payout task (remove tasks older than X eras).
  • Keep existing payout extrinsics.
  • No storage migrations needed.

WDYT?

I haven’t looked deeply into the scheduler, but the proposed approach seems quite similar to the one proposed with pallet::tasks. My understanding is that using the scheduler might require storing significantly more data compared to tasks, as it needs to store each scheduled call in its storage. Is that correct?

Could you perhaps look into this further and elaborate on why the scheduler would be a better choice than tasks in this context? Another question to consider is how we recover from failure scenarios. Eg: what happens if a scheduled call or task cannot be executed at the specified block number?

@vovacha
Copy link

vovacha commented Jan 15, 2025

@Ank4n You mentioned pallet::tasks, but I don't see such a pallet in substrate. I assume you're referring to an example of pattern using Task trait with OCW.

From initial investigation, it seems all approaches would need similar storage for tracking pending payouts. The main differences appear to be in implementation:

  • OCW-based solution (what you suggested with Task trait ) - requires funded accounts for transaction fees
  • Scheduler pallet (what I suggested earlier) - additional pallet dependency. Tasks are executed in on_initialize with MaximumWeight
  • on_idle hook (as mentioned by @aurexav ) - simpler approach, iterates through storage maps to process pending rewards using leftover block space

All approaches would be constrained by block space availability with similar behavior:

  • OCW-based solution - we control when to submit transactions based on block space
  • Scheduler pallet - seems like tasks remain in storage and executed in next blocks when there's enough space
  • on_idle hook - processes a limited batch per block, continuing in next blocks if needed

If we choose OCW approach, we'd need to decide on fee payment.

@bkchr
Copy link
Member

bkchr commented Jan 15, 2025

  • OCW-based solution (what you suggested with Task trait ) - requires funded accounts for transaction fees

That is not correct. The stuff is build exactly for these kind of use cases which are being talked about here. You don't need to pay fees for it.

The scheduler at some point will/should also be rewritten to use Tasks. This solves problems around potential panics etc.

@vovacha
Copy link

vovacha commented Jan 15, 2025

That is not correct.

I apologize for the confusion. Looking at frame_system::Call::do_task it's clear Tasks can be executed with any origin.

Will start working on implementation soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T2-pallets This PR/Issue is related to a particular pallet.
Projects
Status: 📕 Backlog
Development

No branches or pull requests

6 participants