Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose and use claim queue for asynchronous on-demand operability #1797

Open
6 of 8 tasks
Tracked by #1829
eskimor opened this issue Oct 5, 2023 · 3 comments
Open
6 of 8 tasks
Tracked by #1829

Expose and use claim queue for asynchronous on-demand operability #1797

eskimor opened this issue Oct 5, 2023 · 3 comments
Assignees

Comments

@eskimor
Copy link
Member

eskimor commented Oct 5, 2023

In order for on-demand (and other forms of more exotic scheduling) to take advantage of asynchronous backing we need to expose the claim queue to the node side and have validators have accept collations for any ParaId that is in the claim queue, not just the next scheduled entry.

  • Expose Expose ClaimQueue via a runtime api and use it in collation-generation  #3580
  • Make use of it in collator protocol so collations will be accepted in advance.
  • Make use of it in collators: Implement strategies for the right lookahead in collators (see comment below)
  • Fix validator usage as well. E.g. in backing we will need to be more lenient and accept statements for any para in the current claim queue.
  • Orthogonal: Is the runtime (inclusion) already lenient enough. This is not directly related to exposure of the claim queue, but we need to double check that the runtime itself is taking it into account everywhere.
  • Ensure fairness between paras in the collator protocol.
  • Statement distribution and other subsystems need to take full claim queue into account, not only the next scheduled.
  • Verify asynchronous backing working as expected with a core shared by at least three paras, in round robin.

Fairness in collator protocol

To ensure one para can not starve another in the collator protocol, we should become smarter on what collations to fetch. E.g. if we have the following claim queue:

[a, b, c, d]

and we have the following advertisements:

a,a,a,a,a,a,b,c,d

Our fetches should look like round robin:

a,b,c,d,a,a,a,a,a
@eskimor eskimor converted this from a draft issue Oct 5, 2023
@tdimitrov tdimitrov self-assigned this Feb 13, 2024
@tdimitrov tdimitrov moved this from Backlog to In Progress in parachains team board Feb 29, 2024
@tdimitrov tdimitrov moved this from In Progress to Completed in parachains team board Mar 11, 2024
@tdimitrov tdimitrov moved this from Completed to Review in progress in parachains team board Mar 11, 2024
@tdimitrov tdimitrov moved this from Review in progress to In Progress in parachains team board Mar 12, 2024
@eskimor
Copy link
Member Author

eskimor commented Mar 12, 2024

Little illustration how collation providing will work with the claim queue. Most of the time (when running) the second item in the claim queue will be the most relevant one, as this is the one that can be provided "on time". The first item in the claim queue is meant to be backed in the very next block (synchronous backing). So when starting up, we will provide the first time, but will be too slow, so it will go into the block after the next one. In the meantime we will already provide the second item, which then will be ready for the block afterwards - now we are "in sync" and will always prepare the second item in the claim queue, while the first one is already waiting to be picked up by the block producer.

Image

If a para b sees the following claim queue for the current relay parent, it should start preparing the collation to get it in on time:

[a, b, c]

Looking even further ahead (if claim queue is long enough), is not required for operation, but is a legitimate block production strategy. E.g. to avoid more relay chain forks (and thus unnecessary work), one could on purpose pick a relay parent that already has a child block and then pick the third item in the claim queue (which corresponds to the second item for the most recent relay parent claim queue).

github-merge-queue bot pushed a commit that referenced this issue Mar 20, 2024
…tion` (#3580)

The PR adds two things:
1. Runtime API exposing the whole claim queue
2. Consumes the API in `collation-generation` to fetch the next
scheduled `ParaEntry` for an occupied core.

Related to #1797
dharjeezy pushed a commit to dharjeezy/polkadot-sdk that referenced this issue Mar 24, 2024
…tion` (paritytech#3580)

The PR adds two things:
1. Runtime API exposing the whole claim queue
2. Consumes the API in `collation-generation` to fetch the next
scheduled `ParaEntry` for an occupied core.

Related to paritytech#1797
@alindima
Copy link
Contributor

This code in prospective-parachains will also need to be changed to use the claimqueue API:

async fn fetch_upcoming_paras<Context>(

I think it should lookahead according to allowed_ancestry_len or scheduling_lookahead, in order to allow collations coming from upcoming paras.

@eskimor eskimor changed the title Expose claim queue for asynchronous on-demand operability Expose and use claim queue for asynchronous on-demand operability Apr 5, 2024
@bkchr
Copy link
Member

bkchr commented Apr 9, 2024

  • Make use of it in collators: Implement strategies for the right lookahead in collators (see comment below)

#3168 is included there. The collation generation subsystem will not be used for this anymore. Already the lookahead collator is not using the collation generation subsystem for triggering the block production.

github-merge-queue bot pushed a commit that referenced this issue Jun 19, 2024
Implements most of
#1797

Core sharing (two parachains or more marachains scheduled on the same
core with the same `PartsOf57600` value) was not working correctly. The
expected behaviour is to have Backed and Included event in each block
for the paras sharing the core and the paras should take turns. E.g. for
two cores we expect: Backed(a); Included(a)+Backed(b);
Included(b)+Backed(a); etc. Instead of this each block contains just one
event and there are a lot of gaps (blocks w/o events) during the
session.

Core sharing should also work when collators are building collations
ahead of time

TODOs:

- [x] Add a zombienet test verifying that the behaviour mentioned above
works.
- [x] prdoc

---------

Co-authored-by: alindima <[email protected]>
TarekkMA pushed a commit to moonbeam-foundation/polkadot-sdk that referenced this issue Aug 2, 2024
Implements most of
paritytech#1797

Core sharing (two parachains or more marachains scheduled on the same
core with the same `PartsOf57600` value) was not working correctly. The
expected behaviour is to have Backed and Included event in each block
for the paras sharing the core and the paras should take turns. E.g. for
two cores we expect: Backed(a); Included(a)+Backed(b);
Included(b)+Backed(a); etc. Instead of this each block contains just one
event and there are a lot of gaps (blocks w/o events) during the
session.

Core sharing should also work when collators are building collations
ahead of time

TODOs:

- [x] Add a zombienet test verifying that the behaviour mentioned above
works.
- [x] prdoc

---------

Co-authored-by: alindima <[email protected]>
github-merge-queue bot pushed a commit that referenced this issue Dec 13, 2024
Related to #1797

# The problem
When fetching collations in collator protocol/validator side we need to
ensure that each parachain has got a fair core time share depending on
its assignments in the claim queue. This means that the number of
collations fetched per parachain should ideally be equal to (but
definitely not bigger than) the number of claims for the particular
parachain in the claim queue.

# Why the current implementation is not good enough
The current implementation doesn't guarantee such fairness. For each
relay parent there is a `waiting_queue` (PerRelayParent -> Collations ->
waiting_queue) which holds any unfetched collations advertised to the
validator. The collations are fetched on first in first out principle
which means that if two parachains share a core and one of the
parachains is more aggressive it might starve the second parachain. How?
At each relay parent up to `max_candidate_depth` candidates are accepted
(enforced in `fn is_seconded_limit_reached`) so if one of the parachains
is quick enough to fill in the queue with its advertisements the
validator will never fetch anything from the rest of the parachains
despite they are scheduled. This doesn't mean that the aggressive
parachain will occupy all the core time (this is guaranteed by the
runtime) but it will deny the rest of the parachains sharing the same
core to have collations backed.

# How to fix it
The solution I am proposing is to limit fetches and advertisements based
on the state of the claim queue. At each relay parent the claim queue
for the core assigned to the validator is fetched. For each parachain a
fetch limit is calculated (equal to the number of entries in the claim
queue). Advertisements are not fetched for a parachain which has
exceeded its claims in the claim queue. This solves the problem with
aggressive parachains advertising too much collations.

The second part is in collation fetching logic. The collator will keep
track on which collations it has fetched so far. When a new collation
needs to be fetched instead of popping the first entry from the
`waiting_queue` the validator examines the claim queue and looks for the
earliest claim which hasn't got a corresponding fetch. This way the
collator will always try to prioritise the most urgent entries.

## How the 'fair share of coretime' for each parachain is determined?
Thanks to async backing we can accept more than one candidate per relay
parent (with some constraints). We also have got the claim queue which
gives us a hint which parachain will be scheduled next on each core. So
thanks to the claim queue we can determine the maximum number of claims
per parachain.

For example the claim queue is [A A A] at relay parent X so we know that
at relay parent X we can accept three candidates for parachain A. There
are two things to consider though:
1. If we accept more than one candidate at relay parent X we are
claiming the slot of a future relay parent. So accepting two candidates
for relay parent X means that we are claiming the slot at rp X+1 or rp
X+2.
2. At the same time the slot at relay parent X could have been claimed
by a previous relay parent(s). This means that we need to accept less
candidates at X or even no candidates.

There are a few cases worth considering:
1. Slot claimed by previous relay parent.
    CQ @ rp X: [A A A]
    Advertisements at X-1 for para A: 2
    Advertisements at X-2 for para A: 2
Outcome - at rp X we can accept only 1 advertisement since our slots
were already claimed.
2. Slot in our claim queue already claimed at future relay parent
    CQ @ rp X: [A A A]
    Advertisements at X+1 for para A: 1
    Advertisements at X+2 for para A: 1
Outcome: at rp X we can accept only 1 advertisement since the slots in
our relay parents were already claimed.

The situation becomes more complicated with multiple leaves (forks).
Imagine we have got a fork at rp X:
```
CQ @ rp X: [A A A]
(rp X) -> (rp X+1) -> rp(X+2)
         \-> (rp X+1')
```
Now when we examine the claim queue at RP X we need to consider both
forks. This means that accepting a candidate at X means that we should
have a slot for it in *BOTH* leaves. If for example there are three
candidates accepted at rp X+1' we can't accept any candidates at rp X
because there will be no slot for it in one of the leaves.

## How the claims are counted
There are two solutions for counting the claims at relay parent X:
1. Keep a state for the claim queue (number of claims and which of them
are claimed) and look it up when accepting a collation. With this
approach we need to keep the state up to date with each new
advertisement and each new leaf update.
2. Calculate the state of the claim queue on the fly at each
advertisement. This way we rebuild the state of the claim queue at each
advertisements.

Solution 1 is hard to implement with forks. There are too many variants
to keep track of (different state for each leaf) and at the same time we
might never need to use them. So I decided to go with option 2 -
building claim queue state on the fly.

To achieve this I've extended `View` from backing_implicit_view to keep
track of the outer leaves. I've also added a method which accepts a
relay parent and return all paths from an outer leaf to it. Let's call
it `paths_to_relay_parent`.

So how the counting works for relay parent X? First we examine the
number of seconded and pending advertisements (more on pending in a
second) from relay parent X to relay parent X-N (inclusive) where N is
the length of the claim queue. Then we use `paths_to_relay_parent` to
obtain all paths from outer leaves to relay parent X. We calculate the
claims at relay parents X+1 to X+N (inclusive) for each leaf and get the
maximum value. This way we guarantee that the candidate at rp X can be
included in each leaf. This is the state of the claim queue which we use
to decide if we can fetch one more advertisement at rp X or not.

## What is a pending advertisement
I mentioned that we count seconded and pending advertisements at relay
parent X. A pending advertisement is:
1. An advertisement which is being fetched right now.
2. An advertisement pending validation at backing subsystem.
3. An advertisement blocked for seconding by backing because we don't
know on of its parent heads.

Any of these is considered a 'pending fetch' and a slot for it is kept.
All of them are already tracked in `State`.

---------

Co-authored-by: Maciej <[email protected]>
Co-authored-by: command-bot <>
Co-authored-by: Alin Dima <[email protected]>
sfffaaa pushed a commit to peaqnetwork/polkadot-sdk that referenced this issue Dec 27, 2024
Implements most of
paritytech#1797

Core sharing (two parachains or more marachains scheduled on the same
core with the same `PartsOf57600` value) was not working correctly. The
expected behaviour is to have Backed and Included event in each block
for the paras sharing the core and the paras should take turns. E.g. for
two cores we expect: Backed(a); Included(a)+Backed(b);
Included(b)+Backed(a); etc. Instead of this each block contains just one
event and there are a lot of gaps (blocks w/o events) during the
session.

Core sharing should also work when collators are building collations
ahead of time

TODOs:

- [x] Add a zombienet test verifying that the behaviour mentioned above
works.
- [x] prdoc

---------

Co-authored-by: alindima <[email protected]>
dudo50 pushed a commit to paraspell-research/polkadot-sdk that referenced this issue Jan 4, 2025
Related to paritytech#1797

# The problem
When fetching collations in collator protocol/validator side we need to
ensure that each parachain has got a fair core time share depending on
its assignments in the claim queue. This means that the number of
collations fetched per parachain should ideally be equal to (but
definitely not bigger than) the number of claims for the particular
parachain in the claim queue.

# Why the current implementation is not good enough
The current implementation doesn't guarantee such fairness. For each
relay parent there is a `waiting_queue` (PerRelayParent -> Collations ->
waiting_queue) which holds any unfetched collations advertised to the
validator. The collations are fetched on first in first out principle
which means that if two parachains share a core and one of the
parachains is more aggressive it might starve the second parachain. How?
At each relay parent up to `max_candidate_depth` candidates are accepted
(enforced in `fn is_seconded_limit_reached`) so if one of the parachains
is quick enough to fill in the queue with its advertisements the
validator will never fetch anything from the rest of the parachains
despite they are scheduled. This doesn't mean that the aggressive
parachain will occupy all the core time (this is guaranteed by the
runtime) but it will deny the rest of the parachains sharing the same
core to have collations backed.

# How to fix it
The solution I am proposing is to limit fetches and advertisements based
on the state of the claim queue. At each relay parent the claim queue
for the core assigned to the validator is fetched. For each parachain a
fetch limit is calculated (equal to the number of entries in the claim
queue). Advertisements are not fetched for a parachain which has
exceeded its claims in the claim queue. This solves the problem with
aggressive parachains advertising too much collations.

The second part is in collation fetching logic. The collator will keep
track on which collations it has fetched so far. When a new collation
needs to be fetched instead of popping the first entry from the
`waiting_queue` the validator examines the claim queue and looks for the
earliest claim which hasn't got a corresponding fetch. This way the
collator will always try to prioritise the most urgent entries.

## How the 'fair share of coretime' for each parachain is determined?
Thanks to async backing we can accept more than one candidate per relay
parent (with some constraints). We also have got the claim queue which
gives us a hint which parachain will be scheduled next on each core. So
thanks to the claim queue we can determine the maximum number of claims
per parachain.

For example the claim queue is [A A A] at relay parent X so we know that
at relay parent X we can accept three candidates for parachain A. There
are two things to consider though:
1. If we accept more than one candidate at relay parent X we are
claiming the slot of a future relay parent. So accepting two candidates
for relay parent X means that we are claiming the slot at rp X+1 or rp
X+2.
2. At the same time the slot at relay parent X could have been claimed
by a previous relay parent(s). This means that we need to accept less
candidates at X or even no candidates.

There are a few cases worth considering:
1. Slot claimed by previous relay parent.
    CQ @ rp X: [A A A]
    Advertisements at X-1 for para A: 2
    Advertisements at X-2 for para A: 2
Outcome - at rp X we can accept only 1 advertisement since our slots
were already claimed.
2. Slot in our claim queue already claimed at future relay parent
    CQ @ rp X: [A A A]
    Advertisements at X+1 for para A: 1
    Advertisements at X+2 for para A: 1
Outcome: at rp X we can accept only 1 advertisement since the slots in
our relay parents were already claimed.

The situation becomes more complicated with multiple leaves (forks).
Imagine we have got a fork at rp X:
```
CQ @ rp X: [A A A]
(rp X) -> (rp X+1) -> rp(X+2)
         \-> (rp X+1')
```
Now when we examine the claim queue at RP X we need to consider both
forks. This means that accepting a candidate at X means that we should
have a slot for it in *BOTH* leaves. If for example there are three
candidates accepted at rp X+1' we can't accept any candidates at rp X
because there will be no slot for it in one of the leaves.

## How the claims are counted
There are two solutions for counting the claims at relay parent X:
1. Keep a state for the claim queue (number of claims and which of them
are claimed) and look it up when accepting a collation. With this
approach we need to keep the state up to date with each new
advertisement and each new leaf update.
2. Calculate the state of the claim queue on the fly at each
advertisement. This way we rebuild the state of the claim queue at each
advertisements.

Solution 1 is hard to implement with forks. There are too many variants
to keep track of (different state for each leaf) and at the same time we
might never need to use them. So I decided to go with option 2 -
building claim queue state on the fly.

To achieve this I've extended `View` from backing_implicit_view to keep
track of the outer leaves. I've also added a method which accepts a
relay parent and return all paths from an outer leaf to it. Let's call
it `paths_to_relay_parent`.

So how the counting works for relay parent X? First we examine the
number of seconded and pending advertisements (more on pending in a
second) from relay parent X to relay parent X-N (inclusive) where N is
the length of the claim queue. Then we use `paths_to_relay_parent` to
obtain all paths from outer leaves to relay parent X. We calculate the
claims at relay parents X+1 to X+N (inclusive) for each leaf and get the
maximum value. This way we guarantee that the candidate at rp X can be
included in each leaf. This is the state of the claim queue which we use
to decide if we can fetch one more advertisement at rp X or not.

## What is a pending advertisement
I mentioned that we count seconded and pending advertisements at relay
parent X. A pending advertisement is:
1. An advertisement which is being fetched right now.
2. An advertisement pending validation at backing subsystem.
3. An advertisement blocked for seconding by backing because we don't
know on of its parent heads.

Any of these is considered a 'pending fetch' and a slot for it is kept.
All of them are already tracked in `State`.

---------

Co-authored-by: Maciej <[email protected]>
Co-authored-by: command-bot <>
Co-authored-by: Alin Dima <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Status: Todo
Development

No branches or pull requests

4 participants