Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

implement bitfield signing subsystem #1364

Merged
merged 17 commits into from
Jul 23, 2020
Merged

implement bitfield signing subsystem #1364

merged 17 commits into from
Jul 23, 2020

Conversation

coriolinus
Copy link
Contributor

@coriolinus coriolinus commented Jul 6, 2020

Closes #1239. Includes #1403.

@coriolinus coriolinus added A3-in_progress Pull request is in progress. No review needed at this stage. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Jul 6, 2020
@coriolinus coriolinus self-assigned this Jul 6, 2020
@coriolinus coriolinus force-pushed the prgn-bitfield-signing branch from 58a5e6f to 6a062fa Compare July 20, 2020 08:26
There were large merge issues with the old bitfield signing PR, so
we're just copying all the work from that onto this and restarting.

Much of the existing work will be discarded because we now have better
tools available, but that's fine.
It's not an ideal implementation--we can make it much more concurrent--
but at least it compiles.
@coriolinus coriolinus force-pushed the prgn-bitfield-signing branch from 2a2c862 to 5807799 Compare July 22, 2020 12:14
Comment on lines +147 to +148
// REVIEW: is it safe to ignore parathreads here, or do they also figure in the availability mapping?
if let Some(CoreOccupied::Parachain) = core {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we only care about availability as it relates to parachains, but I'm not certain, and the guide doesn't say (AFAICT).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What in the guide made you think that parathreads don't have to worry about availability? They are occupying availability cores just like parachains.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, this whole branch is a byproduct of using the wrong type / runtime API. This should be using the CoreState type, not the CoreOccupied type, which shouldn't even be exposed.

// In principle, this work is all concurrent, not parallel. In practice, we can't guarantee it, which is why
// we need the mutexes and explicit references above.
stream::iter(scheduler_roster.availability_cores.into_iter().enumerate())
.for_each_concurrent(None, |(idx, core)| async move {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The move flag here is the reason for the explicit references we construct above, and this appears to work, but it's not a pattern I've seen elsewhere in the codebase. Is there an elegant way to avoid explicit references?

Comment on lines +242 to +247
/// Query whether a `PoV` exists within the AV Store.
///
/// This is useful in cases like bitfield signing, when existence
/// matters, but we don't want to necessarily pass around multiple
/// megabytes of data to get a single bit of information.
QueryPoVAvailable(Hash, oneshot::Sender<bool>),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could potentially avoid this variant, just using QueryPoV instead. Does it create any problems to create this variant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this message type has some utility, but it's also not what the bitifeld signing subsystem needs.

Bitfield signing is supposed to query for whether we have our specific chunk of the PoV, not the whole PoV itself.

Quoting from the guide's description of bitfield signing:

For each bit in the bitfield, if there is a candidate pending availability, query the Availability Store for whether we have the availability chunk for our validator index.

@@ -272,6 +293,8 @@ pub enum RuntimeApiRequest {
ValidationCode(ParaId, BlockNumber, Option<BlockNumber>, oneshot::Sender<ValidationCode>),
/// Get head data for a specific para.
HeadData(ParaId, oneshot::Sender<HeadData>),
/// Get a the candidate pending availability for a particular parachain by parachain / core index
CandidatePendingAvailability(ParaId, oneshot::Sender<Option<CommittedCandidateReceipt>>),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't see any way around creating this runtime request, but I could have missed something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this not existing is a byproduct of #1419 not being done.

parachain/src/primitives.rs Outdated Show resolved Hide resolved
Co-authored-by: Andronik Ordian <[email protected]>
@coriolinus
Copy link
Contributor Author

I am suspicious of the current CI failure messages:

     Updating git repository `https://github.com/paritytech/substrate`
error: failed to get `sc-client-api` as a dependency of package `polkadot-availability-store v0.8.18 (/builds/parity/polkadot/availability-store)`
Caused by:
  failed to load source for dependency `sc-client-api`
Caused by:
  Unable to update https://github.com/paritytech/substrate#5fd98a66
Caused by:
  revspec '5fd98a661d5384a00c40d17c08f28265dae729d4' not found; class=Reference (4); code=NotFound (-3)

I suspect that these are cache failures, so I'm going to try to re-discover the mechanism for clearing the CI cache.

@rphmeier
Copy link
Contributor

My intuition would be: merge master and continue. Usually fixes spurious dependency errors like that.

@coriolinus coriolinus marked this pull request as ready for review July 23, 2020 12:19
@coriolinus coriolinus requested review from rphmeier and drahnr July 23, 2020 12:19
@github-actions github-actions bot added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Jul 23, 2020
@@ -4,6 +4,10 @@ Validators vote on the availability of a backed candidate by issuing signed bitf

## Protocol

Input:

There is no dedicated input mechanism for bitfield signing. Instead, Bitfield Signing produces a bitfield representing the current state of availability on `StartWork`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Describing things that are not there is really helpful when reading the guide 👍

Copy link
Contributor

@drahnr drahnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, a few very minor nitpicks :)

@coriolinus coriolinus merged commit 8217ca6 into master Jul 23, 2020
@coriolinus coriolinus deleted the prgn-bitfield-signing branch July 23, 2020 14:05
// we have to (cheaply) clone this sender so we can mutate it to actually send anything
let mut sender = sender.clone();

// REVIEW: is it safe to ignore parathreads here, or do they also figure in the availability mapping?
Copy link
Contributor

@rphmeier rphmeier Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to account for parachains and parathreads.

// request the validator groups so we can get the scheduler roster
let (tx, rx) = oneshot::channel();
sender
.send(RuntimeApi(Request(relay_parent, ValidatorGroups(tx))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a hard time understanding this code. It makes a validator groups request and the rx receives a SchedulerRoster?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(anyway, this is getting the wrong info, as explained in other comments)

},
util::{self, JobManager, JobTrait, ToJobTrait, Validator},
};
use polkadot_primitives::v1::{AvailabilityBitfield, CoreOccupied, Hash};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the CoreOccupied type should be used. It shouldn't even be exposed from the runtime, but @montekki added it for candidate backing. Now that there are new runtime APIs, those should be used instead.

Except #1419 hasn't been done yet. That's why I recommended stubbing out an async fn availability_cores { unimplemented!() } in a separate chat. The implementation of that function would be replaced by the actual runtime API in the future.

@coriolinus coriolinus mentioned this pull request Jul 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement: Bitfield Signing
4 participants