Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

Add RestartHeaviestFork to Gossip #34161

Merged
merged 13 commits into from
Jan 19, 2024

Conversation

wen-coding
Copy link
Contributor

Add RestartHeaviestFork to Gossip

Copy link

codecov bot commented Nov 18, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (acef7ba) 0.0% compared to head (7c338b0) 81.7%.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #34161       +/-   ##
===========================================
+ Coverage        0    81.7%    +81.7%     
===========================================
  Files           0      826      +826     
  Lines           0   223236   +223236     
===========================================
+ Hits            0   182485   +182485     
- Misses          0    40751    +40751     

@wen-coding wen-coding self-assigned this Nov 18, 2023
Comment on lines 38 to 39
// Sum of received heaviest fork validator stake / total stake * u16::MAX.
received_heaviest_fork_ratio: u16,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably call this stake_ratio or observed_stake_ratio.
heaviest_fork is already in the name of struct; it is redundant to repeat it here again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also why this u16 better than sending a f64 or the (stake, total_stake) as a tuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original idea is we can save some bytes, but now you mention it, it's not really worth losing the precision by switching from 16 bytes to 2 bytes, I'm changing to (stake, total_stake) now.

Comment on lines 46 to 48
ReceivedHeaviestForkStakeLargerThanTotalStake,
#[error("Total stake of received heaviest fork cannot be zero")]
ReceivedHeaviestForkStakeZero,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to repeat the enum name in the variants.
The variants are already qualified by the enum name, ie. RestartHeaviestForkError::....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@@ -122,6 +141,64 @@ impl RestartLastVotedForkSlots {
}
}

impl Sanitize for RestartHeaviestFork {
fn sanitize(&self) -> Result<(), SanitizeError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we sanitize the wallclock like the other crds values?
also probably that should be done for the previous restart crds value as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 146 to 149
if self.received_heaviest_fork_ratio == 0 {
// this should at least include its own stake.
return Err(SanitizeError::ValueOutOfBounds);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if node_stake / total_stake * u16::MAX < 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

pub wallclock: u64,
pub last_slot: Slot,
pub last_slot_hash: Hash,
// Sum of received heaviest fork validator stake / total stake * u16::MAX.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "total stake"?
Is that the epoch stake or the stake observed in gossip?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

pub last_slot: Slot,
pub last_slot_hash: Hash,
pub observed_stake: u64,
pub total_epoch_stake: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this?
Isn't the epoch_stakes already known by all the nodes in the cluster?
https://github.com/solana-labs/solana/blob/1f00f8e29/runtime/src/bank.rs#L7744-L7746

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it might be nice to distribute total_epoch_stake just in case some calculation is off. But you are right, if this stake is wrong we have bigger problems. Removed.

Comment on lines 42 to 47
#[derive(Debug, Error, PartialEq)]
pub enum RestartHeaviestForkError {
#[error("Total stake of received heaviest fork cannot be zero")]
ZeroObservedStake,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessary and just adds verbosity.
Zero stake is practically not any different from stake of only 1 sol.
In addition a malicious node can just fake any number here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Comment on lines 144 to 147
if self.observed_stake == 0 {
// this should at least include its own stake.
return Err(SanitizeError::ValueOutOfBounds);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, regarding stake of 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Comment on lines 162 to 164
if observed_stake == 0 {
return Err(RestartHeaviestForkError::ZeroObservedStake);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, here regarding zero stake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@github-actions github-actions bot added stale [bot only] Added to stale content; results in auto-close after a week. and removed stale [bot only] Added to stale content; results in auto-close after a week. labels Dec 15, 2023
@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Jan 2, 2024
@wen-coding wen-coding removed the stale [bot only] Added to stale content; results in auto-close after a week. label Jan 4, 2024
@wen-coding wen-coding requested a review from behzadnouri January 4, 2024 19:03
@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Jan 19, 2024
@behzadnouri behzadnouri removed the stale [bot only] Added to stale content; results in auto-close after a week. label Jan 19, 2024
Comment on lines 144 to 160
pub fn new(
from: Pubkey,
now: u64,
last_slot: Slot,
last_slot_hash: Hash,
observed_stake: u64,
shred_version: u16,
) -> Self {
Self {
from,
wallclock: now,
last_slot,
last_slot_hash,
observed_stake,
shred_version,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? all fields are public.
Either remove fields from public interface or remove this redundant method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

}
}

pub fn new_rand<R: Rng>(rng: &mut R, pubkey: Option<Pubkey>) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be pub?
I feel like every review I have to ask not to pollute public interface.

Can you go over all pubs in this code and remove those not needed?
or at least change them to pub(crate).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used in tests, and I don't see any benchmarks needing it now. So I changed it to pub(crate).

@wen-coding wen-coding merged commit 4a2871f into solana-labs:master Jan 19, 2024
35 checks passed
@wen-coding wen-coding deleted the add_restart_heaviest_fork branch January 19, 2024 21:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants