-
Notifications
You must be signed in to change notification settings - Fork 4.6k
add merkle root column in blockstore #33807
add merkle root column in blockstore #33807
Conversation
Codecov Report
@@ Coverage Diff @@
## master #33807 +/- ##
=======================================
Coverage 81.8% 81.8%
=======================================
Files 807 807
Lines 217492 217522 +30
=======================================
+ Hits 178091 178118 +27
- Misses 39401 39404 +3 |
e3c8126
to
974b1a7
Compare
974b1a7
to
70b60ab
Compare
/// | ||
/// * index type: `crate::shred::ErasureSetId` `(Slot, fec_set_index: u64)` | ||
/// * value type: [`solana_sdk::hash::Hash`]` | ||
pub struct MerkleRoot; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@behzadnouri I decided to add a new column instead of modifying the existing ErasureMeta
because ErasureMeta
is only populated once we insert a coding shred.
This approach will allow us to submit a duplicate proof if we observe 2 data shreds with differing merkle roots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If you have 2 data shreds with the same index and they have different merkle roots, then they already are identified as duplicate shreds regardless of their merkle root because their payload is different.
- If they have different index and different merkle roots, then you can't determine if they belong to the same FEC set or not without receiving a coding shred from the same FEC set, because data shreds don't have the erasure set info.
- But if you have a coding shred from the same FEC set, then the coding shred + one of the data shreds is already enough to prove mismatching merkle roots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah you're right 🤦 i'll stick to the original approach of modifying the erasure meta instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they have different index and different merkle roots, then you can't determine if they belong to the same FEC set or not without receiving a coding shred from the same FEC set
Out of curiosity, why is it not enough to compare (shred.slot(), shred.fec_set_index())
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this offline.
@AshwinSekar is right. (shred.slot(), shred.fec_set_index())
will identify the FEC set.
The 2 cases below are left out but they can be tracked separately.
- another scenario for duplicate blocks is that you receive a coding shred which indicates data shreds with indices [10..20] belong to the same FEC set with fec_set_index = 10, but then you have a data shred with index 17 (i.e. within that range) but different fec_set_index.
- Another case is that we have 2 coding shreds which indicate overlapping FEC sets.
@@ -1227,6 +1245,39 @@ impl TypedColumn for columns::OptimisticSlots { | |||
type Type = blockstore_meta::OptimisticSlotMetaVersioned; | |||
} | |||
|
|||
impl Column for columns::MerkleRoot { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a copypasta from ShredCode
, ShredData
, and ErasureMeta
. Happy to consolidate but am unsure about the effects of
fn as_index(slot: Slot) -> Self::Index {
(slot, 0)
}
Is it more beneficial to cache a fec_set_index
and use it instead of 0
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you have for as_index()
is fine; see this comment in the trail declaration:
solana/ledger/src/blockstore_db.rs
Lines 710 to 713 in 56ccffd
// This trait method is primarily used by `Database::delete_range_cf()`, and is therefore only | |
// relevant for columns keyed by Slot: ie. SlotColumns and columns that feature a Slot as the | |
// first item in the key. | |
fn as_index(slot: Slot) -> Self::Index; |
And yeah, looks like you might be able to reuse some impl's
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I think you have things plumbed correctly everywhere (metrics, cleanup, etc). What are you thinking in terms of rolling this broader change out? I see this PR adds a getter/setter, but it is not actually plugged in anywhere yet. Would the actual meat of the logic be added later ?
As mentioned on Discord, we ideally have a minimal change that we can backport for the sake of compatibility between versions; is that this PR?
Also, what happens to the old ErasureMeta
column; would this change essentially make that column obsolete ?
Was trying to be smart and avoid having to change the |
Problem
In order to detect duplicate blocks with conflicting merkle roots we need to be able to access the merkle root for each FEC set
This is the preliminary change to add a merkle root column for use in further changes to store the merkle root for shreds received.
Summary of Changes
Add merkle root column
Contributes to #33644