Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the protVer arguments #324

Closed
Tracked by #319
nfrisby opened this issue Aug 27, 2023 · 9 comments · Fixed by #1224
Closed
Tracked by #319

Simplify the protVer arguments #324

nfrisby opened this issue Aug 27, 2023 · 9 comments · Fixed by #1224
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@nfrisby
Copy link
Contributor

nfrisby commented Aug 27, 2023

Without loss of generality, focus on Allegra. Start from the binding of protVerAllegra in protocolInfoCardano. EG at https://github.com/input-output-hk/ouroboros-consensus/blob/0079ee65f6c49d1c05a8d30ffa492b20c46b12b8/ouroboros-consensus-cardano/src/ouroboros-consensus-cardano/Ouroboros/Consensus/Cardano/Node.hs#L658

There, protVerAllegra:

  • (A) is passed to mkShelleyBlockConfig
  • (B) influences maxMajorProtVer

Also, maxMajorProtVer:

  • (C) is passed to tpraosParams and used to construct praosParams, to eventually be an upper bound in the ProtocolHeaderSupportsEnvelope envelopeChecks method (to throw the "obsolete node" errors).
  • (D) is passed to mkPartialLedgerConfigShelley which eventually uses it to constructor SL.Globals
mkShelleyBlockConfig ::
     ShelleyBasedEra era
  => SL.ProtVer
  -> SL.ShelleyGenesis (EraCrypto era)
  -> [SL.VKey 'SL.BlockIssuer (EraCrypto era)]
  -> BlockConfig (ShelleyBlock proto era)
mkShelleyBlockConfig protVer genesis blockIssuerVKeys = ShelleyConfig {
      shelleyProtocolVersion  = protVer
    , shelleySystemStart      = SystemStart  $ SL.sgSystemStart  genesis
    , shelleyNetworkMagic     = NetworkMagic $ SL.sgNetworkMagic genesis
    , shelleyBlockIssuerVKeys = Map.fromList
        [ (SL.hashKey k, k)
        | k <- blockIssuerVKeys
        ]
    }
data instance BlockConfig (ShelleyBlock proto era) = ShelleyConfig {
      -- | The highest protocol version this node supports. It will be stored
      -- the headers of produced blocks.
      shelleyProtocolVersion  :: !SL.ProtVer
-- | Praos parameters that are node independent
data PraosParams = PraosParams
    ...
    -- | All blocks invalid after this protocol version, see
    -- 'Globals.maxMajorPV'.
    praosMaxMajorPV        :: !MaxMajorProtVer,
data Globals = Globals
  ...
  , maxMajorPV :: !Version
  -- ^ All blocks invalid after this protocol version

https://github.com/input-output-hk/ouroboros-consensus/blob/0079ee65f6c49d1c05a8d30ffa492b20c46b12b8/ouroboros-consensus-cardano/src/ouroboros-consensus-cardano/Ouroboros/Consensus/Cardano/Node.hs#L754-L775

Note that this use is exactly contrary to the warning comments where allegraProtVer is actually set, such as:

https://github.com/input-output-hk/cardano-node/blob/70dde196367e86c977fbe72167f4eb303b3ffa88/cardano-node/src/Cardano/Node/Protocol/Cardano.hs#L185-L194

        Consensus.ProtocolParamsAllegra {
          -- This is /not/ the Allegra protocol version. It is the protocol
          -- version that this node will declare that it understands, when it
          -- is in the Allegra era. That is, it is the version of protocol
          -- /after/ Allegra, i.e. Mary.
          allegraProtVer =
            ProtVer (natVersion @4) 0,
          allegraMaxTxCapacityOverrides =
            TxLimits.mkOverrides TxLimits.noOverridesMeasure
        }

It is currently my suspicion that all of these per-era *ProtVer arguments should be removed (thereby eliminating (B)), and replaced by some logic that defines maxMajorProtVer by analyzing the last given transitionIntraShelleyTrigger. And the per-era values used in (A) should instead all simply use maxMajorProtVer.

@amesgen
Copy link
Member

amesgen commented Aug 28, 2023

Minor comments:

Note that this use is exactly contrary to the warning comments where allegraProtVer is actually set, such as:

https://github.com/input-output-hk/cardano-node/blob/70dde196367e86c977fbe72167f4eb303b3ffa88/cardano-node/src/Cardano/Node/Protocol/Cardano.hs#L185-L194

Yeah, I think this is a case where we actually loose generality by only considering Allegra: Ignoring experimental eras for now (this just changes what the "last era" is), protVerXXX has the following meaning ATM:

  • For all but the last era, it is the lowest major protocol version of the next era, ie if YYY is the successor of XXX, it is eraProtVerLow @YYY
  • For the last era, it is the highest major protocol of this last era, ie eraProtVerHigh @XXX.

Also see #294 (comment)

So in fact, we can safely use the protVer entry for the last era as the max major protocol version, despite the fact that protVerXXX is not a protocol version of the XXX era for all but the last era.

It is currently my suspicion that all of these per-era *ProtVer arguments should be removed (thereby eliminating (B)), and replaced by some logic that defines maxMajorProtVer by analyzing the last given transitionIntraShelleyTrigger.

That specific approach won't work directly:

The general approach can of course still work, we just have to pass in more stuff ("worst-case", just the max major protocol version directly).

And the per-era values used in (A) should instead all simply use maxMajorProtVer.

Intuitively, that sounds like a good plan; main question is: do we ever want forging nodes to put different versions in their header for different eras? Maybe a statistical use case similar to https://github.com/input-output-hk/cardano-node/blob/70dde196367e86c977fbe72167f4eb303b3ffa88/cardano-node/src/Cardano/Node/Protocol/Cardano.hs#L210-L213 could arise?

@nfrisby
Copy link
Contributor Author

nfrisby commented Aug 28, 2023

protVerXXX has the following meaning ATM

The changing semantics of the field---especially in a way dependent on the --experimental flag---is confusing for a "configuration data" data structure😬.

That specific approach won't work directly.

Yeah, I figured there isn't quite enough information/it wasn't accessible right now. But this still seems right to me as a high-level design target.

do we ever want forging nodes to put different versions in their header for different eras?

In general, sounds useful. And indeed you linked to a former example use (which correctly comments itself as a "HACK" :/).

But it also sounds not strictly necessary. So at the very least, I think it should be provided in a separate "overrides" argument, eg (perhaps limited to the minor version?). More directly: I think we should remove it now and add something like that back when it's next actually needed.

@nfrisby
Copy link
Contributor Author

nfrisby commented Aug 28, 2023

PR #294 is definitely related. For context: I hadn't looked over that PR before drafting this Issue.

@nfrisby
Copy link
Contributor Author

nfrisby commented Aug 28, 2023

PR IntersectMBO/ouroboros-network#3891 also seems related

Also, I transferred Issue #325 from ouroboros-network repo.

Wise final words from Jared in his comment:

but it will take some thought, and updating the specs

@nfrisby
Copy link
Contributor Author

nfrisby commented Aug 28, 2023

The MaxMajorPV was added in this commit IntersectMBO/cardano-ledger@441e23b

Edit: Just for context: that commit is from March 2020, which is before the Byron-Shelley hardfork.

@nfrisby
Copy link
Contributor Author

nfrisby commented Aug 28, 2023

Esgen and I just discussed this all during a call. There are several overlapping concerns, sadly.

Testing overrides

For testing, we let config override the version-based triggers in the HFC.

Experimental flags

There are two "experimental" flags in the node: ExperimentalHardForksEnabled (<-> npcExperimentalHardForksEnabled) (formerly TestEnableDevelopmentHardForkEras) and ExperimentalProtocolsEnabled (<-> pncExperimentalProtocolsEnabled) (formerly TestEnableDevelopmentNetworkProtocols).

The node ultimately passes ExperimentalProtocolsEnabled to Consensus as srnEnableInDevelopmentVersions.

  , srnEnableInDevelopmentVersions  :: Bool
    -- ^ If @False@, then the node will limit the negotiated NTN and NTC
    -- versions to the latest " official " release (as chosen by Network and
    -- Consensus Team, with input from Node Team)

The node uses ExperimentalHardForksEnabled to alter the upper bounds on the *ProtVer arguments it passes to Consensus. Re-read the Issue description to understand the consequences this flag therefore has. For example:

          Praos.babbageProtVer =
            if npcExperimentalHardForksEnabled
              then ProtVer (natVersion @9) 0
              else ProtVer (natVersion @8) 0,

It is not obvious to me how these two flags interact in general. In particular, suppose we have an experimental era, such as Conway right now. Is it necessarily the case that ExperimentalProtocolsEnabled enables (at least) the NTN and NTC versions that Conway requires? And so we should require that ExperimentalProtocolsEnabled implies ExperimentalHardForksEnabled?

When we instead don't have an experimental era, ExperimentalHardForksEnabled has no effect and so that implication wouldn't be necessary? For example, without an experimental era, ExperimentalProtocolsEnabled could be guarding merely new queries.

The Obsolete Node Checks

The MaxMajorPV mechanism introduced here IntersectMBO/cardano-ledger@441e23b is older than the HFC. At a high-level, the HFC renders MaxMajorPV unnecessary: if the node doesn't yet know about the Nth ledger era, then it won't be able to even parse a header in that era. However, because of "intra-era hard forks", it may be that MaxMajorPV is necessary even in the presence of the HFC.

It seems possible that we could ban intra-era hard forks in the future. Or maybe even implement them as duplicates in the HFC type-level list (but that would be inconvenient in terms of performance, and Issue #328, eg). Either would let us remove the MaxMajorPV mechanism, and instead rely only on the HFC for this.

Type-level ledger era major version intervals

The ledger statically assigns intervals of the major version to each era. For example:

https://github.com/input-output-hk/cardano-ledger/blob/a90833107aa98f359dbb839991582b397c1c4bc8/eras/allegra/impl/src/Cardano/Ledger/Allegra/Era.hs#L20-L26

-- | The Allegra era
data AllegraEra c


instance Crypto c => Era (AllegraEra c) where
  type PreviousEra (AllegraEra c) = ShelleyEra c
  type EraCrypto (AllegraEra c) = c
  type ProtVerLow (AllegraEra c) = 3

Alonzo is different because it contains a so-called "intra-era hard fork" (ie a hard fork that does not change the ledger era, and so does not involve the HFC).

https://github.com/input-output-hk/cardano-ledger/blob/a90833107aa98f359dbb839991582b397c1c4bc8/eras/alonzo/impl/src/Cardano/Ledger/Alonzo/Era.hs#L21-L28

-- | The Alonzo era
data AlonzoEra c


instance Crypto c => Era (AlonzoEra c) where
  type EraCrypto (AlonzoEra c) = c
  type PreviousEra (AlonzoEra c) = MaryEra c
  type ProtVerLow (AlonzoEra c) = 5
  type ProtVerHigh (AlonzoEra c) = 6

If it weren't for the testing overrides and the experimental flags, I think all of the node's use of protocol versions would be determined by those ProtVerLow and ProtVerHigh instances---even intra-era hardforks, since ProvVerHigh captures those. Sounds appealing.

@coot
Copy link
Contributor

coot commented Aug 28, 2023

My mental model, although I know very little about the ExperimentalHardForksEnabled (so I also feel the need to sync on that), is that consensus and network flags are independent in general. However there might be hard forks which require to set both flags, or just one of them. You're right that TestEnableDevelopmentNetworkProtocols can limit queries, which is relatively small misconfiguration (doesn't break node-to-node, but might prevent testing some its features), but in the future we might have new consensus algorithm which will require an entirely new network (mini-)protocol (e.g. ouroboros-peras or ouroboros-laios). There's yet another level of complication: it could happen that we will have an experimental network protocol change which we don't want to enable, while consensus also requires another experimental feature from the network. Do we need to worry about this before it happens?

@nfrisby
Copy link
Contributor Author

nfrisby commented Aug 28, 2023

Do we need to worry about this before it happens?

I think the only risk of not considering this would be surprise delays in releases (which are bad, but not as bad as bugs eg).

@coot
Copy link
Contributor

coot commented Aug 28, 2023

I think so too.

Right now Handshake assumes that all versions are linearly (/ totally) ordered. But we could relax it to be a partial order, which would allow things like:

graph TD;
  1'-->0;
  1-->0;
  2-->1;
  2-->1';  
Loading

where 1 and 1' are the non comparable two different experimental version. Consensus could peak it's own max network version, the same with network. Assume that 1 contains some consensus feature while 1' some network features, and we don't want to publish both (e.g. 2). Then the network team would pick 1 as it's max version, and consensus would peak 1' as its max version.

Our picking algorithm would need to pick the smallest sub-partial order which is closed under supremum. This way if one enables only consensus experimental features one would end with versions smaller or equal to 1; if one only enables network experimental features one would end with versions smaller or equal to 1' and if one picks both then all versions smaller or equal 2.

Handshake would do the same as now: pick the largest version which belongs to both own supported versions and the presented one.

@dnadales dnadales added this to the Q3 2024 milestone Jul 11, 2024
@dnadales dnadales assigned dnadales and unassigned jorisdral Aug 13, 2024
@dnadales dnadales moved this to 🏗 In progress in Consensus Team Backlog Aug 13, 2024
github-merge-queue bot pushed a commit that referenced this issue Aug 26, 2024
While giving us flexibility, using different protocol versions for all
eras increased the level of complexity and led to several errors in the
past. This PR removes this flexibility by using a single protocol
version, which determines the maximum protocol version Consensus
supports.

See #324 for more background.

Closes #324.
Closes #276.

TODOs:

- [x] Format the code.
- [x] Fix ThreadNet test failure.
- [x] Improve documentation.
- [x] Address all the `FIXME`s.
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Consensus Team Backlog Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: ✅ Done
Development

Successfully merging a pull request may close this issue.

5 participants