Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New query to get snapshots #3898

Merged
merged 2 commits into from
Aug 5, 2022
Merged

Conversation

newhoggy
Copy link
Contributor

@newhoggy newhoggy commented Jul 14, 2022

Description

This introduce a new query GetStakeSnapshots which extracts the stake snapshots for all pools or for a specified subset of pools. This query is required to improve the CPU and memory efficiency of the query stake-snapshot command in the CLI.

Checklist

  • Branch
    • Commit sequence broadly makes sense
    • Commits have useful messages
    • New tests are added if needed and existing tests are updated
    • If this branch changes Consensus and has any consequences for downstream repositories or end users, said changes must be documented in interface-CHANGELOG.md
    • If this branch changes Network and has any consequences for downstream repositories or end users, said changes must be documented in interface-CHANGELOG.md
    • If serialization changes, user-facing consequences (e.g. replay from genesis) are confirmed to be intentional.
  • Pull Request
    • Self-reviewed the diff
    • Useful pull request description at least containing the following information:
      • What does this PR change?
      • Why these changes were needed?
      • How does this affect downstream repositories and/or end-users?
      • Which ticket does this PR close (if any)? If it does, is it linked?
    • Reviewer requested

cardano-api changes

The existing query, QueryDebugLedgerState, will continue to exist and be slow and use a lot of memory, however, there will be a new GetStakeSnapshot which uses a lot less CPU and memory:

GetStakeSnapshot
    :: SL.KeyHash 'SL.StakePool (EraCrypto era)
    -> BlockQuery (ShelleyBlock proto era)
                  (StakeSnapshot (EraCrypto era))

This query takes a set of pool ids so it is possible to query multiple pools at once.

cardano-cli changes

A existing command query stake-snapshot will be be functionally unchanged, but use much less CPU and memory.

@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch from e8a3fa0 to 24660db Compare July 14, 2022 04:14
Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some comments throughout. HTH. Thanks!

GetPoolState {} -> fromCBOR
GetStakeSnapshot {} -> fromCBOR

data StakeSnapshot crypto = StakeSnapshot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a Haddock comment explaining what this is.

Also, I think these fields shouldn't be Integers, but rather SL.Coin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added documentation.

Copy link
Contributor

@Jimbo4350 Jimbo4350 Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return SnapShots crypto? It seems like we are losing potentially useful information i.e the delegations and pool params. The consumer can do these calculations (indeed we can expose functions from cardano-api to do this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is define like this:

data SnapShots crypto = SnapShots
  { _pstakeMark :: SnapShot crypto, -- Lazy on purpose
    _pstakeSet :: !(SnapShot crypto),
    _pstakeGo :: !(SnapShot crypto),
    _feeSS :: !Coin
  }
  deriving (Show, Eq, Generic)

data SnapShot crypto = SnapShot
  { _stake :: !(Stake crypto),
    _delegations :: !(VMap VB VB (Credential 'Staking crypto) (KeyHash 'StakePool crypto)),
    _poolParams :: !(VMap VB VB (KeyHash 'StakePool crypto) (PoolParams crypto))
  }
  deriving (Show, Eq, Generic)

The maps stored are very big, so in the interest of improving memory and cpu efficiency and not transferring more data than necessary over the socket, I do the map lookup on the node side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return SnapShots crypto?

The maps stored are very big [..., so] I do the map lookup on the node side

On Slack, Jordan also asked: "Should we allow the possibility to get the stake snapshot for multiple pools at once?"

One option occurs to me: we return a SnapShots crypto (so we don't have to define a new type 🙌) but we restrict all the maps inside that data type to just the keys given (so we support a multi-pool query 🙌).


On the slack thread, you agreed that one-pool at a time is fine for the first PR. So that's not urgent.

But avoiding a new data type might be nice.

So: does the resuse of SnapShots crypto but with all the maps restricted down to the pools identified by the query make sense?

One immediate downside: such a restricted SnapShot is probably counter-intuitive: eg it violates some generally-expected invariants of SnapShot (along the lines of completeness)

@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch 5 times, most recently from 917b819 to e836ce8 Compare July 19, 2022 13:06
@newhoggy newhoggy changed the title Newhoggy/new query to get snapshots New query to get snapshots Jul 19, 2022
@newhoggy newhoggy marked this pull request as ready for review July 19, 2022 13:13
@newhoggy newhoggy requested review from dnadales and coot as code owners July 19, 2022 13:13
@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch from e836ce8 to eb5e856 Compare July 19, 2022 22:43
@newhoggy newhoggy requested a review from nfrisby July 19, 2022 22:47
@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch 3 times, most recently from 01549ff to 551db30 Compare July 20, 2022 01:11
Copy link
Contributor

@Jimbo4350 Jimbo4350 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@newhoggy newhoggy dismissed Jimbo4350’s stale review July 21, 2022 02:17

replied to comments

@newhoggy newhoggy requested a review from Jimbo4350 July 21, 2022 02:17
@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch from 551db30 to b646f87 Compare July 25, 2022 09:57
@@ -77,6 +77,7 @@ https://keepachangelog.com/en/1.1.0/, adapted to our plan explained above.

- New supported node to client version `NodeToClientV_14` with new queries:
- `GetPoolState`: Get the pool state for the given stake pool ids
- `GetStakeSnapshot`: Get the snapshot of stake distribution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we write this entry here? Why not in a section above 2022-07-20?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved it to a new section.

@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch from b646f87 to 4fcb5c7 Compare July 26, 2022 11:50
@newhoggy newhoggy requested a review from dnadales July 27, 2022 10:34
@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch from 4fcb5c7 to 172ae40 Compare August 1, 2022 13:24
Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some new comments. But everything is relatively minor (though not necessarily cheap), so I'm clicking Approve.

Thanks for responding to the previous comments 🙏

GetPoolState {} -> fromCBOR
GetStakeSnapshot {} -> fromCBOR

data StakeSnapshot crypto = StakeSnapshot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return SnapShots crypto?

The maps stored are very big [..., so] I do the map lookup on the node side

On Slack, Jordan also asked: "Should we allow the possibility to get the stake snapshot for multiple pools at once?"

One option occurs to me: we return a SnapShots crypto (so we don't have to define a new type 🙌) but we restrict all the maps inside that data type to just the keys given (so we support a multi-pool query 🙌).


On the slack thread, you agreed that one-pool at a time is fine for the first PR. So that's not urgent.

But avoiding a new data type might be nice.

So: does the resuse of SnapShots crypto but with all the maps restricted down to the pools identified by the query make sense?

One immediate downside: such a restricted SnapShot is probably counter-intuitive: eg it violates some generally-expected invariants of SnapShot (along the lines of completeness)

case mPoolIds of
Nothing ->
StakeSnapshots
{ ssStakeSnapshots = getPoolStakes (Set.fromList (VMap.elems (SL._delegations _pstakeMark)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that if I get the elems of _delegations _pstakeMark that it would give me all of the pool ids.

I'm not confident that this is true. Please validate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct @newhoggy. _delegations _pstakeMark is a map from stake credentials to stake pool IDs. You can tell from they type, if you happen to know that the stake pool IDs are the hash of the stake pool operator's cold verification key:

https://github.com/input-output-hk/cardano-ledger/blob/93b5d5421a5b5c3a3b6ad89a74e8bdb2ccc9258c/eras/shelley/impl/src/Cardano/Ledger/Shelley/EpochBoundary.hs#L157

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch 2 times, most recently from 92fd09c to a106d1c Compare August 2, 2022 12:52
<*> fromCBOR

data StakeSnapshots crypto = StakeSnapshots
{ ssStakeSnapshots :: Map (SL.KeyHash 'SL.StakePool crypto) (StakeSnapshot crypto)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the strict Map, so I think this field should also be strict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch 2 times, most recently from b70ba6a to a0d8d63 Compare August 3, 2022 10:45
@newhoggy newhoggy force-pushed the newhoggy/new-query-to-get-snapshots branch from a0d8d63 to ff55347 Compare August 3, 2022 12:02
@newhoggy
Copy link
Contributor Author

newhoggy commented Aug 5, 2022

bors r+

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Aug 5, 2022

@iohk-bors iohk-bors bot merged commit a06a247 into master Aug 5, 2022
@iohk-bors iohk-bors bot deleted the newhoggy/new-query-to-get-snapshots branch August 5, 2022 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants