Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create indexation cache for "coins to spend" queries #2455

Closed
wants to merge 165 commits into from

Conversation

rafal-ch
Copy link
Contributor

This is the 1/2 PR to fix the #2391
It is stacked on top of #2383 (balances cache)

Description

The scope of this PR is to build the proper index upon processing the coin related events.
The follow-up PR will contain the actual usage of the index, hence there are a couple of TODOs left that mention this follow-up PR.

Before requesting review

  • I have reviewed the code myself

@rafal-ch rafal-ch changed the title Rafal 2391 coins to spend cache Create indexation cache for "coins to spend" queries Nov 26, 2024
@rafal-ch rafal-ch added the no changelog Skip the CI check of the changelog modification label Nov 26, 2024
@rafal-ch rafal-ch marked this pull request as ready for review November 26, 2024 11:53
@rafal-ch rafal-ch requested a review from a team November 26, 2024 11:55
Base automatically changed from 1965_balances_cache to master November 29, 2024 16:26
crates/fuel-core/src/graphql_api/database.rs Show resolved Hide resolved
crates/fuel-core/src/service/sub_services.rs Show resolved Hide resolved
crates/fuel-core/src/service/sub_services.rs Show resolved Hide resolved
offset += Address::LEN;
arr[offset..offset + AssetId::LEN].copy_from_slice(asset_id_bytes);
offset += AssetId::LEN;
arr[offset..offset + u8::BITS as usize / 8].copy_from_slice(&NON_RETRYABLE_BYTE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will make more sense if the type of the coin Message/Coin will be after amount, before Nonce/UtxoId. In this case we also will sort messages by amount.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes we change the structure of the index key, because with the current approach we cannot switch places since this byte is used as a part of prefix when querying the index. Let's huddle this out.

use crate::graphql_api::indexation;

use self::indexation::coins_to_spend::{
NON_RETRYABLE_BYTE,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In another comment I mentioned, that we don't need to include retryable messages into the coins to spend query. So we can remove it.

But you need to know the difference between coin and message, so I think we need to use this 1 byte for the Message or Coin enum representation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the complete PR I use the "value" in the column to distinguish between Coins and Messages (to be able to read data from the proper on-chain DB).

offset += u64::BITS as usize / 8;
arr[offset..offset + Nonce::LEN].copy_from_slice(nonce_bytes);
offset += Nonce::LEN;
arr[offset..].copy_from_slice(&indexation::coins_to_spend::MESSAGE_PADDING_BYTES);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, how well RocksDB works with the dynamic sized keys? Maybe based on the type of the message/coin we could use 32/34 bytes for Nonce/UtxoId types and during decoding decide what type to return?

Just a thought, if it is hard to support or implement, I'm okay with the current padding approach =)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, in the complete PR I decided to remove the padding approach and use variable length keys. I don't know about any performance implications on RocksDB side. Also, we never query for a large amounts of data (255 items at most with the current limits), so we should be good. Taking this into consideration I though it's not worth "wasting" additional two bytes for every indexed message.

type Key = Self::OwnedKey;
type OwnedKey = CoinsToSpendIndexKey;
type Value = Self::OwnedValue;
type OwnedValue = u8;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need 1 byte here? If you don't use it, we can just use () type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, it is IndexedCoinType. Why not to use this type here instead of u8? Also, in the comment above I said that maybe we could use retryable and non-retryable byte to track message/coin type.

Copy link
Collaborator

@xgreenx xgreenx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be simpler to review if the second part was actually the first, and vice versa=)

Because the second part only requires you to have sorted backward and forward iterators. You could update the current algorithm to work with this iterator(you could use iter and iter().rev() from the vector) and, in the next PR, replace the sorted vector with the new indexation. In this case, we don't need to have todo! in the code and it is easier to review the final variant of the feature.

@rafal-ch
Copy link
Contributor Author

rafal-ch commented Dec 3, 2024

I think it will be simpler to review if the second part was actually the first, and vice versa=)

Yes, that's true. Also because after I started implementing the actual usage of the index, I noticed that a couple of things implemented here need to be adjusted.

I'll prepare a single PR with the complete feature which will also include responses to the comments you placed here.

@rafal-ch
Copy link
Contributor Author

rafal-ch commented Dec 3, 2024

Closing now in favor of #2463
All review comments added to this PR will be addressed and eventually incorporated into the new PR.

@rafal-ch rafal-ch closed this Dec 3, 2024
rafal-ch added a commit that referenced this pull request Jan 13, 2025
Closes #2391

This PR includes all changes from the [Part 1
PR](#2455), making it
deprecated.

## Description
Changes in this PR:

#### The new `CoinsToSpend` index
* This is the database that stores all coins to spend sorted by the
amounts (i.e. largest-by-value coins first)
* The key consists of several parts
* _Retryable flag_ - to distinguish between retryable messages and other
coins
  * _Address_ (owner)
  * _AssetID_
* _Amount_ - as "big-endian" bytes to leverage the RocksDB key sorting
capabilities
* _Foreign Key_ - this are bytes of the key from either the `Messages`
or `Coins` on-chain databases
    * for messages this is a 32-bytes `Nonce`
    * for coins this is a 34-bytes `UtxoId`
* The value is an instance of `IndexedCoinType` enum, so we know which
on-chain database to query when returning the actual coins
* This index is updated when executor events are processed
* When querying for "coins to spend" the following algorithm is applied:
* First, we get as many "big" coins as required to satisfy _double the
amount_ from the query (respecting `max` and `excluded` params)
* If we have enough coins, but there are still some "slots" in the query
left (because we selected less coins than `max`) we fill the remaining
slots with a **random** number of "dust" coins
* If it happens that the value of selected "dust coins" is able to cover
the value of some of the already selected "big coins", we remove the
latter from the response
* If at any step we encounter a problem (reading from database, integer
conversions, etc.) we bail with an appropriate error

#### Changes to `CoinsQueryError` type
* The `MaxCoinsReached` variant has been removed because in the new
algorithm we never query for more coins than the specified `max`, hence,
without additional effort, we are not able to tell whether the query
could be satisfied if user provided bigger `max`
* The `InsufficientCoins` has been renamed to
`InsufficientCoinsForTheMax` and it now contains the additional `max`
field

#### Off-chain database metadata
* The metadata for off-chain database now contain the additional
`IndexationKind` - `CoinsToSpend`

#### Refactoring
* The `indexation.rs` module was split into separate files, each per
indexation type + errors + some utils.

#### Other
* Integration tests have to be updated to not expect the exact number of
coins to spend in the response (currently, due to randomness, we're not
deterministic in this regard)
* The number of excluded ids in the `coinsToSpend` GraphQL query is now
limited to the maximum number of inputs allowed in transaction.

### Before requesting review
- [X] I have reviewed the code myself
- [X] I have created follow-up issues caused by this PR and linked them
here

### Follow-up issues
* #2498
* #2448
* #2428
* #2499
* #2496

---------

Co-authored-by: Green Baneling <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no changelog Skip the CI check of the changelog modification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants