Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.1: Extract client code in send_transaction_service into a new structure (backport of #3423) #3480

Closed
wants to merge 11 commits into from

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Nov 5, 2024

Problem

In order to be able to use new client code in the SendTransactionService (see #3444), I need to wrap network-related code with the new structure ConnectionCacheClient.

Summary of Changes

To extract network code into a new structure, I had to move some auxiliary code into separate files.
Beside of that, changes are pretty minimal and I deliberately haven't tried to optimize existing code to keep changes minimal. It just does almost exactly the same as the old one.


This is an automatic backport of pull request #3423 done by [Mergify](https://mergify.com).

…3423)

In order to be able to use new client code in the `SendTransactionService`, I need to wrap network-related code with the new structure ConnectionCacheClient.
To to that I also had to move some auxiliary code into separate files.

(cherry picked from commit 144925e)
@mergify mergify bot requested a review from a team as a code owner November 5, 2024 12:07
Copy link
Author

mergify bot commented Nov 5, 2024

If this PR represents a change to the public RPC API:

  1. Make sure it includes a complementary update to rpc-client/ (example)
  2. Open a follow-up PR to update the JavaScript client @solana/web3.js (example)

Thank you for keeping the RPC clients in sync with the server API @mergify[bot].

@alessandrod
Copy link

alessandrod commented Nov 5, 2024

This is RPC only code, doesn't change anything in the validator and it's mostly moving existing code around.

I want to backport it because I did some changes to the quic server that give better throughput and lower resource load if the client generates a transaction stream that doesn't have high fragmentation. The FD quic server does the same, and is even stricter in that it only allows very minimal fragmentation.

This is a preparatory PR to ultimately make send-transaction-service optionally use the new quic-client-next crate to send transactions. quic-client-next doesn't create stream fragmentation. Again, RPC only, so zero risk for the validator.

alessandrod
alessandrod previously approved these changes Nov 5, 2024
@KirillLykov KirillLykov added the automerge automerge Merge this Pull Request automatically once CI passes label Dec 5, 2024
@KirillLykov
Copy link

@bw-solana could you have a look?

Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code itself looks fine to me.

The only hesitation I have is changing public items in published sts crate. The amount of changes for public items is surprisingly minimal, but non-zero.

Do we know who we might break with this?
image

@KirillLykov
Copy link

KirillLykov commented Dec 6, 2024

This service is quite internal thing (can be used only together with RPC), so I generally doubt anyone uses it.
@bw-solana the only public crate that depends on it is elipsis client (they depend on version < 1.19), but looking into their code I found that they don't really need to depend directly on it. So this is not a problem.

@anza-team anza-team removed the automerge automerge Merge this Pull Request automatically once CI passes label Dec 6, 2024
@anza-team
Copy link
Collaborator

😱 New commits were pushed while the automerge label was present.

@jacobcreech
Copy link

@bw-solana @alessandrod can we not just separate the crate and bump the major version? If this is making a breaking change with >2000 downloads a day, we cannot possibly alert everyone using it ahead of time.

@bw-solana
Copy link

@bw-solana @alessandrod can we not just separate the crate and bump the major version? If this is making a breaking change with >2000 downloads a day, we cannot possibly alert everyone using it ahead of time.

We've discussed this - haven't come to consensus yet. @willhickey captured things well here: #3724

@t-nelson
Copy link

t-nelson commented Dec 6, 2024

i don't understand why the interfaces were changed in the same PR as breaks the types out to their own module. way too noisy even for master

@KirillLykov
Copy link

i don't understand why the interfaces were changed in the same PR as breaks the types out to their own module. way too noisy even for master

Changes in public API that I found:

  1. SendTransactionService::new_with_config and new: removed reference from connection_cache: &Arc<ConnectionCache>. It is because it was cloned twice anyways.
  2. pub(crate) struct SendTransactionServiceStatsReport -- added pub(crate), used to be pub
  3. pub(crate) struct CurrentLeaderInfo<T> -- added pub(crate), used to be pub

What has been moved:

  1. SendTransactionServiceStats to a separate module. The structure is unchanged, just moved from the send_transaction_service.rs to reduce the lenght of this file. I can do it in a separate commit to simplify the review.
  2. Also moved CurrentLeaderInfo to a another module.

Would it help if I create a separate commit that would extract these structures from the original module? I can also revert for this backport these pub(crate) change.

@alessandrod
Copy link

i don't understand why the interfaces were changed in the same PR as breaks the types out to their own module. way too noisy even for master

Because nobody ever cared about APIs so far, and we don't have a CI hook that breaks CI if API is inadvertently broken. I welcome the interest in being adults with APIs going forward, but you can't just manifest it, things have to change. In the meantime, we're trying to unfuck RPC so that we stop doing ethereum level throughput. This PR is one of many, we're already trying to split, but we're also trying to get everything landed at a decent speed. Keep in mind that this is a huge chore for Kirill who's agreed to fix code that nobody wants to touch, and a source of aneurysms for me as I review all his changes.

@alessandrod
Copy link

i don't understand why the interfaces were changed in the same PR as breaks the types out to their own module. way too noisy even for master

Changes in public API that I found:

1. `SendTransactionService::new_with_config` and `new`: removed reference from `connection_cache: &Arc<ConnectionCache>`. It is because it was cloned twice anyways.

Can this be undone without breaking anything?

2. `pub(crate) struct SendTransactionServiceStatsReport` -- added pub(crate), used to be pub
3. `pub(crate) struct CurrentLeaderInfo<T>`  -- added pub(crate), used to be pub

Then I guess this can easily be reverted.

What has been moved:

1. `SendTransactionServiceStats` to a separate module. The structure is unchanged, just moved from the `send_transaction_service.rs` to reduce the lenght of this file. I can do it in a separate commit to simplify the review.

2. Also moved `CurrentLeaderInfo` to a another module.

Would it help if I create a separate commit that would extract these structures from the original module? I can also revert for this backport these pub(crate) change.

Moving seems fine, just re-export from where they used to be exported.

Let's fix the API breakage, but also let's minimize churn please. I don't want this STS fix to take another month to land. I want to move on and go back to pretending that RPC doesn't exist.

@alessandrod
Copy link

@bw-solana @alessandrod can we not just separate the crate and bump the major version? If this is making a breaking change with >2000 downloads a day, we cannot possibly alert everyone using it ahead of time.

https://crates.io/crates/solana-rpc/reverse_dependencies

there are probably 2 to 3 actual consumers of the API besides us

@t-nelson
Copy link

t-nelson commented Dec 7, 2024 via email

Copy link
Author

mergify bot commented Dec 9, 2024

The Firedancer team maintains a line-for-line reimplementation of the
native programs, and until native programs are moved to BPF, those
implementations must exactly match their Agave counterparts.
If this PR represents a change to a native program implementation (not
tests), please include a reviewer from the Firedancer team. And please
keep refactors to a minimum.

@KirillLykov KirillLykov force-pushed the mergify/bp/v2.1/pr-3423 branch from e91ca42 to 4dca724 Compare December 9, 2024 19:30
Copy link
Author

mergify bot commented Dec 9, 2024

If this PR represents a change to the public RPC API:

  1. Make sure it includes a complementary update to rpc-client/ (example)
  2. Open a follow-up PR to update the JavaScript client @solana/web3.js (example)

Thank you for keeping the RPC clients in sync with the server API @mergify[bot].

/// Metrics of the send-transaction-service.
#[derive(Default)]
struct SendTransactionServiceStats {
pub struct SendTransactionServiceStats {
Copy link

@KirillLykov KirillLykov Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This became pub here to be used in transaction_client

@KirillLykov
Copy link

KirillLykov commented Dec 9, 2024

@t-nelson I added a commit that moves back the Stats structure and reverts some other changes (pub(crate)). The intention is to revert this commit later in this PR

@KirillLykov
Copy link

@KirillLykov
Copy link

KirillLykov commented Dec 11, 2024

now CI breaks after update branch because error: 2 vulnerabilities found! for some crates. Again, unrelated to these changes.

@KirillLykov
Copy link

@t-nelson please ignore CI failures, looks like 2.1 branch constantly faces some CI failures (unrelated to this PR).
For the sake of review simplicity, I returned the moved structure to the original place. I will revert this commit as soon as you confirm that these changes make sense. To prevent API changes due to structures movement, I plan to add commit that does pub use so that use for client is not affected (but structures are moved).

Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I've parsed everything correctly, looks like all of the public changes for the latest version are purely additive. Is that right? If I have that right, LGTM

Have we done some basic checkout sending txs to RPC running this code and ensuring latency/success rate look reasonable?

bw-solana
bw-solana previously approved these changes Dec 13, 2024
Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed offline w/ @KirillLykov that we've done performance checkout after this change and no issues observed

@KirillLykov
Copy link

KirillLykov commented Dec 16, 2024

@bw-solana @alessandrod

If I've parsed everything correctly, looks like all of the public changes for the latest version are purely additive. Is that right? If I have that right, LGTM

I removed changes which made some structures to be pub(crate) instead of pub in the commit that moves structures to their original place.

I reverted this commit and added on top a new commit that basically minimizes API changes in this PR: aed704e

@KirillLykov
Copy link

Closed in favor of redoing this PR, see #3454 (comment)

@KirillLykov KirillLykov closed this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants