[Performance] Reduce RelayMiner memory consumption under load #739

okdas · 2024-08-16T01:02:07Z

Summary

Issue

Type of change

Select one or more:

Testing

Documentation changes (only if making doc changes)

make docusaurus_start; only needed if you make doc changes

Local Testing (only if making code changes)

Unit Tests: make go_develop_and_test
LocalNet E2E Tests: make test_e2e
See quickstart guide for instructions

PR Testing (only if making code changes)

DevNet E2E Tests: Add the devnet-test-e2e label to the PR.
- THIS IS VERY EXPENSIVE, so only do it after all the reviews are complete.
- Optionally run make trigger_ci if you want to re-trigger tests without any code changes
- If tests fail, try re-running failed tests only using the GitHub UI as shown here

Sanity Checklist

I have tested my changes using the available tooling
I have commented my code
I have performed a self-review of my own code; both comments & source code
I create and reference any new tickets, if applicable
I have left TODOs throughout the codebase, if applicable

okdas · 2024-08-16T01:03:28Z

go.mod

-
-	// fix upstream GHSA-h395-qcrw-5vmq vulnerability.
-	github.com/gin-gonic/gin => github.com/gin-gonic/gin v1.7.0
+	github.com/pokt-network/smt => ../smt


Contingent on pokt-network/smt#52

okdas · 2024-08-20T01:29:27Z

Overall I'm pretty happy with how relayminer performs now. There might be some goroutine leaking (not 100% sure it is even happening), but we'll find out soon enough in #742! :) This particular goroutine seems suspicious.

github-actions · 2024-08-21T01:01:59Z

The image is going to be pushed after the next commit.

You can use make trigger_ci to push an empty commit.

If you also want to run E2E tests, please add devnet-test-e2e label.

github-actions · 2024-08-21T01:30:07Z

go.mod


-	// fix upstream GHSA-h395-qcrw-5vmq vulnerability.
-	github.com/gin-gonic/gin => github.com/gin-gonic/gin v1.7.0
+	// TODO_IN_THIS_PR: bump and remove


[linter-name (fail-on-found)] _{reported by reviewdog 🐶}
// TODO_IN_THIS_PR: bump and remove

@okdas Let's push to finish the SMT PR.

@Olshansk requested another review there

bryanchriswhite

Nice one @okdas! 🚀

NOTE: I have not run the load test as I'm currently working on an underpowered machine. 😅

docusaurus/docs/develop/developer_guide/quickstart.md

go.mod

bryanchriswhite · 2024-08-21T17:22:00Z

load-testing/tests/relays_stress.feature

+      | application | 4                | 10             | 12         |
+      | gateway     | 1                | 10             | 3          |
+      | supplier    | 1                | 10             | 3          |


I was under the impression that the "blocks per inc" needed to be a multiple of the blocks per session param to maintain constant rates of change across various metrics as the test scales actors. I also thought that there was a check for this somewhere in the load test helpers, around the "plans".

@bryanchriswhite Are you sure about this?

If so -> can you help @okdas look into the code and find the reasoning.

If not -> my intuition is that it keeps things simpler (to reason about) but doesn't need to be enforced.

In real life, we'll be staking / unstaking irrespective of the blocker per session so it doesn't make sense for the framework to have this limitation.

@okdas W/e the resolution ends up being, seems like the helper in the code needs a #PUC.

@bryanchriswhite interesting. This check is the reason I adjusted the number - as I was getting an error. Given our current blocks per session is 10 (bumped from 4 a couple of weeks ago), what do you think should be the best value?

@okdas that's the check I was referring to. 👍 I did not realize that blocks per session was 10; this all makes sense now.

load-testing/tests/relays_stress_single_suppier.feature

pkg/relayer/session/sessiontree.go

Olshansk · 2024-08-20T01:37:20Z

pkg/relayer/session/sessiontree.go

-	if err := st.treeStore.ClearAll(); err != nil {
-		return err
-	}
+	// We used to `st.treeStore.ClearAll()` here, but don't need to clean up the database, causing IO load,


Great find!

Makefile

Olshansk · 2024-08-21T19:41:55Z

go.mod


-	// fix upstream GHSA-h395-qcrw-5vmq vulnerability.
-	github.com/gin-gonic/gin => github.com/gin-gonic/gin v1.7.0
+	// TODO_IN_THIS_PR: bump and remove


@okdas Let's push to finish the SMT PR.

load-testing/loadtest_manifest_localnet_single_supplier.yaml

Olshansk · 2024-08-21T20:23:06Z

load-testing/tests/relays_stress.feature

+      | application | 4                | 10             | 12         |
+      | gateway     | 1                | 10             | 3          |
+      | supplier    | 1                | 10             | 3          |


@bryanchriswhite Are you sure about this?

If so -> can you help @okdas look into the code and find the reasoning.

If not -> my intuition is that it keeps things simpler (to reason about) but doesn't need to be enforced.

In real life, we'll be staking / unstaking irrespective of the blocker per session so it doesn't make sense for the framework to have this limitation.

@okdas W/e the resolution ends up being, seems like the helper in the code needs a #PUC.

Co-authored-by: Daniel Olshansky <[email protected]> Co-authored-by: Bryan White <[email protected]>

github-actions · 2024-08-21T22:25:57Z

go.mod

@@ -57,8 +61,8 @@ require (
 	// repo is the first obvious idea, but has to be carefully considered, automated, and is not
 	// a hard blocker.
 	github.com/pokt-network/shannon-sdk v0.0.0-20240814144717-dfa95b525d46
+	// TODO_IN_THIS_PR: bump after https://github.com/pokt-network/smt/pull/52 is in


[linter-name (fail-on-found)] _{reported by reviewdog 🐶}
// TODO_IN_THIS_PR: bump after pokt-network/smt#52 is in

github-actions · 2024-08-21T22:25:57Z

go.mod

@@ -79,7 +83,11 @@ require (
 	gopkg.in/yaml.v2 v2.4.0
 )

-require golang.org/x/text v0.16.0
+require (
+	// TODO_IN_THIS_PR: bump to the main branch commit after https://github.com/pokt-network/smt/pull/52 is in


[linter-name (fail-on-found)] _{reported by reviewdog 🐶}
// TODO_IN_THIS_PR: bump to the main branch commit after pokt-network/smt#52 is in

github-actions · 2024-08-22T00:36:29Z

The CI will now also run the e2e tests on devnet, which increases the time it takes to complete all CI checks.

You may need to run make trigger_ci to submit an empty commit that'll trigger the tests.

GCP workloads (requires changing the namespace to 739)
Grafana network dashboard for devnet-issue-739

bryanchriswhite

🚀 🚀 🚀

One final small suggestion but otherwise this LGTM! 🚢

pkg/relayer/session/sessiontree.go

Co-authored-by: Bryan White <[email protected]>

…ke-transfer * pokt/main: [Application] Implement unbonding period (#735) [Docs] Move over docs from poktroll-docker-compose-example (#757) [Performance] Reduce RelayMiner memory consumption under load (#739)

## Summary ## Issue - #551 - #621 ## Type of change Select one or more: - [ ] New feature, functionality or library - [ ] Bug fix - [x] Code health or cleanup - [ ] Documentation - [ ] Other (specify) ## Testing **Documentation changes** (only if making doc changes) - [ ] `make docusaurus_start`; only needed if you make doc changes **Local Testing** (only if making code changes) - [ ] **Unit Tests**: `make go_develop_and_test` - [ ] **LocalNet E2E Tests**: `make test_e2e` - See [quickstart guide](https://dev.poktroll.com/developer_guide/quickstart) for instructions **PR Testing** (only if making code changes) - [ ] **DevNet E2E Tests**: Add the `devnet-test-e2e` label to the PR. - **THIS IS VERY EXPENSIVE**, so only do it after all the reviews are complete. - Optionally run `make trigger_ci` if you want to re-trigger tests without any code changes - If tests fail, try re-running failed tests only using the GitHub UI as shown [here](https://github.com/pokt-network/poktroll/assets/1892194/607984e9-0615-4569-9452-4c730190c1d2) ## Sanity Checklist - [ ] I have tested my changes using the available tooling - [ ] I have commented my code - [ ] I have performed a self-review of my own code; both comments & source code - [ ] I create and reference any new tickets, if applicable - [ ] I have left TODOs throughout the codebase, if applicable --------- Co-authored-by: Daniel Olshansky <[email protected]> Co-authored-by: Bryan White <[email protected]>

okdas added 4 commits August 14, 2024 15:12

--wip-- [skip ci]

d0b14fd

Merge remote-tracking branch 'origin/main' into dk-relayminer-perf

908c769

try under stress test

276c90b

single supplier stress test

1f343db

okdas added relayminer Changes related to the Relayminer loadtest Work related to load testing scalability labels Aug 16, 2024

okdas added this to the Shannon Beta TestNet Launch milestone Aug 16, 2024

okdas self-assigned this Aug 16, 2024

okdas commented Aug 16, 2024

View reviewed changes

okdas requested a review from Olshansk August 16, 2024 01:29

--wip-- [skip ci]

7ba398e

Olshansk mentioned this pull request Aug 16, 2024

[Demand Scalability] Permissionless demand load testing & validation #742

Open

21 tasks

remove pprof files

d222e7e

Olshansk requested a review from bryanchriswhite August 19, 2024 20:50

change the database path

b971b46

okdas added 2 commits August 20, 2024 17:38

Merge remote-tracking branch 'origin/main' into dk-relayminer-perf

ad2731d

use pebble via commit

44ecd64

okdas marked this pull request as ready for review August 21, 2024 01:00

okdas added the push-image CI related - pushes images to ghcr.io label Aug 21, 2024

okdas added 2 commits August 20, 2024 18:01

Empty commit

5a43c31

can build

6c88dd1

github-actions bot reviewed Aug 21, 2024

View reviewed changes

bryanchriswhite requested changes Aug 21, 2024

View reviewed changes

Olshansk requested changes Aug 21, 2024

View reviewed changes

Apply suggestions from code review

aadebfd

Co-authored-by: Daniel Olshansky <[email protected]> Co-authored-by: Bryan White <[email protected]>

github-actions bot reviewed Aug 21, 2024

View reviewed changes

okdas added the devnet-test-e2e label Aug 22, 2024

github-actions bot added the devnet label Aug 22, 2024

okdas and others added 3 commits August 21, 2024 17:40

Use faucet account

90800bc

use release version for smt

eafd562

Merge branch 'main' into dk-relayminer-perf

7034b79

okdas requested a review from bryanchriswhite August 22, 2024 18:14

bryanchriswhite approved these changes Aug 22, 2024

View reviewed changes

pkg/relayer/session/sessiontree.go Outdated Show resolved Hide resolved

Update pkg/relayer/session/sessiontree.go

c9ae6f2

Co-authored-by: Bryan White <[email protected]>

okdas merged commit 5455068 into main Aug 22, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] Reduce RelayMiner memory consumption under load #739

[Performance] Reduce RelayMiner memory consumption under load #739

okdas commented Aug 16, 2024

okdas Aug 16, 2024

okdas commented Aug 20, 2024 •

edited

Loading

github-actions bot commented Aug 21, 2024

github-actions bot Aug 21, 2024

Olshansk Aug 21, 2024

okdas Aug 22, 2024

bryanchriswhite left a comment

bryanchriswhite Aug 21, 2024

Olshansk Aug 21, 2024

okdas Aug 21, 2024

bryanchriswhite Aug 22, 2024

Olshansk Aug 20, 2024

Olshansk Aug 21, 2024

Olshansk Aug 21, 2024

github-actions bot Aug 21, 2024

github-actions bot Aug 21, 2024

github-actions bot commented Aug 22, 2024

bryanchriswhite left a comment

[Performance] Reduce RelayMiner memory consumption under load #739

[Performance] Reduce RelayMiner memory consumption under load #739

Conversation

okdas commented Aug 16, 2024

Summary

Issue

Type of change

Testing

Sanity Checklist

Choose a reason for hiding this comment

okdas commented Aug 20, 2024 • edited Loading

github-actions bot commented Aug 21, 2024

github-actions bot Aug 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bryanchriswhite left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot Aug 21, 2024

Choose a reason for hiding this comment

github-actions bot Aug 21, 2024

Choose a reason for hiding this comment

github-actions bot commented Aug 22, 2024

bryanchriswhite left a comment

Choose a reason for hiding this comment

okdas commented Aug 20, 2024 •

edited

Loading