Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay Miner] Address high memory usage #551

Closed
7 tasks
okdas opened this issue May 24, 2024 · 3 comments
Closed
7 tasks

[Relay Miner] Address high memory usage #551

okdas opened this issue May 24, 2024 · 3 comments
Assignees
Labels
relayminer Changes related to the Relayminer smt Sprase Merkle Tree Related

Comments

@okdas
Copy link
Member

okdas commented May 24, 2024

Objective

RelayMiner processes relays but consumes a large amount of RAM, often leading to the process getting killed due to OOM if resource constrained. RelayMiner should be able to handle a larger amount of requests without saturating memory.

Origin Document

As a part of our load testing plan on TestNet, we've discovered this behavior as we send larger amounts of relays than we usually do on smaller networks.

Pprof heap snapshot suggests most of the memory is reserved by BadgerDB:
Screenshot 2024-05-24 at 9 52 34 AM

Goals

Deliverables

  • Analyze memory usage hotspots within BadgerDB and optimize configurations.
  • Investigate and implement code optimizations for better memory management.
  • Test optimizations under load conditions to ensure stability and performance.
  • Document changes and best practices for memory management.
  • If necessary, evaluate and replace BadgerDB with a more memory-efficient alternative.

Non-goals / Non-deliverables

  • Changes to the core RelayMiner functionality beyond memory management.
  • Major refactoring of unrelated RelayMiner components.

General deliverables

  • Testing: Add new tests (unit and/or E2E) to the test suite.
  • Documentation: Update architectural or development READMEs.

Creator: @okdas
Co-Owners: TBD

@okdas okdas added relayminer Changes related to the Relayminer smt Sprase Merkle Tree Related labels May 24, 2024
@okdas okdas added this to the Shannon MainNet milestone May 24, 2024
@okdas okdas self-assigned this May 24, 2024
@okdas okdas added this to Shannon May 24, 2024
@okdas okdas moved this to 🔖 Ready in Shannon May 24, 2024
@okdas
Copy link
Member Author

okdas commented May 24, 2024

Synched with @Olshansk on this - I'll do a first pass to investigate if there are low-hanging fruits in badger configuration before we dig deeper into the code/smt changes.

@Olshansk
Copy link
Member

Things to consider:

  • Is there to avoid loading the whole key-value store into memory? Are we even doing this at all?
  • What would the usage be if we were to handle "serious traffic"?
  • Why isn't this a problem in Morse?
  • Is using LevelDB, RocksDB, BoltDB or other better?

@okdas okdas moved this from 🔖 Ready to 🏗 In progress in Shannon Aug 12, 2024
okdas added a commit that referenced this issue Aug 22, 2024
## Summary

## Issue

- #551 
- #621 

## Type of change

Select one or more:

- [ ] New feature, functionality or library
- [ ] Bug fix
- [x] Code health or cleanup
- [ ] Documentation
- [ ] Other (specify)

## Testing

**Documentation changes** (only if making doc changes)
- [ ] `make docusaurus_start`; only needed if you make doc changes

**Local Testing** (only if making code changes)
- [ ] **Unit Tests**: `make go_develop_and_test`
- [ ] **LocalNet E2E Tests**: `make test_e2e`
- See [quickstart
guide](https://dev.poktroll.com/developer_guide/quickstart) for
instructions

**PR Testing** (only if making code changes)
- [ ] **DevNet E2E Tests**: Add the `devnet-test-e2e` label to the PR.
- **THIS IS VERY EXPENSIVE**, so only do it after all the reviews are
complete.
- Optionally run `make trigger_ci` if you want to re-trigger tests
without any code changes
- If tests fail, try re-running failed tests only using the GitHub UI as
shown
[here](https://github.com/pokt-network/poktroll/assets/1892194/607984e9-0615-4569-9452-4c730190c1d2)


## Sanity Checklist

- [ ] I have tested my changes using the available tooling
- [ ] I have commented my code
- [ ] I have performed a self-review of my own code; both comments &
source code
- [ ] I create and reference any new tickets, if applicable
- [ ] I have left TODOs throughout the codebase, if applicable

---------

Co-authored-by: Daniel Olshansky <[email protected]>
Co-authored-by: Bryan White <[email protected]>
@okdas
Copy link
Member Author

okdas commented Aug 27, 2024

With PRs merged, the current resource utilization seems adequate. There's more room for improvements and we can work on that when needed.

@okdas okdas closed this as completed Aug 27, 2024
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in Shannon Aug 27, 2024
okdas added a commit that referenced this issue Nov 14, 2024
## Summary

## Issue

- #551 
- #621 

## Type of change

Select one or more:

- [ ] New feature, functionality or library
- [ ] Bug fix
- [x] Code health or cleanup
- [ ] Documentation
- [ ] Other (specify)

## Testing

**Documentation changes** (only if making doc changes)
- [ ] `make docusaurus_start`; only needed if you make doc changes

**Local Testing** (only if making code changes)
- [ ] **Unit Tests**: `make go_develop_and_test`
- [ ] **LocalNet E2E Tests**: `make test_e2e`
- See [quickstart
guide](https://dev.poktroll.com/developer_guide/quickstart) for
instructions

**PR Testing** (only if making code changes)
- [ ] **DevNet E2E Tests**: Add the `devnet-test-e2e` label to the PR.
- **THIS IS VERY EXPENSIVE**, so only do it after all the reviews are
complete.
- Optionally run `make trigger_ci` if you want to re-trigger tests
without any code changes
- If tests fail, try re-running failed tests only using the GitHub UI as
shown
[here](https://github.com/pokt-network/poktroll/assets/1892194/607984e9-0615-4569-9452-4c730190c1d2)


## Sanity Checklist

- [ ] I have tested my changes using the available tooling
- [ ] I have commented my code
- [ ] I have performed a self-review of my own code; both comments &
source code
- [ ] I create and reference any new tickets, if applicable
- [ ] I have left TODOs throughout the codebase, if applicable

---------

Co-authored-by: Daniel Olshansky <[email protected]>
Co-authored-by: Bryan White <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relayminer Changes related to the Relayminer smt Sprase Merkle Tree Related
Projects
Status: ✅ Done
Development

No branches or pull requests

2 participants