-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate daily benchmark results #2208
Comments
Current plan:
|
Maybe not for this PR but we could retrieve amount of gas used per tipset as well. The gas metric should be directly proportional of computational workload in the FVM. Then we could deduce gas/s.
(from filecoin-project/lotus#3326) |
Next step is to measure the validation time in tipsets per second for Forest. Running |
What metrics script are you referring to? Are you able to run the nodes manually to get benchmark results for either Forest or Lotus? Running on calibnet would be as good as mainnet when testing. If not, this is a good place to start. Running a task manually is the first step to automating it. |
Plan to either modify the benchmark script in #2231 or develop another script to cover these metrics. The screenshots above were manual results from Forest running on calibnet. Today my plan is to learn to run Lotus on calibnet. |
That benchmark script works very differently from what we're trying to do in this issue. It won't help you run these benchmarks and I doubt it's worth updating the script. Once you've figured out how to run the benchmarks manually, you can find inspiration in @elmattic's script regarding how to automate the process. |
A question popped up yesterday with @jdjaustin for second point:
Should this be measured just after snapshot loading? So when Forest/Lotus are in msg sync? ( |
The validation time is constant both before a snapshot has been loaded (at 0 epochs per second) and after HEAD has been reached (at 1 epoch per 30 seconds). We need to measure to epochs per second after a snapshot has been loaded and before HEAD has been reached. We need to measure both Forest and Lotus so we can't rely on Forest-only metrics. |
I believe so, yes.
Sure, tracking peak memory usage would be nice as well. |
Adding a few subtasks that we need to address:
|
Removed some sub-tasks, we can do them in another PR. |
Peak memory benchmark is not really needed since our memory leak was fixed by @hanabi1224. |
It's still a useful metric. You never know when someone will introduce such a leak by mistake, and with a daily benchmark, we can quickly pinpoint where the regression happened. |
Yeah I agree, just wanted here to finish this PR faster so we can move the script to iac repo. That said from experience running forest for a week, it will be hard to have such metric. RSS can fluctuate between 7.5 and 8.5GB. |
Unless I'm misreading the code, this issue is not done. |
@lemmih What exactly is missing regarding the script itself? If it's the iac part a new issue has been opened here: ChainSafe/forest-iac#92 |
@elmattic This issue is for comparing Forest against Lotus. We want to know, not the absolute numbers for loading a snapshot, but the relative numbers compared against Lotus. This Ruby script doesn't do this at all. The script is nice and all but it doesn't even try to solve the problem described in this issue. |
Unless I'm misreading the code, the benchmark script can compare Forest in different configurations (say, mimalloc vs. jemalloc). While that may be useful in its own right, that's completely different from what this issue asks for. |
Hmm, I think I see where the confusion comes from. Josh solved this issue: #2714 |
My bad. Was looking at the wrong thing. |
Issue summary
There are two key performance metrics that we want to measure and track over time:
The absolute performance numbers will depend on hardware. But if we compare against Lotus then the ratio of our performance vs. their performance should be relatively stable across different hardware.
We should create a service (as part of forest-iac) that downloads a snapshot, imports it with both Forest and Lotus, runs each client for, say, 30 minutes, and notes down how many tipsets were validated.
The DO droplets have limited disk space so the database should be cleared between Forest and Lotus runs.
The final product might be a CSV file containing the versions of Forest and Lotus, the time it took to load the snapshots, and the validation speed in tipsets per minute. These files should be uploaded to our DO space such that we have one file per day.
Other information and links
The text was updated successfully, but these errors were encountered: