-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: slow regression tests tests #4117
Conversation
Impact on a full run: https://github.com/dragonflydb/dragonfly/actions/runs/11796502461 (I will post the results once it completes) |
Signed-off-by: kostas <[email protected]>
x86 debug 28 minutes now takes 22 Savings on a full run 24minutes (because arm tests run sequentially) https://github.com/dragonflydb/dragonfly/actions/runs/11797723885/job/32862315798 |
At some point I will take care of: 180.20s call dragonfly/replication_test.py::test_replicaof_reject_on_load[df_seeder_factory0-df_factory0] It's the next low hanging fruit |
await asyncio.sleep(0.01) | ||
|
||
checker = asyncio.create_task(check_memory_usage(instance)) | ||
await client.execute_command( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a comment here what it exectly does.
i.e add 1 db entry of given type with elements num each one of size element size
Now my question is why do we end up with more than 2g rss if we have one entry of 1000 elemets each is 1Mb size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RSS will grow during DEBUG populate, For example run:
debug populate 1 prefix 1000000 TYPE hash RAND ELEMENTS 1000
Creates a hash table with 1GB total size. Now do INFO MEMORY
. RSS is: 2062901248 (2GB ~ doubled).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not answer the question why, but we can continue with this PR and please create another github ticket so we can follow up on this to investigate why we have this overhead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh you meant what is causing the rss spike. I do not know, it was an observation. I created an issue #4124
tests/dragonfly/snapshot_test.py
Outdated
@@ -566,12 +566,12 @@ async def test_tiered_entries_throttle(async_client: aioredis.Redis): | |||
assert await StaticSeeder.capture(async_client) == start_capture | |||
|
|||
|
|||
@dfly_args({"proactor_threads": 1}) | |||
@dfly_args({"serialization_max_chunk_size": 4096, "proactor_threads": 1}) | |||
@pytest.mark.parametrize( | |||
"query", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please rename query to container type
tests/dragonfly/snapshot_test.py
Outdated
ten_mb = 10_000_000 | ||
one_gb = 1_000_000_000 | ||
elements = 1000 | ||
one_mb = 1_000_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename one_mb to elemtent_size
|
||
info = await client.info("ALL") | ||
# rss double's because of DEBUG POPULATE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean, why does it doubles because of DEBUG POPULATE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my other comment
We have a few very slow tests on the CI. Specifically:
This PR changes:
It refactors the
test_big_value_serialization
. First, the test was incorrect, in fact for some reason it did not even run with big value serialization flag set. Second, the test usedexecute_command
in a loop which took asubstantial
amount of time. The changes on my machine reduce the total running time of the test from4 * 170 seconds
onaverage
to a staggering1 minute and 10 seconds
for all 4.Also notice, that we don't really need to test big value serialization on a stress test. It's redundant because when we set the value to a small number (4096 etc) it will have the same effect regardless of the stress load. I removed that one test case, saving as roughly (235 * 2) seconds