fix: slow regression tests tests #4117

kostasrim · 2024-11-12T11:38:46Z

We have a few very slow tests on the CI. Specifically:

241.30s call     dragonfly/replication_test.py::test_replication_all[df_factory0-mode0-8-t_replicas8-seeder_config8-50000-True]
240.80s call     dragonfly/replication_test.py::test_replication_all[df_factory0-mode0-8-t_replicas7-seeder_config7-50000-False]
239.92s call     dragonfly/replication_test.py::test_replication_all[df_factory0-mode1-8-t_replicas7-seeder_config7-50000-False]
236.87s call     dragonfly/replication_test.py::test_replication_all[df_factory0-mode1-8-t_replicas8-seeder_config8-50000-True]
178.18s call     dragonfly/snapshot_test.py::test_big_value_serialization_memory_limit[HSET-df_factory0]
175.97s call     dragonfly/snapshot_test.py::test_big_value_serialization_memory_limit[SADD-df_factory0]
175.79s call     dragonfly/snapshot_test.py::test_big_value_serialization_memory_limit[ZSET-df_factory0]
172.35s call     dragonfly/snapshot_test.py::test_big_value_serialization_memory_limit[LIST-df_factory0]
60.21s call     dragonfly/connection_test.py::test_pubsub_busy_connections[df_factory0]

This PR changes:

It refactors the test_big_value_serialization. First, the test was incorrect, in fact for some reason it did not even run with big value serialization flag set. Second, the test used execute_command in a loop which took a substantial amount of time. The changes on my machine reduce the total running time of the test from 4 * 170 seconds on average to a staggering 1 minute and 10 seconds for all 4.

Also notice, that we don't really need to test big value serialization on a stress test. It's redundant because when we set the value to a small number (4096 etc) it will have the same effect regardless of the stress load. I removed that one test case, saving as roughly (235 * 2) seconds

refactor test_big_value_serialization
remove no needed replication tests for big value

kostasrim · 2024-11-12T11:39:48Z

Impact on a full run: https://github.com/dragonflydb/dragonfly/actions/runs/11796502461

(I will post the results once it completes)

Signed-off-by: kostas <[email protected]>

kostasrim · 2024-11-12T15:51:07Z

x86 debug 28 minutes now takes 22
arm debug 42 minutes now takes 34
x86 release 28 minutes now takes 19
arm release 46 minutes now takes 28

Savings on a full run 24minutes (because arm tests run sequentially)

https://github.com/dragonflydb/dragonfly/actions/runs/11797723885/job/32862315798
vs
https://github.com/dragonflydb/dragonfly/actions/runs/11793906726/job/32850430451

kostasrim · 2024-11-12T15:52:00Z

At some point I will take care of:

180.20s call dragonfly/replication_test.py::test_replicaof_reject_on_load[df_seeder_factory0-df_factory0]

It's the next low hanging fruit

adiholden · 2024-11-13T07:22:28Z

tests/dragonfly/snapshot_test.py

-            await asyncio.sleep(0.01)
-
-    checker = asyncio.create_task(check_memory_usage(instance))
+    await client.execute_command(


I would add a comment here what it exectly does.
i.e add 1 db entry of given type with elements num each one of size element size

Now my question is why do we end up with more than 2g rss if we have one entry of 1000 elemets each is 1Mb size

RSS will grow during DEBUG populate, For example run:

debug populate 1 prefix 1000000 TYPE hash RAND ELEMENTS 1000

Creates a hash table with 1GB total size. Now do INFO MEMORY. RSS is: 2062901248 (2GB ~ doubled).

This does not answer the question why, but we can continue with this PR and please create another github ticket so we can follow up on this to investigate why we have this overhead

uh you meant what is causing the rss spike. I do not know, it was an observation. I created an issue #4124

adiholden · 2024-11-13T07:22:50Z

tests/dragonfly/snapshot_test.py

@@ -566,12 +566,12 @@ async def test_tiered_entries_throttle(async_client: aioredis.Redis):
    assert await StaticSeeder.capture(async_client) == start_capture


-@dfly_args({"proactor_threads": 1})
+@dfly_args({"serialization_max_chunk_size": 4096, "proactor_threads": 1})
 @pytest.mark.parametrize(
    "query",


can you please rename query to container type

adiholden · 2024-11-13T07:24:04Z

tests/dragonfly/snapshot_test.py

-    ten_mb = 10_000_000
+    one_gb = 1_000_000_000
+    elements = 1000
+    one_mb = 1_000_000


rename one_mb to elemtent_size

adiholden · 2024-11-13T07:24:57Z

tests/dragonfly/snapshot_test.py


+    info = await client.info("ALL")
+    # rss double's because of DEBUG POPULATE


what do you mean, why does it doubles because of DEBUG POPULATE?

see my other comment

kostasrim self-assigned this Nov 12, 2024

fix: slow CI tests

c5556b5

Signed-off-by: kostas <[email protected]>

kostasrim force-pushed the kpr5 branch from 0388067 to c5556b5 Compare November 12, 2024 12:17

tune

d50845f

kostasrim requested a review from adiholden November 12, 2024 15:51

adiholden reviewed Nov 13, 2024

View reviewed changes

comments

cb2e94c

kostasrim requested a review from adiholden November 13, 2024 07:59

adiholden approved these changes Nov 13, 2024

View reviewed changes

kostasrim merged commit 91c236a into main Nov 13, 2024
12 checks passed

kostasrim deleted the kpr5 branch November 13, 2024 08:32

kostasrim changed the title ~~fix: slow CI tests~~ fix: slow regression tests tests Nov 13, 2024

kostasrim mentioned this pull request Nov 13, 2024

debug populate doubles RSS #4124

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: slow regression tests tests #4117

fix: slow regression tests tests #4117

kostasrim commented Nov 12, 2024 •

edited

Loading

kostasrim commented Nov 12, 2024

kostasrim commented Nov 12, 2024

kostasrim commented Nov 12, 2024

adiholden Nov 13, 2024

kostasrim Nov 13, 2024 •

edited

Loading

adiholden Nov 13, 2024

kostasrim Nov 13, 2024

adiholden Nov 13, 2024

adiholden Nov 13, 2024

adiholden Nov 13, 2024

kostasrim Nov 13, 2024


		info = await client.info("ALL")
		# rss double's because of DEBUG POPULATE

fix: slow regression tests tests #4117

fix: slow regression tests tests #4117

Conversation

kostasrim commented Nov 12, 2024 • edited Loading

kostasrim commented Nov 12, 2024

kostasrim commented Nov 12, 2024

kostasrim commented Nov 12, 2024

adiholden Nov 13, 2024

Choose a reason for hiding this comment

kostasrim Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

adiholden Nov 13, 2024

Choose a reason for hiding this comment

kostasrim Nov 13, 2024

Choose a reason for hiding this comment

adiholden Nov 13, 2024

Choose a reason for hiding this comment

adiholden Nov 13, 2024

Choose a reason for hiding this comment

adiholden Nov 13, 2024

Choose a reason for hiding this comment

kostasrim Nov 13, 2024

Choose a reason for hiding this comment

kostasrim commented Nov 12, 2024 •

edited

Loading

kostasrim Nov 13, 2024 •

edited

Loading