Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CQ shared store write optimisations #8507

Merged
merged 9 commits into from
Jun 20, 2023
Merged

CQ shared store write optimisations #8507

merged 9 commits into from
Jun 20, 2023

Conversation

lhoguin
Copy link
Contributor

@lhoguin lhoguin commented Jun 9, 2023

Another round of CQ shared message store optimisations and refactors, this time around writes.

The first commit is more a refactor than anything, although no longer calling fsync does help in some scenarios. The second commit should provide a noticeable boost in performance when there are no consumers.

@mkuratczyk I have renamed the branch.

@essen essen force-pushed the lh-msg-store-bis branch from d9a9a5f to 914f61a Compare June 9, 2023 10:13
@essen essen force-pushed the lh-msg-store-bis branch from 914f61a to 5e1a83a Compare June 9, 2023 10:40
@michaelklishin

This comment was marked as outdated.

@essen essen force-pushed the lh-msg-store-bis branch from 2b24f09 to 839f3bf Compare June 16, 2023 08:28
lhoguin added 7 commits June 19, 2023 12:22
Before they would only be sent periodically or when
rolling over to a new file.
We know the messages are on disk or were acked so there is no
need to do sets intersections/subtracts in this scenario.
Instead of having the message store send a message to the queue
with the confirms for messages ignored due to the flying
optimisation, we have the queue handle the confirms directly
when removing the messages.

This avoids sending potentially 1 Erlang message per 1 AMQP
message to the queue.
Also make use of the opened file for multi-reads instead
of opening/reading/closing each time.
@essen essen force-pushed the lh-msg-store-bis branch from 3c7c402 to 8c013ad Compare June 19, 2023 10:22
lhoguin added 2 commits June 19, 2023 17:36
The way I initially did this the maybe_gc would be triggered
based on candidates from 15s ago, but run against candidates
from just now. This is sub-optimal because when messages are
consumed rapidly, just-now candidates are likely to be in a
file about to be deleted, and we don't want to run compaction
on those.

Instead, when sending the maybe_gc we also send the candidates
we had at the time. Then 15s later we check if the file still
exists. If it's gone, great! No compaction to do.
@essen essen force-pushed the lh-msg-store-bis branch from af1fbff to df2f718 Compare June 20, 2023 12:11
@lhoguin lhoguin marked this pull request as ready for review June 20, 2023 12:54
@lhoguin
Copy link
Contributor Author

lhoguin commented Jun 20, 2023

This is ready for review/merge.

@lhoguin lhoguin requested a review from mkuratczyk June 20, 2023 12:54
@michaelklishin michaelklishin added this to the 3.13.0 milestone Jun 20, 2023
@mkuratczyk
Copy link
Contributor

Full test results:
https://snapshots.raintank.io/dashboard/snapshot/oo6eAYwUI2exFVFxdt6mTQIsb7AT2hrt?orgId=2

The biggest difference is in confirm latency, when using -c 1 (here for 5kb messages, but similar results for all tested):
Screenshot 2023-06-20 at 19 58 06

And when publishing to a long queue (no consumers):
Screenshot 2023-06-20 at 20 02 12

(the sudden drop in throughput is when the queues reach their max-length - dropping messages is currently quite expensive)

@mkuratczyk mkuratczyk merged commit 985f4e8 into main Jun 20, 2023
@mkuratczyk mkuratczyk deleted the lh-msg-store-bis branch June 20, 2023 18:04
@michaelklishin
Copy link
Member

michaelklishin commented Jun 20, 2023

So, a 40% drop in publisher confirm latency on one workload, and a 60% drop on another (both have publishers and consumers online). Not bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants