Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quorum queues: memory spike when applying a max-length policy retroactively to a long queue #12608

Closed
mkuratczyk opened this issue Oct 29, 2024 · 1 comment · Fixed by #12712
Labels
Milestone

Comments

@mkuratczyk
Copy link
Contributor

Describe the bug

Given a long quorum queue, if I apply a policy to limit the queue's length (in a real-world scenario, likely with the intention of preventing further queue growth and running out of memory), a significant memory spike occurs to drop the messages above the new threshold. This can easily cause the opposite effect than intended: I run out of memory because I was trying to prevent running out of memory...

In my particular case, it was even "funnier": I had a cluster on Kubernetes, applied a policy, the leader was OOMkilled, a new leader was elected and tried to apply the policy, so it was OOMkilled. The remaining node survived because a leader could not be elected, but as soon as one of the nodes restarted, a leader was elected and OOMkilled. A policy that was meant to limit memory usage, cause an OOMkill-loop. :)

Reproduction steps

  1. make run-broker (tested on main)
  2. Publish a significant number of messages: perf-test -qq -u qq -x 4 -y 0 -c 100 -s 5000 -ms -C 1250000
  3. Apply a policy that sets the limit to a low value: rabbitmqctl set_policy max qq '{"max-length": 1234}'
  4. Observe memory usage
Screenshot 2024-10-29 at 12 35 47

set_policy-main

Expected behavior

Ideally there should be no significant spike when dropping messages.

Additional context

No response

@mkuratczyk mkuratczyk added the bug label Oct 29, 2024
@michaelklishin michaelklishin changed the title Quorum queues: memory spike when applying a max-length policy to a long queue Quorum queues: memory spike when applying a max-length policy retroactively to a long queue Oct 29, 2024
@kjnilsson
Copy link
Contributor

We create an effect for each dropped message here:

{state(), ra_machine:effects()}.
discard(Msgs, Reason, undefined, State) ->
{State, [{mod_call, rabbit_global_counters, messages_dead_lettered,
[Reason, rabbit_quorum_queue, disabled, length(Msgs)]}]};

That's probably what causes the memory growth combined with the use of ++ to concatenate the resulting effects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants