Quorum queues: memory spike when applying a `max-length` policy retroactively to a long queue #12608

mkuratczyk · 2024-10-29T11:39:34Z

Describe the bug

Given a long quorum queue, if I apply a policy to limit the queue's length (in a real-world scenario, likely with the intention of preventing further queue growth and running out of memory), a significant memory spike occurs to drop the messages above the new threshold. This can easily cause the opposite effect than intended: I run out of memory because I was trying to prevent running out of memory...

In my particular case, it was even "funnier": I had a cluster on Kubernetes, applied a policy, the leader was OOMkilled, a new leader was elected and tried to apply the policy, so it was OOMkilled. The remaining node survived because a leader could not be elected, but as soon as one of the nodes restarted, a leader was elected and OOMkilled. A policy that was meant to limit memory usage, cause an OOMkill-loop. :)

Reproduction steps

make run-broker (tested on main)
Publish a significant number of messages: perf-test -qq -u qq -x 4 -y 0 -c 100 -s 5000 -ms -C 1250000
Apply a policy that sets the limit to a low value: rabbitmqctl set_policy max qq '{"max-length": 1234}'
Observe memory usage

Expected behavior

Ideally there should be no significant spike when dropping messages.

Additional context

No response

The text was updated successfully, but these errors were encountered:

kjnilsson · 2024-11-13T08:42:59Z

We create an effect for each dropped message here:

rabbitmq-server/deps/rabbit/src/rabbit_fifo_dlx.erl

Lines 151 to 154 in 5f47159

    
               {state(), ra_machine:effects()}. 
        
           discard(Msgs, Reason, undefined, State) -> 
        
               {State, [{mod_call, rabbit_global_counters, messages_dead_lettered, 
        
                         [Reason, rabbit_quorum_queue, disabled, length(Msgs)]}]};

That's probably what causes the memory growth combined with the use of ++ to concatenate the resulting effects.

mkuratczyk added the bug label Oct 29, 2024

michaelklishin changed the title ~~Quorum queues: memory spike when applying a max-length policy to a long queue~~ Quorum queues: memory spike when applying a max-length policy retroactively to a long queue Oct 29, 2024

kjnilsson mentioned this issue Nov 13, 2024

QQ: reduce memory use when dropping many messages at once. #12712

Merged

michaelklishin added this to the 4.0.4 milestone Nov 13, 2024

michaelklishin closed this as completed in #12712 Nov 13, 2024

mergify bot mentioned this issue Nov 13, 2024

QQ: reduce memory use when dropping many messages at once. (backport #12712) #12716

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quorum queues: memory spike when applying a `max-length` policy retroactively to a long queue #12608

Quorum queues: memory spike when applying a `max-length` policy retroactively to a long queue #12608

mkuratczyk commented Oct 29, 2024

kjnilsson commented Nov 13, 2024

Quorum queues: memory spike when applying a max-length policy retroactively to a long queue #12608

Quorum queues: memory spike when applying a max-length policy retroactively to a long queue #12608

Comments

mkuratczyk commented Oct 29, 2024

Describe the bug

Reproduction steps

Expected behavior

Additional context

kjnilsson commented Nov 13, 2024

Quorum queues: memory spike when applying a `max-length` policy retroactively to a long queue #12608

Quorum queues: memory spike when applying a `max-length` policy retroactively to a long queue #12608