Skip to content

Commit

Permalink
rabbit_quorum_queue: Wait for member add in add_member/4
Browse files Browse the repository at this point in the history
[Why]
The `ra:member_add/3` call returns before the change is committed. This
is ok for that addition but any follow-up changes to the cluster might
be rejected with the `cluster_change_not_permitted` error.

[How]
Instead of changing other places to wait or retry their cluster
membership change, this patch waits for the current add to be applied
before proceeding and returning.

This fixes some transient failures in CI where such follow-up changes
are rejected and not retried, leaving the cluster in an unexpected state
for the testcase.

An example is with
`quorum_queue_SUITE:force_shrink_member_to_current_member/1`
  • Loading branch information
dumbbell committed Nov 27, 2024
1 parent c7f6a71 commit b51fcf2
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion deps/rabbit/src/rabbit_quorum_queue.erl
Original file line number Diff line number Diff line change
Expand Up @@ -1346,14 +1346,18 @@ add_member(Q, Node, Membership, Timeout) when ?amqqueue_is_quorum(Q) ->
maps:get(id, Conf)
end,
case ra:add_member(Members, ServerIdSpec, Timeout) of
{ok, _, Leader} ->
{ok, {RaIndex, RaTerm}, Leader} ->
Fun = fun(Q1) ->
Q2 = update_type_state(
Q1, fun(#{nodes := Nodes} = Ts) ->
Ts#{nodes => [Node | Nodes]}
end),
amqqueue:set_pid(Q2, Leader)
end,
{ok, _, _} = ra:leader_query(
Leader,
{erlang, is_list, []},
#{condition => {applied, {RaIndex, RaTerm}}}),
_ = rabbit_amqqueue:update(QName, Fun),
rabbit_log:info("Added a replica of quorum ~ts on node ~ts", [rabbit_misc:rs(QName), Node]),
ok;
Expand Down

0 comments on commit b51fcf2

Please sign in to comment.