Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OBS CI test aganst home:mwilck:multipath #5

Closed
wants to merge 35 commits into from
Closed

OBS CI test aganst home:mwilck:multipath #5

wants to merge 35 commits into from

Conversation

mwilck
Copy link
Collaborator

@mwilck mwilck commented Sep 10, 2021

No description provided.

Add a small utility that will be used in later patches.
Have struct mutex_lock take an optional wakeup function.
unlock() is renamed to __unlock() in order to prevent it from
being called by mistake.
exactly like snprint_config(), but takes a struct strbuf * as argument.
uevents listed on merge_node must be cleaned up, too. uevents
cancelled while being serviced and temporary queues, likewise.
The global uevq must be cleaned out in the uevent listener thread,
because it might have added events after the dispatcher thread
had already finished.
After sending "RELOADING=1" to systemd, a service must send
"READY=1" before "STOPPING=1". Otherwise systemd will be confused
and will not regard the service as stopped. Subsequent attempts
to start multipathd via socket activation fail until systemd times
out the reload operation.

The problem can be reproduced by running "multipathd shutdown"
quickly after "multipathd reconfigure".
When a reconfigure operation is requested, either by the admin
or by some condition multipathd encounters, the current code
attempts to set DAEMON_CONFIGURE state and gives up after a second
if it doesn't succeed. Apart from shutdown, this happens only
if multipathd is either already reconfiguring, or busy in the
path checker loop.

This patch modifies the logic as follows: rather than waiting,
we set a flag that requests a reconfigure operation asap, i.e.
when the current operation is finished and the status switched
to DAEMON_IDLE. In this case, multipathd will not switch to IDLE
but start another reconfigure cycle.

This assumes that if a reconfigure is requested while one is already
running, the admin has made some (additional) changes and wants
multipathd to pull them in. As we can't be sure that the currently
running reconfigure has seen the configuration changes, we need
to start over again.

A positive side effect is less waiting in clients and multipathd.

After this change, the only caller of set_config_state() is
checkerloop(). Waking up every second just to see that DAEMON_RUNNING
couldn't be set makes no sense. Therefore set_config_state() is
changed to wait "forever", or until shutdown is requested. Unless
multipathd completely hangs, the wait will terminate sooner or
later.
No functional changes. Just make the code a little easier to read.
Return code 2 from ev_remove_map means that a delayed remove has
been started, which is not the same as failure.
Use a typedef instead of spelling out the function type everywhere.
The cli_handler functions are only called from the handler table and
need not be exported.
Modify set_handler_callback() such that a missing slot is created
if no matching slot is found. This way, we can skip the initialization
with NULL handlers on startup.
EAGAIN is too generic, and doesn't fit semantically either.
ESRCH in't used anywhere else in our code.
Since e3270f7 ("multipathd: use weaker "force_reload" at startup"),
(multipath-tools 0.7.0), we only reload those maps that must be
reloaded during startup. "multipath reconfigure", OTOH, reloads
every map, which may take a long time on systems with lots of
storage devices, as every DM_DEVICE_RELOAD ioctl involves a
suspend/resume cycle.

The logic we use during startup is actually very robust and catches
all cases in which a reload is necessary. "reconfigure" operations
are often done because of configuration changes, and usually don't
require a full reload of every map.

This patch changes the default behavior of "multipath reconfigure"
to "weak" reload, like we do on startup since e3270f7. The behavior
can be changed by setting the configuration option
"force_reconfigure yes" before starting the reconfigure operation.
"multipath -r" is also affected, but "multipath -D -r" is not.

It would have been nice to have introduced a new cli command
"reconfigure force" instead, but the way "reconfigure" is
implemented, that would have required a major rewrite of the code.
Since 47cc1d3 ("multipathd: fix client response for socket
activation"), we hold back clients while reconfigure is running.
The idea of 47cc1d3 was to fix the behavior during initial
start up. When multipathd reconfigures itself during runtime,
and the reconfiguration takes a long time (a minute or more is
not unusual in big configurations), clients will time out with
no response ("timeout receiving packet"). Waiting for reconfigure
to finish breaks our timeout handling.

Therefore we should only apply the logic of 47cc1d3 during initial
configuration. In this case, the client that triggered socket
activation may still encounter a timeout, but there's not much we can
do about that.

Fixes: 47cc1d3 ("multipathd: fix client response for socket activation")
The unix socket listener thread doesn't even look at the revents
returned by poll() while the daemon is configuring. This may cause a
closed client socket to be kept open for a long time by the server,
while the listener basically performs a busy loop, as ppoll() always
returns immediately as long as the POLLHUP condition exists.

Worse, it can happen that multipathd reads data from such a closed
client socket after the client has disconnected. See the description
of POLLHUP in poll(2).

Close connections immediately if HUP is received.

Also, use the fd in log messages to identify the client rather
than the random index.
Avoid hardcoding the indices as 0, 1, 2...
Minor edit: if notifications are off, we set the poll fd to
-1 but still use the POLLIN mask. It looks nicer if to poll
the correct fd, but reset the event mask to 0 if we're not
actually interested in it.
Currently the uxlsnr handles each client request (receive requset -
handle request - respond) in a single loop iteration. This has
severe disadvantages. In particular, the code may wait in poll()
called from read_all(), or wait for the vecs lock, while other
clients are ready to be serviced or signals to be handled.

This patch adds some fields to "struct client" which will be used
by later patches to change this into a state machine that basically
waits only in place, the ppoll() call in uxsock_listen().

For now, we just introduce and initialize the fields.
uxsock_trigger() really belongs into cli.c. I suppose that way back in
the past there were strong reasons to call this function via a
pointer. I don't think these reasons are valid any more. Moving
the function to cli.c allows restructuring the code.

No functional changes.
parse_cmd() does more than the name says - it parses, executes
handlers, and even provides reply strings for some cases. This doesn't
work well with the state machine idea. Thus move it to uxlsnr.c,
where later patches will move some functionality elsewhere.

No functional changes.
This function just prints a warning, anyway. If this warning
is printed, the client will see a timeout and print a warning, too.
A later patch will re-introduce this function with real functionality.
No functional changes at this point. handle_client() will become
the state machine for handling client requests.
As a first step towards our state machine, avoid the call to
read_all() via recv_packet_from_client(). handle_client() is now
invoked twice for the same connection. The first time it reads
the command length, and later on it reads the command itself
piece-wise, as sent by the client. This will be just a single
read in most cases, but not always.
This allows us to simplify callers by not having to track the
reply length separately.
The SO_PEERCRED socket option returns "the credentials that were
in effect at the time of the call to connect(2)" (see unix(7)).
So we might as well fetch these credentials at that time.
…cmd()

As a next step towards the state machine, give the handler functions
access to the state of the client connection.
Move the actual execution of the handler out of parse_cmd(). For now,
we do it in uxsock_trigger().
Rather than using a separate poor-man's parser for checking root
commands, use the real parser. It will return "LIST" as first verb
for the read-only commands that non-root users may execute.
This patch sets up the bulk of the state machine. The idea is to
fall through the case labels as long as possible (when steps succeed)
and return to the caller if either an error occurs, or it becomes
necessary to wait for some pollable condition.

While doing this, switch to negative error codes for the functions
in uxlsnr.c (e.g. parse_cmd()). Positive return codes are reserved
for the cli_handler functions themselves. This way we can clearly
distinguish the error source, and avoid confusion and misleading
error messages. No cli_handler returns negative values.

Note: with this patch applied, clients may hang and time out if
the handler fails to acquire the vecs lock. This will be fixed in the
follow-up patch "multipathd: uxlsnr: add idle notification".
The previous patches added the state machine and the timeout handling,
but there was no wakeup mechanism for the uxlsnr for cases where
client connections were waiting for the vecs lock.

This patch uses the previously introduced wakeup mechanism of
struct mutex_lock for this purpose. Processes which unlock the
"global" vecs lock send an event in an eventfd which the uxlsnr
loop is polling for.

As we are now woken up for servicing client handlers that don't
wait for input but for the lock, we need to set up the pollfds
differently, and iterate over all clients when handling events,
not only over the ones that are receiving. The hangup handling
is changed, too. We have to look at every client, even if one has
hung up. Note that I don't take client_lock for the loop in
uxsock_listen(), it's not necessary and will be removed elsewhere
in a follow-up patch.

With this in place, the lock need not be taken in execute_handler()
any more. The uxlsnr only ever calls trylock() on the vecs lock,
avoiding any waiting for other threads to finish.
Our ppoll() call needs to wake up when a client request times out.
This logic can be added by determining the first client that's about
to time out. The logic in handle_client() will then cause a timeout
reply to be sent to the client. This is more client-friendly
as the client timing out without receiving a reply.
send_packet() may busy-loop. By polling for POLLOUT, we can
avoid that, even if it's very unlikely in practice.
The list of clients is never changed anywhere except in
uxsock_listen(). No need to lock.
The server checks for root permissions anyway. "multipathd -k"
should work for ordinary users as long as no priviledged commands
are executed.
@mwilck
Copy link
Collaborator Author

mwilck commented Sep 10, 2021

Still not perfect ...

@mwilck mwilck closed this Sep 10, 2021
mwilck added a commit that referenced this pull request Dec 1, 2021
... by the paths and pg vectors of the map to be removed.

Original bug report from Lixiaokeng ("libmultipath: clear removed path from mpp"):

multipathd[3525635]: ==3525635==ERROR: AddressSanitizer: heap-use-after-free on address 0xffffa4902fc0 at pc 0xffffac7d5b88 bp 0xffffa948dac0 sp 0xffffa948dae0
multipathd[3525635]: READ of size 8 at 0xffffa4902fc0 thread T7
multipathd[3525635]:    #0 0xffffac7d5b87 in free_multipath (/usr/lib64/libmultipath.so.0+0x4bb87)
multipathd[3525635]:    #1 0xaaaad6cf7057  (/usr/sbin/multipathd+0x17057)
multipathd[3525635]:    #2 0xaaaad6cf78eb  (/usr/sbin/multipathd+0x178eb)
multipathd[3525635]:    #3 0xaaaad6cff4df  (/usr/sbin/multipathd+0x1f4df)
multipathd[3525635]:    #4 0xaaaad6cfffe7  (/usr/sbin/multipathd+0x1ffe7)
multipathd[3525635]:    #5 0xffffac807be3 in uevent_dispatch (/usr/lib64/libmultipath.so.0+0x7dbe3)
multipathd[3525635]:    #6 0xaaaad6cf563f  (/usr/sbin/multipathd+0x1563f)
multipathd[3525635]:    #7 0xffffac6877af  (/usr/lib64/libpthread.so.0+0x87af)
multipathd[3525635]:    #8 0xffffac44118b  (/usr/lib64/libc.so.6+0xd518b)
multipathd[3525635]: 0xffffa4902fc0 is located 1344 bytes inside of 1440-byte region [0xffffa4902a80,0xffffa4903020)
multipathd[3525635]: freed by thread T7 here:
multipathd[3525635]:    #0 0xffffac97d703 in free (/usr/lib64/libasan.so.4+0xd0703)
multipathd[3525635]:    #1 0xffffac824827 in orphan_paths (/usr/lib64/libmultipath.so.0+0x9a827)
multipathd[3525635]:    #2 0xffffac824a43 in remove_map (/usr/lib64/libmultipath.so.0+0x9aa43)
multipathd[3525635]:    #3 0xaaaad6cf7057  (/usr/sbin/multipathd+0x17057)
multipathd[3525635]:    #4 0xaaaad6cf78eb  (/usr/sbin/multipathd+0x178eb)
multipathd[3525635]:    #5 0xaaaad6cff4df  (/usr/sbin/multipathd+0x1f4df)
multipathd[3525635]:    #6 0xaaaad6cfffe7  (/usr/sbin/multipathd+0x1ffe7)
multipathd[3525635]:    #7 0xffffac807be3 in uevent_dispatch (/usr/lib64/libmultipath.so.0+0x7dbe3)
multipathd[3525635]:    #8 0xaaaad6cf563f  (/usr/sbin/multipathd+0x1563f)
multipathd[3525635]:    #9 0xffffac6877af  (/usr/lib64/libpthread.so.0+0x87af)
multipathd[3525635]:    #10 0xffffac44118b  (/usr/lib64/libc.so.6+0xd518b)

When mpp only has one path and log out the path, there is an asan error.
In remove_mpp, the pp is freed firstly in orphan_path but is accessed,
changed in free_multipath later. Before free_path(pp), the pp should be
cleared from pp->mpp.

Reported-by: Lixiaokeng <[email protected]>
Tested-by: Lixiaokeng <[email protected]>
Reviewed-by: Benjamin Marzinski <[email protected]>
mwilck added a commit that referenced this pull request Dec 2, 2021
... by the paths and pg vectors of the map to be removed.

Original bug report from Lixiaokeng ("libmultipath: clear removed path from mpp"):

multipathd[3525635]: ==3525635==ERROR: AddressSanitizer: heap-use-after-free on address 0xffffa4902fc0 at pc 0xffffac7d5b88 bp 0xffffa948dac0 sp 0xffffa948dae0
multipathd[3525635]: READ of size 8 at 0xffffa4902fc0 thread T7
multipathd[3525635]:    #0 0xffffac7d5b87 in free_multipath (/usr/lib64/libmultipath.so.0+0x4bb87)
multipathd[3525635]:    #1 0xaaaad6cf7057  (/usr/sbin/multipathd+0x17057)
multipathd[3525635]:    #2 0xaaaad6cf78eb  (/usr/sbin/multipathd+0x178eb)
multipathd[3525635]:    #3 0xaaaad6cff4df  (/usr/sbin/multipathd+0x1f4df)
multipathd[3525635]:    #4 0xaaaad6cfffe7  (/usr/sbin/multipathd+0x1ffe7)
multipathd[3525635]:    #5 0xffffac807be3 in uevent_dispatch (/usr/lib64/libmultipath.so.0+0x7dbe3)
multipathd[3525635]:    #6 0xaaaad6cf563f  (/usr/sbin/multipathd+0x1563f)
multipathd[3525635]:    #7 0xffffac6877af  (/usr/lib64/libpthread.so.0+0x87af)
multipathd[3525635]:    #8 0xffffac44118b  (/usr/lib64/libc.so.6+0xd518b)
multipathd[3525635]: 0xffffa4902fc0 is located 1344 bytes inside of 1440-byte region [0xffffa4902a80,0xffffa4903020)
multipathd[3525635]: freed by thread T7 here:
multipathd[3525635]:    #0 0xffffac97d703 in free (/usr/lib64/libasan.so.4+0xd0703)
multipathd[3525635]:    #1 0xffffac824827 in orphan_paths (/usr/lib64/libmultipath.so.0+0x9a827)
multipathd[3525635]:    #2 0xffffac824a43 in remove_map (/usr/lib64/libmultipath.so.0+0x9aa43)
multipathd[3525635]:    #3 0xaaaad6cf7057  (/usr/sbin/multipathd+0x17057)
multipathd[3525635]:    #4 0xaaaad6cf78eb  (/usr/sbin/multipathd+0x178eb)
multipathd[3525635]:    #5 0xaaaad6cff4df  (/usr/sbin/multipathd+0x1f4df)
multipathd[3525635]:    #6 0xaaaad6cfffe7  (/usr/sbin/multipathd+0x1ffe7)
multipathd[3525635]:    #7 0xffffac807be3 in uevent_dispatch (/usr/lib64/libmultipath.so.0+0x7dbe3)
multipathd[3525635]:    #8 0xaaaad6cf563f  (/usr/sbin/multipathd+0x1563f)
multipathd[3525635]:    #9 0xffffac6877af  (/usr/lib64/libpthread.so.0+0x87af)
multipathd[3525635]:    #10 0xffffac44118b  (/usr/lib64/libc.so.6+0xd518b)

When mpp only has one path and log out the path, there is an asan error.
In remove_mpp, the pp is freed firstly in orphan_path but is accessed,
changed in free_multipath later. Before free_path(pp), the pp should be
cleared from pp->mpp.

Reported-by: Lixiaokeng <[email protected]>
Tested-by: Lixiaokeng <[email protected]>
Reviewed-by: Benjamin Marzinski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant