-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multipath-tools 0.9.6 #68
Conversation
Fix warnings reported by udevadm verify: multipath/11-dm-mpath.rules:18 Whitespace after comma is expected. ... multipath/11-dm-mpath.rules: udev rules check failed Note (mwilck): technically, the syntax of the udev rules was correct; they are parsed and executed by udev correctly, and this is unlikely to change. But systemd has enabled stricter checks in "udevadm verify" to ensure better readability of udev rules files (systemd/systemd#26980). This commit makes sure the multipath-tools rules comply with these checks. Signed-off-by: Dmitry V. Levin <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
Fix the following warning reported by udevadm verify: kpartx/dm-parts.rules:35 A comma between tokens is expected. kpartx/dm-parts.rules: udev rules check failed Note (mwilck): technically, this udev rule was parsed and executed by udev correctly, and this is unlikely to change. But the missing comma didn't comply with the udev(7) man page. This commit fixes that. Signed-off-by: Dmitry V. Levin <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
…tibus ALUA is needed by Hitachi Global-Active Device (GAD): https://knowledge.hitachivantara.com/Documents/Management_Software/SVOS/8.1/Global-Active_Device/Overview_of_global-active_device (Information about ALUA support from personal communication from Hitachi). "no_path_retry" recommendation from: https://knowledge.hitachivantara.com/Documents/Management_Software/SVOS/9.8.6/Volume_Management_-_VSP_5000_Series/Host_Attachment/05_Red_Hat_Linux_configuration_and_attachment#Device_Mapper_(DM)_Multipath_configuration.0D.0A____for_Red_Hat_Linux Cc: Matthias Rudolph <[email protected]> Cc: Martin Wilck <[email protected]> Cc: Benjamin Marzinski <[email protected]> Cc: Christophe Varoqui <[email protected]> Cc: DM-DEVEL ML <[email protected]> Signed-off-by: Xose Vazquez Perez <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
libmultipath/structs_vec.c:270: mulitpath ==> multipath libmultipath/libmultipath.version:36: overriden ==> overridden libmpathutil/libmpathutil.version:36: overriden ==> overridden SUMMARY: mulitpath 1 overriden 2 Cc: Martin Wilck <[email protected]> Cc: Benjamin Marzinski <[email protected]> Cc: Christophe Varoqui <[email protected]> Cc: DM-DEVEL ML <[email protected]> Signed-off-by: Xose Vazquez Perez <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
When we group paths by prio and the priority changes, paths can end up temporarily in the wrong path groups. This usually happens when some paths are down, so their priority can't be updated. To avoid this for ALUA paths, group them by their target port group instead. The path groups chosen by this policy won't always match with those chosen by group_by_prio, since it is possible for multiple ALUA target port groups to have the same priority. Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
copying the value into a passed in buffer doesn't help any of the callers of this function. It's just wasted work. Reviewed-by: Martin Wilck <[email protected]> Signed-off-by: Benjamin Marzinski <[email protected]>
Make it possible to easily check a path's target port group. Reviewed-by: Martin Wilck <[email protected]> Signed-off-by: Benjamin Marzinski <[email protected]>
Reviewed-by: Martin Wilck <[email protected]> Signed-off-by: Benjamin Marzinski <[email protected]>
This allows configuations to use "group_by_prio" if alua is autodetected and another policy if it isn't, so they can work with detect_prio. Reviewed-by: Martin Wilck <[email protected]> Signed-off-by: Benjamin Marzinski <[email protected]>
If this and "detect_pgpolicy" are both selected and ALUA is autodetected, the multipath device will use the "group_by_tpg" policy instead of the "group_by_prio" policy. Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
When multipath is not set to group_by_prio, different paths in a pathgroup can have different priorities. If there is a problem getting the priority of an active path, its priority will be set to PRIO_UNDEF. This will change the priority of the whole pathgroup, even though it's likely that this is simply a temporary error. Instead, do not count PRIO_UNDEF paths towards to priority of the path group, unless there are no paths that have an actual priority. This will not effect the priority of multipath devices with group_by_prio, since all paths in a pathgroup will have the same priority. Reviewed-by: Martin Wilck <[email protected]> Signed-off-by: Benjamin Marzinski <[email protected]>
Make sure that pathgroups that include paths with a PRIO_UNDEF priority are properly sorted. Reviewed-by: Martin Wilck <[email protected]> Signed-off-by: Benjamin Marzinski <[email protected]>
Multipathd previously had various rules to update priorities in update_prio(), need_switch_pathgroup(), and reload_map(). Instead, only update the priority in update_prio(). To cover the cases where the priorities were getting updated in other functions, update_prio() now always updates all paths' priorities, if the priority on the path that it was called with changes. Also, do not try to update a path's priority if it is down. Previously, when refreshing all the paths' priorities, a path could have its priority updated when it was in the PATH_DOWN state, as long as its priority was PRIO_UNDEF. But if a path is down, and the last time multipath tried to get its priority, it failed, it seems likely that the prioritizer will just fail again. Finally, skip updating all paths' priorities in deferred_failback_tick(). Now that all the paths' priorities will get updated when one changes before starting the deferred failback count, it's no longer necessary to update them all again when the failback timeout expires. The checker loop will continue to update them correctly while the timeout is going on. Reviewed-by: Martin Wilck <[email protected]> Signed-off-by: Benjamin Marzinski <[email protected]>
need_switch_pathgroup() only checks if the currently used pathgroup is not the highest priority pathgroup. If it isn't, all multipathd does is instruct the kernel to switch to the correct pathgroup. However, the kernel treats the pathgroups as if they were ordered by priority. When the kernel runs out of paths to use in the currently selected pathgroup, it will start checking the pathgroups in order until it finds one with usable paths. need_switch_pathgroup() should also check if the pathgroups are out of order, and if so, multipathd should reload the map to reorder them correctly. Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
When need_switch_pathgroup() is called by deferred_failback_tick(), there is a chance that mpp->paths will be NULL, even if there are paths in the multipath device's pathgroups. Instead check if there are multiple pathgroups, since multipath can't be using the wrong pathgroup if there is one or none. Reviewed-by: Martin Wilck <[email protected]> Signed-off-by: Benjamin Marzinski <[email protected]>
Cc: Martin Wilck <[email protected]> Cc: Benjamin Marzinski <[email protected]> Cc: Christophe Varoqui <[email protected]> Cc: DM-DEVEL ML <[email protected]> Signed-off-by: Xose Vazquez Perez <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
Debian's build system helpers automatically set "PKG_CONFIG" correctly, so it would be convenient for Debian to use this name instead of "PKGCONFIG". The majority of projects also seem to call this makefile variable "PKG_CONFIG". Signed-off-by: Chris Hofstaedtler <[email protected]> Acked-by: Sam James <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
It would appear KBUILD_BUILD_TIMESTAMP is meant to be used the same way as in the linux kernel build. For linux, builders are supposed to set KBUILD_BUILD_TIMESTAMP in the environment if they want a stable timestamp. The libddmmp makefile however tries to directly call git to get a timestamp, which fails in a typical Debian build environment, which is not a full git source tree. Have libdmmp/Makefile use KBUILD_BUILD_TIMESTAMP from the environment if available, otherwise use git as before. mwilck: changed slightly, preserving the original logic Signed-off-by: Chris Hofstaedtler <[email protected]> Signed-off-by: Martin Wilck <[email protected]>
Issue found by coverity: CID 393674: Code maintainability issues (UNUSED_VALUE) Assigning value "NULL" to "mpp" here, but that stored value is overwritten before it can be used. Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
If pp->dev_loss_tmo == DEV_LOSS_TMO_UNSET, sysfs_set_scsi_tmo() would not set it to min_dev_loss_tmo, causing the system dev_loss_tmo value (by default, 30s) to remain unchanged. Fix it. Fixes: 6ad77db ("libmultipath: Set the scsi timeout parameters by path") Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
If the nvme native multipath driver is enabled, blacklist nvme devices for dm-multipath by default. This is particularly useful with "find_multipaths greedy". Signed-off-by: Martin Wilck <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
Signed-off-by: Martin Wilck <[email protected]>
Signed-off-by: Martin Wilck <[email protected]>
Only for "abi" not yet, we need make sure that it's output remains unchanged if we change the runner. Signed-off-by: Martin Wilck <[email protected]>
Mainly; add multipathc info, and info to check man pages Cc: Martin Wilck <[email protected]> Cc: Benjamin Marzinski <[email protected]> Cc: Christophe Varoqui <[email protected]> Cc: DM-DEVEL ML <[email protected]> Signed-off-by: Xose Vazquez Perez <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
…ywords Cc: Martin Wilck <[email protected]> Cc: Benjamin Marzinski <[email protected]> Cc: Christophe Varoqui <[email protected]> Cc: DM-DEVEL ML <[email protected]> Signed-off-by: Xose Vazquez Perez <[email protected]> Reviewed-by: Benjamin Marzinski <[email protected]>
A user can change the value of max_sectors_kb on the multipath device and its path devices. So when a path is deleted and then re-added, its value will not be the same as the multipath device. In that case the IOs could be mis-sized. On reload, this patch re-apply max_sectors_kb value of the multipath device on its path devices. Signed-off-by: Etienne AUJAMES <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
Re-pushed and marked as ready for review. Missing |
This is almost always pointless work. c->timeout has already been set. The only reason why this recheck existed was to deal with the possibility that the sysfs value had changed. It is unlikely that users will update the sysfs value to change the multipath timeout while multipathd is running. They can alway reload multipathd if they want it changed, anyways. Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
Factor the code to get the checker_timeout out of select_checker() into its own select_checker_timeout function, and use that to set a path variable. This variable will be used by future patches. Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
Multipath's prioritizers previously didn't use the sysfs timeout for scsi devices, and used a prioritizer specific default timeout (although in practice, all the prioritizers except hds used 60 seconds). Now prioritizers deal with timeouts the same way as the checkers. When selecting a prioritizer, select_checker_timeout() is called if the path doesn't already have a checker_timeout set, an the prioritizers that use timeouts now all use the path's checker_timeout. This change also incidentally fixes some bugs where the detect_alua() function and the path_latency prioritizer were assuming that the timeout was in a different unit than it was (seconds vs milliseconds). Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
If users don't add a timeout to the datacore priorizier arguments, it should use the checker_timeout. Also, don't limit possible timeout values, to match the other prioritizers. Signed-off-by: Benjamin Marzinski <[email protected]> Reviewed-by: Martin Wilck <[email protected]>
The previous patch "libmultipath: don't bother to recheck timeout" causes sysfs_get_timeout() to be called less frequently. Adapt the tests. Signed-off-by: Martin Wilck <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Thanks, @cvaroqui! I realize that I had pushed my "queue" branch by mistake, including the late prio timeout series from @bmarzins, which I'd originally considered to be post-0.9.6 material. I don't think it's a problem though, the patches are sound and have been reviewed. It's probably a good thing to have them in this release. But this means that the shortlog above is incomplete. I'll modify it to reflect the actual changes. |
Hi Christophe,
here's a PR for multipath-tools 0.9.6. The main new feature is the
group_by_tpg
path grouping policy, a feature that multipath-tools has been lacking for a long time.Other than that, there are just a few bug fixes, build fixes, and minor corrections.
There's one user-visible change: NVMe devices will now be blacklisted by default if NVMe native multipathing is enabled in the kernel.
A few reviewed-by: tags are still missing, thus I'm submitting this as draft for now.Note about ABI: libmultipath ABI is bumped to 20.0 because of the
get_prio_timeout()
changes.asstruct config
andstruct multipath
have changed, abigail finds ABI changes in libmpathpersist and libmpathutil. However these structs are never allocated or directly used by either library, so only the ABI version of libmultipath is bumped.Martin
@bmarzins(15)
libmultipath: add group_by_tpg path_grouping_policy
libmultipath: don't copy pgpolicy string in get_pgpolicy_name
libmultipath: add ALUA tpg path wildcard
multipath-tools tests: add tests for group_by_tpg policy
libmultipath: add "detect_pgpolicy" config option
libmultipath: add "detect_pgpolicy_use_tpg" config option
libmultipath: don't count PRIO_UNDEF paths for pathgroup priority
multipath-tools tests: add tests to verify PRIO_UNDEF changes
multipathd: only refresh priorities in update_prio()
multipathd: reload map if the path groups are out of order
multipathd: don't assume mpp->paths will exist in need_switch_pathgroup
libmultipath: don't bother to recheck timeout
libmultipath: make checker_timeout a path variable
libmultipath: make prioritizer timeouts work like checker timeouts
libmultipath: standardize datacore prioritizer timeouts
@zeha (1):
multipath-tools build: rename PKGCONFIG to PKG_CONFIG
@lamby (1):
multipath-tools build: accept KBUILD_BUILD_TIMESTAMP from env
@ldv-alt (2):
11-dm-mpath.rules: fix warnings reported by udevadm verify (fixes #66)
dm-parts.rules: fix warning reported by udevadm verify
@ajmes (1):
libmultipath: fix max_sectors_kb on adding path (fixes #69)
@mwilck (8):
libmultipath: dm_get_maps(): remove spurious assignment
libmultipath: fix dev_loss_tmo even if not set in configuration
libmultipath: ignore nvme devices if nvme native multipath is enabled
GitHub workflows: enable Debian "bookworm"
GitHub workflows: use Fedora 37 in native.yaml
GitHub workflows: switch to Ubuntu 22.04 runner
libmultipath: bump version to 0.9.6
Fix hwtable test after "libmultipath: don't bother to recheck timeout"
@xosevp (5):
multipath-tools: adapt HITACHI/OPEN- config to work with alua and multibus
multipath-tools: fix spelling
multipath-tools: fix syntax and spelling errors
multipath-tools: fix docs
multipath-tools: treat disable_changed_wwids like other deprecated keywords