Reduce loaded range tree memory usage #9181

pcd1193182 · 2019-08-19T16:00:49Z

Signed-off-by: Paul Dagnelie [email protected]
Reviewed by: Matt Ahrens [email protected]
Reviewed by: Sebastien Roy [email protected]
Reviewed by: George Wilson [email protected]

Motivation and Context

On large systems with high fragmentation, very large amounts of memory can be used to store metaslab space maps in memory in range trees. In addition, if the system is very fragmented, it is often necessary to store many of these in memory in order to be able to efficiently perform allocations. Failure to keep enough metaslabs loaded can cause serious performance problems, especially on sync write workloads.

Part of the reason that so much memory is used is that the structure used to store the range segs is a pair of AVL trees (one is sorted by offset, the other by size). This results in range_seg_ts of 72 bytes to store a mere 24 bytes of payload (8 of which is unused for metaslab range trees). While these trees are easy to use and provide good performance, we can significantly reduce the memory usage of the range trees by using a more efficient tree structure.

Description

This patch implements a new tree structure for ZFS, and uses it to store range trees more efficiently.

The new structure is approximately a B-tree, though there are some small differences from the usual characterizations. The tree has core nodes and leaf nodes; each contain data elements, which the elements in the core nodes acting as separators between its children. The difference between core and leaf nodes is that the core nodes have an array of children, while leaf nodes don't. Every node in the tree may be only partially full; in most cases, they are all at least 50% full (in terms of element count) except for the root node, which can be less full. Underfull nodes will steal from their neighbors or merge to remain full enough, while overfull nodes will split in two. The data elements are contained in tree-controlled buffers; they are copied into these on insertion, and overwritten on deletion. This means that the elements are not independently allocated, which reduces overhead, but also means they can't be shared between trees (and also that pointers to them are only valid until a side-effectful tree operation occurs). The overhead varies based on how dense the tree is, but is usually on the order of about 50% of the element size; the per-node overheads are very small, and so don't make a significant difference. The trees can accept arbitrary records; they accept a size and a comparator to allow them to be used for a variety of purposes.

The new trees replace the AVL trees used in the range trees today. Currently, the range_seg_t structure contains three 8 byte integers of payload and two 24 byte avl_tree_node_ts to handle its storage in both an offset-sorted tree and a size-sorted tree (total size: 64 bytes). In the new model, the range seg structures are usually two 4 byte integers, but a separate one needs to exist for the size-sorted and offset-sorted tree. Between the raw size, the 50% overhead, and the double storage, the new btrees are expected to use 8*1.5*2 = 24 bytes per record, or 33.3% as much memory as the AVL trees (this is for the purposes of storing metaslab range trees; for other purposes, like scrubs, they use ~50% as much memory).

We reduced the size of the payload in the range segments by teaching range trees about starting offsets and shifts; since metaslabs have a fixed starting offset, and they all operate in terms of disk sectors, we can store the ranges using 4-byte integers as long as the size of the metaslab divided by the sector size is less than 2^32. For 512-byte sectors, this is a 2^41 (or 2TB) metaslab, which with the default settings corresponds to a 256PB disk. 4k sector disks can handle metaslabs up to 2^46 bytes, or 2^63 byte disks. Since we do not anticipate disks of this size in the near future, there should be almost no cases where metaslabs need 64-byte integers to store their ranges. We do still have the capability to store 64-byte integer ranges to account for cases where we are storing per-vdev (or per-dnode) trees, which could reasonably go above the limits discussed. We also do not store fill information in the compact version of the node, since it is only used for sorted scrub.

We also optimized the metaslab loading process in various other ways to offset some inefficiencies in the btree model. While individual operations (find, insert, remove_from) are faster for the btree than they are for the avl tree, remove usually requires a find operation, while in the AVL tree model the element itself suffices. Some clever changes actually caused an overall speedup in metaslab loading; we use approximately 40% less cpu to load metaslabs in our tests on Illumos.

Another memory and performance optimization was achieved by changing what is stored in the size-sorted trees. When a disk is heavily fragmented, the df algorithm used by default in ZFS will almost always find a number of small regions in its initial cursor-based search; it will usually only fall back to the size-sorted tree to find larger regions. If we increase the size of the cursor-based search slightly, and don't store segments that are smaller than a tunable size floor in the size-sorted tree, we can further cut memory usage down to below 20% of what the AVL trees store. This also results in further reductions in CPU time spent loading metaslabs.

The 16KiB size floor was chosen because it results in substantial memory usage reduction while not usually resulting in situations where we can't find an appropriate chunk with the cursor and are forced to use an oversized chunk from the size-sorted tree. In addition, even if we do have to use an oversized chunk from the size-sorted tree, the chunk would be too small to use for ZIL allocations, so it isn't as big of a loss as it might otherwise be. And often, more small allocations will follow the initial one, and the cursor search will now find the remainder of the chunk we didn't use all of and use it for subsequent allocations. Practical testing has shown little or no change in fragmentation as a result of this change.

If the size-sorted tree becomes empty while the offset sorted one still has entries, it will load all the entries from the offset sorted tree and disregard the size floor until it is unloaded again. This operation occurs rarely with the default setting, only on incredibly thoroughly fragmented pools.

There are some other small changes to zdb to teach it to handle btrees, but nothing major.

How Has This Been Tested?

Correctness testing

This change has undergone extensive testing on Illumos: first, a feature was added to zhack that would allocate a btree and manipulate it at random by inserting and removing random integers. An AVL tree was used alongside it to verify its integrity, and internal integrity checks were run frequently to identify any broken invariants. This acted as a fuzzer to catch basic problems. Next, the code was integrated into the kernel for use in range trees, and then run through zfs-test and zloop to identify issues with any workloads that might occur in normal operation that didn't occur regularly with random inputs and outputs (many inserts/deletions in a row, lots of merges and splits, etc). Finally, the code was used in performance tests on a performance VM with a large, heavily fragmented pool. This provided both further correctness testing as well as performance testing.

On Linux, this change has been run through zfs-test and zloop several times, with a focus on the areas that differ between Illumos and Linux (notably the sorted scrub code, which uses the fill range_seg entry that doesn't exist on Illumos).

Performance testing

Performance test results from Illumos:
~80% decrease in memory utilization to store loaded metaslabs
~40% decrease in cpu utilization to load metaslabs
specific range tree operations vary, but most operations are slightly faster in btrees or comparable, except for range_tree_vacate (which takes somewhat longer because it needs to free two trees).

Some basic performance tests have been run on Linux to verify that there was no regression due to these changes. The specifics of memory and CPU usage are expected to remain approximately the same between the two platforms.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the ZFS on Linux code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
All new and existing tests passed.
All commit messages are properly formatted and contain Signed-off-by.

behlendorf

This looks great, thank you for the detailed PR comment. It was tremendously helpful. Mainly a few comments to head off some future issues.

include/sys/range_tree.h

module/zfs/arc.c

module/zfs/btree.c

module/zfs/metaslab.c

module/zfs/range_tree.c

module/zfs/btree.c

module/zfs/metaslab.c

behlendorf · 2019-08-26T22:07:45Z

Several of the bots hit the recent unrelated metaslab ASSERT, I've resubmitted those tests. The style builder also flagged a few minor cstyle warnings.

include/sys/range_tree.h

module/zfs/btree.c

ahrens

I like the refactor! Makes it easier to read the (quite complicated) insertion/removal code.

module/zfs/btree.c

module/zfs/metaslab.c

sempervictus · 2019-08-30T21:49:10Z

This looks like it will be a godsend for old pools. Unfortunately i can't seem to get this built for testing - failing to build in-tree with

build.log:fs/zfs/zfs/metaslab.c:1452:2: error: invalid initializer
build.log:fs/zfs/zfs/metaslab.c:1455:2: error: initialization of ‘void (*)(range_tree_t *, void *)’ {aka ‘void (*)(struct range_tree *, void *)’} from incompatible pointer type ‘void (*)(range_tree_t *, range_seg_t *, void *)’ {aka ‘void (*)(struct range_tree *, void *, void *)’} [-Werror=incompatible-pointer-types]

EDIT: the comment here seems to indicate that the struct definition was made for human readability, not so much informing the compiler as to what types are involved in a clear manner:

/*
 * This is just for clarity of code purposes, so we can make it clear that a
 * pointer is to a range seg of some type; when we need to do the actual math,
 * we'll figure out the real type.
 */
typedef void range_seg_t;

struct range_tree_ops {
        void    (*rtop_create)(range_tree_t *rt, void *arg);
        void    (*rtop_destroy)(range_tree_t *rt, void *arg);
        void    (*rtop_add)(range_tree_t *rt, void *rs, void *arg);
        void    (*rtop_remove)(range_tree_t *rt, void *rs, void *arg);
        void    (*rtop_vacate)(range_tree_t *rt, void *arg);
};

I'm guessing that this is fixed with a designated initializer detailing the struct members for the compiler up front.

pcd1193182 · 2019-08-30T22:07:37Z

@sempervictus It looks like you've got an outdated definition of range_tree_ops_t, or possibly a merge error. In the latest version, line 1452 of metaslab.c is in the middle of a comment. If you can point me at a copy of your code I can take a look at let you know if I see any issues.

sempervictus · 2019-08-30T22:23:01Z

@pcd1193182: this also includes the kvmem alloc PR (8384) and the grsec patch, neither of which touch that file. Its also from today's master branch.
Usually this sort of thing for grsec users is due to the stricter compiler options and the plugins' need to understand the layout as early as possible, and the function pointer casting bit in range_tree_ops looks very much like that type of issue.
I'll try to do another tree/build after client ops this evening...

sempervictus · 2019-09-01T03:47:01Z

Ok, so this gets me past the first build errors:

diff --git i/fs/zfs/zfs/metaslab.c w/fs/zfs/zfs/metaslab.c
index de9e1d909..fbfcae713 100644
--- i/fs/zfs/zfs/metaslab.c
+++ w/fs/zfs/zfs/metaslab.c
@@ -1441,11 +1441,11 @@ metaslab_rt_vacate(range_tree_t *rt, void *arg)
 }
 
 static range_tree_ops_t metaslab_rt_ops = {
-	metaslab_rt_create,
-	metaslab_rt_destroy,
-	metaslab_rt_add,
-	metaslab_rt_remove,
-	metaslab_rt_vacate
+	.rtop_create = metaslab_rt_create,
+	.rtop_destroy = metaslab_rt_destroy,
+	.rtop_add = metaslab_rt_add,
+	.rtop_remove = metaslab_rt_remove,
+	.rtop_vacate = metaslab_rt_vacate
 };
 
 /*

which brings me to:

build.log:fs/zfs/zfs/range_tree.c:790:18: error: conflicting type qualifiers for ‘rt_btree_ops’
build.log:fs/zfs/zfs/range_tree.c:791:2: error: invalid initializer
build.log:fs/zfs/zfs/range_tree.c:793:2: error: initialization of ‘void (*)(range_tree_t *, void *)’ {aka ‘void (*)(struct range_tree *, void *)’} from incompatible pointer type ‘void (*)(range_tree_t *, range_seg_t *, void *)’ {aka ‘void (*)(struct range_tree *, void *, void *)’} [-Werror=incompatible-pointer-types]
build.log:fs/zfs/zfs/range_tree.c:795:2: error: initialization of ‘void (*)(range_tree_t *, void *, void *)’ {aka ‘void (*)(struct range_tree *, void *, void *)’} from incompatible pointer type ‘void (*)(range_tree_t *, void *)’ {aka ‘void (*)(struct range_tree *, void *)’} [-Werror=incompatible-pointer-types]

The first issue is likely being caught by the randstruct plugin, which exists upstream as well, so designated initializers will need to be used (anywhere that has more than one struct member which are not VLAs as those go to the end of a randstructed struct, IIUC).
The second issue seems to be more of the same, adding this to the mix and trying another build:

diff --git i/module/zfs/range_tree.c w/module/zfs/range_tree.c
index b705a526f..9e243af96 100644
--- i/module/zfs/range_tree.c
+++ w/module/zfs/range_tree.c
@@ -787,12 +787,12 @@ rt_btree_vacate(range_tree_t *rt, void *arg)
        rt_btree_create(rt, arg);
 }
 
-range_tree_ops_t rt_btree_ops = {
-       rt_btree_create,
-       rt_btree_destroy,
-       rt_btree_add,
-       rt_btree_remove,
-       rt_btree_vacate
+static range_tree_ops_t rt_btree_ops = {
+       .rtop_create = rt_btree_create,
+       .rtop_destroy = rt_btree_destroy,
+       .rtop_add = rt_btree_add,
+       .rtop_remove = rt_btree_remove,
+       .rtop_vacate = rt_btree_vacate
 };

sempervictus · 2019-09-01T04:05:55Z

@pcd1193182: designated initializers PR created in your repo to deal with the randstruct bit, and ensure that rt_btree_ops is always a range_tree_ops_t.
Builds fine in-tree as builtin now, zloop running for a number of cycles so far with no errors coming up.
Adding these changes to the full testing branch with the vmem/cache changes and a few other PRs and building a full kernel to test on something a bit less virtual presently.

sempervictus · 2019-09-01T15:42:56Z

@pcd1193182: so the plot thickens a bit, think i found an edge case related to how i'm building:
The test kernel i produced and verified with zloop yesterday was a minimal virt-device-driver-only image with ZFS built directly into the binary (not as a module). This works fine, passed tests, life is good.
The 2nd phase is similar to allyesconfig with the ZFS module still baked into the kernel, but now i'm being blocked at link time by redefinitions of btree functions - namespace collision @ link time which is brought on by the difference in Kconfig, not codebase. Could all of the btree_* functions become zfs_btree_* to avoid the following?

  LD      vmlinux.o
ld: lib/btree.o: in function `btree_last':
(.text+0x20): multiple definition of `btree_last'; fs/zfs/zfs/btree.o:btree.c:(.text+0x1350): first defined here
ld: lib/btree.o: in function `btree_insert':
(.text+0x1470): multiple definition of `btree_insert'; fs/zfs/zfs/btree.o:btree.c:(.text+0x1bd0): first defined here
ld: lib/btree.o: in function `btree_remove':
(.text+0xe50): multiple definition of `btree_remove'; fs/zfs/zfs/btree.o:btree.c:(.text+0x2930): first defined here
ld: lib/btree.o: in function `btree_init':
(.text+0x4a0): multiple definition of `btree_init'; fs/zfs/zfs/btree.o:btree.c:(.text+0xf90): first defined here
ld: lib/btree.o: in function `btree_destroy':
(.text+0x6c0): multiple definition of `btree_destroy'; fs/zfs/zfs/btree.o:btree.c:(.text+0x1a60): first defined here

sempervictus · 2019-09-01T23:06:12Z

@pcd1193182: the renaming of btree_* to zfs_btree_* seems to work, everything built, zloop is running happily. I've added that commit to the PR as well, which should address all of the issues i've found so far. Hoping to verify stability and check against some rather old and fragmented pools to see how it behaves.

sempervictus · 2019-09-02T00:12:58Z

@pcd1193182: now it looks like i've found (stumbled over thanks to uderef) a real bug. This happens when importing a zpool which has dedup enabled and created in a prior version of ZFS, then upgraded to the current master's feature set, sha512 IIRC (9th gen HK chip, power to spare).

pcd1193182 · 2019-09-02T01:55:59Z

@sempervictus do you happen to have a crash dump from that kernel panic?

sempervictus · 2019-09-02T13:33:31Z

@pcd1193182: negative, boot-time crash fully locks up the system, and it cant write anything to any media since all volumes are ZFS (except /boot and /boot/efi) - hence the shoddy phone photo

pcd1193182 · 2019-09-03T18:13:59Z

@sempervictus Ok so I've had time now to catch up with the various points you addressed.

I'm happy to pull in the designated initializers part of the PR, that seems like a good change.

I'm not as convinced about the rename from btree_* to zfs_btree_*; we don't use that naming scheme for most other data structures (avl trees, range trees) to avoid external name collisions. From what I can see, we mostly use that prefix for things where it's important for clarify and zfs_refcount_t. I'm not sure it's worth making that change and adding bulk to the code to support an edge-case build setup.

As for the kernel panic, I'll see if I can reproduce it. Does this happen consistently when you populate a pool as you described, or is that just the scenario that your pool happened to be in when you encountered this issue? Neither dedup nor pool features should have an effect on the behavior of the btree code, so if that's the consistent factor it would be useful information.

sempervictus · 2019-09-05T05:35:56Z

@pcd1193182: if those functions aren't renamed, they will conflict with the in-kernel (in Linux) btree implementation at least when building ZFS into the kernel as a static binary. Edge-case seems like an odd way to minimize the build failure - isn't in-kernel build what Illumos does too? As you mention, refcount is a good example of where we already do this - the kernel has its own refcount API, so ZFS had to prefix its own to avoid the collision IIRC.
I'm perfectly happy to maintain my own patchsets and trees to deal with this, but as is, this will break an existing build option (feature) of ZoL for people who don't. I'm not 100% sure this wont come up when built as a module in some other "edge case" either.

module/zfs/btree.c

pcd1193182 · 2019-09-06T19:46:52Z

@sempervictus I hadn't realized to what extent building as part of the kernel was a supported operation; I've prefixed the data types and functions with zfs_ to help avoid name collisions in the future. Still haven't managed to reproduce the panic you saw, but I'll keep trying.

sempervictus · 2019-09-07T21:03:48Z

@pcd1193182: i think 65a91b1 might be causing some problems here. Latest naive merge attempt resulted in build failures with incorrect argument numbers for functions and redefinitions of things.

Signed-off-by: Paul Dagnelie <[email protected]>

ikozhukhov

thanks for update!
LGTM with wrapper at this moment and can update it later to another one if will agreed with proposals.

ikozhukhov · 2019-10-08T23:47:17Z

maybe use define something like:

#ifndef	zfs_bcopy
#define	zfs_bcopy(src, dst, size)	memmove((dst), (src), (size))
#endif

just for example
i'll try to test this idea by separate PR

Signed-off-by: Paul Dagnelie <[email protected]>

ikozhukhov · 2019-10-09T11:36:48Z

(1) https://bitbucket.org/dilos/dilos-illumos/commits/7495cf4e17d78eb677e2a3d60c38f61c4e66268b
i have tested (1) with 2 tries in loop ZTS (about 12h) - no issues
(2) https://bitbucket.org/dilos/dilos-illumos/commits/4e1638b87137d4de6b0f2bcf5793cafaef1c4fb0
it is update after (1) - build has been done and ZTS in progress.

with my updates I can see several options:
1 - integrate Paul changes after zfs_bcopy - if we have agreements for it for separate PR
2 - integrate Paul version, but define zfs_bcopy and use it, will update others ZFS code later
3 - include zfs_bcopy update in all code to Paul PR
i'll be happy with any options

behlendorf · 2019-10-09T17:18:52Z

@ikozhukhov I'm going to go ahead and merge this PR as is. It includes the bmov wrapper function for memmove. If we decide it would be useful to rename or refactor that code let's discuss it in a new PR after the merge.

ikozhukhov · 2019-10-09T17:27:41Z

ok, i have no issues with bmov

Additional test cases for the btree implementation, see #9181. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: John Kennedy <[email protected]> Closes #9717

This patch implements a new tree structure for ZFS, and uses it to store range trees more efficiently. The new structure is approximately a B-tree, though there are some small differences from the usual characterizations. The tree has core nodes and leaf nodes; each contain data elements, which the elements in the core nodes acting as separators between its children. The difference between core and leaf nodes is that the core nodes have an array of children, while leaf nodes don't. Every node in the tree may be only partially full; in most cases, they are all at least 50% full (in terms of element count) except for the root node, which can be less full. Underfull nodes will steal from their neighbors or merge to remain full enough, while overfull nodes will split in two. The data elements are contained in tree-controlled buffers; they are copied into these on insertion, and overwritten on deletion. This means that the elements are not independently allocated, which reduces overhead, but also means they can't be shared between trees (and also that pointers to them are only valid until a side-effectful tree operation occurs). The overhead varies based on how dense the tree is, but is usually on the order of about 50% of the element size; the per-node overheads are very small, and so don't make a significant difference. The trees can accept arbitrary records; they accept a size and a comparator to allow them to be used for a variety of purposes. The new trees replace the AVL trees used in the range trees today. Currently, the range_seg_t structure contains three 8 byte integers of payload and two 24 byte avl_tree_node_ts to handle its storage in both an offset-sorted tree and a size-sorted tree (total size: 64 bytes). In the new model, the range seg structures are usually two 4 byte integers, but a separate one needs to exist for the size-sorted and offset-sorted tree. Between the raw size, the 50% overhead, and the double storage, the new btrees are expected to use 8*1.5*2 = 24 bytes per record, or 33.3% as much memory as the AVL trees (this is for the purposes of storing metaslab range trees; for other purposes, like scrubs, they use ~50% as much memory). We reduced the size of the payload in the range segments by teaching range trees about starting offsets and shifts; since metaslabs have a fixed starting offset, and they all operate in terms of disk sectors, we can store the ranges using 4-byte integers as long as the size of the metaslab divided by the sector size is less than 2^32. For 512-byte sectors, this is a 2^41 (or 2TB) metaslab, which with the default settings corresponds to a 256PB disk. 4k sector disks can handle metaslabs up to 2^46 bytes, or 2^63 byte disks. Since we do not anticipate disks of this size in the near future, there should be almost no cases where metaslabs need 64-byte integers to store their ranges. We do still have the capability to store 64-byte integer ranges to account for cases where we are storing per-vdev (or per-dnode) trees, which could reasonably go above the limits discussed. We also do not store fill information in the compact version of the node, since it is only used for sorted scrub. We also optimized the metaslab loading process in various other ways to offset some inefficiencies in the btree model. While individual operations (find, insert, remove_from) are faster for the btree than they are for the avl tree, remove usually requires a find operation, while in the AVL tree model the element itself suffices. Some clever changes actually caused an overall speedup in metaslab loading; we use approximately 40% less cpu to load metaslabs in our tests on Illumos. Another memory and performance optimization was achieved by changing what is stored in the size-sorted trees. When a disk is heavily fragmented, the df algorithm used by default in ZFS will almost always find a number of small regions in its initial cursor-based search; it will usually only fall back to the size-sorted tree to find larger regions. If we increase the size of the cursor-based search slightly, and don't store segments that are smaller than a tunable size floor in the size-sorted tree, we can further cut memory usage down to below 20% of what the AVL trees store. This also results in further reductions in CPU time spent loading metaslabs. The 16KiB size floor was chosen because it results in substantial memory usage reduction while not usually resulting in situations where we can't find an appropriate chunk with the cursor and are forced to use an oversized chunk from the size-sorted tree. In addition, even if we do have to use an oversized chunk from the size-sorted tree, the chunk would be too small to use for ZIL allocations, so it isn't as big of a loss as it might otherwise be. And often, more small allocations will follow the initial one, and the cursor search will now find the remainder of the chunk we didn't use all of and use it for subsequent allocations. Practical testing has shown little or no change in fragmentation as a result of this change. If the size-sorted tree becomes empty while the offset sorted one still has entries, it will load all the entries from the offset sorted tree and disregard the size floor until it is unloaded again. This operation occurs rarely with the default setting, only on incredibly thoroughly fragmented pools. There are some other small changes to zdb to teach it to handle btrees, but nothing major. Reviewed-by: George Wilson <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed by: Sebastien Roy [email protected] Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes openzfs#9181 Signed-off-by: Bryant G. Ly <[email protected]> Conflicts: cmd/zdb/zdb.c include/sys/spa.h module/zfs/Makefile.in module/zfs/dmu_send.c module/zfs/dsl_bookmark.c module/zfs/dsl_deadlist.c module/zfs/metaslab.c

pcd1193182 requested review from behlendorf, grwilson and ahrens August 19, 2019 16:00

pcd1193182 added Status: Code Review Needed Ready for review and testing Type: Performance Performance improvement or performance problem labels Aug 19, 2019

ahrens mentioned this pull request Aug 21, 2019

keep more metaslabs loaded #9197

Merged

12 tasks

behlendorf reviewed Aug 22, 2019

View reviewed changes

behlendorf reviewed Aug 27, 2019

View reviewed changes

include/sys/range_tree.h Show resolved Hide resolved

module/zfs/btree.c Outdated Show resolved Hide resolved

pcd1193182 force-pushed the btree branch from e26960d to ec4faf4 Compare August 27, 2019 19:49

ahrens reviewed Aug 28, 2019

View reviewed changes

grwilson reviewed Aug 29, 2019

View reviewed changes

module/zfs/metaslab.c Outdated Show resolved Hide resolved

pcd1193182 force-pushed the btree branch from ec4faf4 to 935d669 Compare August 30, 2019 16:03

grwilson approved these changes Sep 1, 2019

View reviewed changes

ahrens reviewed Sep 5, 2019

View reviewed changes

pcd1193182 force-pushed the btree branch from fc91003 to 49b4bbb Compare September 6, 2019 22:46

pcd1193182 added 15 commits October 8, 2019 16:15

feedback1

2c07366

Signed-off-by: Paul Dagnelie <[email protected]>

switch to one kstat

99b4df4

REFACTOR

758693c

style

e76980d

matt feedback

12717e8

bugfix

723aef2

change AVL_ macros to TREE_ macros

ea333de

style

9765f88

Signed-off-by: Paul Dagnelie <[email protected]>

add static initializers

81fc8a8

Signed-off-by: Paul Dagnelie <[email protected]>

Matt feedback 1

5d6cd17

Signed-off-by: Paul Dagnelie <[email protected]>

Prefix btree to avoid build issues when building with linux kernel

157c510

Signed-off-by: Paul Dagnelie <[email protected]>

Fix sorted scan bug

7224043

Signed-off-by: Paul Dagnelie <[email protected]>

Reduce min size by 1, simplify merge and split logic

f5fdfda

Signed-off-by: Paul Dagnelie <[email protected]>

Fix dsl_scan, bulk_finish issues

b2f3759

Signed-off-by: Paul Dagnelie <[email protected]>

Reduce verify intensity on debug builds

ade2936

Signed-off-by: Paul Dagnelie <[email protected]>

pcd1193182 force-pushed the btree branch 2 times, most recently from d435e6e to 42dfc54 Compare October 8, 2019 23:25

ikozhukhov approved these changes Oct 8, 2019

View reviewed changes

Use memmove wrapper instead of bcopy for platform compatibillity

3ea9b78

Signed-off-by: Paul Dagnelie <[email protected]>

pcd1193182 force-pushed the btree branch from 42dfc54 to 3ea9b78 Compare October 9, 2019 05:50

behlendorf merged commit ca57777 into openzfs:master Oct 9, 2019

bra-fsn mentioned this pull request Apr 17, 2020

compression=gzip on FreeBSD leaks memory #10225

Closed

pfmooney mentioned this pull request Aug 13, 2020

dnode_sync is careless with range tree #10708

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce loaded range tree memory usage #9181

Reduce loaded range tree memory usage #9181

pcd1193182 commented Aug 19, 2019 •

edited

Loading

behlendorf left a comment

behlendorf commented Aug 26, 2019

ahrens left a comment •

edited

Loading

sempervictus commented Aug 30, 2019 •

edited

Loading

pcd1193182 commented Aug 30, 2019

sempervictus commented Aug 30, 2019

sempervictus commented Sep 1, 2019

sempervictus commented Sep 1, 2019 •

edited

Loading

sempervictus commented Sep 1, 2019 •

edited

Loading

sempervictus commented Sep 1, 2019

sempervictus commented Sep 2, 2019 •

edited

Loading

pcd1193182 commented Sep 2, 2019 •

edited

Loading

sempervictus commented Sep 2, 2019

pcd1193182 commented Sep 3, 2019

sempervictus commented Sep 5, 2019

pcd1193182 commented Sep 6, 2019

sempervictus commented Sep 7, 2019

ikozhukhov left a comment

ikozhukhov commented Oct 8, 2019 •

edited

Loading

ikozhukhov commented Oct 9, 2019

behlendorf commented Oct 9, 2019

ikozhukhov commented Oct 9, 2019

Reduce loaded range tree memory usage #9181

Reduce loaded range tree memory usage #9181

Conversation

pcd1193182 commented Aug 19, 2019 • edited Loading

Motivation and Context

Description

How Has This Been Tested?

Correctness testing

Performance testing

Types of changes

Checklist:

behlendorf left a comment

Choose a reason for hiding this comment

behlendorf commented Aug 26, 2019

ahrens left a comment • edited Loading

Choose a reason for hiding this comment

sempervictus commented Aug 30, 2019 • edited Loading

pcd1193182 commented Aug 30, 2019

sempervictus commented Aug 30, 2019

sempervictus commented Sep 1, 2019

sempervictus commented Sep 1, 2019 • edited Loading

sempervictus commented Sep 1, 2019 • edited Loading

sempervictus commented Sep 1, 2019

sempervictus commented Sep 2, 2019 • edited Loading

pcd1193182 commented Sep 2, 2019 • edited Loading

sempervictus commented Sep 2, 2019

pcd1193182 commented Sep 3, 2019

sempervictus commented Sep 5, 2019

pcd1193182 commented Sep 6, 2019

sempervictus commented Sep 7, 2019

ikozhukhov left a comment

Choose a reason for hiding this comment

ikozhukhov commented Oct 8, 2019 • edited Loading

ikozhukhov commented Oct 9, 2019

behlendorf commented Oct 9, 2019

ikozhukhov commented Oct 9, 2019

pcd1193182 commented Aug 19, 2019 •

edited

Loading

ahrens left a comment •

edited

Loading

sempervictus commented Aug 30, 2019 •

edited

Loading

sempervictus commented Sep 1, 2019 •

edited

Loading

sempervictus commented Sep 1, 2019 •

edited

Loading

sempervictus commented Sep 2, 2019 •

edited

Loading

pcd1193182 commented Sep 2, 2019 •

edited

Loading

ikozhukhov commented Oct 8, 2019 •

edited

Loading