Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-37029: keep usable_arenas in sorted order without searching #13612

Merged
merged 5 commits into from
Jun 1, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Freeing a great many small objects could take time quadratic in the number of arenas, due to using linear search to keep ``obmalloc.c``'s list of usable arenas sorted by order of number of free memory pools. This is accomplished without search now, leaving the worst-case time linear in the number of arenas. For programs where this quite visibly matters (typically with more than 100 thousand small objects alive simultaneously), this can greatly reduce the time needed to release their memory.
109 changes: 77 additions & 32 deletions Objects/obmalloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -918,6 +918,11 @@ static int running_on_valgrind = -1;
#define POOL_SIZE SYSTEM_PAGE_SIZE /* must be 2^N */
#define POOL_SIZE_MASK SYSTEM_PAGE_SIZE_MASK

#define MAX_POOLS_IN_ARENA (ARENA_SIZE / POOL_SIZE)
#if MAX_POOLS_IN_ARENA * POOL_SIZE != ARENA_SIZE
# error "arena size not an exact multiple of pool size"
#endif

/*
* -- End of tunable settings section --
*/
Expand Down Expand Up @@ -1155,6 +1160,18 @@ usable_arenas

Note that an arena_object associated with an arena all of whose pools are
currently in use isn't on either list.

Changed in Python 3.8: keeping usable_arenas sorted by number of free pools
used to be done by one-at-a-time linear search when an arena's number of
free pools changed. That could, overall, consume time quadratic in the
number of arenas. That didn't really matter when there were only a few
hundred arenas (typical!), but could be a timing disaster when there were
hundreds of thousands. See bpo-37029.

Now we have a vector of "search fingers" to eliminate the need to search:
nfp2lasta[nfp] returns the last ("rightmost") arena in usable_arenas
with nfp free pools. This is NULL if and only if there is no arena with
nfp free pools in usable_arenas.
*/

/* Array of objects used to track chunks of memory (arenas). */
Expand All @@ -1172,6 +1189,9 @@ static struct arena_object* unused_arena_objects = NULL;
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment documenting that they are sorted in number of free pools?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not, unless you feel strongly about it. This has been documented all along, but up where the main explanation of arenas lives (look at the comment block about usable_arenas at line 1150).

There are a bunch of moving parts here, and the style of the file is to give long-ish English explanations in uninterrupted comment blocks, with only the briefest of memory-jogging comments on declaration lines.

static struct arena_object* usable_arenas = NULL;

/* nfp2lasta[nfp] is the last arena in usable_arenas with nfp free pools */
static struct arena_object* nfp2lasta[MAX_POOLS_IN_ARENA + 1] = { NULL };

/* How many arena_objects do we initially allocate?
* 16 = can allocate 16 arenas = 16 * ARENA_SIZE = 4MB before growing the
* `arenas` vector.
Expand Down Expand Up @@ -1281,8 +1301,7 @@ new_arena(void)
/* pool_address <- first pool-aligned address in the arena
nfreepools <- number of whole pools that fit after alignment */
arenaobj->pool_address = (block*)arenaobj->address;
arenaobj->nfreepools = ARENA_SIZE / POOL_SIZE;
assert(POOL_SIZE * arenaobj->nfreepools == ARENA_SIZE);
arenaobj->nfreepools = MAX_POOLS_IN_ARENA;
excess = (uint)(arenaobj->address & POOL_SIZE_MASK);
if (excess != 0) {
--arenaobj->nfreepools;
Expand Down Expand Up @@ -1478,30 +1497,39 @@ pymalloc_alloc(void *ctx, void **ptr_p, size_t nbytes)
}
usable_arenas->nextarena =
usable_arenas->prevarena = NULL;
assert(nfp2lasta[usable_arenas->nfreepools] == NULL);
nfp2lasta[usable_arenas->nfreepools] = usable_arenas;
}
assert(usable_arenas->address != 0);

/* This arena already had the smallest nfreepools value, so decreasing
* nfreepools doesn't change that, and we don't need to rearrange the
* usable_arenas list. However, if the arena becomes wholly allocated,
* we need to remove its arena_object from usable_arenas.
*/
assert(usable_arenas->nfreepools > 0);
if (nfp2lasta[usable_arenas->nfreepools] == usable_arenas) {
/* It's the last of this size, so there won't be any. */
nfp2lasta[usable_arenas->nfreepools] = NULL;
}
/* If any free pools will remain, it will be the new smallest. */
if (usable_arenas->nfreepools > 1) {
assert(nfp2lasta[usable_arenas->nfreepools - 1] == NULL);
nfp2lasta[usable_arenas->nfreepools - 1] = usable_arenas;
}

/* Try to get a cached free pool. */
pool = usable_arenas->freepools;
if (pool != NULL) {
/* Unlink from cached pools. */
usable_arenas->freepools = pool->nextpool;

/* This arena already had the smallest nfreepools
* value, so decreasing nfreepools doesn't change
* that, and we don't need to rearrange the
* usable_arenas list. However, if the arena has
* become wholly allocated, we need to remove its
* arena_object from usable_arenas.
*/
--usable_arenas->nfreepools;
if (usable_arenas->nfreepools == 0) {
/* Wholly allocated: remove. */
assert(usable_arenas->freepools == NULL);
assert(usable_arenas->nextarena == NULL ||
usable_arenas->nextarena->prevarena ==
usable_arenas);

usable_arenas = usable_arenas->nextarena;
if (usable_arenas != NULL) {
usable_arenas->prevarena = NULL;
Expand Down Expand Up @@ -1709,7 +1737,23 @@ pymalloc_free(void *ctx, void *p)
ao = &arenas[pool->arenaindex];
pool->nextpool = ao->freepools;
ao->freepools = pool;
nf = ++ao->nfreepools;
nf = ao->nfreepools;
/* If this is the rightmost arena with this number of free pools,
* nfp2lasta[nf] needs to change. Caution: if nf is 0, there
* are no arenas in usable_arenas with that value.
*/
struct arena_object* lastnf = nfp2lasta[nf];
assert((nf == 0 && lastnf == NULL) ||
(nf > 0 &&
lastnf != NULL &&
lastnf->nfreepools == nf &&
(lastnf->nextarena == NULL ||
nf < lastnf->nextarena->nfreepools)));
if (lastnf == ao) { /* it is the rightmost */
struct arena_object* p = ao->prevarena;
nfp2lasta[nf] = (p != NULL && p->nfreepools == nf) ? p : NULL;
}
ao->nfreepools = ++nf;

/* All the rest is arena management. We just freed
* a pool, and there are 4 cases for arena mgmt:
Expand Down Expand Up @@ -1777,6 +1821,9 @@ pymalloc_free(void *ctx, void *p)
usable_arenas->prevarena = ao;
usable_arenas = ao;
assert(usable_arenas->address != 0);
if (nfp2lasta[1] == NULL) {
nfp2lasta[1] = ao;
}
tim-one marked this conversation as resolved.
Show resolved Hide resolved

goto success;
}
Expand All @@ -1788,14 +1835,23 @@ pymalloc_free(void *ctx, void *p)
* a few un-scientific tests, it seems like this
* approach allowed a lot more memory to be freed.
*/
if (ao->nextarena == NULL ||
nf <= ao->nextarena->nfreepools) {
/* If this is the only arena with nf, record that. */
if (nfp2lasta[nf] == NULL) {
nfp2lasta[nf] = ao;
} /* else the rightmost with nf doesn't change */
/* If this was the rightmost of the old size, it remains in place. */
if (ao == lastnf) {
/* Case 4. Nothing to do. */
goto success;
}
/* Case 3: We have to move the arena towards the end
* of the list, because it has more free pools than
* the arena to its right.
/* If ao were the only arena in the list, the last block would have
* gotten us out.
*/
assert(ao->nextarena != NULL);

/* Case 3: We have to move the arena towards the end of the list,
* because it has more free pools than the arena to its right. It needs
* to move to follow lastnf.
* First unlink ao from usable_arenas.
*/
if (ao->prevarena != NULL) {
Expand All @@ -1809,24 +1865,13 @@ pymalloc_free(void *ctx, void *p)
usable_arenas = ao->nextarena;
}
ao->nextarena->prevarena = ao->prevarena;

/* Locate the new insertion point by iterating over
* the list, using our nextarena pointer.
*/
while (ao->nextarena != NULL && nf > ao->nextarena->nfreepools) {
ao->prevarena = ao->nextarena;
ao->nextarena = ao->nextarena->nextarena;
}

/* Insert ao at this point. */
assert(ao->nextarena == NULL || ao->prevarena == ao->nextarena->prevarena);
assert(ao->prevarena->nextarena == ao->nextarena);

ao->prevarena->nextarena = ao;
/* And insert after lastnf. */
ao->prevarena = lastnf;
ao->nextarena = lastnf->nextarena;
if (ao->nextarena != NULL) {
ao->nextarena->prevarena = ao;
}

lastnf->nextarena = ao;
/* Verify that the swaps worked. */
assert(ao->nextarena == NULL || nf <= ao->nextarena->nfreepools);
assert(ao->prevarena == NULL || nf > ao->prevarena->nfreepools);
Expand Down