-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decommit unused pages in the txn metadata table #1073
Conversation
@@ -473,6 +473,8 @@ class server_t | |||
|
|||
static void deallocate_object(gaia_offset_t offset); | |||
|
|||
static char* get_txn_metadata_page_address_from_ts(gaia_txn_id_t ts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't you use a different type for memory addresses? I remember seeing uintptr_t
or something like that in other changes. It would be nice to be consistent about the data type that we use with page addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I normally use void*
for memory addresses, but here I need char*
because I need to do pointer arithmetic on the address (and that is UB on void*
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will cause an issue (SIGBUS? SIGSEGV?) with UNIQUE indexes. We need to figure out a strategy for this before committing this change.
But isn't the decommitting occurring after the index entries would have been garbage collected or is there no coordination of the sort? |
No, the issue is committed entries with an old begin_ts will still be around as those entries cannot be garbage collected since they are still pointing to valid data. |
@yiwen-wong and I had a long discussion, the conclusion of which was that we need to make some minor coordinated changes to ensure the safety of index entries before this PR can be merged. I will be publishing a PR to revise the watermark logic with this in mind, and @yiwen-wong will follow with her "frozen timestamp" proposal for index entries whose committing txn has been GC'ed. |
I just merged #1076, which will unblock this PR when combined with @yiwen-wong 's upcoming change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unblocked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I wanted to test these combined changes with the palletbox workload, but I can't manage to build it, so I'm merging now and crossing my fingers there are no latent issues. |
This is the final PR in the "txn metadata GC" series. It implements the actual reclamation of unused memory in the txn table.
For anyone interested: in the comments on
truncate_txn_table()
I discuss the choice of flags tomadvise(2)
. I originally usedMADV_FREE
because of its purported performance advantages (it allows the OS to lazily reclaim decommitted pages instead of immediately unmapping them), but because this has no effect on the RSS charge of the process, it's impossible to observe any effect in any memory management tools (including cgroups, which is extremely problematic). Apparently MacOS has a better solution, but on Linux we're stuck with this unpleasant tradeoff. I decided that for now it's more important to be able to reliably observe committed physical memory than to chase hypothetical performance gains, so I reverted toMADV_DONTNEED
. The Golang runtime had a similar experience: golang/go#42330.