Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decommit unused pages in the txn metadata table #1073

Merged
merged 2 commits into from
Dec 3, 2021

Conversation

senderista
Copy link
Contributor

This is the final PR in the "txn metadata GC" series. It implements the actual reclamation of unused memory in the txn table.

For anyone interested: in the comments on truncate_txn_table() I discuss the choice of flags to madvise(2). I originally used MADV_FREE because of its purported performance advantages (it allows the OS to lazily reclaim decommitted pages instead of immediately unmapping them), but because this has no effect on the RSS charge of the process, it's impossible to observe any effect in any memory management tools (including cgroups, which is extremely problematic). Apparently MacOS has a better solution, but on Linux we're stuck with this unpleasant tradeoff. I decided that for now it's more important to be able to reliably observe committed physical memory than to chase hypothetical performance gains, so I reverted to MADV_DONTNEED. The Golang runtime had a similar experience: golang/go#42330.

@@ -473,6 +473,8 @@ class server_t

static void deallocate_object(gaia_offset_t offset);

static char* get_txn_metadata_page_address_from_ts(gaia_txn_id_t ts);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't you use a different type for memory addresses? I remember seeing uintptr_t or something like that in other changes. It would be nice to be consistent about the data type that we use with page addresses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I normally use void* for memory addresses, but here I need char* because I need to do pointer arithmetic on the address (and that is UB on void*).

Copy link

@yiwen-wong yiwen-wong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will cause an issue (SIGBUS? SIGSEGV?) with UNIQUE indexes. We need to figure out a strategy for this before committing this change.

https://github.com/gaia-platform/GaiaPlatform/blob/master/production/db/core/src/index_builder.cpp#L167

@LaurentiuCristofor
Copy link
Contributor

I think this will cause an issue (SIGBUS? SIGSEGV?) with UNIQUE indexes. We need to figure out a strategy for this before committing this change.

https://github.com/gaia-platform/GaiaPlatform/blob/master/production/db/core/src/index_builder.cpp#L167

But isn't the decommitting occurring after the index entries would have been garbage collected or is there no coordination of the sort?

@yiwen-wong
Copy link

I think this will cause an issue (SIGBUS? SIGSEGV?) with UNIQUE indexes. We need to figure out a strategy for this before committing this change.
https://github.com/gaia-platform/GaiaPlatform/blob/master/production/db/core/src/index_builder.cpp#L167

But isn't the decommitting occurring after the index entries would have been garbage collected or is there no coordination of the sort?

No, the issue is committed entries with an old begin_ts will still be around as those entries cannot be garbage collected since they are still pointing to valid data.
The solution that comes to mind is we have to mark the begin_ts as 'frozen' (Postgres reserves a special value, but c_invalid_txn_id might work but is confusing) before actually collecting the metadata.

@senderista
Copy link
Contributor Author

@yiwen-wong and I had a long discussion, the conclusion of which was that we need to make some minor coordinated changes to ensure the safety of index entries before this PR can be merged. I will be publishing a PR to revise the watermark logic with this in mind, and @yiwen-wong will follow with her "frozen timestamp" proposal for index entries whose committing txn has been GC'ed.

@senderista
Copy link
Contributor Author

I just merged #1076, which will unblock this PR when combined with @yiwen-wong 's upcoming change.

Copy link

@yiwen-wong yiwen-wong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unblocked

Copy link

@yiwen-wong yiwen-wong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@senderista
Copy link
Contributor Author

I wanted to test these combined changes with the palletbox workload, but I can't manage to build it, so I'm merging now and crossing my fingers there are no latent issues.

@senderista senderista merged commit dcdd995 into master Dec 3, 2021
@senderista senderista deleted the tobin/txn_metadata_gc_impl branch December 3, 2021 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants