-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DocDB] Cleanup transactions from memory before CDC streams changes #21290
Labels
2024.1 Backport Required
2024.1.0_blocker
area/docdb
YugabyteDB core features
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Comments
es1024
added
area/docdb
YugabyteDB core features
status/awaiting-triage
Issue awaiting triage
labels
Mar 4, 2024
yugabyte-ci
added
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
and removed
status/awaiting-triage
Issue awaiting triage
labels
Mar 4, 2024
es1024
added a commit
that referenced
this issue
Mar 14, 2024
…tely Summary: TransactionParticipant currently keeps a transaction in memory until `apply_op_id <= cdc_checkpoint_op_id`, in order to avoid cleaning up intents for changes not yet streamed by CDC. This includes a fair amount of metadata (RunningTransaction, 720 bytes per transaction, on RF number of tservers) and makes memory requirements for large CDC retention windows infeasible. For even a modest 100 transaction/second workload and the current 4 hour window, this means we may at worst case dedicate ~1 GB * RF memory solely to keeping this metadata. This diff makes the following changes: 1. When attempting to clean up a transaction where `apply_op_id > cdc_checkpoint_op_id`, a post-apply transaction metadata entry is written to intentsdb, containing the apply op id/commit ht/apply ht. The key for this entry is `[transaction id] 00`. 2. The transaction is immediately cleared from memory. 3. TransactionParticipant no longer triggers per-transaction intent cleanup in most cases if CDC is active, as there is no longer information in memory about whether it is safe to do so. We instead now rely on cleanup of entire SST files to clean up these intents in this case. - Exception: per-transaction intent cleanup occurs during compaction, which has been modified to locate the post-apply transaction metadata and thus can load the apply op id to compare against `cdc_checkpoint_op_id`. These changes are behind two newly added gflags: - `cdc_write_post_apply_metadata` controls whether the new post-apply transaction metadata entry should be written if a transaction is cleaned up before CDC has streamed it - `cdc_immediate_transaction_cleanup` controls whether to apply the new cleanup logic. `cdc_write_post_apply_metadata` must be on if this is on These gflags are on in debug and off in release by default. **Regarding CDC unit tests:** After these changes, CDC unit tests that check for intent count must be handled with more caution. If the unit tests do not wait for the post-apply transaction metadata to be written, intent counts may be non-deterministic (post-apply metadata may be written before or after counts are taken). Furthermore, as per-transaction cleanup is disabled except for compaction, intents are now generally cleaned up when CDC changes are fetched and the checkpoint op id is updated (which then triggers a full SST file cleanup). If the writing of the post-apply metadata is not waited for, then it may happen after CDC changes are fetched and the transactions intents are cleaned. In a production system, this would just get cleaned up in the next compaction or when the next SST file is deleted, but in a test counting number of intents, this may lead to off by one errors. **Upgrade/Rollback safety:** This change adds the PostApplyTransactionMetadata protobuf, which is written to intentsdb when cleaning a transaction from memory while CDC still needs its intents. These changes are guarded behind the `cdc_write_post_apply_metadata` gflag, which is default off. This flag should only be enabled after upgrade has finished, to ensure that we do not send post-apply transaction metadata to nodes that do not understand it during RBS. Post-upgrade, we handle the case where this metadata is not present safely (assume that it is unsafe to cleanup). This flag will eventually need to be converted into an auto flag, as the change is not rollback safe: once post-apply transaction metadata is present in intents, rolling back will cause this metadata to be incorrectly treated as a malformed reverse index entry. Jira: DB-10206 Test Plan: Jenkins. Reviewers: sergei, skumar, mbautin, rthallam Reviewed By: sergei, rthallam Subscribers: yql, timur, ycdcxcluster, ybase, bogdan, rthallam Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31900
asrinivasanyb
pushed a commit
to asrinivasanyb/yugabyte-db
that referenced
this issue
Mar 18, 2024
… immediately Summary: TransactionParticipant currently keeps a transaction in memory until `apply_op_id <= cdc_checkpoint_op_id`, in order to avoid cleaning up intents for changes not yet streamed by CDC. This includes a fair amount of metadata (RunningTransaction, 720 bytes per transaction, on RF number of tservers) and makes memory requirements for large CDC retention windows infeasible. For even a modest 100 transaction/second workload and the current 4 hour window, this means we may at worst case dedicate ~1 GB * RF memory solely to keeping this metadata. This diff makes the following changes: 1. When attempting to clean up a transaction where `apply_op_id > cdc_checkpoint_op_id`, a post-apply transaction metadata entry is written to intentsdb, containing the apply op id/commit ht/apply ht. The key for this entry is `[transaction id] 00`. 2. The transaction is immediately cleared from memory. 3. TransactionParticipant no longer triggers per-transaction intent cleanup in most cases if CDC is active, as there is no longer information in memory about whether it is safe to do so. We instead now rely on cleanup of entire SST files to clean up these intents in this case. - Exception: per-transaction intent cleanup occurs during compaction, which has been modified to locate the post-apply transaction metadata and thus can load the apply op id to compare against `cdc_checkpoint_op_id`. These changes are behind two newly added gflags: - `cdc_write_post_apply_metadata` controls whether the new post-apply transaction metadata entry should be written if a transaction is cleaned up before CDC has streamed it - `cdc_immediate_transaction_cleanup` controls whether to apply the new cleanup logic. `cdc_write_post_apply_metadata` must be on if this is on These gflags are on in debug and off in release by default. **Regarding CDC unit tests:** After these changes, CDC unit tests that check for intent count must be handled with more caution. If the unit tests do not wait for the post-apply transaction metadata to be written, intent counts may be non-deterministic (post-apply metadata may be written before or after counts are taken). Furthermore, as per-transaction cleanup is disabled except for compaction, intents are now generally cleaned up when CDC changes are fetched and the checkpoint op id is updated (which then triggers a full SST file cleanup). If the writing of the post-apply metadata is not waited for, then it may happen after CDC changes are fetched and the transactions intents are cleaned. In a production system, this would just get cleaned up in the next compaction or when the next SST file is deleted, but in a test counting number of intents, this may lead to off by one errors. **Upgrade/Rollback safety:** This change adds the PostApplyTransactionMetadata protobuf, which is written to intentsdb when cleaning a transaction from memory while CDC still needs its intents. These changes are guarded behind the `cdc_write_post_apply_metadata` gflag, which is default off. This flag should only be enabled after upgrade has finished, to ensure that we do not send post-apply transaction metadata to nodes that do not understand it during RBS. Post-upgrade, we handle the case where this metadata is not present safely (assume that it is unsafe to cleanup). This flag will eventually need to be converted into an auto flag, as the change is not rollback safe: once post-apply transaction metadata is present in intents, rolling back will cause this metadata to be incorrectly treated as a malformed reverse index entry. Jira: DB-10206 Test Plan: Jenkins. Reviewers: sergei, skumar, mbautin, rthallam Reviewed By: sergei, rthallam Subscribers: yql, timur, ycdcxcluster, ybase, bogdan, rthallam Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31900
This was referenced Mar 19, 2024
1 task
es1024
added a commit
that referenced
this issue
May 2, 2024
…om memory immediately Summary: Original commit: 559b2b0 / D31900 TransactionParticipant currently keeps a transaction in memory until `apply_op_id <= cdc_checkpoint_op_id`, in order to avoid cleaning up intents for changes not yet streamed by CDC. This includes a fair amount of metadata (RunningTransaction, 720 bytes per transaction, on RF number of tservers) and makes memory requirements for large CDC retention windows infeasible. For even a modest 100 transaction/second workload and the current 4 hour window, this means we may at worst case dedicate ~1 GB * RF memory solely to keeping this metadata. This diff makes the following changes: 1. When attempting to clean up a transaction where `apply_op_id > cdc_checkpoint_op_id`, a post-apply transaction metadata entry is written to intentsdb, containing the apply op id/commit ht/apply ht. The key for this entry is `[transaction id] 00`. 2. The transaction is immediately cleared from memory. 3. TransactionParticipant no longer triggers per-transaction intent cleanup in most cases if CDC is active, as there is no longer information in memory about whether it is safe to do so. We instead now rely on cleanup of entire SST files to clean up these intents in this case. - Exception: per-transaction intent cleanup occurs during compaction, which has been modified to locate the post-apply transaction metadata and thus can load the apply op id to compare against `cdc_checkpoint_op_id`. These changes are behind two newly added gflags: - `cdc_write_post_apply_metadata` controls whether the new post-apply transaction metadata entry should be written if a transaction is cleaned up before CDC has streamed it - `cdc_immediate_transaction_cleanup` controls whether to apply the new cleanup logic. `cdc_write_post_apply_metadata` must be on if this is on These gflags are on in debug and off in release by default. **Regarding CDC unit tests:** After these changes, CDC unit tests that check for intent count must be handled with more caution. If the unit tests do not wait for the post-apply transaction metadata to be written, intent counts may be non-deterministic (post-apply metadata may be written before or after counts are taken). Furthermore, as per-transaction cleanup is disabled except for compaction, intents are now generally cleaned up when CDC changes are fetched and the checkpoint op id is updated (which then triggers a full SST file cleanup). If the writing of the post-apply metadata is not waited for, then it may happen after CDC changes are fetched and the transactions intents are cleaned. In a production system, this would just get cleaned up in the next compaction or when the next SST file is deleted, but in a test counting number of intents, this may lead to off by one errors. **Upgrade/Rollback safety:** This change adds the PostApplyTransactionMetadata protobuf, which is written to intentsdb when cleaning a transaction from memory while CDC still needs its intents. These changes are guarded behind the `cdc_write_post_apply_metadata` gflag, which is default off. This flag should only be enabled after upgrade has finished, to ensure that we do not send post-apply transaction metadata to nodes that do not understand it during RBS. Post-upgrade, we handle the case where this metadata is not present safely (assume that it is unsafe to cleanup). This flag will eventually need to be converted into an auto flag, as the change is not rollback safe: once post-apply transaction metadata is present in intents, rolling back will cause this metadata to be incorrectly treated as a malformed reverse index entry. Jira: DB-10206 Test Plan: Jenkins. Reviewers: sergei, skumar, mbautin, rthallam Reviewed By: rthallam Subscribers: rthallam, bogdan, ybase, ycdcxcluster, timur, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D33174
ZhenYongFan
pushed a commit
to ZhenYongFan/yugabyte-db
that referenced
this issue
Jun 15, 2024
…r CDC from memory immediately Summary: Original commit: 559b2b0 / D31900 TransactionParticipant currently keeps a transaction in memory until `apply_op_id <= cdc_checkpoint_op_id`, in order to avoid cleaning up intents for changes not yet streamed by CDC. This includes a fair amount of metadata (RunningTransaction, 720 bytes per transaction, on RF number of tservers) and makes memory requirements for large CDC retention windows infeasible. For even a modest 100 transaction/second workload and the current 4 hour window, this means we may at worst case dedicate ~1 GB * RF memory solely to keeping this metadata. This diff makes the following changes: 1. When attempting to clean up a transaction where `apply_op_id > cdc_checkpoint_op_id`, a post-apply transaction metadata entry is written to intentsdb, containing the apply op id/commit ht/apply ht. The key for this entry is `[transaction id] 00`. 2. The transaction is immediately cleared from memory. 3. TransactionParticipant no longer triggers per-transaction intent cleanup in most cases if CDC is active, as there is no longer information in memory about whether it is safe to do so. We instead now rely on cleanup of entire SST files to clean up these intents in this case. - Exception: per-transaction intent cleanup occurs during compaction, which has been modified to locate the post-apply transaction metadata and thus can load the apply op id to compare against `cdc_checkpoint_op_id`. These changes are behind two newly added gflags: - `cdc_write_post_apply_metadata` controls whether the new post-apply transaction metadata entry should be written if a transaction is cleaned up before CDC has streamed it - `cdc_immediate_transaction_cleanup` controls whether to apply the new cleanup logic. `cdc_write_post_apply_metadata` must be on if this is on These gflags are on in debug and off in release by default. **Regarding CDC unit tests:** After these changes, CDC unit tests that check for intent count must be handled with more caution. If the unit tests do not wait for the post-apply transaction metadata to be written, intent counts may be non-deterministic (post-apply metadata may be written before or after counts are taken). Furthermore, as per-transaction cleanup is disabled except for compaction, intents are now generally cleaned up when CDC changes are fetched and the checkpoint op id is updated (which then triggers a full SST file cleanup). If the writing of the post-apply metadata is not waited for, then it may happen after CDC changes are fetched and the transactions intents are cleaned. In a production system, this would just get cleaned up in the next compaction or when the next SST file is deleted, but in a test counting number of intents, this may lead to off by one errors. **Upgrade/Rollback safety:** This change adds the PostApplyTransactionMetadata protobuf, which is written to intentsdb when cleaning a transaction from memory while CDC still needs its intents. These changes are guarded behind the `cdc_write_post_apply_metadata` gflag, which is default off. This flag should only be enabled after upgrade has finished, to ensure that we do not send post-apply transaction metadata to nodes that do not understand it during RBS. Post-upgrade, we handle the case where this metadata is not present safely (assume that it is unsafe to cleanup). This flag will eventually need to be converted into an auto flag, as the change is not rollback safe: once post-apply transaction metadata is present in intents, rolling back will cause this metadata to be incorrectly treated as a malformed reverse index entry. Jira: DB-10206 Test Plan: Jenkins. Reviewers: sergei, skumar, mbautin, rthallam Reviewed By: rthallam Subscribers: rthallam, bogdan, ybase, ycdcxcluster, timur, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D33174
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
2024.1 Backport Required
2024.1.0_blocker
area/docdb
YugabyteDB core features
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Jira Link: DB-10206
Description
We currently keep transaction metadata (RunningTransaction) in memory until apply op id <= cdc_checkpoint_op_id, to ensure that intents needed for CDC are not deleted before they get streamed by CDC. This means we have a fair amount of memory per transaction, and cannot set the CDC retention interval too high before memory usage becomes prohibitive.
We should instead clear this metadata from memory once the transaction has been applied, and rely on on-disk metadata for CDC.
Context: #15533
Issue Type
kind/enhancement
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: