Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executes conditional mutations in a thread pool #5184

Merged
merged 4 commits into from
Dec 14, 2024

Conversation

keith-turner
Copy link
Contributor

Conditional mutations currently execute in the thrift thread pool which can grow unbounded in size. Having an unbounded number of conditional mutations executing concurrently could degrade a tablet servers health in some cases. This change adds threads pools for executing conditional updates. This will help limit the CPU and memory used by executing conditional mutations. Conditional mutations can also contain data as part of the mutation that needs to be checked, if this is large that could still cause memory issues. This change more limits the memory used by reading current tablet data to check the condition. Limiting the memory used by conditional mutations waiting to execute is something that would need to somehow be controlled by thrift. So this change protects memory and CPU resources for conditional mutations that are executing but does not protect memory for ones that are waiting to execute. To protect memory for those waiting to execute would need to end the current practice of letting the thrift thread pool grow unbounded, but that is a long existing problem.

Conditional mutations currently execute in the thrift thread pool which
can grow unbounded in size.  Having an unbounded number of conditional
mutations executing concurrently could degrade a tablet servers health
in some cases.  This change adds threads pools for executing conditional
updates.  This will help limit the CPU and memory used by executing
conditional mutations.  Conditional mutations can also contain data as
part of the mutation that needs to be checked, if this is large that
could still cause memory issues.  This change more limits the memory
used by reading current tablet data to check the condition.  Limiting
the memory used by conditional mutations waiting to execute is something
that would need to somehow be controlled by thrift.  So this change
protects memory and CPU resources for conditional mutations that are
executing but does not protect memory for ones that are waiting to
execute.  To protect memory for those waiting to execute would need to
end the current practice of letting the thrift thread pool grow
unbounded, but that is a long existing problem.
Copy link
Contributor

@cshannon cshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, in terms of a default size, picking a static number is fine but I'm also wondering if it would be possible to make the default dynamic. There's going to be a wide range of servers in terms of how much cpu/memory they have so having the default scale up or down by detecting the resources could be nice. But it's definitely simpler to just make users configure what they want.

@keith-turner
Copy link
Contributor Author

LGTM, in terms of a default size, picking a static number is fine but I'm also wondering if it would be possible to make the default dynamic. There's going to be a wide range of servers in terms of how much cpu/memory they have so having the default scale up or down by detecting the resources could be nice. But it's definitely simpler to just make users configure what they want.

Adjusted the defaults from 16 threads to 64 threads since it used to be unlimited thought a higher default would be good. Can use the metrics to monitor the utilization of the thread pool and adjust it based on those.

@keith-turner keith-turner merged commit 47b75d3 into apache:main Dec 14, 2024
8 checks passed
@keith-turner keith-turner deleted the conditional-pool branch December 14, 2024 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants