-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow setting per task memory limit #10308
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks reasonable to me. Though I'm not an expert here. I wonder if there's anybody else who might want to take a look?
@Config(QUERY_MAX_MEMORY_PER_TASK_CONFIG) | ||
public NodeMemoryConfig setMaxQueryMemoryPerTask(DataSize maxQueryMemoryPerTask) | ||
{ | ||
if (maxQueryMemoryPerTask.toBytes() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be @Min("1B")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually 0
has special meaning: "no-limit". Added description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be allowed to be empty instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be. I somewhat prefer it to be possible to explicitly set the value to "unlimited" in the config. Via session variable we can make unlimited if value is set to null
- but I am not very convinced it is easier to comprehend that special 0
.
@findepi any opinion here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember generally we prefer not to have values with a special meaning. To avoid enforcing this limit not setting the limit seems to be more intuitive than setting it to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would anyone set the property to 0 if default is no-limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently no. I was more thinking of case when (if) we decide that default is not no-limit. I can change it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was more thinking of case when (if) we decide that default is not no-limit.
0 is odd, but then if we decide there should be some limit by default, should there be an option to run query without any per-task memory limit? User probably could set it to query mam memory per node
.
lib/trino-memory-context/src/main/java/io/trino/memory/context/MemoryAllocationValidator.java
Outdated
Show resolved
Hide resolved
02918ac
to
b4d0365
Compare
Rebase |
b4d0365
to
6621ee4
Compare
@findepi PTAL |
*/ | ||
package io.trino.memory.context; | ||
|
||
public interface MemoryAllocationValidator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really similar to MemoryReservationHandler
. Could that interface do the job? It seems that memory allocations via reserveMemory
should return Future
as it's caller intention is to wait for memory. The other method tryReserveMemory
is for trying to allocate memory.
With interface like this, I don't see a reason for two methods to exist (other than exception customization), but then probably single callback is fine: #10308 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future<>
as a return type does not suite me well. No use for that. Sole purpose of validator is to "reject" the memory allocation if limit is exceeded. It is not handling the case when the limit is not exceeded but node run out of memory (pool is depleted).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future<> as a return type does not suite me well. No use for that. Sole purpose of validator is to "reject" the memory allocation if limit is exceeded. It is not handling the case when the limit is not exceeded but node run out of memory (pool is depleted).
Ok, so why do we need an interface with two methods?
Two methods there so we do not need to do exception driven logic in ChildAggregatedMemoryContext.tryUpdateBytes where MemoryAllocationValidator.tryReserveMemory is used.
If I just had reserveMemory I would need to try/catch there. It is an option, but I think it is not ellegant.
It seems that we leak implementation details to interface. I don't think two methods are needed now that you have ValidatingLocalMemoryContext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make it one method which would return Optional<String>
- empty if allocation is valid
- containing allocation error description if allocation violates constraints
Then I can do the throwing inValidatingLocalMemoryContext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I take that back. Then the ValidatingLocalMemoryContext
need to decide on exception type to throw. I am not a big fan of hardcoding the ExceededMemoryLimitException(EXCEEDED_LOCAL_MEMORY_LIMIT,
there. It makes class less reusable. And by design nothing is enforcing us to use the class only for TaskContext.
I suggest we keep interface as proposed.
lib/trino-memory-context/src/main/java/io/trino/memory/context/MemoryAllocationValidator.java
Outdated
Show resolved
Hide resolved
@@ -27,6 +27,8 @@ static AggregatedMemoryContext newRootAggregatedMemoryContext(MemoryReservationH | |||
|
|||
AggregatedMemoryContext newAggregatedMemoryContext(); | |||
|
|||
AggregatedMemoryContext newAggregatedMemoryContext(MemoryAllocationValidator memoryAllocationValidator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this shouldn't be solved locally at task level via combination of newRootAggregatedMemoryContext
and MemoryReservationHandler
. Are there more use cases for newAggregatedMemoryContext(MemoryAllocationValidator memoryAllocationValidator)
other than task allocation?
For example both OperatorContext
and WorkProcessorPipelineSourceOperator
both have their own InternalAggregatedMemoryContext
. I'm not sure we should mix MemoryAllocationValidator memoryAllocationValidator
into that picture (this adds complexity to the allocators structure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sopel39 PTAL now
lib/trino-memory-context/src/main/java/io/trino/memory/context/MemoryAllocationValidator.java
Show resolved
Hide resolved
...no-memory-context/src/main/java/io/trino/memory/context/AbstractAggregatedMemoryContext.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skimmed, lgtm
6621ee4
to
3828952
Compare
} | ||
|
||
@Override | ||
public ListenableFuture<Void> setBytes(long bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do I need synchronization here and in trySetBytes
?
@Config(QUERY_MAX_MEMORY_PER_TASK_CONFIG) | ||
public NodeMemoryConfig setMaxQueryMemoryPerTask(DataSize maxQueryMemoryPerTask) | ||
{ | ||
if (maxQueryMemoryPerTask.toBytes() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would anyone set the property to 0 if default is no-limit?
*/ | ||
package io.trino.memory.context; | ||
|
||
public interface MemoryAllocationValidator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future<> as a return type does not suite me well. No use for that. Sole purpose of validator is to "reject" the memory allocation if limit is exceeded. It is not handling the case when the limit is not exceeded but node run out of memory (pool is depleted).
Ok, so why do we need an interface with two methods?
Two methods there so we do not need to do exception driven logic in ChildAggregatedMemoryContext.tryUpdateBytes where MemoryAllocationValidator.tryReserveMemory is used.
If I just had reserveMemory I would need to try/catch there. It is an option, but I think it is not ellegant.
It seems that we leak implementation details to interface. I don't think two methods are needed now that you have ValidatingLocalMemoryContext
long delta = bytes - delegate.getBytes(); | ||
|
||
// first consult validator if allocation is possible | ||
memoryValidator.reserveMemory(allocationTag, delta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have absolute value at hand, just pass it to validator (single method) as in:
void enforceUserMemoryLimit(long allocated, long delta)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have absolute value. Here I just know how many bytes are allocated by this LocalMemoryContext
. And memory validator is accounting on AggregatedMemoryContext
level.
{ | ||
long delta = bytes - delegate.getBytes(); | ||
|
||
if (!memoryValidator.tryReserveMemory(allocationTag, delta)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should always throw exception if limit is exceeded (and not return false). This is how io.trino.memory.QueryContext#enforceUserMemoryLimit
works and it makes sense.
Query needs X
memory and X
exceeds limit, so try
should also fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually enforceUserMemoryLimit
is not called from trySetBytes
. I think it should on successful allocation, but it is not. Or am I reading something wrong.
core/trino-main/src/main/java/io/trino/operator/TaskAllocationValidator.java
Show resolved
Hide resolved
3828952
to
ee9eb86
Compare
core/trino-main/src/main/java/io/trino/operator/TaskAllocationValidator.java
Outdated
Show resolved
Hide resolved
@Config(QUERY_MAX_MEMORY_PER_TASK_CONFIG) | ||
public NodeMemoryConfig setMaxQueryMemoryPerTask(DataSize maxQueryMemoryPerTask) | ||
{ | ||
if (maxQueryMemoryPerTask.toBytes() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was more thinking of case when (if) we decide that default is not no-limit.
0 is odd, but then if we decide there should be some limit by default, should there be an option to run query without any per-task memory limit? User probably could set it to query mam memory per node
.
core/trino-main/src/main/java/io/trino/operator/TaskAllocationValidator.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments
core/trino-main/src/main/java/io/trino/operator/TaskAllocationValidator.java
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/TaskAllocationValidator.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/TaskAllocationValidator.java
Show resolved
Hide resolved
...trino-memory-context/src/main/java/io/trino/memory/context/ValidatingLocalMemoryContext.java
Outdated
Show resolved
Hide resolved
...trino-memory-context/src/main/java/io/trino/memory/context/ValidatingLocalMemoryContext.java
Outdated
Show resolved
Hide resolved
|
||
import static java.util.Objects.requireNonNull; | ||
|
||
public class ValidatingAggregateContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe some meaningful name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea.
ee9eb86
to
34a6352
Compare
This is just parameter rename; no semantic changes
34a6352
to
5abc758
Compare
A building block for better memory management to be used mostly together with tasks level retries (#9818)