Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting per task memory limit #10308

Merged
merged 4 commits into from
Dec 27, 2021

Conversation

losipiuk
Copy link
Member

@losipiuk losipiuk commented Dec 15, 2021

A building block for better memory management to be used mostly together with tasks level retries (#9818)

Copy link
Contributor

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks reasonable to me. Though I'm not an expert here. I wonder if there's anybody else who might want to take a look?

@Config(QUERY_MAX_MEMORY_PER_TASK_CONFIG)
public NodeMemoryConfig setMaxQueryMemoryPerTask(DataSize maxQueryMemoryPerTask)
{
if (maxQueryMemoryPerTask.toBytes() == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be @Min("1B")?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually 0 has special meaning: "no-limit". Added description.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be allowed to be empty instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be. I somewhat prefer it to be possible to explicitly set the value to "unlimited" in the config. Via session variable we can make unlimited if value is set to null - but I am not very convinced it is easier to comprehend that special 0.

@findepi any opinion here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember generally we prefer not to have values with a special meaning. To avoid enforcing this limit not setting the limit seems to be more intuitive than setting it to 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would anyone set the property to 0 if default is no-limit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently no. I was more thinking of case when (if) we decide that default is not no-limit. I can change it though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking of case when (if) we decide that default is not no-limit.

0 is odd, but then if we decide there should be some limit by default, should there be an option to run query without any per-task memory limit? User probably could set it to query mam memory per node.

@losipiuk losipiuk self-assigned this Dec 20, 2021
@losipiuk losipiuk force-pushed the lo/per-task-memory-limits branch from 02918ac to b4d0365 Compare December 20, 2021 19:52
@losipiuk
Copy link
Member Author

Rebase

@losipiuk losipiuk force-pushed the lo/per-task-memory-limits branch from b4d0365 to 6621ee4 Compare December 20, 2021 19:58
@losipiuk
Copy link
Member Author

@findepi PTAL

@losipiuk losipiuk requested a review from sopel39 December 20, 2021 19:58
*/
package io.trino.memory.context;

public interface MemoryAllocationValidator
Copy link
Member

@sopel39 sopel39 Dec 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really similar to MemoryReservationHandler. Could that interface do the job? It seems that memory allocations via reserveMemory should return Future as it's caller intention is to wait for memory. The other method tryReserveMemory is for trying to allocate memory.

With interface like this, I don't see a reason for two methods to exist (other than exception customization), but then probably single callback is fine: #10308 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future<> as a return type does not suite me well. No use for that. Sole purpose of validator is to "reject" the memory allocation if limit is exceeded. It is not handling the case when the limit is not exceeded but node run out of memory (pool is depleted).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future<> as a return type does not suite me well. No use for that. Sole purpose of validator is to "reject" the memory allocation if limit is exceeded. It is not handling the case when the limit is not exceeded but node run out of memory (pool is depleted).

Ok, so why do we need an interface with two methods?

Two methods there so we do not need to do exception driven logic in ChildAggregatedMemoryContext.tryUpdateBytes where MemoryAllocationValidator.tryReserveMemory is used.
If I just had reserveMemory I would need to try/catch there. It is an option, but I think it is not ellegant.

It seems that we leak implementation details to interface. I don't think two methods are needed now that you have ValidatingLocalMemoryContext

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make it one method which would return Optional<String>

  • empty if allocation is valid
  • containing allocation error description if allocation violates constraints
    Then I can do the throwing in ValidatingLocalMemoryContext

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I take that back. Then the ValidatingLocalMemoryContext need to decide on exception type to throw. I am not a big fan of hardcoding the ExceededMemoryLimitException(EXCEEDED_LOCAL_MEMORY_LIMIT, there. It makes class less reusable. And by design nothing is enforcing us to use the class only for TaskContext.

I suggest we keep interface as proposed.

@@ -27,6 +27,8 @@ static AggregatedMemoryContext newRootAggregatedMemoryContext(MemoryReservationH

AggregatedMemoryContext newAggregatedMemoryContext();

AggregatedMemoryContext newAggregatedMemoryContext(MemoryAllocationValidator memoryAllocationValidator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this shouldn't be solved locally at task level via combination of newRootAggregatedMemoryContext and MemoryReservationHandler. Are there more use cases for newAggregatedMemoryContext(MemoryAllocationValidator memoryAllocationValidator) other than task allocation?

For example both OperatorContext and WorkProcessorPipelineSourceOperator both have their own InternalAggregatedMemoryContext. I'm not sure we should mix MemoryAllocationValidator memoryAllocationValidator into that picture (this adds complexity to the allocators structure)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sopel39 PTAL now

Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skimmed, lgtm

@losipiuk losipiuk force-pushed the lo/per-task-memory-limits branch from 6621ee4 to 3828952 Compare December 22, 2021 13:55
}

@Override
public ListenableFuture<Void> setBytes(long bytes)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do I need synchronization here and in trySetBytes?

@Config(QUERY_MAX_MEMORY_PER_TASK_CONFIG)
public NodeMemoryConfig setMaxQueryMemoryPerTask(DataSize maxQueryMemoryPerTask)
{
if (maxQueryMemoryPerTask.toBytes() == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would anyone set the property to 0 if default is no-limit?

*/
package io.trino.memory.context;

public interface MemoryAllocationValidator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future<> as a return type does not suite me well. No use for that. Sole purpose of validator is to "reject" the memory allocation if limit is exceeded. It is not handling the case when the limit is not exceeded but node run out of memory (pool is depleted).

Ok, so why do we need an interface with two methods?

Two methods there so we do not need to do exception driven logic in ChildAggregatedMemoryContext.tryUpdateBytes where MemoryAllocationValidator.tryReserveMemory is used.
If I just had reserveMemory I would need to try/catch there. It is an option, but I think it is not ellegant.

It seems that we leak implementation details to interface. I don't think two methods are needed now that you have ValidatingLocalMemoryContext

long delta = bytes - delegate.getBytes();

// first consult validator if allocation is possible
memoryValidator.reserveMemory(allocationTag, delta);
Copy link
Member

@sopel39 sopel39 Dec 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have absolute value at hand, just pass it to validator (single method) as in:

void enforceUserMemoryLimit(long allocated, long delta)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have absolute value. Here I just know how many bytes are allocated by this LocalMemoryContext. And memory validator is accounting on AggregatedMemoryContext level.

{
long delta = bytes - delegate.getBytes();

if (!memoryValidator.tryReserveMemory(allocationTag, delta)) {
Copy link
Member

@sopel39 sopel39 Dec 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should always throw exception if limit is exceeded (and not return false). This is how io.trino.memory.QueryContext#enforceUserMemoryLimit works and it makes sense.
Query needs X memory and X exceeds limit, so try should also fail

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually enforceUserMemoryLimit is not called from trySetBytes. I think it should on successful allocation, but it is not. Or am I reading something wrong.

@losipiuk losipiuk force-pushed the lo/per-task-memory-limits branch from 3828952 to ee9eb86 Compare December 23, 2021 15:45
@Config(QUERY_MAX_MEMORY_PER_TASK_CONFIG)
public NodeMemoryConfig setMaxQueryMemoryPerTask(DataSize maxQueryMemoryPerTask)
{
if (maxQueryMemoryPerTask.toBytes() == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking of case when (if) we decide that default is not no-limit.

0 is odd, but then if we decide there should be some limit by default, should there be an option to run query without any per-task memory limit? User probably could set it to query mam memory per node.

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments


import static java.util.Objects.requireNonNull;

public class ValidatingAggregateContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe some meaningful name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea.

@losipiuk losipiuk force-pushed the lo/per-task-memory-limits branch from ee9eb86 to 34a6352 Compare December 27, 2021 09:10
@losipiuk losipiuk force-pushed the lo/per-task-memory-limits branch from 34a6352 to 5abc758 Compare December 27, 2021 09:10
@losipiuk losipiuk merged commit c88d968 into trinodb:master Dec 27, 2021
@losipiuk losipiuk deleted the lo/per-task-memory-limits branch December 27, 2021 15:22
@github-actions github-actions bot added this to the 368 milestone Dec 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants