-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory oversubscription #10247
Memory oversubscription #10247
Changes from 8 commits
80faab0
7df90da
18e1a59
b383f92
b1ff06f
5e3fbd5
750e665
43549b4
699cd7a
6d89131
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -976,6 +976,11 @@ func (tr *TaskRunner) buildTaskConfig() *drivers.TaskConfig { | |
} | ||
} | ||
|
||
memoryLimit := taskResources.Memory.MemoryMB | ||
if max := taskResources.Memory.MemoryMaxMB; max > memoryLimit { | ||
memoryLimit = max | ||
} | ||
|
||
return &drivers.TaskConfig{ | ||
ID: fmt.Sprintf("%s/%s/%s", alloc.ID, task.Name, invocationid), | ||
Name: task.Name, | ||
|
@@ -988,7 +993,7 @@ func (tr *TaskRunner) buildTaskConfig() *drivers.TaskConfig { | |
Resources: &drivers.Resources{ | ||
NomadResources: taskResources, | ||
LinuxResources: &drivers.LinuxResources{ | ||
MemoryLimitBytes: taskResources.Memory.MemoryMB * 1024 * 1024, | ||
MemoryLimitBytes: memoryLimit * 1024 * 1024, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updating the linux resources is intended to ease drivers implementation and adoption of the features: drivers that use Drivers that use NomadResources will need to updated to track the new field value. Given that tasks aren't guaranteed to use up the excess memory limit, this is a reasonable compromise. I don't know the breakdown of how external 3rd party drivers check memory limit, but happy to change the default. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
So if they don't get updated, they'll just end up setting their limit equal to the From a customer/community impact standpoint, the two I'd worry the most about are There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, the failure mode is ignoring memory_max and behaving like today. I'm researching soft vs hard limits a bit now, and will ensure |
||
CPUShares: taskResources.Cpu.CpuShares, | ||
PercentTicks: float64(taskResources.Cpu.CpuShares) / float64(tr.clientConfig.Node.NodeResources.Cpu.CpuShares), | ||
}, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this iteration, I've opted to simply add a
memory_max
field in the job spec, withmemory
remaining as the "reserve"/base memory requirement. Happy to reconsider this and use an alternative name for the "base", e.g.memory_reserve
,memory_required
?I considered
memory_min
- but I find it ambiguous.min
indicates the minimum memory a task uses rather than how much memory we should reserve/allocate for the task.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like that never really got resolved on the RFC, but I'm totally 👍 for this. It avoids any migration issues later, too.