Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Use cgroup for jvm memory limit #337

Merged
merged 3 commits into from
Apr 24, 2018

Conversation

kragniz
Copy link
Contributor

@kragniz kragniz commented Apr 17, 2018

This should cap memory limits to the amount allocated in the containers cgroup.

Use cgroup for jvm memory limit

@kragniz
Copy link
Contributor Author

kragniz commented Apr 17, 2018

Note: not currently tested with elasticsearch

@munnerz
Copy link
Contributor

munnerz commented Apr 18, 2018

Ran this https://gist.github.com/munnerz/bb19540ab547fc5d2ec1e1af0f7a2963

docker.elastic.co/elasticsearch/elasticsearch:6.2.3 = openjdk version "1.8.0_161"
docker.elastic.co/elasticsearch/elasticsearch:6.2.2 = openjdk version "1.8.0_161"
docker.elastic.co/elasticsearch/elasticsearch:6.2.1 = openjdk version "1.8.0_161"
docker.elastic.co/elasticsearch/elasticsearch:6.2.0 = openjdk version "1.8.0_161"
docker.elastic.co/elasticsearch/elasticsearch:6.1.4 = openjdk version "1.8.0_161"
docker.elastic.co/elasticsearch/elasticsearch:6.1.3 = openjdk version "1.8.0_161"
docker.elastic.co/elasticsearch/elasticsearch:6.1.2 = openjdk version "1.8.0_151"
docker.elastic.co/elasticsearch/elasticsearch:6.1.1 = openjdk version "1.8.0_151"
docker.elastic.co/elasticsearch/elasticsearch:6.1.0 = openjdk version "1.8.0_151"
docker.elastic.co/elasticsearch/elasticsearch:6.0.1 = openjdk version "1.8.0_151"
docker.elastic.co/elasticsearch/elasticsearch:6.0.0 = openjdk version "1.8.0_151"
docker.elastic.co/elasticsearch/elasticsearch:5.6.8 = openjdk version "1.8.0_161"
docker.elastic.co/elasticsearch/elasticsearch:5.6.7 = openjdk version "1.8.0_161"
docker.elastic.co/elasticsearch/elasticsearch:5.6.6 = openjdk version "1.8.0_151"
docker.elastic.co/elasticsearch/elasticsearch:5.6.5 = openjdk version "1.8.0_151"
docker.elastic.co/elasticsearch/elasticsearch:5.6.4 = openjdk version "1.8.0_141"
docker.elastic.co/elasticsearch/elasticsearch:5.6.3 = openjdk version "1.8.0_141"
docker.elastic.co/elasticsearch/elasticsearch:5.6.2 = openjdk version "1.8.0_141"
docker.elastic.co/elasticsearch/elasticsearch:5.6.1 = openjdk version "1.8.0_141"
docker.elastic.co/elasticsearch/elasticsearch:5.6.0 = openjdk version "1.8.0_141"
docker.elastic.co/elasticsearch/elasticsearch:5.5.3 = openjdk version "1.8.0_141"
docker.elastic.co/elasticsearch/elasticsearch:5.5.2 = openjdk version "1.8.0_141"
docker.elastic.co/elasticsearch/elasticsearch:5.5.1 = openjdk version "1.8.0_141"
docker.elastic.co/elasticsearch/elasticsearch:5.5.0 = openjdk version "1.8.0_131"
docker.elastic.co/elasticsearch/elasticsearch:5.4.3 = openjdk version "1.8.0_131"
docker.elastic.co/elasticsearch/elasticsearch:5.4.2 = openjdk version "1.8.0_131"
docker.elastic.co/elasticsearch/elasticsearch:5.4.1 = openjdk version "1.8.0_131"
docker.elastic.co/elasticsearch/elasticsearch:5.4.0 = openjdk version "1.8.0_131"
docker.elastic.co/elasticsearch/elasticsearch:5.3.3 = openjdk version "1.8.0_131"
docker.elastic.co/elasticsearch/elasticsearch:5.3.2 = openjdk version "1.8.0_121"
docker.elastic.co/elasticsearch/elasticsearch:5.3.1 = openjdk version "1.8.0_121"
docker.elastic.co/elasticsearch/elasticsearch:5.3.0 = openjdk version "1.8.0_92-internal"
docker.elastic.co/elasticsearch/elasticsearch:5.2.1 = openjdk version "1.8.0_92-internal"
docker.elastic.co/elasticsearch/elasticsearch:5.2.0 = openjdk version "1.8.0_92-internal"

EDIT: updated with more versions

@kragniz
Copy link
Contributor Author

kragniz commented Apr 18, 2018

cool, so 5.2.1 is the only one that won't support the cgroup flags

@munnerz
Copy link
Contributor

munnerz commented Apr 18, 2018

I've updated my comment with some more versions

@munnerz
Copy link
Contributor

munnerz commented Apr 18, 2018

@cehoffman do you have a requirement for any specific elasticsearch versions? It looks like this fix would not work for:

docker.elastic.co/elasticsearch/elasticsearch:5.3.2 = openjdk version "1.8.0_121"
docker.elastic.co/elasticsearch/elasticsearch:5.3.1 = openjdk version "1.8.0_121"
docker.elastic.co/elasticsearch/elasticsearch:5.3.0 = openjdk version "1.8.0_92-internal"
docker.elastic.co/elasticsearch/elasticsearch:5.2.1 = openjdk version "1.8.0_92-internal"
docker.elastic.co/elasticsearch/elasticsearch:5.2.0 = openjdk version "1.8.0_92-internal"

@cehoffman
Copy link

cehoffman commented Apr 18, 2018 via email

@munnerz
Copy link
Contributor

munnerz commented Apr 24, 2018

/lgtm
/approve

@jetstack-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: munnerz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@retest-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to jetstack).
Review the full test history for this PR.
Silence the bot with an /lgtm cancel comment for consistent failures.

@jetstack-bot jetstack-bot merged commit 24b3717 into jetstack:master Apr 24, 2018
@cehoffman
Copy link

cehoffman commented May 6, 2018

Haven't been able to update navigator to a version which includes this change, but I updated a docker image for elasticsearch to include these flags. I found a couple problems. For elasticsearch, these flags are ignored because the default options for the JVM have min/max heap set. From the official blog introducing these flags When these two JVM command line options are used, and -Xmx is not specified, the JVM will look at the Linux cgroup configuration.

I created a new image then removed the default heap flags. This resulted in constant OOM because of the memory required by pilot is my guess. I set the InitialRAMFraction to 2 and have had success now. This gives the JVM a good initial heap to help fragmentation, but it still will eventually OOM due to shared memory space with pilot. I think this is an argument for pilot being another container in the deployment instead of being copied to database container.

InitialRAMFraction did not work how I expected, in a 6Gi limit container the JVM allocated a max heap of only 1.5Gi and never increased it. I reverted back to using Xms and Xmx flags until the OOM due to pilot memory can be addressed.

Edit:
Also on the subject of flags. Elasticsearch needs the -Des.cgroups.hierarchy.override=/ flag to get proper monitoring results.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants