Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config-linux: add more info about hugetlb page size #1011

Merged
merged 1 commit into from
Jun 17, 2019

Conversation

odinuge
Copy link
Contributor

@odinuge odinuge commented Jun 3, 2019

Currently the docs don't say anything about what the "pageSize" is other
than the fact that it is a string. This makes it easier for developers
to understand how it works, and may help avoiding mistakes which are
hard to spot.

This can help other doing the same mistake as runc and k8s when using hugepages, used <size>kB instead of the correct <size>KB. (opencontainers/runc#2065)

Signed-off-by: Odin Ugedal [email protected]

@odinuge odinuge force-pushed the hugetlb-pattern branch from fa389cb to 74d30d5 Compare June 3, 2019 10:18
@crosbymichael
Copy link
Member

crosbymichael commented Jun 3, 2019

LGTM

Approved with PullApprove

config-linux.md Outdated
@@ -394,6 +394,8 @@ For more information, see the kernel cgroups documentation about [HugeTLB][cgrou
Each entry has the following structure:

* **`pageSize`** *(string, REQUIRED)* - hugepage size
The value has the format "<size><unit-prefix>B" (64KB, 2MB, 1GB), and must match the <hugepagesize> of the
corresponding control file found in "/sys/fs/cgroup/hugetlb/hugetlb.<hugepagesize>.limit_in_bytes"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also specify explicitly that these are intended to be 1024-based units?

I think https://github.com/opencontainers/runc/blob/e4fa8a457544ca646e02e60d124aebb0bb7f52ad/libcontainer/cgroups/utils.go#L411-L429 is also related -- the kernel appears to use a lowercase k but uppercase for the rest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tianon! 😄

Should we also specify explicitly that these are intended to be 1024-based units?

Smart! Any suggestion about how to write it?

I think https://github.com/opencontainers/runc/blob/e4fa8a457544ca646e02e60d124aebb0bb7f52ad/libcontainer/cgroups/utils.go#L411-L429 is also related -- the kernel appears to use a lowercase k but uppercase for the rest?

The kernel does indeed use "KB" (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349), and that has been the case from day 1. To avoid the mistake, I posted a patch adding a note about it in the kernel docs too. https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git/tree/Documentation/cgroup-v1/hugetlb.txt?h=for-5.2-fixes It was applied to the cgroup-tree, and will most likely be a part of the 5.2 release 😄

I have a PR in runc to fix it here: opencontainers/runc#2065, together with this PR kubernetes/kubernetes#78495 in k8s, blocked by runc. 😄 Both PRs have some background info about the issue. Even containerd use kB, patched here: containerd/cgroups#85

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think the lowercase k is simply a result of copy-paste from docker/go-units into runc, and further into k8s etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is -- when I do an ls on the path that bit of code is looking at, I get lowercase k values:

$ ls /sys/kernel/mm/hugepages/
hugepages-1048576kB  hugepages-2048kB

Smart! Any suggestion about how to write it?

Technically, what we're talking about is officially "MiB", etc, but still commonly used both ways (which is why I figure it's probably worth calling out, since MB is still considered ambiguous).
Here's the best idea I've got, maybe someone else can do / adapt this better:

Values of <unit-prefix> are intended to be parsed using base 1024 ("1KB" = 1024, "1MB" = 1048576, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is -- when I do an ls on the path that bit of code is looking at, I get lowercase k values:

You are looking at the normal kernel memory section (idk. what they are calling it). There, all the sizes are rendered with kB (yes, lowercase). The cgroup control files (v1) we are talking about, are located in /sys/fs/cgroup/hugetlb/. When setting the huge page limits we use the cgroup control files for the given container (further down in the /sys/fs/cgroup/hugetlb dir), and not the files inside /sys/kernel/mm/hugepages/.

I totally agree that this is strange, and that it would be easier to just use the same format both places. Maybe they does something else when they implement huge pages in cgroup v2 (I dont think they have yet)..

Screenshot from 2019-06-03 21-45-09

Technically, what we're talking about is officially "MiB", etc, but still commonly used both ways (which is why I figure it's probably worth calling out, since MB is still considered ambiguous).
Here's the best idea I've got, maybe someone else can do / adapt this better:

Yee. MiB is the correct. When dealing with memory, the kernel people (and more or less all of us) always use 1 << 10, 1 << 20 etc. when talking about sizes, and in my opinion they (the kernel people) do a lot of strange and inconsistent stuff when naming these things. They never (almost never) break userspace, so I guess that is the reason stuff like this is kept the way it is.

Values of are intended to be parsed using base 1024 ("1KB" = 1024, "1MB" = 1048576, etc).

Nice! 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, thanks for the additional background!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the description here according to the discussion now @tianon! 😄

Currently the docs don't say anything about what the "pageSize" is other
than the fact that it is a string. This makes it easier for developers
to understand how it works, and may help avoiding mistakes which are
hard to spot.

Signed-off-by: Odin Ugedal <[email protected]>
@tianon
Copy link
Member

tianon commented Jun 14, 2019

LGTM

👍

Approved with PullApprove

@odinuge
Copy link
Contributor Author

odinuge commented Jun 16, 2019

CC: @crosbymichael

@crosbymichael
Copy link
Member

crosbymichael commented Jun 17, 2019

LGTM

Approved with PullApprove

@crosbymichael crosbymichael merged commit 7a49e34 into opencontainers:master Jun 17, 2019
@vbatts vbatts mentioned this pull request Jan 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants