Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: add namespace adjustment #124

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tych0
Copy link

@tych0 tych0 commented Nov 22, 2024

We are interested in running some parts of a pod in host or totally separate pid and network namespaces, so add an adjustment that allows for that.

pkg/adaptation/result.go Outdated Show resolved Hide resolved
withNamespaces([]rspec.LinuxNamespace{
{
Type: rspec.PIDNamespace,
Path: "/meshuggah.rocks",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆

Copy link
Member

@dcantah dcantah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after the error msg fix

We are interested in running some parts of a pod in host or totally
separate pid and network namespaces, so add an adjustment that allows for
that.

Signed-off-by: Tycho Andersen <[email protected]>
@tych0 tych0 force-pushed the adjust-namespaces branch from 4403b52 to 698e6ae Compare November 25, 2024 14:29
@klihub klihub requested review from fuweid and samuelkarp November 26, 2024 09:13
Copy link
Member

@klihub klihub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @mikebrow WDYT ?

I suspect this PR (and the #123 seccomp one) might revive the discussion around the possibility to administratively lock down some NRI adjustments via configuration.

@mikebrow
Copy link
Member

mikebrow commented Dec 2, 2024

Need more detail on the use case. What is a "part" of a pod. E.g. the network namespace, well there is currently only one network namespace for the shared networks of the pod and all of it's containers.. and that network namespace is host, type pod generated by the container runtime, or type user namespace pod as directed by kubelet based on the pod spec and in this case the runc runtime engine creates the netns under the user namespace. Is this some sort of non-k8s use case for linux distros to support pods such as podman pods? Need to understand these use cases to understand where and how to manage these security / isolation changes, possibly on a client basis and possibly under a new non k8s.io namespace.

@tych0
Copy link
Author

tych0 commented Dec 2, 2024

Sure,

What is a "part" of a pod.

Specifically, one container in a pod. The rest of the pod we will leave as is. In fact,

E.g. the network namespace, well there is currently only one network namespace for the shared networks of the pod and all of it's containers..

it's exactly the network namespace that we want to change here. The rest of the pod will live in the same set of namespaces as it usually does.

Is this some sort of non-k8s use case for linux distros

It is non-k8s in the sense that the network namespace we care about is created entirely outside of k8s, and there is no k8s infrastructure for managing it. It is unrelated to any linux-distro specific thing, and has to do with Netflix' network architecture. There is an old Plumber's talk about the specifics here: https://lpc.events/event/11/contributions/932/attachments/908/1764/LPC%202021_%20Talking%20IPv6%20to%20IPv4%20Without%20NAT_2.pdf

@tych0
Copy link
Author

tych0 commented Dec 2, 2024

it's exactly the network namespace that we want to change here.

Actually it's the pid ns as well. We want to run in the parent pidns of the containers, so that we can see them to do seccomp() operations on them correctly.

@mikebrow
Copy link
Member

mikebrow commented Dec 3, 2024

thx for the detail

@champtar
Copy link
Contributor

I didn't pay attention to the open PRs and ended up doing another PR to adjust namespaces: #135
The key differences:

  • you pass only the added / modified / deleted namespaces, not the full list
  • ownership is per namespace (1 plugin add cgroup, another plugin change network)
  • there are some helper functions

I'm fine if we pick this PR in the end, I just want to be able to adjust namespaces :)

As for the security discussion, I though NRI was considered part of the runtime, ie you get NRI you get root.
Today adjusting mounts or devices you can likely already escape to the host, so I don't think adjusting namespaces or seccomp changes anything security wise, it's already wide open.

Comment on lines +338 to +343
func (g *Generator) AdjustNamespaces(namespaces []*nri.LinuxNamespace) error {
for _, ns := range namespaces {
if err := g.AddOrReplaceLinuxNamespace(ns.Type, ns.Path); err != nil {
return err
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit you say We are interested in running some parts of a pod in host but if I understand this correctly you can't remove a namespace, correct ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants