Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GlusterFS volume error : glusterfs: failed to get endpoints #6331

Closed
arrawatia opened this issue Dec 15, 2015 · 47 comments
Closed

GlusterFS volume error : glusterfs: failed to get endpoints #6331

arrawatia opened this issue Dec 15, 2015 · 47 comments
Assignees
Labels
component/storage lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2

Comments

@arrawatia
Copy link

I followed the instructions from [1] and [2] to setup Gluster FS PV but when I try to use the PVC in a pod it fails.

The error in node logs is :

Dec 15 21:48:52 node-compute-8c258955.dev.aws.qtz.io origin-node[405]: E1215 21:48:52.824595     405 glusterfs.go:89] glusterfs: failed to get endpoints glusterfs-dev[endpoints "glusterfs-dev" not found]
Dec 15 21:48:52 node-compute-8c258955.dev.aws.qtz.io origin-node[405]: E1215 21:48:52.832670     405 glusterfs.go:89] glusterfs: failed 

But the "glusterfs-dev" service and endpoint exists in the "default" namespace:

$ oc get service glusterfs-dev -n default -o yaml
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: 2015-12-15T21:32:23Z
  name: glusterfs-dev
  namespace: default
  resourceVersion: "2894020"
  selfLink: /api/v1/namespaces/default/services/glusterfs-dev
  uid: 5424624c-a373-11e5-b562-06302f7b1fe5
spec:
  clusterIP: 172.30.224.96
  portalIP: 172.30.224.96
  ports:
  - port: 1
    protocol: TCP
    targetPort: 1
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

$ oc get endpoints glusterfs-dev -n default -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  creationTimestamp: 2015-12-15T11:03:46Z
  name: glusterfs-dev
  namespace: default
  resourceVersion: "2839215"
  selfLink: /api/v1/namespaces/default/endpoints/glusterfs-dev
  uid: 82f60af3-a31b-11e5-b562-06302f7b1fe5
subsets:
- addresses:
  - ip: 172.18.1.89
  - ip: 172.18.1.90
  ports:
  - port: 1
    protocol: TCP

It looks like the glusterfs plugin looks for the endpoint in the pod namespace [3] rather that the "default" namespace as documented.

I tried creating the endpoint in the project and it worked.

Is this a document bug or a regression ? Creating endpoints per project is cumbersome as the endpoints need glusterfs cluster IPs and do not work with hostname (as per [1]) and any change in glusterfs cluster will mean that every project on the cluster needs to be updated.

[1] https://docs.openshift.org/latest/install_config/persistent_storage/persistent_storage_glusterfs.html
[2] kubernetes/kubernetes#12964
[3] https://github.com/openshift/origin/blob/master/Godeps/_workspace/src/k8s.io/kubernetes/pkg/volume/glusterfs/glusterfs.go#L86

@smarterclayton
Copy link
Contributor

The instructions are wrong - you need to create a service. See #6167

@smarterclayton
Copy link
Contributor

Docs will be created with openshift/openshift-docs#1356

@smarterclayton
Copy link
Contributor

Oops, missed that you had the service :) Regarding the mismatch, agree that is ugly.

@smarterclayton
Copy link
Contributor

@rootfs how do we fix this so end users can easily have gluster that the whole cluster can use?

@rootfs
Copy link
Member

rootfs commented Dec 16, 2015

@smarterclayton
Ideally glusterfs volume spec should embed the endpoint instead of referring to external endpoints. This is how ceph rbd volume spec does too.
To make it backward compatible, the embeded endpoint will be optionally but if supplied, it overwrites the external endpoint. If this doesn't sound too bad, I'll make an upstream PR.

@smarterclayton
Copy link
Contributor

Sounds fine to me.

On Wed, Dec 16, 2015 at 3:47 PM, Huamin Chen [email protected]
wrote:

@smarterclayton https://github.com/smarterclayton
Ideally glusterfs volume spec should embed the endpoint instead of
referring to external endpoints. This is how ceph rbd volume spec does too.
To make it backward compatible, the embeded endpoint will be optionally
but if supplied, it overwrites the external endpoint. If this doesn't sound
too bad, I'll make an upstream PR.


Reply to this email directly or view it on GitHub
#6331 (comment).

@karelstriegel
Copy link

@rootfs , @smarterclayton to be clear, to only workaround is to define the endpoints in each namespace?

@rootfs
Copy link
Member

rootfs commented Apr 7, 2016

@karelstriegel you need to have a service to associate the endpoint.

@karelstriegel
Copy link

@rootfs I have a service to associate the endpoints, both are in the default namespace. So guess I need to have both for each namespace? (as a workaround)

update: That solves my issue, but you should be able to define it cluster-wide.

@rootfs
Copy link
Member

rootfs commented Apr 7, 2016

yes, service and endpoints live in the same namespace.

@ibotty
Copy link
Contributor

ibotty commented Apr 20, 2016

@rootfs I personally like to specify the gluster endpoints separately, it is an abstraction I do like and use. What about adding a namespace key, so that the default glusterfs endpoint can be chosen, also from pods in a different namespace.

@rootfs
Copy link
Member

rootfs commented Apr 21, 2016

sure, i'll look into that.

@erictune
Copy link

Putting the server list into every pod/podTemplate does not seem to solve the OPs requirements:

any change in glusterfs cluster will mean that every project on the cluster needs to be updated.

Changing an endpoints in all namespaces is easier than changing the volume spec in all pods.

@erictune
Copy link

Also, you cannot really make an required field optional, because clients other than kubelet might be relying on the definition of the Glusterfs volume, and will be broken if a required field is not set.

@erictune
Copy link

Have you considered either (1) adding an optional "endpointsNamespace" field (which is super easy to implement but has authz implications or (2) making a thing that helps you broadcast an service's endpoints from a source namespace into headless endpoints in all other namespaces.

@smarterclayton
Copy link
Contributor

For PVC we would already have the security protection (end users can't
change endpointsNamespace) so we could be reasonably sure that any kubelet
can look up that endpoint. We already block regular users from specifying
PVC, and technically DNS already returns endpoints for everyone, so I think
it's ok for Gluster endpoints to be a reference across namespaces.

On Wed, Aug 31, 2016 at 4:49 PM, Eric Tune [email protected] wrote:

Have you considered either (1) adding an optional "endpointsNamespace"
field (which is super easy to implement but has authz implications or (2)
making a thing that helps you broadcast an service's endpoints from a
source namespace into headless endpoints in all other namespaces.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#6331 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pzGrU4rNUW6GE965EnugM5D81NZkks5qlejWgaJpZM4G2EBz
.

@erictune
Copy link

Good point about DNS. Option 1 is looking best.

@rootfs
Copy link
Member

rootfs commented Sep 1, 2016

@erictune @smarterclayton I like the namespace as a parameter pattern. I already used it in one of the PRs.

Here, server list doesn't replace endpoint. Server list is used only when endpoint is not working. It solves the following issues:

  • We have to ensure a headless service uses the endpoint to keep it alive, This is really a poor user experience. In case endpoint is not associated with a service and gets lost, server list is our insurance.
  • The server list approach makes more sense from volume provisioning perspective. When gluster plugin uses Heketi (the external gluster provisioner) to provision a gluster volume, the plugin gets gluster hosts and volume path.

If the hosts are in the endpoint, all is good. But if the hosts are not in the endpoint, then we have a split brain problem. If we decide to go option (1), then we have to either create a new endpoint (and the headless service) or update an existing endpoint, a quite unprecedented pattern. But if we use the hosts in the proposed server list, then we at least can get updated working hosts.

@humblec
Copy link
Contributor

humblec commented Sep 1, 2016

My thoughts are same as what @rootfs shared here. Also, to ensure the backward compatibility, the endpoint is still kept and the preference is given to endpoint than the servers list. The servers list is only used when endpoint is not working.

@erictune
Copy link

@humblec When I looked at kubernetes/kubernetes#33020 it sounds like it allows you to reference an existing endpoints in another namespace. For example, if my namespace is called "ericsns" and there is a gluster cluster in the "gluster" namespace, then I can refer to that and kubelet will use those other endpoints.

But, your description in your last comment uses the words dynamically create the endpoint/service when we provision the volume. I didn't see anything about dynamic creation in that PR.

@humblec
Copy link
Contributor

humblec commented Sep 20, 2016

@erictune yes, in the new PR, the provided 'endpointnamespace' is used. Regarding dynamic creation of endpoint/service , the provisioner dynamically collect the cluster IPs and create an endpoint/service if it doesnt exist https://github.com/kubernetes/kubernetes/pull/33020/files#diff-e97253dd603331ffca81131a4b67264fR555

@liggitt
Copy link
Contributor

liggitt commented Sep 20, 2016

So this would allow me to make a kubelet create a service and endpoint for me in someone else's namespace by creating a pod with a specially crafted glustervolumesource? That's not something we should enable

humblec added a commit to humblec/kubernetes that referenced this issue Sep 21, 2016
@eparis
Copy link
Member

eparis commented Sep 21, 2016

@liggitt I'm not sure I understand why a kubelet would ever create a service and endpoint. Although I haven't read to see if the patch is doing something different that it should.

The idea was supposed to be that the provisioner, the trusted component which creates PVs, will create, in a namespace controlled by the provisioner configuration (lets call it provisionNS), the service and endoints (endpointName). It will then create the PV listing both provisionNS and endpointName. The end user should have NO control over this.

While this may be triggered by an end user creating a PVC, which causes the provisioner to do its work, the end user, nor the kubelet, create services, endpoints, etc. The provisioner does that. The kubelet will USE the provisionNS/endpointName tuple in order to look up the server location and mount the storage.

If this is not what the code is doing we have a problem.

@liggitt
Copy link
Contributor

liggitt commented Sep 21, 2016

The GlusterVolumeSource object can be part of a PersistentVolume, or part of a Volume definition inside a pod spec. I'm primarily concerned about adding a namespace field to the latter (though the more params we add to PV provisioners that end users shouldn't be able to see/modify, the harder ACL will be). I would not want the possibility of part of the system acting in another namespace thinking the GlusterVolumeSource came from a PersistentVolume under cluster admin control, when in reality it came from a pod spec Volume under user control.

@eparis
Copy link
Member

eparis commented Sep 21, 2016

So that I thought we discussed as well. Although now I have a new 'concern' which you just brought up. I'd been thinking that it was irrelevant if we let a user set the endpointName and endpointNamespace in the pod spec volume definition.

But I'm afraid that is not true. While discovering the IP addresses which are part of the endpoint in another namespace does not seem security sensitive we are subtly changing how those can be used. Since this will cause the kubelet to connect to those endpoints, instead of the pod connecting to those endpoints. Connections from the kubelet may have different access permissions as connections from the pod. So @liggitt is right, we do have the system acting on behalf of something the user can control. Uggh.

Is the only answer to abandon endpoints altogether?

@erictune
Copy link

I agree with Jordan that cross-namespace references are to be avoided for
security reasons.

Another reason to avoid them is that they expose an implementation detail.
For example, if the cluster admin decides to run gluster in the "gluster"
namespace, then all the consumers have to refer to
"endpointsns: gluster, endpoint: default".

But, what happens if, say, the cluster admin decides to try out a new
version of gluster running in namespace "gluster-new".
And the cluster admin wants to move 10% of namespaces to the new storage
system. How would the admin even do that?

I think that the service producer and consumer need to be loosely coupled,
and endpointsns does not accomplish that.

Some other options to explore:

Continue to use a local-to-consumer-namespace endpoints, and try harder to
make sure that endpoints was always correct (such as by having the cluster
admin run a controller to reconcile it.) (I still don't understand why
this is not an option).

Refer to the gluster cluster via a DNS name, which could be, but is not
necessarily a KubeDNS name. If it is not a KubeDNS name, but it is a DNS
suffix that is controlled by the cluster admin, then the cluster admin can
retarget it. Would it be possible to do split-horizon DNS for this based
on the PodIP? Does DNS see PodIP as the source IP? Or is it the node IP
of kubelet doing the request? hostNet? Gets confusing.

On Wed, Sep 21, 2016 at 8:23 AM, Eric Paris [email protected]
wrote:

So that I thought we discussed as well. Although now I have a new
'concern' which you just brought up. I'd been thinking that it was
irrelevant if we let a user set the endpointName and endpointNamespace in
the pod spec volume definition.

But I'm afraid that is not true. While discovering the IP addresses which
are part of the endpoint in another namespace does not seem security
sensitive we are subtly changing how those can be used. Since this will
cause the kubelet to connect to those endpoints, instead of the pod
connecting to those endpoints. Connections from the kubelet may have
different access permissions as connections from the pod. So @liggitt
https://github.com/liggitt is right, we do have the system acting on
behalf of something the user can control. Uggh.

Is the only answer to abandon endpoints altogether?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#6331 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHuudmlYZaKfBC9D_T7oCHXPnmMwfwlAks5qsUvmgaJpZM4G2EBz
.

@erictune
Copy link

Okay, prepare for backpedaling.

  • the security concerns might be mitigated if the cluster admin maintained
    a namespace whose only purpose was to hold endpoints for "global
    namespaces", and didn't hold anything else interesting. You could make
    sure that kubelets didn't have write permission to that ns, if you wanted
    to. Or you could make it so that all namespaces can read endpoints in this
    namespace, and make the kubelet act as the Pod's service account when
    resolving the endpoints (using Impersonate-User)
  • the fact that users have the option of maintaining their own local
    endpoints means it is harder for a cluster admin to "move" the location of
    the gluster cluster. The pattern that Eric Paris descrbes does allow for
    this, because the provisioner can change the contents of the endpoints
    object.
  • If the endpoints are resolved once and then never re-resolved, this is
    going to cause a problem at some point. Periodic re-resolution will be
    more flexible. This could be via DNS, but I guess it could be by a Kube
    client periodically re-resolving endpoints.

On Wed, Sep 21, 2016 at 8:32 AM, Eric Tune [email protected] wrote:

I agree with Jordan that cross-namespace references are to be avoided for
security reasons.

Another reason to avoid them is that they expose an implementation detail.
For example, if the cluster admin decides to run gluster in the "gluster"
namespace, then all the consumers have to refer to
"endpointsns: gluster, endpoint: default".

But, what happens if, say, the cluster admin decides to try out a new
version of gluster running in namespace "gluster-new".
And the cluster admin wants to move 10% of namespaces to the new storage
system. How would the admin even do that?

I think that the service producer and consumer need to be loosely coupled,
and endpointsns does not accomplish that.

Some other options to explore:

Continue to use a local-to-consumer-namespace endpoints, and try harder to
make sure that endpoints was always correct (such as by having the cluster
admin run a controller to reconcile it.) (I still don't understand why
this is not an option).

Refer to the gluster cluster via a DNS name, which could be, but is not
necessarily a KubeDNS name. If it is not a KubeDNS name, but it is a DNS
suffix that is controlled by the cluster admin, then the cluster admin can
retarget it. Would it be possible to do split-horizon DNS for this based
on the PodIP? Does DNS see PodIP as the source IP? Or is it the node IP
of kubelet doing the request? hostNet? Gets confusing.

On Wed, Sep 21, 2016 at 8:23 AM, Eric Paris [email protected]
wrote:

So that I thought we discussed as well. Although now I have a new
'concern' which you just brought up. I'd been thinking that it was
irrelevant if we let a user set the endpointName and endpointNamespace in
the pod spec volume definition.

But I'm afraid that is not true. While discovering the IP addresses which
are part of the endpoint in another namespace does not seem security
sensitive we are subtly changing how those can be used. Since this will
cause the kubelet to connect to those endpoints, instead of the pod
connecting to those endpoints. Connections from the kubelet may have
different access permissions as connections from the pod. So @liggitt
https://github.com/liggitt is right, we do have the system acting on
behalf of something the user can control. Uggh.

Is the only answer to abandon endpoints altogether?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#6331 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHuudmlYZaKfBC9D_T7oCHXPnmMwfwlAks5qsUvmgaJpZM4G2EBz
.

@eparis
Copy link
Member

eparis commented Sep 21, 2016

@erictune although a kube-DNS solution might be even better. Instead of listing ANY endpoints in the volume list a DNS address. That address could obviously be of the form endpointName.endpointNamespace, but we would not have any cross API object reference. Only the indirect reference via the DNS results...

Not sure that really helps the fact that the user can cause the kubelet to act on their behalf connecting to this address. Then again no solution involving the in pod volume definitions can overcome these issues...

@erictune
Copy link

Impersonate-User feature would address the latter issue.

@humblec
Copy link
Contributor

humblec commented Sep 22, 2016

Just to clarify what the code does now, if the endpoint namespace is mentioned in storage class parameter, the provisioner create endpoint/svc in the mentioned namespace if those didnt exist. As a second case, if 'endpoint namespace' is filled directly via pv volume source spec or specially crafted glustervolumesource, at time of mounting, kubelet give preference to the ep/svc existence in mentioned namespace and use it. If no namespace is mentioned ( as its optional) in the volume source, 'pod's namepsace' is used by kubelet ie only access happens and no creation of ep/svc happens in the latter case.

@rootfs
Copy link
Member

rootfs commented Sep 29, 2016

@erictune @eparis @liggitt DNS looks good solution, can I assume this is the way to go?

@lpabon
Copy link

lpabon commented Oct 25, 2016

Is this closed by kubernetes/kubernetes#31854 ?

@ReToCode
Copy link

ReToCode commented Sep 8, 2017

Hello,

A customer using glusterfs in a pretty large OpenShift cluster (3k containers, > 400 PVs) here :)

No I think this is not closed by kubernetes/kubernetes#31854. With heketi you just create/delete the service & endpoints dynamically, but the problem still remains.

I still don't like that gluster pv definition is different from all the other storage types. Why are not all the information inside the gluster pv object like with other storage types? The current solution has several downsides:

  • Mixes things that cluster admins (PV) manage with things project admins manage (Services, Endpoints). Also exposes information to the projects that are not relevant to them.
  • Noise in UI/exports/templates/cli and so on
  • Redundancy (we have >500 namespaces) so we would have 500x the same services/endpoints
  • Updating gluster server IPs is a pain! (even if heketi takes care of parts of it)
  • Project admins will (an already did) delete the configs by mistake (read-only seems like a hack...)

I know about your limitations with changing the existing object. Here a few ideas:

kind: PersistentVolume
spec:
  ...
  glusterfs:
    path: MyVolume
    endpoints: glusterfs-cluster
    namespace: global-gluster-config  # new optional field

If this field is not specified it takes the data from the current namespace. If specified and no access right just write an event with an error message. We would have no problem to add this config to one of our global namespaces where everyone has read permissions.

Or even better, add the information directly to the pv object:

kind: PersistentVolume
spec:
  ...
  glusterfs:
    path: MyVolume
    glusterIPs:  # new optional list
    - 10.1.1.1
    - 10.1.1.2
    endpoints: # make optional if possible or we just put a dummy string in there

Then if glusterIPs is present, take these IPs, otherwise fallback to current solution.

DNS seems also like a good way to go. I would like to "re-activate" the discussion around this topic.

@humblec
Copy link
Contributor

humblec commented Sep 8, 2017

HI @ReToCode, you got most of it, one blocker was backward compatibility or the difficulty to just discard endpoint from spec. What we are working on atm is endpoint namespace in PV spec and we are on it, it should take are of this issue.

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 19, 2018
@eduardobaitello
Copy link

eduardobaitello commented Feb 23, 2018

Is that problem solved? I'm in the same situation, my GlusterFS endpoints are in the default namespace and pods in another namespaces can't got them.

Too bad that this issue is stalled :(

@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 25, 2018
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@hostingnuggets
Copy link

hostingnuggets commented Nov 7, 2018

Same problem here... @eduardobaitello and @arrawatia did you manage to find a workaround this issue?

@smarterclayton and other OpenShift members: please re-open this issue and investigate for a solution or at least workaround. As it is right now the GlusterFS persitant storage in OpenShift is useless...

/reopen
/remove-lifecycle rotten

@eduardobaitello
Copy link

@hostingnuggets I used the ugly workaround: Create an endpoint and a headless service for every namespace in the cluster that has pods using GlusterFS...

@hostingnuggets
Copy link

@eduardobaitello thanks for your answer. Uh oh yes that's really ugly but I tried it out and it works... This means that an admin needs to create an endpoint and service for each project within the namespace of that project :-(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/storage lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2
Projects
None yet
Development

No branches or pull requests