Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission Denied and Directory "/var/lib/postgresql/data/pg" exists but is not empty with NFS PVC #792

Closed
danroot opened this issue Nov 30, 2020 · 7 comments
Labels
question Usability question, not directly related to an error with the image

Comments

@danroot
Copy link

danroot commented Nov 30, 2020

I am attempting to set up a postgres container with data files stored in an NFS share. When I attempt this the pod fails with status of CrashLoopBackOff. Using kubectl logs , I think it first fails with:

fixing permissions on existing directory /var/lib/postgresql/data ... ok
initdb: error: could not create directory "/var/lib/postgresql/data/pg_wal/archive_status": Permission denied
initdb: removing contents of data directory "/var/lib/postgresql/data"
initdb: warning: could not open directory "/var/lib/postgresql/data/global": Permission denied
initdb: warning: could not open directory "/var/lib/postgresql/data/pg_wal": Permission denied
initdb: error: failed to remove contents of data directory
creating subdirectories ...

Then subsequent initialization attempts fail with:

initdb: error: directory "/var/lib/postgresql/data/pg" exists but is not empty
If you want to create a new database system, either remove or empty
the directory "/var/lib/postgresql/data/pg" or run initdb
with an argument other than "/var/lib/postgresql/data/pg".

As best I can tell, what is happening is:

  1. the container starts and sees the data folder empty
  2. it creates the folders pg_wal and global
  3. it tries to access them and gets permission denied
  4. it attempts chown -R 999:999 $PGDATA
  5. it continues failing init
  6. subsequent init attempts fail because pg_wal and global are in the folder.

I see from other issue reports that there have been some changes around this as it previously overwrote data in $PGDATA. I am hoping someone can point me in the right direction on this. In the end I don't need HA or great performance. I just want a small single replica postgres pod where the data is persisted to our storage array (which can expose NFS and other flavors) and not local storage on the kubernetes host.

A couple specific questions:

  1. How can I best modify the yaml below so that this scenario works with no errors?
  2. Why must $PGDATA be empty or be overwritten? My expected behavior is that IF database files exist in the folder, it mounts them and uses them. Is there a flag or something to cause this behavior?

To repro:
Ensure you have a NFS share exposed and update the yaml below with the correct paths. Then run kubectl apply -f nameoffile.yaml
Run kubectl get pods, observe Error, then CrashLoopBackOff
Run kubectl log pod to observe behavior described above

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: harbor-database-pv  
spec:
  capacity:
    storage: 5Gi 
  accessModes:
  - ReadWriteMany
  nfs: 
    path: /nfsserver/somepath/database
    server: platform.mycompany.com
---              
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
   name: harbor-database-pvc
spec:
   accessModes:
     - ReadWriteMany
   storageClassName: ""
   resources:
     requests:
       storage: "1Gi"
   volumeName: "harbor-database-pv"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: harbor-postgres
spec:
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  replicas: 1
  selector:
    matchLabels:
      app: harbor-postgres
  template:
    metadata:
      labels:
        app: harbor-postgres
    spec:    
      containers:
      - name: postgres
        image: postgres:13
        imagePullPolicy: "Always"
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_USER
          value: pgbench
        - name: PGUSER
          value: pgbench
        - name: POSTGRES_PASSWORD
          value: postgres@123
        - name: PGBENCH_PASSWORD
          value: superpostgres
        - name: PGDATA
          value: /var/lib/postgresql/data/pg
        volumeMounts:
        - mountPath: /var/lib/postgresql/data
          name: postgredb
      volumes:
      - name: postgredb
        persistentVolumeClaim:
          claimName: harbor-database-pvc
@wglambert wglambert added the question Usability question, not directly related to an error with the image label Nov 30, 2020
@wglambert
Copy link

This looks relevant #116 (comment)

But otherwise I would try asking over at the Docker Community Forums, Docker Community Slack, or Stack Overflow. Since the issue is related to the host environment and not something we could alleviate in the image

@yosifkit
Copy link
Member

yosifkit commented Dec 1, 2020

Likely you need to set the container to run as the uid of the owner of the NFS mount.

Similar related issues:
#589, #563, #361, #213

@danroot
Copy link
Author

danroot commented Dec 3, 2020

@wglambert yes, that issue does seem relevant. I'll try to go through each of the suggested remedies again. I tried several prior to opening this issue with no luck. For example, the annotation jsvp mentions (volume.beta.kubernetes.io/mount-options: "dir_mode=0777,file_mode=0777,uid=1000,gid=1000") is not relevant to nfs pvs.

@yosifkit I think you're right it is related to the uid:gid of the container and the nfs mount, but I'm unable to find any combination of chmod, chown, etc. that works. I tinkered with setting securityContext for the container to no avail. So I think you're right on the cause, but am not versed enough in k8s or even basic linux/nfs permissions to resolve. I've tried several of the recommendations in those issues, but will go back through them to be sure I didn't miss anything.

The reason I feel this belongs in docker-hub/postgres and not the main postgres repo is that a lot of the permissions changes are happening in the image. For example: line 15-22 and 182 in Dockerfile. In the end, I think something about those lines is breaking for NFS PVs or at least our particular NFS setup. I do have a SO post active on this as well and will update here if I get resolution there.

@rnantes
Copy link

rnantes commented Dec 20, 2020

@yosifkit I am doing what you suggest by setting uid and gid to the owner of the directory as well as replacing the container /etc/passwd/ with the host /etc/passwd, however I am still getting permission errors such as

[28] LOG: could not link file "pg_wal/xlogtemp.28" to "pg_wal/000000010000000000000001": Operation not permitted [28] FATAL: could not open file "pg_wal/000000010000000000000001": No such file or directory

@arkonsolutions
Copy link

Hello. Wokring for me:

HOST:
sudo useradd -u 999 postgres
cd /path/to/postgres/data/on/host
sudo chown 999 .

@zero1zero
Copy link

After a very frustrating couple evenings working through this, it is due to the minikube Vagrant implementation (vs the docker default). Moving away from Vagrant resolved all these.

@tianon
Copy link
Member

tianon commented Jun 10, 2022

For further assistance debugging this (and/or issues like it), I'd suggest trying a dedicated support forum, such as the Docker Community Forums, the Docker Community Slack, or Stack Overflow.

@tianon tianon closed this as completed Jun 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Usability question, not directly related to an error with the image
Projects
None yet
Development

No branches or pull requests

7 participants