Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k0s reset deleted all data on persistent volumes #4318

Closed
4 tasks done
devsjc opened this issue Apr 20, 2024 · 11 comments · Fixed by #5186 or #5193 · May be fixed by #5187
Closed
4 tasks done

k0s reset deleted all data on persistent volumes #4318

devsjc opened this issue Apr 20, 2024 · 11 comments · Fixed by #5186 or #5193 · May be fixed by #5187
Assignees
Labels
bug Something isn't working

Comments

@devsjc
Copy link

devsjc commented Apr 20, 2024

Before creating an issue, make sure you've checked the following:

  • You are running the latest released version of k0s
  • Make sure you've searched for existing issues, both open and closed
  • Make sure you've searched for PRs too, a fix might've been merged already
  • You're looking at docs for the released version, "main" branch docs are usually ahead of released versions.

Platform

Linux 6.1.0-18-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux

Version

v1.29.3+k0s.0

Sysinfo

`k0s sysinfo`
Machine ID: "bf8222b2f95c64426d710b2c00cc6e5eb49618a69a861782356308a69f99328c" (from machine) (pass)
Total memory: 15.3 GiB (pass)
Disk space available for /var/lib/k0s: 13.5 GiB (pass)
Name resolution: localhost: [::1 127.0.0.1] (pass)
Operating system: Linux (pass)
  Linux kernel release: 6.1.0-18-amd64 (pass)
  Max. file descriptors per process: current: 1048576 / max: 1048576 (pass)
  AppArmor: active (pass)
  Executable in PATH: modprobe: exec: "modprobe": executable file not found in $PATH (warning)
  Executable in PATH: mount: /usr/bin/mount (pass)
  Executable in PATH: umount: /usr/bin/umount (pass)
  /proc file system: mounted (0x9fa0) (pass)
  Control Groups: version 2 (pass)
    cgroup controller "cpu": available (is a listed root controller) (pass)
    cgroup controller "cpuacct": available (via cpu in version 2) (pass)
    cgroup controller "cpuset": available (is a listed root controller) (pass)
    cgroup controller "memory": available (is a listed root controller) (pass)
    cgroup controller "devices": unknown (warning: insufficient permissions, try with elevated permissions)
    cgroup controller "freezer": available (cgroup.freeze exists) (pass)
    cgroup controller "pids": available (is a listed root controller) (pass)
    cgroup controller "hugetlb": available (is a listed root controller) (pass)
    cgroup controller "blkio": available (via io in version 2) (pass)
  CONFIG_CGROUPS: Control Group support: built-in (pass)
    CONFIG_CGROUP_FREEZER: Freezer cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_PIDS: PIDs cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_DEVICE: Device controller for cgroups: built-in (pass)
    CONFIG_CPUSETS: Cpuset support: built-in (pass)
    CONFIG_CGROUP_CPUACCT: Simple CPU accounting cgroup subsystem: built-in (pass)
    CONFIG_MEMCG: Memory Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_HUGETLB: HugeTLB Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_SCHED: Group CPU scheduler: built-in (pass)
      CONFIG_FAIR_GROUP_SCHED: Group scheduling for SCHED_OTHER: built-in (pass)
        CONFIG_CFS_BANDWIDTH: CPU bandwidth provisioning for FAIR_GROUP_SCHED: built-in (pass)
    CONFIG_BLK_CGROUP: Block IO controller: built-in (pass)
  CONFIG_NAMESPACES: Namespaces support: built-in (pass)
    CONFIG_UTS_NS: UTS namespace: built-in (pass)
    CONFIG_IPC_NS: IPC namespace: built-in (pass)
    CONFIG_PID_NS: PID namespace: built-in (pass)
    CONFIG_NET_NS: Network namespace: built-in (pass)
  CONFIG_NET: Networking support: built-in (pass)
    CONFIG_INET: TCP/IP networking: built-in (pass)
      CONFIG_IPV6: The IPv6 protocol: built-in (pass)
    CONFIG_NETFILTER: Network packet filtering framework (Netfilter): built-in (pass)
      CONFIG_NETFILTER_ADVANCED: Advanced netfilter configuration: built-in (pass)
      CONFIG_NF_CONNTRACK: Netfilter connection tracking support: module (pass)
      CONFIG_NETFILTER_XTABLES: Netfilter Xtables support: module (pass)
        CONFIG_NETFILTER_XT_TARGET_REDIRECT: REDIRECT target support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_COMMENT: "comment" match support: module (pass)
        CONFIG_NETFILTER_XT_MARK: nfmark target and match support: module (pass)
        CONFIG_NETFILTER_XT_SET: set target and match support: module (pass)
        CONFIG_NETFILTER_XT_TARGET_MASQUERADE: MASQUERADE target support: module (pass)
        CONFIG_NETFILTER_XT_NAT: "SNAT and DNAT" targets support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: "addrtype" address type match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_CONNTRACK: "conntrack" connection tracking match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_MULTIPORT: "multiport" Multiple port match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_RECENT: "recent" match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_STATISTIC: "statistic" match support: module (pass)
      CONFIG_NETFILTER_NETLINK: module (pass)
      CONFIG_NF_NAT: module (pass)
      CONFIG_IP_SET: IP set support: module (pass)
        CONFIG_IP_SET_HASH_IP: hash:ip set support: module (pass)
        CONFIG_IP_SET_HASH_NET: hash:net set support: module (pass)
      CONFIG_IP_VS: IP virtual server support: module (pass)
        CONFIG_IP_VS_NFCT: Netfilter connection tracking: built-in (pass)
        CONFIG_IP_VS_SH: Source hashing scheduling: module (pass)
        CONFIG_IP_VS_RR: Round-robin scheduling: module (pass)
        CONFIG_IP_VS_WRR: Weighted round-robin scheduling: module (pass)
      CONFIG_NF_CONNTRACK_IPV4: IPv4 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_REJECT_IPV4: IPv4 packet rejection: module (pass)
      CONFIG_NF_NAT_IPV4: IPv4 NAT: unknown (warning)
      CONFIG_IP_NF_IPTABLES: IP tables support: module (pass)
        CONFIG_IP_NF_FILTER: Packet filtering: module (pass)
          CONFIG_IP_NF_TARGET_REJECT: REJECT target support: module (pass)
        CONFIG_IP_NF_NAT: iptables NAT support: module (pass)
        CONFIG_IP_NF_MANGLE: Packet mangling: module (pass)
      CONFIG_NF_DEFRAG_IPV4: module (pass)
      CONFIG_NF_CONNTRACK_IPV6: IPv6 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_NAT_IPV6: IPv6 NAT: unknown (warning)
      CONFIG_IP6_NF_IPTABLES: IP6 tables support: module (pass)
        CONFIG_IP6_NF_FILTER: Packet filtering: module (pass)
        CONFIG_IP6_NF_MANGLE: Packet mangling: module (pass)
        CONFIG_IP6_NF_NAT: ip6tables NAT support: module (pass)
      CONFIG_NF_DEFRAG_IPV6: module (pass)
    CONFIG_BRIDGE: 802.1d Ethernet Bridging: module (pass)
      CONFIG_LLC: module (pass)
      CONFIG_STP: module (pass)
  CONFIG_EXT4_FS: The Extended 4 (ext4) filesystem: module (pass)
  CONFIG_PROC_FS: /proc file system support: built-in (pass)

What happened?

k0s reset, followed by a node reboot deleted all files from all persistent volumes, irrespective of their Retain policies. Folders remained, but were completely empty.

Steps to reproduce

I have not managed to reproduce the error (thankfully!)

Expected behavior

Persistent volumes mounted with the Retain policy are untouched on a reset, only k0s' /var/lib/k0s directory gets cleaned.

Actual behavior

See "What Happened?"

Screenshots and logs

I have the full set of logs from sudo journalctl -u k0scontroller -r -U 2024-04-13 -S 2024-04-11 but I imagine a more focussed subset is more useful!

Additional context

Firstly, thanks for the great tool!

I have previously run k0s reset a fair few times without issue with no changes to the way volumes were mounted or to the services running in the cluster. All I can think that seperates this one from the others is the context of the reset regarding why it was needed:

This specific reset was prompted by an issue with the helm extensions: removal of a chart from the k0s config yaml, instead of uninstalling the chart, put the cluster in an unstartable state. The config was installed into the controller by

$ sudo k0s install controller --single -c ~/hs-infra/k0s.yaml                                                  

And k0s stop was run before making changes to the config. The only changes to k0s.yaml were from

extensions:
  helm:
    repositories:
      - name: tailscale
        url: https://pkgs.tailscale.com/helmcharts
    charts:
      - name: tailscale-operator
        namespace: tailscale
        chartname: tailscale/tailscale-operator
      - name: nginx-gateway
        namespace: nginx-gateway
        chartname: oci://ghcr.io/nginxinc/charts/nginx-gateway-fabric

to

extensions:
  helm:
    repositories: null
    charts: null

k0s start would then error with logs such as

time="2024-04-12 11:58:15" level=info msg="Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference" Chart="{k0s-addon-chart-nginx-gateway kube-system}" component=extensions_controller controller=chart controllerGroup=helm.k0sproject.io controllerKind=Chart name=k0s-addon-chart-nginx-gateway namespace=kube-system

or

time="2024-04-12 10:58:36" level=info msg="Warning: Reconciler returned both a non-zero result and a non-nil error. The result will always be ignored if the error is non-nil and the non-nil error causes reqeueuing with exponential backoff. For more details, see: https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/reconcile#Reconciler" Chart="{k0s-addon-chart-tailscale-operator kube-system}" component=extensions_controller controller=chart controllerGroup=helm.k0sproject.io controllerKind=Chart name=k0s-addon-chart-tailscale-operator namespace=kube-system

Before running a k0s reset, to try to resolve the error and start the cluster successfully, I modified /var/lib/k0s/helmhome/repositories.yaml to remove reference to the charts, but this didn't work either. Therefore I ran k0s reset as I had a few times before, and performed the node reboot as requested. However on restarting the server, all files in any mounted volumes were deleted - definitely deleted and not just hidden or moved as an inspection of available space revealed. Strangely enough the folder structure within the volumes remained, but just all empty directories.

If it helps, here's some manifest snippets!

Persistent Volume manifest example
apiVersion: v1
kind: PersistentVolume
metadata:
  name: media-pv
  labels:
    type: local
    disk: hdd
spec:
  storageClassName: manual
  capacity:
    storage: 4T
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/mnt/sda1/media"

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: appdata-pv
  labels:
    type: local
    disc: ssd
spec:
  storageClassName: manual
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/home/<user>/appdata"
Persistent Volume Claims manifest example
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: media-pvc
  namespace: default
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 4T
  volumeName: media-pv

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: appdata-pvc
  namespace: default
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  volumeName: appdata-pv
Volume mount in deployment example
apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 1
  revisionHistoryLimit: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: plex
  template:
    metadata:
      labels:
        app: plex
    spec:
      hostNetwork: True
      volumes:
        - name: plex-vol
          persistentVolumeClaim:
            claimName: appdata-pvc
        - name: media-vol
          persistentVolumeClaim:
            claimName: media-pvc
...
          volumeMounts:
            - name: plex-vol
              mountPath: /config
              subPath: plex/config
            - name: media-vol
              mountPath: /media
@devsjc devsjc added the bug Something isn't working label Apr 20, 2024
@kke
Copy link
Contributor

kke commented Apr 22, 2024

The resetcommand is a bit dangerous in any case as it doesn't require any kind of confirmation/extra flag, just trying k0s reset will immediately destroy everything.

@banjoh
Copy link

banjoh commented Apr 26, 2024

Would having a prompt make the UX better, and perhaps reduce the chance of such occurrences happening. I'd add a --force flag or something similar to allow automation tools to work.

Goes without saying, this would be a breaking change

@kke
Copy link
Contributor

kke commented Apr 26, 2024

Well, there is of course the safeguard that k0s must be not running.

The main problem in this case is that first reset tries to unmount such volumes, but even if it fails to do so, it does not abort and proceeds to recursively delete data under datadir/rundir, which includes data on any volumes mounted under them.

@devsjc
Copy link
Author

devsjc commented Apr 29, 2024

Ah that's useful to know, thanks! I didn't realise mounted volumes came under the potential data parts up for deletion during a reset. The docs reset.md says the main data stored on the node that will be purged is the k0s directory which I didn't think would include retained mounts. It does say before mentioning this though, on the topic of the deleted data locations, that

This includes, but is not limited to, the following

So I perhaps should have procedded with a little more caution! I do wonder whether a bullet point could be added to the docs there that includes something similar to this line

proceeds to recursively delete data under datadir/rundir, which includes data on any volumes mounted under them

Happy to make a PR for it, if it helps! Otherwise good to close this issue.

Copy link
Contributor

The issue is marked as stale since no activity has been recorded in 30 days

@github-actions github-actions bot added the Stale label May 29, 2024
@kke kke removed the Stale label May 30, 2024
@jnummelin
Copy link
Member

@devsjc PRs are always welcome :D

Copy link
Contributor

The issue is marked as stale since no activity has been recorded in 30 days

@github-actions github-actions bot added the Stale label Jun 30, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 8, 2024
@MichaelDausmann
Copy link

MichaelDausmann commented Oct 15, 2024

@jnummelin Please Reopen this issue. This is a serious problem and a critical design flaw (yes even 'just' clearing /var/lib/k0s) IMO. This behaviour recently caused a serious data loss incident at my organisation where a k0s reset command was executed prior to un-mounting critical data as a part of a large and complex release.

If we did not have backup policies in place, this would have been an existential problem for our organisation, as it was, it caused a week of headaches and a large Cloud Egress bill for restoring deleted data.

There is just no way that silent deletion of data is an acceptable behaviour for any software that is intended for enterprise use.

We have DevOps capability in my team, happy to contribute with some guidance.

//from pkg/cleanup/directories.go
	var dataDirMounted bool

	// search and unmount kubelet volume mounts
	for _, v := range procMounts {
		if v.Path == filepath.Join(d.Config.dataDir, "kubelet") {
			logrus.Debugf("%v is mounted! attempting to unmount...", v.Path)
			if err = mounter.Unmount(v.Path); err != nil {
				logrus.Warningf("failed to unmount %v", v.Path)
			}
		} else if v.Path == d.Config.dataDir {
			dataDirMounted = true
		}
	}
...etc
if err := os.RemoveAll(d.Config.dataDir); err != nil {

.... as discussed above, failing to unmount does not prevent the RemoveAll from executing

@twz123
Copy link
Member

twz123 commented Oct 29, 2024

I think the safest option is to implement recursive directory removal using openat2 along with RESOLVE_NO_XDEV, if available, and skip over mount points.

@ncopa
Copy link
Collaborator

ncopa commented Nov 6, 2024

We definitively need to refactor this. The problem is bigger than crossing mount points. It can delete completely unrelated stuff.

mkdir -p /tmp/camerun/netnsomething/foo
echo data > /tmp/camerun/netnsomething/foo/data
sudo mount --bind /tmp/camerun/netnsomething/foo /tmp/camerun/netnsomething/

start k0s controller --enable-worker, wait til there are pods running. stop it and run k0s reset.

and boom! the data is gone as collateral damage.

EDIT: this is a separate issue.

@ncopa ncopa self-assigned this Nov 6, 2024
@twz123
Copy link
Member

twz123 commented Nov 6, 2024

A bind mount is a mount point, isn't it? Btw, I've been investigating this for quite some days already.

ncopa added a commit to ncopa/k0s that referenced this issue Nov 8, 2024
Make sure that we don't have anything mounted under those directories so
we don't delete persistent data.

We do this py parsing /proc/mounts in reverse order as it is listed in
mount order, and then we unmount anything that is under our directories
before we delete them.

Don't umount datadir itself if it is on a separate partition/mount

Fixes k0sproject#4318

Signed-off-by: Natanael Copa <[email protected]>
ncopa added a commit to ncopa/k0s that referenced this issue Nov 8, 2024
Make sure that we don't have anything mounted under those directories so
we don't delete persistent data.

We do this py parsing /proc/mounts in reverse order as it is listed in
mount order, and then we unmount anything that is under our directories
before we delete them.

Don't umount datadir itself if it is on a separate partition/mount

Fixes k0sproject#4318

Signed-off-by: Natanael Copa <[email protected]>
ncopa added a commit to ncopa/k0s that referenced this issue Nov 8, 2024
Make sure that we don't have anything mounted under those directories so
we don't delete persistent data.

We do this py parsing /proc/mounts in reverse order as it is listed in
mount order, and then we unmount anything that is under our directories
before we delete them.

Don't umount datadir itself if it is on a separate partition/mount

Fixes k0sproject#4318

Signed-off-by: Natanael Copa <[email protected]>
@ncopa ncopa closed this as completed in 7ab81aa Nov 29, 2024
k0s-bot pushed a commit that referenced this issue Nov 29, 2024
Make sure that we don't have anything mounted under those directories so
we don't delete persistent data.

We do this py parsing /proc/mounts in reverse order as it is listed in
mount order, and then we unmount anything that is under our directories
before we delete them.

Don't umount datadir itself if it is on a separate partition/mount

Fixes #4318

Signed-off-by: Natanael Copa <[email protected]>
(cherry picked from commit 7ab81aa)
k0s-bot pushed a commit that referenced this issue Nov 29, 2024
Make sure that we don't have anything mounted under those directories so
we don't delete persistent data.

We do this py parsing /proc/mounts in reverse order as it is listed in
mount order, and then we unmount anything that is under our directories
before we delete them.

Don't umount datadir itself if it is on a separate partition/mount

Fixes #4318

Signed-off-by: Natanael Copa <[email protected]>
(cherry picked from commit 7ab81aa)
twz123 pushed a commit to twz123/k0s that referenced this issue Dec 11, 2024
Make sure that we don't have anything mounted under those directories so
we don't delete persistent data.

We do this py parsing /proc/mounts in reverse order as it is listed in
mount order, and then we unmount anything that is under our directories
before we delete them.

Don't umount datadir itself if it is on a separate partition/mount

Fixes k0sproject#4318

Signed-off-by: Natanael Copa <[email protected]>
(cherry picked from commit 7ab81aa)
(cherry picked from commit 54fb78c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment