Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DietPi-Software | Uninstalling netdata breaks system #2336

Closed
Node815 opened this issue Dec 9, 2018 · 8 comments
Closed

DietPi-Software | Uninstalling netdata breaks system #2336

Node815 opened this issue Dec 9, 2018 · 8 comments
Labels
Milestone

Comments

@Node815
Copy link

Node815 commented Dec 9, 2018

Details:

  • Date | Sat 8 Dec 21:21:09 PST 2018
  • Bug report | N/A
  • DietPi version | v6.19.6 (Fourdee/master)
  • Img creator | DietPi Core Team
  • Pre-image | Raspbian Lite
  • SBC device | RPi 2 Model B (armv7l) (index=2)
  • Kernel version | Letsencrypt supports Free Noip.com Dynamic DNS #1159 SMP Sun Nov 4 17:50:20 GMT 2018
  • Distro | stretch (index=4)
  • Command | G_AGP: netdata
  • Exit code | 100
  • Software title | DietPi-Software

Steps to reproduce:

Use dietpi-software to uninstall Netdata

Expected behaviour:

Netdata should uninstall

Actual behaviour:

Error as mentioned below

Extra details:

This is a fresh install of Dietpi with the only modifications made were to install netdata, OpenSSH server and the LCD driver for the Waveshare2 LCD

Additional logs:

Log file contents:
Reading package lists...
Building dependency tree...
Reading state information...
The following packages will be REMOVED:
  netdata*
0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
After this operation, 8,057 kB disk space will be freed.
dpkg: warning: 'ldconfig' not found in PATH or not executable
dpkg: warning: 'start-stop-daemon' not found in PATH or not executable
dpkg: error: 2 expected programs not found in PATH or not executable
Note: root's PATH should usually contain /usr/local/sbin, /usr/sbin and /sbin
E: Sub-process /usr/bin/dpkg returned an error code (2)
@MichaIng
Copy link
Owner

MichaIng commented Dec 9, 2018

@PDXUser
Thanks for your report.

Strange, ldconfig and start-stop-daemon should be fixed parts of the core system.


Just tried it here. And jep, somewhere on uninstall, the binaries from $PATH suddenly not available anymore 🤔:

 DietPi-Software
─────────────────────────────────────────────────────
 Mode: Uninstalling NetData: real-time performance monitoring

userdel: netdata mail spool (/var/mail/netdata) not found
userdel: error removing directory /
/DietPi/dietpi/dietpi-software: line 13735: groupdel: command not found
/DietPi/dietpi/func/dietpi-globals: line 1650: dpkg: command not found
[ INFO ] DietPi-Software | Not installed, ignoring: netdata
[ INFO ] DietPi-Software | None of the requested packages are currently installed. Aborting...
[  OK  ] DietPi-Software | G_AGP: netdata
rm: cannot remove '/usr/sbin/netdata': No such file or directory
rm: cannot remove '/usr/share/netdata': No such file or directory
rm: cannot remove '/usr/libexec/netdata': No such file or directory

 DietPi-Software
─────────────────────────────────────────────────────
 Mode: Finalize uninstall

[  OK  ] DietPi-Software | APT autoremove + purge, please wait...
/DietPi/dietpi/func/dietpi-globals: line 1709: tee: command not found

/DietPi/dietpi/func/dietpi-globals: line 1137: tail: command not found
[FAILED] DietPi-Software | G_AGA
ps: error while loading shared libraries: liblz4.so.1: cannot open shared object file: No such file or directory

$PATH var is good, binaries still in place, but somehow they are not searched in every dir:

root@VM-Stretch:~# echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
root@VM-Stretch:~# l /bin/which
-rwxr-xr-x 1 root root 946 Apr  2  2017 /bin/which
root@VM-Stretch:~# which which
-bash: /usr/bin/which: No such file or directory
root@VM-Stretch:~# l /usr/bin/which
ls: cannot access '/usr/bin/which': No such file or directory
  • which is available ....
    BOOM, /usr path has been completely removed O_O:
root@VM-Stretch:~# l /
total 72
drwxrwxrwt  3 root root   100 Dec  9 14:13 DietPi
-rwxr-xr-x  1 root root   673 Oct 24 20:29 bench
-rw-r--r--  1 root root    31 Oct 24 20:29 bench_result
drwxr-xr-x  2 root root  4096 Nov 13 17:34 bin
drwxr-xr-x  4 root root  4096 Nov 13 17:34 boot
drwxr-xr-x 15 root root  2860 Dec  9 14:13 dev
drwxr-xr-x 59 root root  4096 Dec  9 14:23 etc
drwxr-xr-x  3 root root  4096 Oct  6 16:11 home
lrwxrwxrwx  1 root root    29 Aug 23 21:57 initrd.img -> boot/initrd.img-4.9.0-8-amd64
lrwxrwxrwx  1 root root    29 Aug 23 22:02 initrd.img.old -> boot/initrd.img-4.9.0-8-amd64
drwxr-xr-x 14 root root  4096 Oct  6 14:57 lib
drwxr-xr-x  2 root root  4096 Jul 20 12:57 lib64
drwx------  2 root root 16384 Jul 20 12:55 lost+found
drwxr-xr-x  3 root root  4096 Nov  8 20:11 mnt
drwxr-xr-x  2 root root  4096 Jul 20 12:57 opt
dr-xr-xr-x 83 root root     0 Dec  9 14:13 proc
drwx------  5 root root  4096 Dec  4 15:41 root
drwxr-xr-x 14 root root   420 Dec  9 14:13 run
drwxr-xr-x  2 root root  4096 Nov 13 17:34 sbin
drwxr-xr-x  2 root root  4096 Jul 20 12:57 srv
dr-xr-xr-x 13 root root     0 Dec  9 14:25 sys
drwxrwxrwt  7 root root   140 Dec  9 14:30 tmp
drwxr-xr-x 12 root root  4096 Dec  9 14:15 var
lrwxrwxrwx  1 root root    26 Aug 23 21:57 vmlinuz -> boot/vmlinuz-4.9.0-8-amd64
lrwxrwxrwx  1 root root    26 Aug 23 22:02 vmlinuz.old -> boot/vmlinuz-4.9.0-8-amd64

@MichaIng MichaIng added this to the v6.20 milestone Dec 9, 2018
@MichaIng
Copy link
Owner

MichaIng commented Dec 9, 2018

Due to: useradd -r netdata -c netdata -s /usr/sbin/nologin -d /

  • Users home dir is root. That is nogo, for several reasons!

On uninstall: userdel -rf netdata

  • Script attempts to remove users home dir, so root...

Okay this is serious. As long as I can see, only /usr got removed, everything else untouched. But this already breaks system functionality completely.

@PDXUser
Do you have a backup of SDcard or from dietpi-backup?

Most basic navigation and file manipulation commands are still in place, mv, cp, cd, mkdir, rm and all these. So it's possible to copy a /usr dir from backup in place which should recover functionality.

@MichaIng
Copy link
Owner

MichaIng commented Dec 9, 2018

ToDo:

root@VM-Stretch:~# systemctl status netdata
● netdata.service - Real time performance monitoring
   Loaded: loaded (/lib/systemd/system/netdata.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2018-12-09 16:14:38 CET; 49s ago
  Process: 11212 ExecStartPre=/bin/chown -R netdata:netdata /var/run/netdata (code=exited, status=0/SUCCESS)
  Process: 11209 ExecStartPre=/bin/mkdir -p /var/run/netdata (code=exited, status=0/SUCCESS)
  Process: 11207 ExecStartPre=/bin/chown -R netdata:netdata /var/cache/netdata (code=exited, status=0/SUCCESS)
  Process: 11204 ExecStartPre=/bin/mkdir -p /var/cache/netdata (code=exited, status=0/SUCCESS)
 Main PID: 11215 (netdata)
    Tasks: 17 (limit: 4915)
   CGroup: /system.slice/netdata.service
           ├─11215 /usr/sbin/netdata -P /var/run/netdata/netdata.pid -D -W set global process scheduling policy keep -W set global OOM score keep
           ├─11240 bash /usr/libexec/netdata/plugins.d/tc-qos-helper.sh 1
           └─11252 /usr/libexec/netdata/plugins.d/apps.plugin 1

Dec 09 16:14:38 VM-Stretch systemd[1]: Starting Real time performance monitoring...
Dec 09 16:14:38 VM-Stretch systemd[1]: Started Real time performance monitoring.
root@VM-Stretch:~# cat /lib/systemd/system/netdata.service
# SPDX-License-Identifier: GPL-3.0-or-later
[Unit]
Description=Real time performance monitoring

# append here other services you want netdata to wait for them to start
After=network.target httpd.service squid.service nfs-server.service mysqld.service mysql.service named.service postfix.service chronyd.service

[Service]
Type=simple
User=netdata
Group=netdata
RuntimeDirectory=netdata
RuntimeDirectoryMode=0775
PIDFile=/var/run/netdata/netdata.pid
ExecStart=/usr/sbin/netdata -P /var/run/netdata/netdata.pid -D -W set global 'process scheduling policy' 'keep' -W set global 'OOM score' 'keep'
ExecStartPre=/bin/mkdir -p /var/cache/netdata
ExecStartPre=/bin/chown -R netdata:netdata /var/cache/netdata
ExecStartPre=/bin/mkdir -p /var/run/netdata
ExecStartPre=/bin/chown -R netdata:netdata /var/run/netdata
#ExecStopPost=/bin/rm /var/run/netdata/netdata.pid
PermissionsStartOnly=true

# saving a big db on slow disks may need some time
TimeoutStopSec=60

# restart netdata if it crashes
Restart=on-failure
RestartSec=30

# The minimum netdata Out-Of-Memory (OOM) score.
# netdata (via [global].OOM score in netdata.conf) can only increase the value set here.
# To decrease it, set the minimum here and set the same or a higher value in netdata.conf.
# Valid values: -1000 (never kill netdata) to 1000 (always kill netdata).
OOMScoreAdjust=1000

# Valid policies: other (the system default) | batch | idle | fifo | rr
# To give netdata the max priority, set CPUSchedulingPolicy=rr and CPUSchedulingPriority=99
CPUSchedulingPolicy=idle

# This sets the scheduling priority (for policies: rr and fifo).
# Priority gets values 1 (lowest) to 99 (highest).
#CPUSchedulingPriority=1

# For scheduling policy 'other' and 'batch', this sets the lowest niceness of netdata (-20 highest to 19 lowest).
#Nice=0

[Install]
WantedBy=multi-user.target
  • netdata:x:999:999:netdata:/var/lib/netdata:/usr/sbin/nologin Users home dir at /var/lib/netdata which makes totally sense.
  • Comparing with official install method which includes git cloning and compiling from source: https://docs.netdata.cloud/installer/#one-line-installation
  • Good to have an own deb to avoid the bunch of pre-reqs, clone and build steps. But our version could be updated to current v1.11.X. No chance to use Debian repo, which is frozen at 1.6 (stretch-backuports).
  • Use netdata user to run the service, switch home dir to /var/lib/netdata

@MichaIng
Copy link
Owner

MichaIng commented Dec 9, 2018

@Node815
Copy link
Author

Node815 commented Dec 9, 2018

@PDXUser
Do you have a backup of SDcard or from dietpi-backup?

Not really worried in my case about having a backup - because it was a completely fresh install, so I really did not have a full installation going.

@MichaIng
Copy link
Owner

MichaIng commented Dec 9, 2018

@PDXUser
Okay, so at least it is no big deal to do a fresh install then. Big sorry to have a system destroying uninstaller in code currently. Sadly Fourdee is not available for a few days. I am thinking how to handle this best.

To fix the issue for now, although, since you uninstalled netdata, you most likely do not want to install it again:
sed -i 's/userdel -rf netdata/userdel netdata/g' /DietPi/dietpi/dietpi-software

There is more to do about netdata installer: Do not use root dir as users home dir. Decide whether to run netdata as root (currently), then not create netdata user at all; Or run it as netdata user (preferred), which requires more complicated dir permissions to preserve full features.
However I need Fourdees help for this.

Check current netdata install count: https://dietpi.com/survey/

  • ~1500 installs
  • Last uninstaller rework is already some versions old, so the majority of users will be affected by this.
  • See no other solution then doing another quick hotfix update: 6.19.7, which stops after doing the above as quick pre-patch. We cannot risk further systems to be destroyed by this. Nearly impossible to repair without backup, since even APT binaries and several core libraries are lost.

@MichaIng MichaIng changed the title Unable to uninstall netdata DietPi-Software | Uninstalling netdata breaks system Dec 9, 2018
MichaIng referenced this issue Dec 10, 2018
- DietPi-Software | Netdata: Resolved an issue, where uninstalling netdata lead to a broken system. Thanks to @PDXUser for reporting this issue: https://github.com/Fourdee/DietPi/issues/2336
- DietPi-Software | Docker: Resolved an issue on RPi, where a faulty "docker-ce" version from repository prevents service start on Raspberry Pi. Thanks to @iAreSee and @garret for reporting this issue, finding and testing workarounds: https://github.com/Fourdee/DietPi/issues/2282
@MichaIng
Copy link
Owner

Fixed with just released v6.19.7

Will close this issue. Further enhancements on netdata install will be done for v6.20

@Fourdee
Copy link
Collaborator

Fourdee commented Dec 27, 2018

All systems, receive current error when accessing web interface on remote system (notice the 2x //):

Access to file is not permitted: /usr/share/netdata/web//index.html

Fixed with:

chown -R netdata:netdata /usr/share/netdata/web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants