Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rancher Desktop 1.14.2 wiped my .zshrc file. #7154

Closed
conor-mccullough opened this issue Jul 8, 2024 · 15 comments · Fixed by #7206
Closed

Rancher Desktop 1.14.2 wiped my .zshrc file. #7154

conor-mccullough opened this issue Jul 8, 2024 · 15 comments · Fixed by #7206
Assignees
Labels
kind/bug Something isn't working platform/macos priority/1 Work should be fixed for next release triage/confirmed Issue has been reproduced by dev team
Milestone

Comments

@conor-mccullough
Copy link

Actual Behavior

Unfortunately it seems this recurring bug is still around. I just lost my .zhrc and .bashrc at a minimum.

Didn't have a recent backup handy, my bad, but I had been installing Rancher Desktop to test as a replacement for Docker Desktop and the fact that this appears to be something the devs aren't able to fix on a permanent basis has been enough to put me off using it in the future.

Steps to Reproduce

N/A

Result

N/A

Expected Behavior

I shouldn't need to worry about losing my entire shell config when toying around with resets/etc using Rancher Desktop.

Additional Information

No response

Rancher Desktop Version

1.14.2

Rancher Desktop K8s Version

1.30.2

Which container engine are you using?

moby (docker cli)

What operating system are you using?

macOS

Operating System / Build Version

MacOS Sonoma, M3 Pro

What CPU architecture are you using?

arm64 (Apple Silicon)

Linux only: what package format did you use to install Rancher Desktop?

None

Windows User Only

No response

@conor-mccullough conor-mccullough added the kind/bug Something isn't working label Jul 8, 2024
@jandubois
Copy link
Member

Unfortunately it seems this recurring bug is still around. I just lost my .zhrc and .bashrc at a minimum.

Didn't have a recent backup handy, my bad, but I had been installing Rancher Desktop to test as a replacement for Docker Desktop and the fact that this appears to be something the devs aren't able to fix on a permanent basis has been enough to put me off using it in the future.

I'm really sorry to hear that. This issue was supposed to be fixed in 1.12.2, and this is the first report that this has happened with any later release.

Do you have any information on how to reproduce this? Is this reproducible on your machine?

@jandubois jandubois added platform/macos priority/1 Work should be fixed for next release triage/need-to-repro Needs to be reproduced by dev team labels Jul 8, 2024
@jandubois jandubois added this to the 1.15 milestone Jul 8, 2024
@conor-mccullough
Copy link
Author

I only noticed it after around 3 hours of messing with Rancher Desktop - a series of factory resets, reinstalls, playing around with attempting to embed a CA, and some other stuff. My active terminal window lost its zshrc settings but I thought maybe it was just unloaded from that terminal session somehow so kept working and can't recall exactly what triggered it.

Currently I don't have the time to reliably reproduce this, I'm raising this here in hopes that it can serve as a point of reference in case others run into the same thing.

It replaced the contents with the following (Still available in my .bash_profile), if this helps narrow it down:

▶ cat ~/.bash_profile
### MANAGED BY RANCHER DESKTOP START (DO NOT EDIT)
export PATH="/Users/<user>/.rd/bin:$PATH"
### MANAGED BY RANCHER DESKTOP END (DO NOT EDIT)

@amandeep-sharma
Copy link

I also faced the same issue on Mac M2. It can be reproduced with below steps

  • Perform factory reset with version 1.14.2
  • Launch the application, setup options prompt will not appear as k3s versions download fails; due to unavailability of a valid certificate(Don't know the reason).
  • As "Automatic" value gets seeded for "Configure PATH" options, it wiped all shell files .cshrc, .tcshrc, .zshrc, .bashrc and .profile as mentioned in the issue.

@mook-as
Copy link
Contributor

mook-as commented Jul 16, 2024

Thank you @amandeep-sharma — with those steps I can (sometimes) reproduce.

I think it was easier for me to reproduce because that error case made shutting down faster, so it triggered the actual problem:

  • Rancher Desktop attempts to shut down
  • As part of shut down, we triggered a settings update
  • That triggered the path management code (where we tried to sync the state according to the preferences)
  • As the path management code runs, we end up exiting the process (because we're still shutting down) partway through writing the updated RC files.
    • With extra logging, I see that the code expected to write the correct number of lines, but it seems like the write actually failed because the output ended up truncated.

I think the most important change here is to avoid doing the write when we're starting to shut down. Also, we should look into doing the write atomically; I think that involves (at least on Linux) using file renames, and therefore bailing if the target file is a symlink or other non-normal files.

@jandubois
Copy link
Member

I think the most important change here is to avoid doing the write when we're starting to shut down.

Yes, and there needs to be some kind of mutex that prevents starting a shutdown while the write is in progress.

Also, we should look into doing the write atomically; I think that involves (at least on Linux) using file renames, and therefore bailing if the target file is a symlink or other non-normal files.

You could still fall back to the file copy when the rename is not possible. The important part is to keep a backup in place that will not be removed/overwritten even by repeated failed attempts to overwrite the original.

@jandubois jandubois added triage/confirmed Issue has been reproduced by dev team and removed triage/need-to-repro Needs to be reproduced by dev team labels Jul 16, 2024
@mook-as
Copy link
Contributor

mook-as commented Jul 16, 2024

Posted #7195 to at least not do the write on shutdown.

I think we should do a bit extra to help with the data loss, though; I'm thinking of changing the existing logic to something like:

(Edited to reflect the next four comments)

  1. If the target does not exist:
    1. If the desired state is true, write the contents into file (creating it).
    2. Return.
  2. If the target is a normal file:
    1. If the file has xattrs (doing a platform-specific check), bail.
    2. Copy target file (with COPYFILE_FICLONE) to <file>.rd-temp
    3. Read the contents of the file we just copied.
    4. Split the contents into before, current, after (i.e. the existing code).
    5. Set desired to the desired lines ([] if the desired state is false).
    6. If current is desired, return (since no change is needed).
    7. If before + desired + after is empty:
      1. Delete the target file.
      2. Delete the temporary file.
      3. Return.
    8. Write the before + desired + after content into the temporary file.
    9. Rename the temporary file into the target file.
    10. Return.
    11. File permissions should already be correct.
  3. If the target is a symlink:
    1. Copy target file (with COPYFILE_FICLONE + COPYFILE_EXCL) to <file>.rd-backup~
      1. This would throw an error if the symlink is dangling.
      2. This would also throw if the backup file already exists.
    2. Read the file (through the symlink).
    3. Split the contents into before, current, after (i.e. the existing code).
    4. Set desired to the desired lines ([] if the desired state is false).
    5. If current is desired, return (since no change is needed).
    6. Write the before + desired + after content into the file (through the symlink, even if it's empty).
    7. Read the file back out; if it does not match, throw an error.
      1. This would not remove <file>.rd-backup~
    8. Remove <file>.rd-backup~
  4. If the target is neither a normal file nor a symlink:
    1. Return with an error.

The main changes are:

  • Try to create a backup file before doing the write (and abort if it already exists)
  • When making the changes, merge the desired and not-desired paths.
  • Write to a swap file and rename it into place (to avoid partially written files)

Thoughts?**

@jandubois
Copy link
Member

  1. If the existing file is not a plain file (i.e. it's a symlink or something), throw an error.

That is not really acceptable; it is a common use case1 to have config files in a Git repo, or shared cloud folder, and use symlinks to point to them.

I think you can still follow the rest of your plan, except steps 8 and 9 need to be different when <file> is not a regular file:

  1. Write the new content to <file>.
  2. Verify that <file> contains the new content and throw an error if not.

You could also add an extra step

3.5 Verify that .<file>~rd-backup is identical to <file> and throw an error if not.

Another thing to add could be

8.5 Call sync to make sure everything is written back to disk.

I think step 3.5 and the alternate step 9 are paranoia, as long as the return codes of the write and close syscalls indicate no error, but it doesn't hurt to do a belt-and-suspenders thing here.

Footnotes

  1. Of course "common use case" is also a euphemism for "that's what I'm doing". 😛

@mook-as
Copy link
Contributor

mook-as commented Jul 16, 2024

I think 3.5 and sync are overkill; we're not attempting to be safe from the system crashing, just Rancher Desktop crashing. To rephrase: if the system crashes, the user is aware that things have went wrong.

I guess I can accept doing the alternate code path (not using a swap file) if the target is a symlink, though.

@jandubois
Copy link
Member

jandubois commented Jul 16, 2024

It is a bit early to start bike-shedding about the filenames, but whatever, I'm already here:

.<file>~rd-backup

Given that <file> already starts with a ., I think the extra dot here is not needed. I would also prefer to use a proper file extension by using . instead of a ~, so:

<file>.rd-backup

If you really want to include the ~ to indicate "backup" (which is already in the name), then append it at the very end: <file>.rd-backup~.

This also is more in line with your .<file>.rd-swp filename, which I would call

<file>.rd-temp

The names should make sense to the user, who is already distressed about the original file missing/destroyed. They are unaware of the implementation details, so swap doesn't really make sense to them at this time.

@mook-as
Copy link
Contributor

mook-as commented Jul 16, 2024

Oh, yeah, I don't actually plan to add a . in front, I dunno why I called it that. It's just going to be .bashrc~rd-backup or something. I can live with .bashrc.rd-backup~ though. Same with .bashrc.rd-temp.

@mook-as
Copy link
Contributor

mook-as commented Jul 17, 2024

(Note that #7154 (comment) has been updated)

Hmm, while writing a bunch of tests for this (haven't started on the implementation yet), I think there are some corner cases to think about:

  • If the target file is a symlink, the target only contains managed lines, and we want to remove those lines: do we remove the symlink (leaving the symlink target behind), remove the symlink target (leaving a dangling symlink), or remove both? Or leave an empty file at the target?
  • When we're dealing with plain files (not symlinks), do we still need the <file>.rd-backup~, given that we use <file>.rd-temp and rename into place?
  • When doing plain files, do we need to do anything to preserve permissions and xattrs? How would we go about doing that? (Ideally we'd use clonefile(2) to make the temporary file first then overwriting its contents, but it's unclear if that's possible in NodeJS.)

@jandubois
Copy link
Member

  • If the target file is a symlink, the target only contains managed lines, and we want to remove those lines: do we remove the symlink (leaving the symlink target behind), remove the symlink target (leaving a dangling symlink), or remove both? Or leave an empty file at the target?

I think we should leave the empty file.

Another question though: if the symlink is dangling from the start, should we create the file? I would think it is better to throw an error, the same was as when the backup file already exists.

  • When we're dealing with plain files (not symlinks), do we still need the <file>.rd-backup~, given that we use <file>.rd-temp and rename into place?

Probably not. Theoretically it guards against the editing producing a broken new version of the file by keeping a backup. But since we are going to delete it once the rename succeeds, there isn't much of a point.

  • When doing plain files, do we need to do anything to preserve permissions and xattrs? How would we go about doing that? (Ideally we'd use clonefile(2) to make the temporary file first then overwriting its contents, but it's unclear if that's possible in NodeJS.)

On Linux you can use getfattr/setfattr and getfacl/setfacl.

On macOS it will require some research; you can probably use xattr, and some combination of ls -le and chmod.

I think it will be rather unusual to have xattrs and/or acls on the config files, and we could just disallow using "Automatic" path management in that case.


Another thing I just thought about: we need to make sure the code works correctly if e.g. ~/.bash_profile and ~/.profile are symlinks to the same target. I think it works correctly right now; just make sure there is no race.

@mook-as
Copy link
Contributor

mook-as commented Jul 17, 2024

I think it will be rather unusual to have xattrs and/or acls on the config files, and we could just disallow using "Automatic" path management in that case.

That means we will have to detect that there are things on the config files. That's going to be… annoying. On the bright side, though, I missed that fs.promises.copyFile has a COPYFILE_FICLONE option, so we do have clonefile(). So I think we can create the temp file by using that, then clobbering its contents, then rename on top of the original.


This is incorrect: per the source, that just uses ioctl(…, FICLONE), so it has no effect on the (extended) attributes…
https://github.com/libuv/libuv/blob/v1.48.0/src/unix/fs.c#L1227

@jandubois
Copy link
Member

That means we will have to detect that there are things on the config files.

You can still do that with the commands I gave above.

Or find a Node module, e.g. https://github.com/LinusU/fs-xattr for attributes; didn't find anything quickly for ACLs.

we do have clonefile()

That is fine for macOS, but I wonder what it does on Linux. On btrfs and xfs it could to a reflink copy (don't know if it does).

I think this is too much specialized logic for an edge case. Just more code that can have bugs.

So I still think disallowing changes to path management when something unusual happens is the safest way:

  • backup file already exists
  • file is a dangling symlink
  • file has xattrs or acls

In that case disable the path management checkboxes in the dialog and add a hover tip explaining the reason behind it. And force the setting in the app to "Manual" regardless what the actual file content is.

@mook-as
Copy link
Contributor

mook-as commented Jul 17, 2024

Updated #7154 (comment) again.

So fs.promises.copyFile only preserves file permissions, and not xattrs; so we'll need to do some platform-specific code to handle xattrs (shelling out…). That, or we can, as suggested, just fail if the target file has any xattrs (including the Apple quarantine bit).

(I'm avoiding fs-xattr because that requires node-gyp again, and we made so much effort to try to get rid of it…)

mook-as added a commit to mook-as/rd that referenced this issue Jul 18, 2024
This tests that the manageLinesInFile implementation corresponds to the
algorithm specified in rancher-sandbox#7154 (in an attempt to reduce the chances of users
losing data).  Note that the implementation has not yet been updated, so
this test is expected to fail.

Signed-off-by: Mark Yen <[email protected]>
mook-as added a commit to mook-as/rd that referenced this issue Jul 18, 2024
This tests that the manageLinesInFile implementation corresponds to the
algorithm specified in rancher-sandbox#7154 (in an attempt to reduce the chances of users
losing data).  Note that the implementation has not yet been updated, so
this test is expected to fail.

Signed-off-by: Mark Yen <[email protected]>
mook-as added a commit to mook-as/rd that referenced this issue Jul 23, 2024
This tests that the manageLinesInFile implementation corresponds to the
algorithm specified in rancher-sandbox#7154 (in an attempt to reduce the chances of users
losing data).  Note that the implementation has not yet been updated, so
this test is expected to fail.

Signed-off-by: Mark Yen <[email protected]>
mook-as added a commit to mook-as/rd that referenced this issue Jul 24, 2024
This tests that the manageLinesInFile implementation corresponds to the
algorithm specified in rancher-sandbox#7154 (in an attempt to reduce the chances of users
losing data).  Note that the implementation has not yet been updated, so
this test is expected to fail.

Signed-off-by: Mark Yen <[email protected]>
mook-as added a commit to mook-as/rd that referenced this issue Jul 25, 2024
This tests that the manageLinesInFile implementation corresponds to the
algorithm specified in rancher-sandbox#7154 (in an attempt to reduce the chances of users
losing data).  Note that the implementation has not yet been updated, so
this test is expected to fail.

Signed-off-by: Mark Yen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working platform/macos priority/1 Work should be fixed for next release triage/confirmed Issue has been reproduced by dev team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants