-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError on cache in multi-users case #488
Comments
Thanks for the detailed bug report! |
Hi @fcollonval , makes me think about this conda issue. About
|
We experience the same issue - Is there a solution yet? |
@dandaman what version of mamba? We recently pushed some improvements with version 0.17.0. |
@wolfv Thank you for your prompt response!
I'll have a go at the new version then and report back |
I have just upgraded to 0.17.0 and the issue remains. |
Would you be able to write a test that would show us your issue? |
Sure, it is easy to reproduce. A multi-user hpc and the users will run shared pipelines for specific processes. Each pipeline will begin by downloading the required packages into the shared pkgs directory. The error appears when a second user runs the exact same pipeline and the json files are already present in the cache directory, but owned by another user. No permission changes on the cache files have any affect. The only workarounds are to remove the cache files or not use the shared pkgs directory. If I can provide more detail/test data, just let me know what you would like to see. |
@davestacionis are you using There is a proposed fix in #1137 however, the authors haven't had time to get back to it. |
I am not using ACLs but I think #1123 still applies. We are using standard group permissions and have setgid at the top level of the install to ensure everyone in our users group has access. |
any chance you could convert that into a bash script for us to try out?! |
@wolfv, for us its the same - also after upgrading to 0.17.0 the issue remains. Everytime a different user uses mamba to install things the above error occurs the next time another user tries to install things. Permissions on the pkgs/cache (g+rwxs) work at the file system level i.e. the different users can delete cache json files created by others.. The problem can always be resolved by either using conda or deleting the contents of pkg/cache. So I think we have the identical situation as described by @davestacionis I'll checkif I can reproduce it on my local linux client... |
I'm not clear on what you want me to test via script. The proposed fix looks like it requires changes to the source/C code. |
I want a bash script that replicates your setup. I believe that should only take a couple of lines of bash , no? Ideally you could integrate that as a test in our GitHub action (that would fail for now) |
While working on a script to duplicate the issue, I found a fix for my environment. Our install was shared by many users and they had write access to the pkgs & cache folders. Removing this access and setting the gid back to root resolved the issue and now shared pkg updates must be done via sudo. This forces user pkgs & envs into their home/personal environment, but that is ok for now. |
great. thansk for letting us know. |
I am hitting a similar issue in our setup. We build linux VMs for a users using ansible. This installs mambaforge in /opt and creates an environment with our software installed, this runs as root. The VM is then given to a single user, who will use it for a few days or weeks. We want the user to be able to update the environment. We don't know who the user will be at install time, so we This worked with conda and
I can reproduce with the following docker file: FROM ubuntu:20.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y wget
RUN wget -nv -O Mambaforge.sh https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh
RUN bash Mambaforge.sh -b -p /mambaforge
RUN /mambaforge/bin/mamba init
SHELL ["/bin/bash", "-c"]
RUN source /mambaforge/etc/profile.d/conda.sh && mamba create --yes -n test numpy==1.21.3
RUN chmod -R o+w /mambaforge/envs
RUN chmod -R o+w /mambaforge/pkgs/cache
# Make the cache files look old so that mamba needs to update them
RUN find /mambaforge/pkgs/cache -exec touch -d "7 days ago" {} +
RUN useradd -ms /bin/bash user
USER user Build with:
then in:
run:
A work around is for the user to remove the .json files.
|
I just started seeing this issue out of nowhere as part of my Azure pipeline. During
I presume something must have changed about permissions in this directory in the Azure docker image for Ubuntu. Similar CI with |
Just saw this here as well: pandas-dev/pandas#45902 |
A few fixes were needed: * Increase timeout for `install-with-pip` and `install-with-conda` jobs These were taking a longer time. The problem for conda on Python 3.8 is a known problem * Switch to mamba: The build using conda was failing because it was running out of memory. Mamba should be faster and use less memory. * The switch to mamba required a workaround to delete any existing JSON files in `/usr/share/miniconda/pkgs/cache`. This seems a known problem: mamba-org/mamba#488
A few fixes were needed: * Increase timeout for `install-with-pip` and `install-with-conda` jobs These were taking a longer time. The problem for conda on Python 3.8 is a known problem * Switch to mamba: The build using conda was failing because it was running out of memory. Mamba should be faster and use less memory. * The switch to mamba required a workaround to delete any existing JSON files in `/usr/share/miniconda/pkgs/cache`. This seems a known problem: mamba-org/mamba#488
This is still an issue when using the mamba solver within Here's a workaround that worked for us in GitHub actions:
Thanks to pandas-dev/pandas#45902 for figuring this out. |
I guess at least for some people that might be identical to #1123 and thus potentially addressed by #2141 . ie mtime update to a given time denied (unless owned by effective uid), while special calling case of |
To anyone who wants a quick dirty fix, it's possible to fix the problem by hooking #include <fcntl.h>
#include <sys/stat.h>
#include <stddef.h>
#define __USE_GNU
#include <dlfcn.h>
int utimensat(int dirfd, const char *pathname, const struct timespec times[2], int flags) {
int (*outimensat)(int, const char *, const struct timespec[2], int) = dlsym(RTLD_NEXT, "utimensat");
return outimensat(dirfd, pathname, (struct timespec*)NULL, flags);
}
|
According to
|
Upgrading to mamba 1.2.0 seems to have fixed the issue for me (unless it was by chance, because caching of repodata temporarily bypasses the issue in my case). So I suppose #2141 is indeed responsible for the fix. Thanks @coroa ! My setup: multi-user install on Linux using standard group permissions and setgid on Would be good if others confirm the fix |
Awesome! Thanks @coroa :) |
I have the same problem when I use conda on Windows. |
I am working inside a Docker container on Ubuntu (container also based on Ubuntu, 22.04) and just noticed this exact issue happening for me too. My |
Hello, We are encountering the same issue. critical libmamba filesystem error: cannot set permissions: Operation not permitted [/app/list/ia/micromamba_cache/pkgs/cache] The cache dir has been generated by the install of environments by one user. And the install from another user generates this error, even though he has rwx permissions on all the directories and files inside micromamba_cache. But he is not the owner. |
This might be related to groups ownership and permissions of the caches and mamba. Please see: #3740 (comment) Feel free to reopen an issue with a reproducer if you still have this issue with 2.0. Note that 1.x is note being developed anymore, but to address security fixes. |
Fail to create a new environment using mamba.
Context:
The cache file is owned by another user than the one creating the environment.
Reference:
mamba 0.5.1
conda 4.8.4
The text was updated successfully, but these errors were encountered: