Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate that there is enough space to actually perform the mirror #19

Closed
ericdill opened this issue Dec 21, 2016 · 6 comments
Closed

Comments

@ericdill
Copy link
Contributor

ericdill commented Dec 21, 2016

There has been a user report of the following stack trace:

INFO: download_url=https://anaconda.org/conda-forge/gdal/2.1.1/download/win-64/gdal-2.1.1-np111py34_4.tar.bz2
Traceback (most recent call last):
  File "/opt/miniconda3/lib/python3.5/site-packages/conda_mirror/conda_mirror.py", line 253, in _download
    tf.write(data)
OSError: [Errno 28] No space left on device
 
During handling of the above exception, another exception occurred:
 
Traceback (most recent call last):
  File "/opt/miniconda3/bin/conda-mirror", line 11, in <module>
    sys.exit(cli())
  File "/opt/miniconda3/lib/python3.5/site-packages/conda_mirror/conda_mirror.py", line 183, in cli
    blacklist, whitelist)
  File "/opt/miniconda3/lib/python3.5/site-packages/conda_mirror/conda_mirror.py", line 427, in main
    _download(url, download_dir, repodata)
  File "/opt/miniconda3/lib/python3.5/site-packages/conda_mirror/conda_mirror.py", line 253, in _download
    tf.write(data)
OSError: [Errno 28] No space left on device

I'm pretty sure that this is because there is not enough space in the /tmp directory on the host machine where this command was being run.

One way to fix this would be to compute the space required to perform the mirror for all packages that are in the to_mirror set by using the bytes stored in the size key in repodata[pkg_name]['size']. Would need to check that there is enough space in the temp directory located at download_dir and the final location for these packages at 'local_directory'.

@MWigger
Copy link

MWigger commented Dec 22, 2016

Hello,
To verify that it is the fault of low space on /tmp I replaced the line 418 with
with tempfile.TemporaryDirectory( dir="/media/myfilesysten......" ) as download_dir:

With this change it worked, so to face my issue, it would be sufficient to make the temp folder configurable.

But what is the point of the tmp folder, why is the download not directly written to the target folder?

@ericdill
Copy link
Contributor Author

With this change it worked, so to face my issue, it would be sufficient to make the temp folder configurable.

Great, I'll do that in the next week or so

But what is the point of the tmp folder, why is the download not directly written to the target folder?

I run conda-mirror as an hourly cron job. It felt easier to download everything to a staging directory and then validate all the packages in that staging directory, removing packages that do not pass validation (size/md5/sha256). Only once packages have been downloaded are they promoted from the staging directory to the directory where they are being served from. I honestly hadn't considered that /tmp might not have enough space to do a full channel mirror. This issue will only be hit the first time you mirror the channel since that's going to require >10GB of space. Additionally, if I directly download to the directory where the packages are being served from, I occasionally hit the issue where conda install <package> would be grabbing a partially downloaded package and fail. Though that issue could also be avoided by downloading the file to a temp file in that directory and then moving it to its actual package name <package_version_buildnumber.tar.bz2>.

In any event, the problem should be solved by adding a command line argument that will allow you to specify where the transient download directory should be.

@MWigger
Copy link

MWigger commented Dec 23, 2016

Thanks for the explanation, that makes sense

@ericdill
Copy link
Contributor Author

ericdill commented Jan 3, 2017

@MWigger I'd also welcome a PR from you implementing this feature 😄 I'll get to it eventually, but it's not a priority for me at the moment

@jneines
Copy link
Contributor

jneines commented Jan 10, 2017

Being a co-worker of @MWigger I have applied some changes to fix this issue. Find the implementation in my current pull request. It's based on adding a new parameter for specifying the temporary directory to be used, with having a suitable default set and using this setting as the dir parameter of tempfile.TemporaryDirectory to use the setting. Tests pass successfully and the implementation solves our current problem.

@ericdill
Copy link
Contributor Author

Closed by #21. Thanks for the report @MWigger and thanks for the implementation @jneines ! New release is available on pypi that has these changes (0.6.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants