-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building from a local sdist file url broken in 0.21.0 #1045
Comments
Thanks! There are a few workarounds, of course (e.g. making pip unarchive the file). I would also be interested if Lastly, I do think you are right and this file should be un-archived to have the same behavior as fetching from a URL. |
It's working with ╭─ Running build for recipe: rich-13.4.2-pyh4616a5c_0
│
│ ╭─ Fetching source code
│ │ Fetching source from path: "/tmp/rich/rich-13.4.2.tar.gz"
│ │ Extracted to "/tmp/rich/output/bld/rattler-build_rich_1725441621/work"
│ │
│ ╰─────────────────── (took 0 seconds) We can see in the logs that it is extracted. Using Would still be nice to fix the file url behaviour as you said. |
Same here, but unfortunately
The issue $ ls -lh /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
-rwxrwx---+ 1 username nogroup 2.6G Aug 12 2021 /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
$ |
@DimitriPapadopoulos thank you for the detailed write-up!
Ah, I see that in the URL case it takes only 30 seconds so something is wrong. I'll have to take a look. |
While $ rsync --progress /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip /tmp/
MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
2,786,688,287 100% 448.76MB/s 0:00:05 (xfr#1, to-chk=0/1)
$ |
I'm not used to building/running Rust programs, but chances are function extract_zip stalls in our context: extract_zip/// `.zip` files archived with compression other than deflate would fail.
pub(crate) fn extract_zip(
archive: impl AsRef<Path>,
target_direcextract_ziptory: impl AsRef<Path>,
log_handler: &LoggingOutputHandler,
) -> Result<(), SourceError> {
let archive = archive.as_ref();
let target_directory = target_directory.as_ref();
fs::create_dir_all(target_directory)?;
let len = archive.metadata().map(|m| m.len()).unwrap_or(1);
let progress_bar = log_handler.add_progress_bar(
indicatif::ProgressBar::new(len)
.with_finish(indicatif::ProgressFinish::AndLeave)
.with_prefix("Extracting zip")
.with_style(log_handler.default_bytes_style()),
);
let mut archive = zip::ZipArchive::new(progress_bar.wrap_read(
File::open(archive).map_err(|_| SourceError::FileNotFound(archive.to_path_buf()))?,
))
.map_err(|e| SourceError::InvalidZip(e.to_string()))?;
let tmp_extraction_dir = tempfile::Builder::new().tempdir_in(target_directory)?;
archive
.extract(&tmp_extraction_dir)
.map_err(|e| SourceError::ZipExtractionError(e.to_string()))?;
move_extracted_dir(tmp_extraction_dir.path(), target_directory)?;
progress_bar.finish_with_message("Extracted...");
Ok(())
} Could it be that |
Would you be able to try with the file on the same filesystem? It could be related to NFS, after all. |
Will try next week. By the way, the compression method is either $ zipinfo -l /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip | grep -v -e ' defX ' -e ' stor '
Archive: /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
Zip file size: 2786688287 bytes, number of entries: 5487
5487 files, 2989357849 bytes uncompressed, 2785399227 bytes compressed: 6.8%
$ |
My workstation was updated from Ubuntu 22.04 to Ubuntu 24.04 a few days ago. I wonder whether a filesystem issue could plague it. After "heavy use" (typically running rattler-build to build from simple but large sources) Google Chrome starts complaining (without reason) about invalid site certificates or identifies other sites as non-existent. I couldn't find anything suspicious in the system logs. I will try on a machine still running Ubuntu 22.04, this might be totally unrelated to rattler-build — could be a Linux kernel bug. |
That sounds strange. rattler-build itself should not modify anything system-wide. Of course, I don't know what the build scripts are doing. |
Oh, I mean it wouldn't be a rattler-build issue, rather a Linux kernel bug triggered by something specific to rattler-build operation, perhaps manipulating lots of hardlinks. |
The scripts are very simple, they just unzip and don't event test. For example: |
My issue was probably a Linux kernel issue, or more generally a system issue. Today, ZIP extraction works just fine, either from the local file system:
or the NFS share:
|
Allow multiple versions of the Matlab Runtime For now, keep the package non-relocatable. The `patchelf` tool fails with obscure error messages on MATLAB Runtime binaries. While developping, we retrieve the binary locally because https:// is too damn slow and NFS breaks rattler-builder: prefix-dev/rattler-build#1045 (comment)
Allow multiple versions of the Matlab Runtime For now, keep the package non-relocatable. The `patchelf` tool fails with obscure error messages on MATLAB Runtime binaries. While developing, we retrieve from local disk because retrieving the binary from MathWorks usign HTTPS is damn slow and NFS breaks rattler-builder: prefix-dev/rattler-build#1045 (comment)
Unfortunately, I am again having freezing issues with
I don't see anything relevant in the system logs. |
Hmm, maybe we need to use a BufferReader or something like that somewhere ... |
@DimitriPapadopoulos it was indeed missing a BufReader: #1144 - I believe this will help nicely in your case. |
@wolfv Thank you very much for looking into this issue. I don't know much about Rust, I understand it provides unbuffered I/O by default and that unbuffered I/O can be slow due to repeated system calls. Yet |
@DimitriPapadopoulos - the progress bar is just for showing the progress. The main problem was the unbuffered read which will result in many more system calls and generally be slow. I am very sure that this can be exaggerated by slow disk / NFS filesystems. We already had this optimization for the Tar-file reader but missed it for Zip. I already made the release so you can try out 0.28.2 whenever you have time. I am quite sure that it should give you a decent improvement :) |
Just upgraded to 0.28., it's still slow. The throughput shown by the progress bar keeps dropping forever:
|
argh. Just to be sure - |
Yes, it's 0.28.2 (I forgot to copy/paste the output of
|
When I start
In short, at the system level, network resources are not used as they should. When running Nothing in the system logs. Note that file |
Where is your output folder located and the corresponding |
The output dir is |
Now about the cache. We used to have home dirs on NFS servers, but that's not the case any more. Besides, even with home dirs on NFS servers, we used to point the environment variable Initial command:
Skimmed down command:
Unfortunately it remains as slow as before. I'm not sure how to further investigate. Do you have a Rust code snippet that unzips a file I could try to build and test locally? I wouldn't be suprised if it were a Rust bug. |
What does |
I kicked off a build that you could try for debugging: #1146 ... And when you run |
Yuo can find the binaries here: https://github.com/prefix-dev/rattler-build/actions/runs/11594061424?pr=1146 |
I
|
I do see a EDIT: Ah, just found the binaries. |
Here is the output, $ ~/Downloads/rattler-build-x86_64-unknown-linux-musl/rattler-build build -r /local/disk/recipes/matlab-runtime-9.7 --output-dir /tmp/channel -c conda-forge
╭─ Finding outputs from recipe
│ Found 1 variants
│ Build variant: matlab-runtime-9.7-9-hb0f4dca_0
│
│ ╭─────────────────┬──────────╮
│ │ Variant ┆ Version │
│ ╞═════════════════╪══════════╡
│ │ target_platform ┆ linux-64 │
│ ╰─────────────────┴──────────╯
│
╰─────────────────── (took 0 seconds)
╭─ Running build for recipe: matlab-runtime-9.7-9-hb0f4dca_0
│
│ ╭─ Fetching source code
│ │ Fetching source from path: /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
│ │ Starting zip extraction
│ │ Zip file size: 2786688287
│ │ Extracting zip file: "/path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip" to "/tmp/channel/bld/rat
│ │ tler-build_matlab-runtime-9.7_1730300983/work" I don't see anything in the extraction directory: $ ls -a /tmp/channel/bld/rattler-build_matlab-runtime-9.7_1730300983/work/
.
..
$ |
Hmm, this might be completely unrelated, but your version number is also broken. It should not contain a
|
The package name is EDIT: At least for the sake of debugging, I guess I need to change the package name to Our "requirement" is to be able to install multiple versions of the MATLAB Runtime package alongside. We need to share the MATLAB Runtime between packages that use the same version of the MATLAB Runtime, because it is really huge - latest versions weigh 4.5 GB. At the same time, not all packages depend on the same version of the MATLAB Runtime. I understand that the proper way to do that is to have each package embark its own MATLAB Runtime instead of depending on a specific version of an external MATLAB Runtime package, but then we end pulling successive dependencies weighing 5 GB each, which my colleagues feel becomes untractable. |
Mmmh... Your debug version doesn't work much better using a local copy of $ ~/Downloads/rattler-build-x86_64-unknown-linux-musl/rattler-build build -r /local/disk/recipes/matlab-runtime-9.7 --output-dir /tmp/channel -c conda-forge
╭─ Finding outputs from recipe
│ Found 1 variants
│ Build variant: matlab-runtime-9.7-9-hb0f4dca_0
│
│ ╭─────────────────┬──────────╮
│ │ Variant ┆ Version │
│ ╞═════════════════╪══════════╡
│ │ target_platform ┆ linux-64 │
│ ╰─────────────────┴──────────╯
│
╰─────────────────── (took 0 seconds)
╭─ Running build for recipe: matlab-runtime-9.7-9-hb0f4dca_0
│
│ ╭─ Fetching source code
│ │ Fetching source from path: /tmp/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
│ │ Starting zip extraction
│ │ Zip file size: 2786688287
│ │ Extracting zip file: "/tmp/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip" to "/tmp/channel/bld/rattler-build_matlab-runtime-9.7_17
│ │ 30301459/work" |
Changed the naming/versioning scheme. Same thing, stuck again with a local $ ~/Downloads/rattler-build-x86_64-unknown-linux-musl/rattler-build build -r /local/disk/recipes/matlab-runtime-9.7 --output-dir /tmp/channel -c conda-forge
╭─ Finding outputs from recipe
│ Found 1 variants
│ Build variant: matlab-runtime-9.7.9-hb0f4dca_0
│
│ ╭─────────────────┬──────────╮
│ │ Variant ┆ Version │
│ ╞═════════════════╪══════════╡
│ │ target_platform ┆ linux-64 │
│ ╰─────────────────┴──────────╯
│
╰─────────────────── (took 0 seconds)
╭─ Running build for recipe: matlab-runtime-9.7.9-hb0f4dca_0
│
│ ╭─ Fetching source code
│ │ Fetching source from path: /tmp/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
│ │ Starting zip extraction
│ │ Zip file size: 2786688287
│ │ Extracting zip file: "/tmp/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip" to "/tmp/channel/bld/rattler-build_matlab-runtime_173030
│ │ 2794/work" Let me reboot though, just in case. |
I started a more isolated repository that we can test things on... https://github.com/wolfv/zippy You can run this code locally with |
@DimitriPapadopoulos - I think we're not alone: zip-rs/zip2#231 ... Hopefully it gets fixed upstream and we might also be able to dedicate a few cycles to this. We do have a lot of other things on our hands as well though, so I can't make big promises right now. There is a second issue with the zip crate in rattler-build and large archives though (#1147), so we might wanna prioritize this at some point. Is it possible for you to work around this issue at the moment by e.g. copying the file to your disk before starting the build? Or is it a complete blocker? |
Hi @DimitriPapadopoulos - there is a chance that the new builds are much faster: https://github.com/prefix-dev/rattler-build/actions/runs/11607783736 - I pulled in the changes from the PR that I linked above. Would love to hear if that changes things for you and thank you again for all the help debugging this! |
For now I do work around this issue by copying the file locally. Hopefully I understand enough of Rust to run the new branch. I'll spend some limited time today to that task today. |
You don't need to learn any rust. Just download teh artifact for your platform, extract and try :) E.g. for |
Good news, I get decent unzipping times using the latest experimental version. Unzipping $ ~/Downloads/rattler-build-x86_64-unknown-linux-musl/rattler-build build -r /local/disk/recipes/matlab-runtime-9.7 --output-dir /tmp/channel -c conda-forge
╭─ Finding outputs from recipe
│ Found 1 variants
│ Build variant: matlab-runtime-9.7-9-hb0f4dca_0
│
│ ╭─────────────────┬──────────╮
│ │ Variant ┆ Version │
│ ╞═════════════════╪══════════╡
│ │ target_platform ┆ linux-64 │
│ ╰─────────────────┴──────────╯
│
╰─────────────────── (took 0 seconds)
╭─ Running build for recipe: matlab-runtime-9.7-9-hb0f4dca_0
│
│ ╭─ Fetching source code
│ │ Fetching source from path: /tmp/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
│ │ Extracted zip to /tmp/channel/bld/rattler-build_matlab-runtime-9.7_1730380055/work
│ │
│ ╰─────────────────── (took 15 seconds)
[...]
$ Unzipping $ ~/Downloads/rattler-build-x86_64-unknown-linux-musl/rattler-build build -r /local/disk/recipes/matlab-runtime-9.7 --output-dir /tmp/channel -c conda-forge
╭─ Finding outputs from recipe
│ Found 1 variants
│ Build variant: matlab-runtime-9.7-9-hb0f4dca_0
│
│ ╭─────────────────┬──────────╮
│ │ Variant ┆ Version │
│ ╞═════════════════╪══════════╡
│ │ target_platform ┆ linux-64 │
│ ╰─────────────────┴──────────╯
│
╰─────────────────── (took 0 seconds)
╭─ Running build for recipe: matlab-runtime-9.7-9-hb0f4dca_0
│
│ ╭─ Fetching source code
│ │ Fetching source from path: /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
│ │ Extracted zip to /tmp/channel/bld/rattler-build_matlab-runtime-9.7_1730380270/work
│ │
│ ╰─────────────────── (took 80 seconds)
[...]
$ Note that $ time unzip -q /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
real 0m30,287s
user 0m26,266s
sys 0m4,004s
$ Any way, we're back to a tractable situation at least. Thank you very much for fixing this issue. |
great, thanks for testing. Seems a lot better than before. There is another zip library we could try, |
I finally got around to fixing the original issue in #1164 @beenje @DimitriPapadopoulos the latest release of rattler-build (0.29.0) ships with a patched |
I tested the original issue with rattler-build 0.30.0 and confirm it's fixed. Thanks @wolfv ! |
Thanks! |
Support for local source file url scheme was added in #177 and working in version 0.5.0.
I hadn't tested that in a while. When trying to build a recipe using a local file as source with
rattler-build
0.21.0, it fails.Issue can be reproduced with:
The local file was copied to the work directory but wasn't unarchived.
The text was updated successfully, but these errors were encountered: