-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for .zst
for component tarballs in channels
#2488
Comments
For reference, on my laptop:
|
@joshtriplett Could you also run compression timing? If zstd is significantly slower we probably can't afford it. |
@Mark-Simulacrum Depends heavily on compression level and what tradeoff we want to make. How much time does the current xz compression use? |
I don't think we have timings, so it's hard to say. Does decompression time not correlate with compression time much then? (Beyond "disk reading is slower with more data")? I'd guess a good way to try and estimate things would be for someone to read https://github.com/rust-lang/rust-installer/blob/d66f476b4d5e7fdf1ec215c9ac16c923dc292324/src/tarballer.rs#L49-L56, lower that into either |
No, it doesn't. Compression can take anywhere from "faster than gzip" to "several minutes", with corresponding improvements in data size; decompression of all of those is proportional to data size, never compression time. Taking several minutes would be worth it for stable release tarballs, while nightly/CI versions could scale that back a little and aim for taking the same amount of time in CI that we currently do. |
@joshtriplett @Mark-Simulacrum Was there consensus on this in the end? Is this still on the cards? |
I think we should gather some more data timing and size wise, but I expect that we should indeed support zstd compression (and perhaps even make that our canonical format, instead of xz, in the next several cycles after it rolls out). That said, I would like for us to try to figure out a plan for limiting ourselves to maybe 2-3 long-term supported compression formats for cost / time reasons; probably gzip needs to stick around for compatibility (though I don't think I've encountered xz lacking but gzip supporting servers myself) but I'm not sure about xz vs zstd. cc @rust-lang/release |
It's not that much pain for rustup to support all the formats, though we don't have to generate them all for each channel release. With that said, I'll consider the work to add zstd support to rustup in the near future. |
Yeah, I'm mostly worried about storing 2-3x as much data (plus the time to recompress things) if we're going to accumulate new algorithms over time, not implementation complexity of each piece. |
I think that if zstd looks promising, the right thing will be to swap xz for it. gzip is a good idea to retain long-term for compatibility I guess. |
On Sat, Oct 17, 2020 at 02:35:25PM -0700, Daniel Silverstone wrote:
I think that if zstd looks promising, the right thing will be to swap xz for it. gzip is a good idea to retain long-term for compatibility I guess.
Honestly, zstd isn't hard to come by. I don't think it'd be unreasonable
to drop gzip as long as rustup supports zst.
|
From a rustup perspective we have to support everything ever published to a
release channel, so swapping isn't really what we'd do 😄
…On Sun, 18 Oct 2020, 02:30 Josh Triplett, ***@***.***> wrote:
On Sat, Oct 17, 2020 at 02:35:25PM -0700, Daniel Silverstone wrote:
> I think that if zstd looks promising, the right thing will be to swap xz
for it. gzip is a good idea to retain long-term for compatibility I guess.
Honestly, zstd isn't hard to come by. I don't think it'd be unreasonable
to drop gzip as long as rustup supports zst.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2488 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADZ7XSDBD2YQBTAGAPQSO3SLIZJ5ANCNFSM4RMYN74A>
.
|
I'm not suggesting that rustup can drop the support, since they need to
be able to install old releases. I'm suggesting that the release
channels could, eventually, drop other formats.
…On Sat, Oct 17, 2020 at 11:25:38PM -0700, Robert Collins wrote:
From a rustup perspective we have to support everything ever published to a
release channel, so swapping isn't really what we'd do 😄
On Sun, 18 Oct 2020, 02:30 Josh Triplett, ***@***.***> wrote:
> On Sat, Oct 17, 2020 at 02:35:25PM -0700, Daniel Silverstone wrote:
> > I think that if zstd looks promising, the right thing will be to swap xz
> for it. gzip is a good idea to retain long-term for compatibility I guess.
>
> Honestly, zstd isn't hard to come by. I don't think it'd be unreasonable
> to drop gzip as long as rustup supports zst.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#2488 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AADZ7XSDBD2YQBTAGAPQSO3SLIZJ5ANCNFSM4RMYN74A>
> .
>
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#2488 (comment)
|
The nice thing is that once the client has zst support, the server-side can decide how aggressively to compress for each tarball. For instance, if stable artifacts get more downloads, we could spend more resources compressing them. |
Currently the channel manifest toml only has gz and bz2 support, and not in an entirely extensible way, I think that we need to define how the channel toml will represent the available compression formats, and thence what is considered 'acceptable' in terms of available formats. E.g. would a channel manifest with only I've given this some thought already and will continue to do so, and may bring it up for wider discussion at the next dev-tools sync. |
True, not bz2, braino on my part :D |
@joshtriplett If you use a master build of I won't be advertising this beyond a changelog entry along the lines of |
@89z I didn't say that any of the channels had been updated, merely that the in-development branch of |
@kinnison Thank you! |
I'm working on patches to try this in the Rust distribution process. Right now, those patches are waiting on gyscos/zstd-rs#117 to go into a released version of the zstd crate, because Rust tarballs benefit greatly from zstd's long-distance matching mode. |
Great to hear - if you need any extra work on our side to enable your experiments, please let me know. I fully expect that we'll not have done everything right first time :D |
I worry about the reference to memory footprint: we have literally just
resolved issues extracting large files from tarballs on platforms like
raspberry pi; requiring hundreds of MB of RAM in working set to decompress
will not work for a significant number of use cases.
|
@rbtcollins I'd expect zst to typically take less memory on decompression, and much less decompression time (by a factor of 10). And gzip isn't going away. |
Just a note, I'm not sure whether it will be feasible to provide three compression formats. Our releases are already huge and we store them forever, so adding more duplication could be an issue. The infra team would need to discuss what we're going to serve to users. |
Rustup would need to figure out logic for when to choose which once as
well..., probably solvable, but not zero cost.
… |
@pietroalbini I'm not expecting that we should supply three. I'm hoping that we move from gz/xz to gz/zst. I'm working on a proposal for that. |
Describe the problem you are trying to solve
As per #1858 and other places, it would be good to support
zstd
compression for component packages in channels. This would allow for smaller, faster-to-decompress components which is a benefit all round.Describe the solution you'd like
Support for
.zst
as a file extension andzstd
decompression as part of Rustup.Notes
This ought mostly to be constrained to the
src/dist/
tree, particularly themanifestation.rs
andcomponent/package.rs
files. Some test support will also be needed.Since
zstd
generally compresses better and decompresses more quickly thanxz
, we should prefer.zst
over.xz
where it is present in the manifest.Once implemented, the compiler team could start to produce channels with
zstd
compressed contentThe text was updated successfully, but these errors were encountered: