-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Contribution] qubes-updates-cache #1957
Comments
It's indeed a common problem when deploying fedora vms/containers, or with server farms. Debian has apt-cacher(ng) but fedora doesn't have something similar. Solutions that came up:
Anyway, instead of having specific tools for each distro it would be wiser to have a generic solution. |
Actually apt-cacher-ng works for Fedora too :) |
apt-cacher-ng works on fedora for mirroring debian stuff, but does it really work for mirroring (d)rpms/metadata downloaded with yum/dnf ? From the doc [1]: "6.3 Fedora Core - Attempts to add apt-cacher-ng support ended up in pain and the author lost any motivation in further research on this subject. " [1] https://www.unix-ag.uni-kl.de/~bloch/acng/html/distinstructions.html#hints-fccore |
Yes, I've seen this. But in practice it works. The only problem is Best Regards, |
Marek Marczykowski-Górecki:
Can it also let through non-apt traffic? Specifically I am wondering |
That's interesting question - if you have apt-cacher-ng instance handy, Best Regards, |
I don't think there is a generic solution that works at the same time
What do you think?
I've read all the config, and tired, does not seem possible but never |
It will require more resources (memory), somehow wasted when one use for example only Debian templates. But maybe it is possible to activate those services on demand (socket activation comes to my mind). It will be even easier for qrexec-based updates proxy. |
I'm all for 100% caching success rate with a specific mechanism for each distro, but do Qubes developpers/contributors have time to develop/support that feature ? |
I'm using polipo proxy => tor to cache updates. I also modified the repo configuration to use one specific update server instead of dynamically selecting it. I'm planing to document my setup and will post a link here. |
Just wanted to throw in https://github.com/yevmel/squid-rpm-cache I planned to setup a dedicated squid vm and use the above mentioned config/plugin to cache rpms, but never found the time for it.
Currently i just use my NAS which has a "normal" squid running as caching proxy. I have an ansible script which generates me my templates. In the templates I replaced the |
My experience with squid is horrible in terms of resources (RAM, I/O usage) for small setups. Looks like an overkill for just downloading updates from a few templates from time to time. |
I don't like saying this, but we should also consider making this an additional, non-default option or wontfix also. I like apt-cacher-ng very much and use it myself. However, introducing it by default into Qubes would lead to new issues, more users having issues with upgrading due to added technical complexity. There are corner cases where apt-cacher-ng introduces new issues, such as showing |
FWIW I have squid installed on an embedded router (RB450g) for a 25+ people office and it's been running for literally ages without any problem. There's a strict bandwidth control (delay pools), which is usually the biggest offender in terms of resources, but squid's memory usage has constantly been < 20 Mo and highest CPU usage < 6%. Granted, the office's uplink speed is low - in the megabits/s range - but the resources available for updateVM are in another league compared to the embedded stuff and the setup - only caching - is not fancy. tl;dr, squid is not as bad as it used to be years ago. The issues you mention reinforce my concern that it will be too time-consuming for Qubes devs to support distro-specific solutions. A simple generic one, even if not optimal is still better than nothing at all, rather than "wontfix". just my 2c - not pushing for anything, you guys are doing a great work ! |
At the very least, we should provide some documentation (or suggestions or pointers in the documentation) regarding something like @taradiddles's router solution. Qubes users are more likely than the average Linux user to have multiple machines (in this case, virtual) downloading exactly the same updates. |
Looks like what you want is Squid with an adaptive disk cache size (for storing packages in the volatile OTOH, it's always a security footprint issue to run a larger codebase for a cache. Also, Squid caching can be ineffective if multiple VMs download files from different mirrors (remember that the decision of which mirror to use is left practically at random to the VM calling onto the Squid proxy to do its job). For those reasons, it may be wise to investigate solutions that do a better job of proxy caching using a content-addressable store, or matching file names. |
Perhaps a custom Go-based (to prevent security vulns) cache that can listen for requests using the net/http module, and proxy them to the VMs? This has potential to be a very efficient solution too, as a Go program would have a minuscule memory footprint. |
@Rudd-O Have a look at this https://github.com/mojaves/yumreproxyd |
Looking. Note we need something like that for Debian as well. |
The code is not idiomatic Go and there are some warts there that I would fix before including it anywhere. Just as a small example on https://github.com/mojaves/yumreproxyd/blob/master/yumreproxy/yumreproxy.go#L33 you can see he is using a nil value as a sort of a bool. That is not correct -- the return type should be (bool, struct). |
https://github.com/mojaves/yumreproxyd/blob/master/yumreproxy/yumreproxy.go#L73 <- also problematic. But the BIGGEST problem, is that the program appears not to give a shit about concurrency. Save into cache and serve from cache can have a race, and no locking is performed, nor are channels being used there. Big fat red flag. The right way to do that by communicating with the Cache aspect of the application through channels -- send request to the Cache, await for response, if not available, then download file, send storage to the Cache, await for response. Also, all content types returned are application/rpm. That's wrong in many cases. BUT, that only means that project can be extended or rewritten, and it should not be very difficult to do so. |
I just uploaded the Squid-based https://github.com/rustybird/qubes-updates-cache (posted to qubes-devel too) |
The latest commit (-57 lines, woo) reworks qubes-updates-cache to act as a drop-in replacement for qubes-updates-proxy. No changes to the client templates are needed at all now. |
I can try to propose to resurrect it from Fedora side. I'm not sure to have the current bandwidth for but I could give it a try if it is worth to do it. |
Just to let you know that I've updated |
On Sat, Jan 16, 2021 at 12:51:32AM -0800, Fr??d??ric Pierret wrote:
Just to let you know that I've updated `apt-cacher-ng` and as it was orphaned for 8+ weeks, I've requested a re-review: https://bugzilla.redhat.com/show_bug.cgi?id=1916884. On the Fedora devel list there is already one user which is pretty enthusiast about this resurrection. You can already test it by using my COPR repository `fepitre/fedora`. I'm currently using it with success on Fedora 32 AppVM.
When you say "with success" do you mean "with success for Fedora"?
Can you post your acng.conf file, because Fedora remains a PITA.
Are you rewriting the sources files in the templates?
|
Yes success with Fedora 32. I'm currently running the build provided on my COPR repository. I'm using the almost default conf: https://gist.github.com/fepitre/fd490e04fe92bd023f77f0e03984b05c and I'm having success on caching Debian repositories. FYI, I'm not using the |
On Sat, Jan 16, 2021 at 07:50:08AM -0800, Fr??d??ric Pierret wrote:
> When you say "with success" do you mean "with success for Fedora"? Can you post your acng.conf file, because Fedora remains a PITA. Are you rewriting the sources files in the templates?
Yes success with Fedora 32. I'm currently running the build provided on my COPR repository. I'm using the almost default conf: https://gist.github.com/fepitre/fd490e04fe92bd023f77f0e03984b05c and I'm having success on caching Debian repositories. FYI, I'm not using the `qubes-updates-cache`. I've simply setup `apt-cacher-ng` into an AppVM then `qvm-connect-tcp :aptcachervm:3142` from another Debian AppVM and use proxy setting to `localhost:3142`.
I was asking if you had success in caching Fedora packages.
But you are skipping caching for most repositories. The config is the
difficult part.
|
@unman how would I check if fedora packages are cached? I am using apt-cacher-ng in a debian based qube based on your notes and it appears to work. The download speed of the packages in the fedora templates and the presence of the mirror sites as subdirectories in /var/cache make me think it works, but how can I be sure? ... and would that be unexpected? The discussion in this issue makes me doubt. |
On Sun, Jan 17, 2021 at 05:18:14PM -0800, Sven Semmler wrote:
@unman how would I check if fedora packages are cached? I am using apt-cacher-ng in a debian based qube based on your notes and it appears to work. The download speed of the packages in the fedora templates and the presence of the mirror sites as subdirectories in /var/cache make me think it works, but how can I be sure? ... and would that be unexpected? The discussion in this issue makes me doubt.
That will depend on how the repositories are set in the yum.repos
definition, and the configuration.
If the config includes a general PassThroughPattern, then that content
will not be cached. Since the majority of fedora links are https in
default, without special treatment they will not be cached.
You can always confirm caching by examining the directories in
/var/cache/apt-cacher-ng with `du -sh`.
Also, if you use Remap, you've probably already discovered that the
default fedora definitions are wholly inadequate, and need
supplementing. That said, fedora caching seems to work reasonably well.
|
Thank you!
|
I think apt-cache as bundled per @unman deserves way more attention. Even more since shaker project (which salts otherwise complex use cases) including this caching proxy. https://forum.qubes-os.org/t/simple-set-up-of-new-qubes-and-software/13064 https://qubes.3isec.org/tasks.html @marmarek : @unman now releases the package under his own repo, alongside other spec files that deploys salt scripts and actually deploys them as post install scripts when packages are installed from his repo. cacher is one if those packaged salt scripts. Spec files under main project https://github.com/unman/shaker/blob/main/cacher.spec As referred in closed issue unman/shaker#5 (comment), what is missing, without qubes integration, is for the qubes updater to reapply the wildcard sls so that repositories entered as https are transformed be cached when applying updates on cloned templates on next run of qubes updater. No problem on vanilla install if a user installs a repo and software on a single template with https links; it will pass. But a hook is missing from qubes updater so that on next update iteration, the links are transformed on templates, applying qubesctl prior of running updates: https://github.com/unman/shaker/blob/main/cacher/change_templates.sls @marmarek: is there any chance of some collaboration happenning between qubes and shaker projects? This idea (packaged salt recipes, applied at install) is a life changer. And this issue (in rhe too 10 most commented opened issue) tells this cacher is needed by a lot of people. And now exists in just works (tm) mode. |
How do you imagine this notification? Is this a package manager hook that runs at last and notifiy in DomU or the qube sends a feature request to Dom0? |
Community Dev: @rustybird
PoC: https://github.com/rustybird/qubes-updates-cache
It's common for users to have multiple TemplateVMs that download many of the same packages when being individually updated. Caching these packages (e.g., in the UpdateVM) would allow us to download a package only once, then make it available to all the TemplateVMs which need it (and perhaps even to dom0), thereby saving bandwidth.
This has come up on the mailing lists several times over the years:
Here's a blog post about setting up a squid caching proxy for DNF updates on baremetal Fedora:
The text was updated successfully, but these errors were encountered: