Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Mirroring Support for Private Registries #9161

Closed
wants to merge 2 commits into from

Conversation

jlhawn
Copy link
Contributor

@jlhawn jlhawn commented Nov 14, 2014

Docker-DCO-1.1-Signed-off-by: Josh Hawn [email protected] (github: jlhawn)

@stevenschlansker
Copy link

I can mirror my private registries with this patch! This is amazing :)

@bordicon
Copy link

So amazing :D

@tarnfeld
Copy link

Nice one. 👍

@jlhawn
Copy link
Contributor Author

jlhawn commented Nov 14, 2014

There's probably a bunch of refactoring that would need to be done before this could get merged though.

@SvenDowideit
Copy link
Contributor

and documentation! user facing wise, this is huge!! ;)

@stevenschlansker
Copy link

Documentation is in jlhawn#1

@jlhawn jlhawn force-pushed the private_mirror branch 3 times, most recently from fa2814f to cef6135 Compare December 4, 2014 19:26
@stevenschlansker
Copy link

@SvenDowideit we have some documentation now, anything else you would like to see on this PR? Would love to get it in for 1.4.0 if possible, happy to work on any further improvements needed.

@bjaglin
Copy link
Contributor

bjaglin commented Dec 10, 2014

This is highly anticipated indeed, especially since https is now more or less enforced (which is a good thing) so using a local squid3 is not an easy option anymore. I wonder why the support for mirrors did not include private registries from the start? Is there a gotcha I am missing here?

@bjaglin
Copy link
Contributor

bjaglin commented Dec 10, 2014

Nevermind my last sentence above - I just had a look at the diff and realized it was still a proposal. @jlhawn is help wanted on this one or is this already ongoing?

@jessfraz jessfraz added this to the next milestone Dec 10, 2014
@jessfraz
Copy link
Contributor

Added to next so we can review after 1.4.0 release tomorrow.

@dmp42
Copy link
Contributor

dmp42 commented Dec 10, 2014

Making mirroring a first class citizen is definitely a target for 1.5
cc @stevvooe

Specifically adressing the engine performance issues (docker-archive/docker-registry#785), usability, namespace issues, and caching as part of the registry rewrite. In its current state, adding this feature to v1 is problematic.

When an image is pulled from its upstream registry, if one or more
`--registry-mirror=http://<my-docker-mirror-host>` options are specified, the given mirrors are checked
in order to see if they have a cached version before using the normal upstream registry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the lines are more than 80 chars, but i'm not counting yet :)

@SvenDowideit
Copy link
Contributor

LGTM - though now I'm curious if we need some way to configure images that should not be mirrored :/

@stevenschlansker
Copy link

@SvenDowideit shouldn't that be done by the registry? If it refuses to mirror an image it can simply not return it, and then the docker client will continue trying mirrors in order.

It's much easier to manage on the registry side as well - no need to roll out updates to the docker configuration on every client machine just to refuse to mirror an image.

I think having a simple and straightforward path to mirroring available now is much more important than making it very configurable - presumably the "v2 registry" feature will have all the configuration knobs. I just need some way to distribute images and can't wait indefinitely! The poor guys over in the EU already hate docker because of this, between the network latency and the fact that 'docker pull' is really really slow...

@SvenDowideit
Copy link
Contributor

correct - it would be expected to be registry functionality :) and certainly not for this PR

the guys in the EU have a much better situation than those of us in AU - our internet noodle is almost back to modem standard.

@jessfraz jessfraz removed this from the next milestone Dec 19, 2014
@bobrik
Copy link
Contributor

bobrik commented Dec 29, 2014

If we going to merge that PR without support from registry side, local registries will be unnecessarily mirrored. I have local registry for X TB in s3-like storage and mirror docker hub to the same s3-like storage, but with different path.

After upgrade without registry support I'll need to have 2X TB in s3-like storage + more traffic + more io + who knows what else. That doesn't make a lot of sense.

@stevvooe
Copy link
Contributor

@bobrik We are working on registry support to make this more efficient. The plan is to have pull-through mirroring, such that a local registry, configured as the mirror in the client, can check local storage and fallback to a remote registry, such as docker hub, if the content is unavailable locally. This should massively cut incoming bandwidth at the cost of some extra local storage.

If this doesn't work for you, please let me know why (detail about your setup may help to clarify your position).

@stevvooe
Copy link
Contributor

@bobrik Wanted to clarify that my comments above are related to the V2 registry.

@bobrik
Copy link
Contributor

bobrik commented Dec 30, 2014

@stevvooe my setup is:

  1. Local registry
  2. Local mirror

My idea is that local mirror should not do any mirroring for local registry, since it would only increase bandwidth usage. Mirroring remote registries and docker hub is perfectly fine.

Is there an issue where mirroring is discussed for v2 registry?

@SvenDowideit
Copy link
Contributor

Docs LGTM, once the lines are made 80 char or less

@fredlf @jamtur01

@stevvooe
Copy link
Contributor

@bobrik The discussion is in docker-archive/docker-registry#658, albeit its a little thin, currently. I'll put together a plan there in the next few weeks. For your setup, the plan for V2 is to only require a single registry instance for this use case. Push/pull would be allowed locally, with fallback to a remote registry along with caching of the remote resources.

I don't want to promise you too much now, but adding your use case to the comments will be helpful when the design gets fully fleshed out.

@stevenschlansker
Copy link

@SvenDowideit the documentation in that file already has multiple lines that stretch to over 100 chars, and some even reaching 200 chars. But just for you I'll fix it anyway :P

@jlhawn mind merging jlhawn#2 ?

@tarnfeld
Copy link

tarnfeld commented Jan 6, 2015

@tarnfeld that person was hitting their mirror as well, and everything was configured like it should - still pulling from their mirror was not any faster than from the hub itself, for the reason I mentioned (lies in docker itself, see below).

Sorry perhaps I didn't explain my point well enough. The testing I did resulted in the mirror never being hit and I verified the code path with verbose logging, the mirroring code never gets hit in the client hence why this fix is being proposed. The client always went to the primary registry directly regardless of the mirror configuration, because the code does not allow private registries to go via mirrors.

It's also worth bearing in mind that not all registries are being powered by the docker-registry project.

@shin-
Copy link
Contributor

shin- commented Jan 6, 2015

There is a lot of confusion here regarding what we call mirrors in the docker daemon, and what we call mirrors in the docker-registry code, which makes the conversation all the more difficult to follow.

@SvenDowideit
Copy link
Contributor

as an anecdote :)

before the --registry-mirror code was released working, hub downloads of the golang image used up about 80% of my bandwidth quota for the preceding month or 2 (whenever we started using it for boot2docker - now, it is not quite irrelevant.

each fresh test of a boot2docker vm would re-download the docker project build images, which would take in the order of an hour, whereas now, its close enough to instant that i don't have time to make another coffee.

so I imagine for a distributed organisation using private registries for their internal workflows, the problem is just as big a deal.

@bobrik
Copy link
Contributor

bobrik commented Jan 7, 2015

Imagine that you deploy 800mb app (2 layers) 20-40 times a day to 150 machines daily. You deploy it from private registry and you also have mirror set up because docker hub is giving you images at 14kbps.

Right now workflow is:

  1. Push image to local registry
  2. Pull image from local registry on every node directly

With this change:

  1. Push image to local registry
  2. Pull image from local mirror, figure out that it's not there (images deployed once, why would it be there?), start mirroring in 150 parallel forks.
  3. If that worked, pull image from mirror that is not a bit faster than local registry because it shares the same distributed storage backend.

Why do that? I don't see a reason to break existing behavior, since we cannot disable mirroring for local registries.

If you really need to mirror private registries — go ahead and set up caching nginx in front of your private registry, it is just highly cacheable http service after all. You have a solution. I don't see which solution do I have except "avoid docker upgrades before v2 registry is out".

@tarnfeld
Copy link

tarnfeld commented Jan 7, 2015

@bobrik That is a fair point, however when you're deploying a 3GB image once a month, and you have lots of new machines coming in and out of the cluster, it's a pain. Mirroring helps solve that problem because you can aggregate the downloads, when you're frequently changing images you have a slightly different problem because you simply have to ship the bytes at some point in time, and that's always going to be as slow as the first pull, and the more contention you have it will be slower yet.

Pull image from local mirror, figure out that it's not there (images deployed once, why would it be there?), start mirroring in 150 parallel forks.

I've put a diagram together to hopefully clear up any confusion of what we're trying to achieve.

Note: Please ignore the practicalities of the fact we have two locations (or regions) but only one single link :-)

Without docker mirroring and a single registry

without mirror

With docker mirroring and a single registry

with mirror

It's important to note that we're not trying to save on bandwidth as such (in terms of total bytes transferred) though that is a side bonus of doing this, but trying to reduce network contention. As you can see 76 seconds vs 17 seconds is a pretty significant speed up.

If you really need to mirror private registries — go ahead and set up caching nginx in front of your private registry, it is just highly cacheable http service after all. You have a solution. I don't see which solution do I have except "avoid docker upgrades before v2 registry is out".

This change does not preclude simply using a caching proxy, if you use a caching nginx proxy you still need to point the docker client (daemon) at the proxy instead of the primary registry, which is not possible currently in docker for images NOT hosted on index.docker.io. As far as I am aware this change is not at all tied to the docker/docker-registry project.

The main point of allowing mirrors to be configured at the daemon level is to abstract and remove the concept from the user. Mirrors may be configured in a cluster to improve performance based on network layout or security constraints, which is not something a user that's launching a docker container needs to be concerned with. It's also very nice to be able to use the same string (e.g registry.my-net.internal/foo/bar:1.0.0) to describe your image regardless of which site it's launching in, and therefore which mirror to use.

@bobrik
Copy link
Contributor

bobrik commented Jan 7, 2015

Your "private mirror" could be replaced with nginx that only does caching (with locking) for layer files. Just point private registry domain to that nginx in every location. I fully understand you goals and desire to do that with registry itself.

My concern is that now we are breaking my existing setup without giving me an ability to fix it. I have 1gb link to my mirror and 1gb link to my registry.

[host] - 1gb -> [local registry] changes to [host] - 1gb -> [mirror] - 1gb -> [local registry].

Just let me have the old way without an extra hop. I'd love to see mirroring for private registries, but with the way to disable it for local ones.

@tarnfeld
Copy link

tarnfeld commented Jan 7, 2015

@bobrik This is an optional configuration. A mirror is not required to be able to use a private registry, nor will it be with this change.

I fully understand you goals and desire to do that with registry itself.

This change has nothing to do with the actual registry, this is a configuration option in the docker client which allows you to tell it to point to the [mirror]. Without this, your example [host] - 1gb -> [mirror] - 1gb -> [local registry] is impossible.

Is your concern that you would like to use your registry as a mirror for index.docker.io BUT not for your own private images? If so, that's OK, this won't cause a problem for you.

@bobrik
Copy link
Contributor

bobrik commented Jan 7, 2015

Is your concern that you would like to use your registry as a mirror for index.docker.io BUT not for your own private images?

Exactly. I want to use mirroring registry for index.docker.io and all private registries, except my own registry that cannot become faster with mirroring.

@tarnfeld
Copy link

tarnfeld commented Jan 7, 2015

I see, that makes sense. Intriguing that you're using multiple private registries but that's a legit issue. @stevenschlansker any thoughts?

@stevenschlansker
Copy link

@bobrik @tarnfeld If your mirror has a mirror of public registry content but not your private registry, then when it fails to serve the image Docker will fall back to the 'real' private registry that you want it to read from. So as long as your 'mirror of index.docker.io' registry does not mirror your actual content, the only regression here is that it must check once that the image is not there before doing the same pull as it always did.
Do you agree?

@bobrik
Copy link
Contributor

bobrik commented Jan 8, 2015

I agree. How do I do that?

To save on network bandwidth, it is useful to cache images close to the Docker
instances using them. When an image is pulled from its upstream registry, if
one or more `--registry-mirror=http://<my-docker-mirror-host>` options are
specified, the given mirrors are checked in order to see if they have a cached
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where "in order" is left to right? Alphabetical?

@fredlf
Copy link
Contributor

fredlf commented Jan 9, 2015

A copy-edit and a minor question. Once those nits are picked, docs LGTM.

@stevenschlansker
Copy link

It looks like the merge of #8456 causes this entire patch to end up as a merge conflict.

It looks like that PR changes how mirrors are handled, without mentioning that fact at all in the description. I did a brief read-through of #8456 but can't see how to resolve this patch against that one. And I'm not terribly thrilled about rebasing this patch again. Maybe someone who knows Docker better than I do will care enough to rebase.

Steven Schlansker added 2 commits January 11, 2015 10:35
Docker-DCO-1.1-Signed-off-by: Josh Hawn <[email protected]> (github: jlhawn)
@crosbymichael
Copy link
Contributor

@stevenschlansker

I'm a little confused right now, I don't see any code in this PR right now. Did you remove those commits or what's up?

I'm sure someone else or one of the maintainers can rebase the PR for you.

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 12, 2015

@crosbymichael Yes, I removed the code commit for now because #8456 made it very different and now explicitly sets mirroring for the official "indexInfo" config. We'll have to rethink how to specify mirrors for private registries - probably a new CLI option on the daemon.

@jamtur01
Copy link
Contributor

Docs LGTM

@icecrime
Copy link
Contributor

Closing this as it is now tracked on docker/distribution (I guess distribution/distribution#19 is a good entrypoint).

@icecrime icecrime closed this Mar 24, 2015
@jlhawn jlhawn deleted the private_mirror branch July 31, 2015 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.