-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spec: support for passing client image name for mirroring use case #12
Comments
On Mon, Apr 16, 2018 at 10:35:39PM +0000, Derek McGowan wrote:
So if a request is going to `localhost:5000/library/ubuntu`, it
could mirror both `docker.io/library/ubuntu` and
`quay.io/library/ubuntu` and switch based on request parameters.
This could also be a “caching proxy” use case, depending on when the
upstream requests happen. Just dropping in some additional keywords
in case that helps folks discover this issue again later on.
Complicated registry configurations have been proposed to remedy
this as well as a backwards incompatible approach of requesting as
localhost:5000/docker.io/library/ubuntu.
Can you shed more light on why this is backwards incompatible? I
don't see wording in the current spec that would care about what goes
into the ‘<name>’ portion of the URLs besides [1]:
Classically, repository names have always been two path components
where each path component is less than 30 characters. The V2
registry API does not enforce this…
If a registry was trying to mirror/proxy multiple upstream registries,
I don't see why the registry couldn't define a default (for any of
these approaches). For example, “when I get a two-component name, the
implicit first component is ‘docker.io’” (or whatever) as a local
policy. And without such a default, I don't see how it would support
clients who are only capable of creating two-component names.
Longer term, the default-component approach may run into issues
(e.g. if you wanted to mirror/proxy a namespace that didn't expect two
child components, e.g. example.com/ubuntu or
example.com/some/group/app). The default-name-component approach is
not forward-compatible with those cases, but that's a distinct issue
from backwards compatibility. And you could cludge around the
limitation with blacklists for defaults (e.g. “don't inject default
components if the given name's first component is example.com”). If
we go with the default component approach, folks maintaining default
components would ideally get their user-base upgraded to clients which
used fully-qualified names before the forward-compat issues became too
troublesome. If that timescale is expected to be very long (because
some clients will never upgrade?), then one of your “this channel
always contains the fully-qualified image name” approaches would be a
better choice.
[1]: https://github.com/opencontainers/distribution-spec/blob/f4f0ac08649d8eedbe9fc76aef248a84c3bcd827/spec.md#overview
|
@wking number of components have no relevance here. The specification does not define anything about the path components. The backwards incompatibility comes from existing clients and servers. If a client is upgraded and now starts requesting |
On Mon, Apr 16, 2018 at 11:32:43PM +0000, Derek McGowan wrote:
If a client is upgraded and now starts requesting
`localhost:5000/docker.io/library/ubuntu`, the registry would have
to be configured to treat `docker.io` as the same as previous
requests it had seen. If it was an older registry, then it would
just not understand the request, forcing the client to resend the
request without `docker.io`.
Ah, I'd only considered old-client/new-registry above. I agree that
new-client/old-registry would need some sort of client fallback for
registries that didn't recognize the fully qualified name in the URL
path.
Using headers or query parameters can be safely ignored by older
registries or omitted by older clients.
So what would the logic for new clients be? Always set the
fully-qualified name in the query parameter (or wherever) and always
drop the leading component when constructing the URL path? That would
probably work, although it doesn't end up in a world where we could
eventually drop the query parameter. The spec already supports
version checks [1], perhaps we can do whatever for the remainder of v2
and then require fully-qualified names in the path once we cut a v3
API? That would at least restrict “feature probing” to the initial
version check that clients should be performing anyway (or should be
performing when their non-version request 404s ;).
[1]: https://github.com/opencontainers/distribution-spec/blob/f4f0ac08649d8eedbe9fc76aef248a84c3bcd827/spec.md#api-version-check
|
just like mirror-proxy function, it not spec scope in my mind. |
There hasn't been much talk about this issue. Is this something we want to put on the agenda for Wednesday's call or can we push this to a later release? |
I am going to open up a PR for it this week. We can discuss the design further there. I think this is important to properly implement the mirroring use case in a less opinionated manner (currently a mirror can only mirror a single upstream registry). |
How about implementing a |
Mirrors should be mostly transparent to the client, kind of like setting an HTTP proxy. Also the issue with the current situation is the repository name used by registries does not contain the host name which could lead to namespace collision in the registry implementation in the mirroring case. Using the |
I am working on a PR for this now. I will add a section under Mirroring and Proxy CachingCompany X sets up an internal registry which is capable of storing local copies of images from any upstream registry. |
I think |
Also, is it possible for clients to use separate creds for the local and authority? |
Using the term "authority" here because a proxy is really required to delegate authority over content and access to that content to somewhere else. Whether it does that delegation by proxying is an implementation detail by the registry, same as how it constructs any proxy requests. One thing to consider though is the use of an HTTP header vs a query parameter. A query parameter gives better cacheability in cases where there could be an even less sophisticated HTTP cache in between. A query parameter would prevent identical requests returning different content based solely on a non-standard HTTP header. In that cases we would have something like |
Yes I think a query parameter is better, otherwise for caching you need to set vary-by on a nonstandard header. |
@justincormack my plan here to PoC it in containerd then open a PR for the spec here. I am not sure we have used a consistent naming scheme for what we call this, in containerd we usually call this part the |
I think this is a good approach and the first step in separating registry location from the "authority". The eventual goal should be to encode the authority in the image name, but this will allow for cases where it is not. Do registries currently ignore this parameter? |
There is an assumption to today that a server implementation of the distribution specification will either not care about the name used by the client or that all requests will have a known common namespace. An example of this is the Docker Hub assuming that all requests are prefixed with
docker.io
even though the registry hostname isregistry-1.docker.io
. However this has always caused difficulty when a client then wants to mirror content,localhost:5000/library/ubuntu
could proxy toregistry-1.docker.io/library/ubuntu
, howeverlocalhost:5000
could never proxy to anything else. Complicated registry configurations have been proposed to remedy this as well as a backwards incompatible approach of requesting aslocalhost:5000/docker.io/library/ubuntu
. However a goal of this specification should be simplicity and backwards compatibility. I believe that a solution does belong in the specification to unlock the mirroring use cases without complicated configuration or DNS setup.My proposal is to add a way to pass up the name resolved by the client to the registry, (e.g.
docker.io/library/ubuntu
). So if a request is going tolocalhost:5000/library/ubuntu
, it could mirror bothdocker.io/library/ubuntu
andquay.io/library/ubuntu
and switch based on request parameters. There are 2 possible ways to achieve this, one is by creating adding an HTTP request header (e.g.OCI-REF-NAME: docker.io/library/ubuntu
) or by adding a query parameter?oci-ref-name=docker.io/library/ubuntu
). The first is clean but the second may be more useful for static mirroring. I am not suggesting one over the other yet, just stating the problem and solutions to discuss.The text was updated successfully, but these errors were encountered: