Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

media-types: Define '+gzip' structured syntax suffix #332

Closed
wants to merge 2 commits into from

Conversation

wking
Copy link
Contributor

@wking wking commented Sep 20, 2016

Leaning heavily on the existing entries in RFC 6839. The suffix makes it easy to clarify DiffIDs without requiring a particular layer media type. It also allows you to create image-layout instances where the layers are stored uncompressed, which may be useful for cases such as:

The suffix has been discussed briefly in #316 and it would make #328 and #330 easier to address.

@@ -32,7 +32,8 @@ This specification uses the following terms:
<dd>
A layer DiffID is a SHA256 digest over the layer's uncompressed tar archive and serialized in the descriptor digest format, e.g., <code>sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9</code>.
Layers must be packed and unpacked reproducibly to avoid changing the layer ID, for example by using tar-split to save the tar headers.
NOTE: the DiffID is different than the digest in the manifest list because the manifest digest is taken over the gzipped layer for <code>application/vnd.oci.image.layer.tar+gzip</code> types.
The DiffID is different than the layer digest in the <a href="manifest.md#image-manifest-property-descriptions">manifest's <code>layers</code></a> because the layer digest is taken over the blob regardless of compression, while the DiffID is taken after removing any compression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could be phrased as "The DiffID can be different than the layer digest...". Can it be the case that the manifest refers to uncompressed blobs? Or is the spec recommending against this practice elsewhere?

@glestaris
Copy link
Contributor

@wking I wonder how binary diffing can work with the tar format. Are there any tools you know off that can make sense of the binary diff between two tar files?

@vbatts
Copy link
Member

vbatts commented Sep 20, 2016

@wking I wonder how binary diffing can work with the tar format. Are there any tools you know off that can make sense of the binary diff between two tar files?

It's not quite a binary diff of two tar files, though creating a
changeset from two tar archives ought to be feasible. I don't think
there is a tool for this yet.

@vbatts
Copy link
Member

vbatts commented Sep 20, 2016 via email

@wking
Copy link
Contributor Author

wking commented Sep 20, 2016

On Tue, Sep 20, 2016 at 08:44:19AM -0700, Vincent Batts wrote:

honestly, the wording for the suffix isn't so bad, but you've removed
all the places that use the suffix? That seems confusing.

In most places where we talk about these types, we're talking about
the layer semantics; the fact that that layer might be gzipped isn't
relevant. In all the places where the spec talks about a base media
type the +gzip suffix definition lets you use a gzipped form as well.
You could even use application/vnd.oci.image.manifest.v1+json+gzip if
you like in places where we talk about
application/vnd.oci.image.manifest.v1+json.

So I've left the +gzip on in the examples, but I don't think we need
to remind people about it much beyond that.

@wking
Copy link
Contributor Author

wking commented Sep 20, 2016

On Tue, Sep 20, 2016 at 08:37:33AM -0700, Vincent Batts wrote:

@wking I wonder how binary diffing can work with the tar
format. Are there any tools you know off that can make sense of
the binary diff between two tar files?

It's not quite a binary diff of two tar files, though creating a
changeset from two tar archives ought to be feasible. I don't think
there is a tool for this yet.

Assuming you have the same file order, regular old diff is sufficient
for handling text files in tarballs:

$ wget https://archive.org/download/alicesadventures19033gut/19033.txt
$ tar -cf a.tar 19033.txt
$ sed -i s/Carroll/CARROLL/g 19033.txt
$ tar -cf b.tar 19033.txt
$ diff -au a.tar b.tar
$ diff -au a.tar b.tar
--- a.tar 2016-09-20 09:47:43.837793647 -0700
+++ b.tar 2016-09-20 09:48:23.385795616 -0700
@@ -1,4 +1,4 @@
-19033.txt0000644000175000017500000022174610467303146011336 0ustar wkingwkingThe Project Gutenberg EBook of Alice in Wonderland, by Lewis Carroll
+19033.txt0000644000175000017500000022174612770264113011335 0ustar wkingwkingThe Project Gutenberg EBook of Alice in Wonderland, by Lewis CARROLL

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever. You may copy it, give it away or
@@ -8,7 +8,7 @@

Title: Alice in Wonderland

-Author: Lewis Carroll
+Author: Lewis CARROLL

Illustrator: Gordon Robinson

@@ -1338,7 +1338,7 @@

-End of the Project Gutenberg EBook of Alice in Wonderland, by Lewis Carroll
+End of the Project Gutenberg EBook of Alice in Wonderland, by Lewis CARROLL

*** END OF THIS PROJECT GUTENBERG EBOOK ALICE IN WONDERLAND ***

@wking
Copy link
Contributor Author

wking commented Sep 20, 2016

On Tue, Sep 20, 2016 at 04:43:49AM -0700, George Lestaris wrote:

  • NOTE: the DiffID is different than the digest in the manifest list because the manifest digest is taken over the gzipped layer for application/vnd.oci.image.layer.tar+gzip types.
  •    The DiffID is different than the layer digest in the <a href="manifest.md#image-manifest-property-descriptions">manifest's <code>layers</code></a> because the layer digest is taken over the blob regardless of compression, while the DiffID is taken after removing any compression.
    

I wonder if this could be phrased as "The DiffID can be different
than the layer digest...". Can it be the case that the manifest
refers to uncompressed blobs?

The resulting digest might be the same, but the logic for getting
there is different (it's just that the layer decompress step might be
a no-op). How does the rephrasing in 16af37ea7d5713 look to you?

Or is the spec recommending against this practice elsewhere?

I think using uncompressed layers in the manifest should be fine, if
you don't mind the blobs being a bit bigger.

@wking
Copy link
Contributor Author

wking commented Sep 20, 2016

On Tue, Sep 20, 2016 at 09:44:55AM -0700, W. Trevor King wrote:

Tue, Sep 20, 2016 at 08:44:19AM -0700, Vincent Batts:

honestly, the wording for the suffix isn't so bad, but you've
removed all the places that use the suffix? That seems confusing.

In most places where we talk about these types, we're talking about
the layer semantics; the fact that that layer might be gzipped isn't
relevant.

However, it is relevant when we equate our layer type with
application/vnd.docker.image.rootfs.diff.tar.gzip. I've added a note
to that effect (and included the rebuilt media-types.png) with a7d5713
96f3a67.

@vbatts
Copy link
Member

vbatts commented Sep 20, 2016

On 20/09/16 09:52 -0700, W. Trevor King wrote:

Assuming you have the same file order, regular old diff is sufficient
for handling text files in tarballs:

You can choose this for yourself, but it's not an all inclusive
solution, and will led to unuseful churn in producing diffs, that are not
actually different.

@wking
Copy link
Contributor Author

wking commented Sep 21, 2016

On Tue, Sep 20, 2016 at 04:50:29PM -0700, Vincent Batts wrote:

On 20/09/16 09:52 -0700, W. Trevor King wrote:

Assuming you have the same file order, regular old diff is
sufficient for handling text files in tarballs:

You can choose this for yourself, but it's not an all inclusive
solution, and will led to unuseful churn in producing diffs, that
are not actually different.

Agreed. I'm sure folks who want to put in more effort can do better.
It's possible that Rabin fingerprinting [1](which IPFS optionally
uses) would be an easy win. And you could always create your own
diff/patch tool that understood that these were packed in tarballs,
although in that case you could teach it about gzip as well. I'm just
saying that even when you put in no effort, existing tools can already
help some cases (when you have lots of text-heavy files in your
tarball).

@stevvooe
Copy link
Contributor

There are a few problems with this PR. The first is that we don't want to introduce CAS concepts at the config level. The diff_ids are mainly just for consistent hashing. Referencing media types for this is just confusing.

The main problem is that we introduce structure to what should just be string constants. The media types should just string matching.

@wking
Copy link
Contributor Author

wking commented Sep 22, 2016

On Wed, Sep 21, 2016 at 04:52:37PM -0700, Stephen Day wrote:

The diff_ids are mainly just for consistent hashing. Referencing
media types for this is just confusing.

The media-type reference for diff_ids is an informative example, since
with this PR it is easy to give explicit media types for the gzipped
and uncompressed layer tarballs. I think that helps clarify the idea,
but if you think it adds confusion, I'm happy to drop the line.

The main problem is that we introduce structure to what should just
be string constants. The media types should just string matching.

They're called “structured syntax suffixes” for a reason ;). I agree
with the RFC 6839 authors that having a general rule for naming media
types that use a particular format (e.g. gzip) underneath. It
certainly scales better than explicitly naming gzipped and
uncompressed versions of media types as the number of OCI-related
media types grows. And client libraries can still use string
matching.

But “scales better” only matters if we actually do grow the number of
OCI-related media types, and that is still up in the air. If it would
make you more comfortable, I'm happy to define MediaTypeImageLayerGzip
and MediaTypeImageLayerNonDistributableGzip 1 as an addition to this
PR.

And if the idea of a structured syntax suffix makes you jumpy, I'm
happy to file an alternative PR that just defines
application/vnd.oci.image.layer.tar and
application/vnd.oci.image.layer.nondistributable.tar in addition to
their current +gzip forms without the RFC 6839 trappings. If we go
this route and then do see a rise in OCI-related media-types, we can
always come back to this PR.

@stevvooe
Copy link
Contributor

@wking 👎

I'm not wasting any more time explaining why these changes are a bad idea.

@wking
Copy link
Contributor Author

wking commented Sep 22, 2016

On Thu, Sep 22, 2016 at 02:57:56PM -0700, Stephen Day wrote:

I'm not wasting any more time explaining why these changes are a bad
idea.

I'm proposing 1 ways to address both of your concerns 2. If you
want me to drop the media types reference from the diff_ids comment, I
can do that 1. If you want me to drop the structured syntax suffix
in favor of just defining the application/vnd.oci.image.layer.tar and
application/vnd.oci.image.layer.nondistributable.tar strings without
declaring a general pattern, I can do that too 2. Is that how you'd
like to see your concerns addressed, or did you have a different
alternative in mind?

Or are you against allowing uncompressed layers at all? In that case
I expect you like #316 as it stands and can just close this PR.
Although the only reason I've heard for not allowing compression is
the increased duplication risk 4, and I'd rather leave it to blob
authors to avoid duplication than to forbid uncompressed layers 5.
If we were really worried about the impacts of compression on layer
duplication (which I'm not), we'd require uncompressed blobs in the
CAS APIs, and use ephemeral compression while moving them over the
wire or saving them to disk.

@stevvooe
Copy link
Contributor

@wking Dismissing is not the same as addressing. In practice, with CAS, allowing this variation at all causes a host of problems around hash stability and re-use of the existing body of content. For now, let's err on having just compressed layers. Structure can always be added but once you let this out of the bag, it cannot be returned.

@wking
Copy link
Contributor Author

wking commented Sep 23, 2016

On Thu, Sep 22, 2016 at 05:56:42PM -0700, Stephen Day wrote:

@wking Dismissing is not the same as addressing.

That's certainly true, but I don't think I'm dismissing your concerns.
Which are (as well as I can tell):

  1. “we don't want to introduce CAS concepts at the config level. The
    diff_ids are mainly just for consistent hashing. Referencing media
    types for this is just confusing.” 1

    I have offered to address this by removing the line which
    references media types [2,3], although personally I find that line
    useful. Is that the solution you would like to see for that
    concern? If not, what alternative solution are you looking for?

  2. “we introduce structure to what should just be string
    constants. The media types should just string matching.” 1

    I have offered to file an alternative PR that just defines
    MediaTypeImageLayer, MediaTypeImageLayerGzip,
    MediaTypeImageLayerNonDistributable, and
    MediaTypeImageLayerNonDistributableGzip as strings without
    involving structured syntax suffixes [2,3], although personally I
    find the structured syntax suffix approach more scalable and
    understandable. Is that the solution you would like to see for
    that concern? If not, what alternative solution are you looking
    for?

  3. “allowing this variation at all causes a host of problems around
    hash stability and re-use of the existing body of content” 4

    I don't think “folks can push layers with different hashes but the
    same semantic meaning” is a major problem 3. See also the
    discussion around canonical JSON in manifest json fields order #259, where the consensus seems
    to have landed, rightly I think, around SHOULD canonicalize, not
    MUST canonicalize.

    And if it is a major problem, I think the right solution would be
    to define and require canonical tarballs (e.g. “entries MUST be
    collated in the C local by directory, {further restrictions to on
    the tar format}, and the tarball MUST be uncompressed”). Forcing
    folks to push their tar into an unstable compression format does
    not seem like a good way to encourage reproducible hashes.

    But if you still think it's a major problem and that requiring
    gzip compression is a good fix for it, then you should close this
    PR and require compression [4](although I'd be very interested in
    knowing how requiring compressed tarballs helps stabilize hashes
    more than requiring uncompressed tarballs).

Do you have further concerns which I have dismissed without addressing?

For now, let's err on having just compressed layers. Structure can
always be added but once you let this out of the bag, it cannot be
returned.

“Things work fairly well as they stand; let's not rock the boat on
this yet” is a perfectly understandable position. In that case, I
recommend you drop a “long-term” tag or some such on this PR, and come
back and consider it when you feel you have the time to decide
whether or not optional compression (narrowly) or a +gzip structured
syntax suffix (broadly) are useful additions. But I'd appreciate
something with more technical grounding than “I don't want to get into
this now” before we close this door forever.

@vbatts
Copy link
Member

vbatts commented Oct 6, 2016

@wking i'm not strictly opposed to the possibility of non-gzipped tar archives, but I feel the way you've introduced the extent of it being optional here is too far. particularly, seeing the truncated form (without +gzip) seems to infer that this is the default. For Docker, this would break some notions of the cache that the registries hold.
Let's not get into the semantics of having the optional addition of +gzip in places, but stick to string constants for the media-types, with at most a paragraph regarding the structured syntax use of +gzip

@wking
Copy link
Contributor Author

wking commented Oct 6, 2016

On Thu, Oct 06, 2016 at 10:42:40AM -0700, Vincent Batts wrote:

particularly, seeing the truncated form (without +gzip) seems to
infer that this is the default.

There is no “default”; both are valid. But I've pushed 96f3a67
3cc036e rebasing onto master and adding some “Entries in this field
will frequently…” weaseling (although I'd be happier without the new
line).

I've also added some language requiring OCI implementations to support
the +gzip suffix for OCI image types, just to make sure everyone's
clear that gzipped layers are legal.

For Docker, this would break some notions of the cache that the
registries hold.

I'm missing something here. How does registry cache come in? The
builder makes up a layer, gzips it (or not) as they see fit, pushes it
into CAS, and sticks the media type and digest in the ‘layers[]’
descriptor. Where does caching come in? And where does the breakage
come in?

Let's not get into the semantics of having the optional addition of
+gzip in places, but stick to string constants for the
media-types, with at most a paragraph regarding the structured
syntax use of +gzip

That sounds like the bit under (2) in 1. I'll file that alternative
PR in the next few days unless someone beats me to it (which is fine
with me). I still think it's more clear to just lay out the generic
pattern following the example set by RFC 6839. But the end result (as
far as OCI-defined types are concerned) will be the same either way.

@wking
Copy link
Contributor Author

wking commented Oct 15, 2016

Rebased onto master with 3cc036e2829f04, resolving some minor conflicts and adding support for unpacking both gzipped and uncompressed layers. I expect unpackLayer will end up in image-tools, so I haven't invested a lot of time polishing this implementation. But without some sort of change the manifest tests fail.

wking added a commit to wking/image-spec that referenced this pull request Oct 15, 2016
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

Also update unpackLayer to handle both compressed and uncompressed
layers.  I expect unpackLayer will end up in image-tools, so I haven't
invested a lot of time polishing this implementation.  But without
*some* sort of change the manifest tests fail.

[1]: opencontainers#332
[2]: opencontainers#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
@wking
Copy link
Contributor Author

wking commented Oct 15, 2016

On Thu, Oct 06, 2016 at 03:57:45PM -0700, W. Trevor King wrote:

Thu, Oct 06, 2016 at 10:42:40AM -0700, Vincent Batts:

Let's not get into the semantics of having the optional addition
of +gzip in places, but stick to string constants for the
media-types, with at most a paragraph regarding the structured
syntax use of +gzip

That sounds like the bit under (2) in [1]. I'll file that alternative
PR in the next few days…

Filed as #388.

@wking
Copy link
Contributor Author

wking commented Oct 21, 2016

Rebased around #317 and #337 with b9173d5a288637.

wking added 2 commits October 21, 2016 12:46
Leaning heavily on the existing entries in RFC 6839.  The suffix makes
it easy to clarify DiffIDs without requiring a particular layer media
type.  It also allows you to create image-layout instances where the
layers are stored uncompressed, which may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [1].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
Generated with:

  $ make img/media-types.png

and Graphviz version 2.38.0.

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/image-spec that referenced this pull request Oct 21, 2016
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers#332
[2]: opencontainers#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
@vbatts
Copy link
Member

vbatts commented Nov 30, 2016

I like the notion of defining the structured syntax, but the mechanic of leaving it up to implementations for detecting and trying different suffixes seems to sprawl that clarity rather than refine it.
I'm closing this until the idea lead to refined media-types.

@vbatts vbatts closed this Nov 30, 2016
wking added a commit to wking/image-spec that referenced this pull request Jan 19, 2017
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers#332
[2]: opencontainers#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/image-spec that referenced this pull request Jan 19, 2017
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers#332
[2]: opencontainers#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/image-spec that referenced this pull request Jan 19, 2017
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers#332
[2]: opencontainers#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
dattgoswami9lk5g added a commit to dattgoswami9lk5g/bremlinr that referenced this pull request Jun 6, 2022
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers/image-spec#332
[2]: opencontainers/image-spec#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
7c00d pushed a commit to 7c00d/J1nHyeockKim that referenced this pull request Jun 6, 2022
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers/image-spec#332
[2]: opencontainers/image-spec#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
7c00d added a commit to 7c00d/J1nHyeockKim that referenced this pull request Jun 6, 2022
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers/image-spec#332
[2]: opencontainers/image-spec#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
laventuraw added a commit to laventuraw/Kihara-tony0 that referenced this pull request Jun 6, 2022
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers/image-spec#332
[2]: opencontainers/image-spec#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
tomalopbsr0tt added a commit to tomalopbsr0tt/fabiojosej that referenced this pull request Oct 6, 2022
I'd prefer defining this as a structured syntax suffix following RFC
6839, and have filed a pull request to that effect [1].  However, the
current maintainer consensus seems to be to define the compressed and
uncompressed types directly without declaring a structured syntax
suffix pattern [2].  I'm not clear on the reason for avoiding the
structured syntax suffix, but that's the route I've taken in this
commit.

Now that you can choose both compressed or uncompressed media types,
it is easy to clarify DiffIDs by comparing types with and without the
+gzip compression.  media type.  It also allows you to create
image-layout instances where the layers are stored uncompressed, which
may be useful for cases such as:

* Binary diffing between layer blobs for cheaper updates of large
  layers [3].

* Compressing an image-layout tarball for a smaller smaller overall
  tarball (by avoiding the unnecessary fragmentation of compressing
  the individual blob entries).

[1]: opencontainers/image-spec#332
[2]: opencontainers/image-spec#332 (comment)
[3]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/%23opencontainers.2016-08-16.log.html#t2016-08-16T23:35:43

Signed-off-by: W. Trevor King <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants