Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dockerfile (labs): implement ADD --checksum=<checksum> <http src> <dest> #3093

Merged
merged 1 commit into from
Oct 17, 2022

Conversation

AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Sep 8, 2022

e.g.,

# syntax=docker/dockerfile-upstream:master-labs
# ↑ Will be included in `syntax=docker/dockerfile:1.5-labs`

FROM scratch
ADD --checksum=sha256:24454f830cdb571e2c4ad15481119c43b3cafd48dd869a9b2945d1036d1dc68d https://mirrors.edge.kernel.org/pub/linux/kernel/Historic/linux-0.01.tar.gz /

Fix #975

@thaJeztah
Copy link
Member

Oh, this is great! Thanks for working on this! I was just discussing this feature with @crazy-max last week.

Following the related ticket, I was thinking about this, and I can see potential use-cases that would also want to use this for local files (so also for COPY?), thinking of cases like;

A local Dockerfile used to build from a build-context elsewhere; in that case the Dockerfile wants to check that the file in the "elsewhere" context has the expected checksum;

docker build -f ./Dockerfile /path/to/build-context/

docker build -f ./Dockerfile https://github.com/foo/bar.git


docker build -f- https://github.com/foo/bar.git <EOF
FROM foo
ADD --checksum=$CHECKSUM file.tar.gz .
EOF

Second thought (but perhaps feature creep / YAGNI); do we expect other verification methods to be added? If so, do we want this to be more generic (--verify)?

Thinking of;

FROM scratch

# verify checksum
ADD --verify=checksum=sha256:<checksum> some-file .

# verify GPG signed
ADD --verify=gpg=<GPG key> some-file .

@AkihiroSuda
Copy link
Member Author

potential use-cases that would also want to use this for local files (so also for COPY?)

Yes, but checksum verification is currently not implemented in the FileOp.

do we expect other verification methods to be added? If so, do we want this to be more generic (--verify)?

I think that will be in the source policy engine (OPA/Rego?) #2943 (comment)

@thaJeztah
Copy link
Member

Yes, but checksum verification is currently not implemented in the FileOp.

I'm not too familiar with all the internals, so my comment is purely "user perspective";

  • Is it (technically) possible? (if not now, then in future?)
  • What happens with the current PR if you try to use --checksum for a local file (with ADD)?

@AkihiroSuda
Copy link
Member Author

  • Is it (technically) possible? (if not now, then in future?)

Yes, but probably in a separate PR.

  • What happens with the current PR if you try to use --checksum for a local file (with ADD)?

Returns errors.New("checksum can't be specified for non-HTTP sources")

@thaJeztah
Copy link
Member

Yes, but probably in a separate PR.
Returns errors.New("checksum can't be specified for non-HTTP sources")

Thanks! Expanding in future is fine (as long as we indeed have a good error message before that).

I'll let others give their input as well on --checksum vs --verify 👍

@crazy-max
Copy link
Member

crazy-max commented Sep 20, 2022

This could be handy but I wonder if we could not put instead the chesksum as suffix after @ char like we do with the image digest:

FROM scratch
COPY --from=moby/buildkit:latest@sha256:67c9251f9f2e103e1ee489b6cead518b6d82607ef485d3f1505fc4095a55ebeb /usr/bin/buildkitd /usr/bin/

So it would not require an extra flag?:

# syntax=docker/dockerfile-upstream:master
# ↑ Will be included in `syntax=docker/dockerfile:1.5`

FROM scratch
ADD https://mirrors.edge.kernel.org/pub/linux/kernel/Historic/linux-0.01.tar.gz@sha256:24454f830cdb571e2c4ad15481119c43b3cafd48dd869a9b2945d1036d1dc68d /

Looking at the RFC

Reserved:

Many URL schemes reserve certain characters for a special meaning:
their appearance in the scheme-specific part of the URL has a
designated semantics. If the character corresponding to an octet is
reserved in a scheme, the octet must be encoded. The characters ";",
"/", "?", ":", "@", "=" and "&" are the characters which may be
reserved for special meaning within a scheme. No other characters may
be reserved within a scheme.

@ is a reserved char and should be encoded so it might be ok to use it as separator.

@tonistiigi
Copy link
Member

What about the cases like:

ADD https://..../releases/release-$TARGETOS-$TARGETARCH.tar.gz /

Is there some pattern, how checksums could be validated for this case?

@AkihiroSuda
Copy link
Member Author

https://mirrors.edge.kernel.org/pub/linux/kernel/Historic/linux-0.01.tar.gz@sha256:24454f830cdb571e2c4ad15481119c43b3cafd48dd869a9b2945d1036d1dc68d

This can be misinterpreted like

proto=https
username=mirrors.edge.kernel.org/pub/linux/kernel/Historic/linux-0.01.tar.gz
host=sha256
port=24454f830cdb571e2c4ad15481119c43b3cafd48dd869a9b2945d1036d1dc68d

@AkihiroSuda
Copy link
Member Author

What about the cases like:

ADD https://..../releases/release-$TARGETOS-$TARGETARCH.tar.gz /

Is there some pattern, how checksums could be validated for this case?

This should work?

FROM scratch AS foo-linux-amd64
ADD --checksum=sha256:deadbeef https://..../releases/release-linux-amd64.tar.gz /

FROM scratch AS foo-freebsd-riscv64
ADD --checksum=sha256:cafebabe https://..../releases/release-freebsd-riscv64.tar.gz /

ARG TARGETOS
ARG TARGETARCH
FROM foo-$TARGETOS-$TARGETARCH AS foo

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but maybe we should keep it in labs for a release in case there is a need for something like --verify as well.

@AkihiroSuda AkihiroSuda requested a review from thaJeztah October 4, 2022 04:11
@thaJeztah
Copy link
Member

LGTM but maybe we should keep it in labs for a release in case there is a need for something like --verify as well.

Do you think it should be --verify=checksum=XXX (#3093 (comment)) to be prepared for that, or "unlikely" that other options will be added?

@AkihiroSuda
Copy link
Member Author

LGTM but maybe we should keep it in labs for a release in case there is a need for something like --verify as well.

Do you think it should be --verify=checksum=XXX (#3093 (comment)) to be prepared for that, or "unlikely" that other options will be added?

If we adopt this syntax, can we skip the labs?

@tonistiigi
Copy link
Member

If we adopt this syntax, can we skip the labs?

I would prefer not. I don't actually see how the --verify could be extended atm. For something like the GPG signature check, you would need to provide a trust chain with the full public key so I don't know how that flag could be used in its current form. If the key pulled from keyserver based on fingerprint then I guess the --verify syntax could work (or it could as well be --pgp then) but then we have a dependency on network access.

@AkihiroSuda AkihiroSuda marked this pull request as draft October 12, 2022 05:03
@thaJeztah
Copy link
Member

If we don't expect other options to be added, then --checksum SGTM

@AkihiroSuda any reason you put this one back in draft?

@AkihiroSuda
Copy link
Member Author

@AkihiroSuda any reason you put this one back in draft?

For moving this to the labs channel.

@AkihiroSuda AkihiroSuda marked this pull request as ready for review October 13, 2022 01:05
@AkihiroSuda
Copy link
Member Author

Moved to the labs.
This should be ready to merge now.

@tonistiigi
Copy link
Member

@AkihiroSuda check the linter

e.g.,

  ADD --checksum=sha256:24454f830cdb571e2c4ad15481119c43b3cafd48dd869a9b2945d1036d1dc68d https://mirrors.edge.kernel.org/pub/linux/kernel/Historic/linux-0.01.tar.gz /

Fix issue 975

Signed-off-by: Akihiro Suda <[email protected]>
@AkihiroSuda
Copy link
Member Author

Fixed

@AkihiroSuda AkihiroSuda force-pushed the add-checksum branch 2 times, most recently from 87186e4 to a83fbb9 Compare October 13, 2022 03:06
@tonistiigi tonistiigi merged commit c717d6a into moby:master Oct 17, 2022
@AkihiroSuda AkihiroSuda changed the title dockerfile: implement ADD --checksum=<checksum> <http src> <dest> dockerfile (labs): implement ADD --checksum=<checksum> <http src> <dest> Oct 18, 2022
@richlander
Copy link

I tried to get this to work with sha512. It doesn't seem to. Any plans to enable the longer digest length?

I was wanting to rewrite this Dockerfile with this new syntax. I was able to with sha256 but not sha512.

https://github.com/dotnet/dotnet-docker/blob/c02e853354cba294b3ac934ef0c5b57a2109811e/src/runtime/8.0/bookworm-slim/amd64/Dockerfile

@tonistiigi
Copy link
Member

I tried to get this to work with sha512.

This isn't as easy as this validation isn't just checking the downloaded bytes but deeply integrated with the caching system.

Eg. if you have a build that does: ADD https://someurl foo.tar and save cache for that build we will save the digest of that artifact in the build cache so next time we just need to recompute the digest (there is etag caching as well) and can reuse the cache when it has not changed. But if next time your command is ADD --checksum=sha256:abc https://someurl foo.tar and the provided checksum matches the previous pull we don't need to check anything and can just match the cache directly. But if multiple algorithms are used that would mean that when we save cache for such operations we would need to compute the checksums with many algorithms that would be wasteful.

@richlander
Copy link

Got it. Thanks. Makes sense. I appreciate the extra insight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Abbility to verify checksum with the ADD command
5 participants