Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: provenance generation and verifying attestation using cosign with new predicate type #105

Closed
Dentrax opened this issue Aug 5, 2022 · 3 comments · Fixed by #191

Comments

@Dentrax
Copy link
Member

Dentrax commented Aug 5, 2022

TL;DR: Create a provenance by traversing each commit. Attest into OCI registry. Verify all commits in attestation using cosign.


Abstract

This issue proposes introducing new cosign.sigstore.dev/attestation/gitsign/v1 predicate type for using the in-toto/attestation model. Also proposes a new subcommand in gitsign itself: attestation generate to generate an attestation file by traversing each commits.

Motivation

While working on the Cosign VULN_SPEC proposal a few months ago, we learned a lot from the both sigstore and in-toto community members and valuable comments. And no doubt this proposal will be our next opportunity for new learnings for sure. Gitsign is a brand-new tool and very promising. So with @developer-guy, we thought that this proposal would be worth creating and discussing further here. So created this one, as far as we can do best.

Proposal

The main idea is to generate a provenance for the entire commit history for the repository in the container image build-time (to prevent a slight time window in which it could have tampered). And storing as an attestation format in the OCI registry. As a consumer of the container image, verifying gitsign attestation means that I verify the attestation itself and each commit that has been made by committers using their public key that is either fetched from Rekor (PKCS7 cert) or GPG verify (PGP cert).

By doing so; rather than saying "I trust this repository and binaries thus all artifacts are signed", the statement becomes "all commits that have made by commiters are signed & verified by either using gitsign and GPG, so I trust each commit. and also container image, binaries and artifacts already signed.".

And example use-case for verifying is to use cosign to verify attestation and also verify all the commits by using gitsign under the hood.

Whis proposal requires collaborate with gitsign and cosign together. Implement all of this, we would do add some commands:

For gitsign: introduce new attestation generate command:

$ gitsign attestation generate

For cosign: introduce new --gitsign true|false flag:

$ COSIGN_EXPERIMENTAL=1 cosign verify-attestation \
    --type cosign.sigstore.dev/attestation/gitsign/v1 \
    --gitsign \
    foo/bar:baz

New --gitsign flag traverses each commits and verifies by running under the hood:

$ cosign verify-blob \
  --cert commits[*].rekor.signature.publicKey \
  --signature commits[*].rekor.signature.content \
  commits[*].author.digest.sha1

The 30,000-foot view

Please note that this spec should be generic and open for extendibility. This is not intended to be specially built for just gitsign. Wnyone who wants to create their own git commit signing tool should be able to consume this. Here is how new cosign.sigstore.dev/attestation/gitsign/v1 predicate looks like (draft - open for feedback):

{
    "_type": "https://in-toto.io/Statement/v0.1",
    "subject": [],
    "predicateType": "cosign.sigstore.dev/attestation/gitsign/v1",
    "predicate": {
        "invocation": {
            "fulcio_url": "https://fulcio.sigstore.dev",
            "rekor_url": "https://rekor.sigstore.dev",
            "oidc": {
                "client_id": "sigstore",
                "issuer": "https://oauth2.sigstore.dev/auth",
                "redirect_url": ""
            },
            "timestamp": 1627564731
        },
        "commits": [
            {
                "issuer": {
                    "o": "sigstore.dev",
                    "cn": "sigstore-intermediate"
                },
                "signature": {
                    "status": "G",
                    "format": "pkcs7",
                    "publicKey":"base64-encoded-pkcs7-cert"
                },
                "rekor": {
                    "signature": {
                        "algorithm": "ecdsa-with-SHA256",
                        "content": "MEYCIQC1gffaJyVGfmJNoX94n9vOj+1EKEAlolT9UH7Bb2MuwwIhAJ1FFee9bDQdLbjt4yjbYz5Ojd3uITilNU4KLGJqiSQr",
                        "cert": "base64-encoded-cert"
                    },
                    "digest": {
                        "sha256": "d00bf6f1a50835a372b137ea18306d6a1d554500caf821e375817173db760868"
                    }
                },
                "author": {
                    "name": "[email protected]",                 
                    "digest": {
                        "sha1": "6a2b1f12938059f2fbe68754657fad33a4fca372"
                    }
                }
            },
            {
                "signature": {
                    "status": "G",
                    "format": "pgp",
                    "publicKey": "base64-encoded-pgp-cert"
                },
                "author": {
                    "name": "[email protected]",
                    "digest": {
                        "sha1": "5ee8eca4f05ef4b79eb6dc21c09198d4d61e3495"
                    }
                }
            },
            {
                "signature": {
                    "status": "N"
                },
                "author": {
                    "name": "[email protected]",
                    "digest": {
                        "sha1": "adba2b44bd04f8f2c92fbf739af797b8dcd046b5"
                    }
                }
            }
        ]
    }
}
  • author.name: (required) name of the author

  • author.digest.sha1: (required) commit

  • signature.status: (required) %G? of a commit

  • signature.format: (optional) signed with? (gpg, gitsign, etc.)

  • signature.publiKey: (optional) certificate from ($ git cat-file commit HEAD)

  • rekor: (optional) (if keyless signed with gitsign)

  • rekor.signature.algorithm: (required) signature algorithm

  • rekor.signature.content: (required) Rekor.Body.HashedRekordObj.signature.content

  • rekor.signature.cert: (required) Rekor.Body.HashedRekordObj.signature.publicKey.content

  • rekor.digest.sha256: (required) Rekor.Body.HashedRekordObj.data.hash.value

  • issuer: (optional) (if keyless signed)

  • issuer.o: (required) organization

  • issuer.cn (required) commonName

Implementation

  1. Define a new SLSA provenance predicate type either here in gitsign or in the in-toto repository.

  2. We should add a new attestation generate command in gitsign:

$ gitsign attestation generate -o gitsign.json

See git-log(1) page for more details about the %G? signature status.

Pseudocode:

loop through each commits
  check %G? for each commit
    if commit no signed: // <sig>N</sig>
      set signature.status = N
    if commit signed with GPG: // <sig>G</sig>
      set signature.status = G
      set signature.format = GPG
      set signature.publicKey = base64-encoded-pgp-cert
    if commit signed with gitsign:
      search on Rekor: rekor-cli search --artifact <COMMIT>
        get certificate: $(rekor-cli get --uuid=$uuid --format=json | jq -r .Body.HashedRekordObj.signature.publicKey.content)
        get signature: $(rekor-cli get --uuid=$uuid --format=json | jq -r .Body.HashedRekordObj.signature.content)
        get issuer from the cert
        get necessary rekor obj values from rekor API
        append issuer and rekor objects
    append author object
  1. Introduce a new --gitsign boolean flag in cosign verify-attestation:
$ COSIGN_EXPERIMENTAL=1 cosign verify-attestation \
    --type cosign.sigstore.dev/attestation/gitsign/v1 \
    --gitsign \
    foo/bar:baz

Cosign should first verify the attestation as is. If verification succeed, it should parse the .att to load all cosign.sigstore.dev/attestation/gitsign/v1 predicate type to struct definitions.

The New --gitsign flag traverses each commit and verifies by running under the hood:

$ cosign verify-blob \
  --cert commits[*].rekor.signature.publicKey \
  --signature commits[*].rekor.signature.content \
  commits[*].author.digest.sha1

Pseudocode:

loop through each commits
  verify the attestation
    if verification fails
      exit
    if verification succeed
      parse predicate to structs
        traverse in commits[*]
          if commit signed with gitsign
            run $ cosign verify-blob --cert .rekor.signature.publicKey --signature .rekor.signature.content .author.digest.sha1
          if commit signed with gpg
            run $ git verify-commit .author.digest.sha1 (in cosign)
          if commit unsigned
            exit
          if one of verification fails exit

End User Usage

  1. Generates a provenance:
$ gitsign attestation generate -o gitsign.json
  1. Attests and push to OCI:
$ COSIGN_EXPERIMENTAL=1 cosign attest \
    --type cosign.sigstore.dev/attestation/gitsign/v1 \
    --predicate gitsign.json \
    foo/bar:baz
  1. Verifies the attestation:
$ COSIGN_EXPERIMENTAL=1 cosign verify-attestation \
    --type cosign.sigstore.dev/attestation/gitsign/v1 \
    --gitsign \
    foo/bar:baz

Intended Users

  • container image builders
  • pipelines
  • end users

User Story

This container image seems signed by a trusted author but still, I couldn't trust the source code because I don't know who contributed. There is always the possibility that committers could have impersonated someone else by using their name and email. What I want is to verify all the commits that include in the source code that makes up the final container image. So I can be sure that there is no unsigned commit. All commits have a signature and are verified.

Concerns

  1. We should be careful here since our main goal here is to not just say "all commits are verified" but "all commits are verified and trusted". So we should not forget that.

  2. Backward compatibility is another big issue for us. What if a repository started enforcing GPG after the 50th commit and switched to a brand-new gitsign tool after the 200th commit? How the whole concept will work here?

Would pass something -n | --number <INTEGER> flag to gitsign attestation generate tackle this problem? So we can only get commits after n th.

Alternative Methods

  1. Instead of generating an attestation and verifying, we can simply enforce signed commits only rule in the Git providers (GitHub, GitLab, etc.). So It would be easier to reject unsigned commits. E.g. server-side precommit hooks.

  2. Similar to 1, instead of trusting the commits, we can trust the commit authors. By making an allowlist for the public keys that committers use, we can create restrictive policies to reject upcoming commits from the anonymous (i.e. open-source) contributors.

Related Proposals

  1. XREF: Idea: gitsign attest #94 by @wlynch

What distinguishes our proposal from the following above is that we do not create attestation on each commit, but instead keep a single one. Also this proposal does NOT store anything in refs folder. We also want to verify everything as much as possible before using.

Open Questions

  1. Does this all make sense overall?
  2. Are all fields in rekor object really necessary? How would it be if we just pass UUID instead?
  3. Should we verify each commit in history that made up the container image? (before running the container image)
  4. Instead of generating the attestation that only has one signature, shouldn't each commiter sign the same attestation with their PK or keyless somehow?
  5. Should we sign all commits individually?
  6. Should we store the final attestation in a version controlled in the repository itself? (i.e. in ref/, so anyone can verify all commits using gitsign itself either)
  7. Does this proposal uses Zero Trust ("trust nothing, verify everything") principles?

Waiting for your feedback!

cc @dlorenc @lukehinds @TomHennen @adityasaky @SantiagoTorres

@wlynch
Copy link
Member

wlynch commented Aug 5, 2022

Thanks for putting this together! I definitely want to explore what kind of attestations we can start putting together for git commits.

So I'm a bit hesitant about trying to create an attestation that captures all commits in a single document for a few reasons:

  1. This could get large

    e.g. the canonical worst case example is https://github.com/torvalds/linux which is 1M+ commits. Git does not guarantee immutable / append only branch history (i.e. you can rewrite history, or point the ref to a completely disjoint commit DAG), so you'd have to recompute a new attestation for the entire DAG for each new commit.

  2. Whether a commit is part of a repo can actually be a bit ambiguous. 😅

    GitHub (and IIRC other Git providers work in a similar manner) stores forks / PRs in the same repo - this post has a good diagram showing how this layering is represented in storage. For example, I've been hacking on attestation storage for Idea: gitsign attest #94 in https://github.com/wlynch/gitsign/tree/attest, but this commit can also be "found" in the main gitsign repo if you know the SHA - 08572d0, even though it isn't actually reachable by any branch in the main gitsign repo. Pull requests that are opened in the repo can also be found in the main repo under refs/pull (you can see these if you run git ls-remote on a repo), so these are technically part of the repo, but not necessarily commits that need to be considered trusted.

You would probably want to scope this to a particular ref - i.e. what commits are reachable by refs/heads/main. But because Git itself is a merkle tree, by signing a commit at the head of the branch, you're effectively signing content hashes of all of its parent history as well - you wouldn't be able to modify any history without causing modifications to the commit, so there's no need to traverse the entire DAG to ensure integrity.


I think you also raise a great point with the concern around backwards compatibility w.r.t. how to handle repos that previously weren't signing commits. I don't think there's significant advantage for repo consumers in guaranteeing that every commit in the history was signed, so as long as the commit you're building from was signed by a trusted identity. That said, I 100% agree that we should be encouraging people to sign all commits, because ideally you should be able to check out an arbitrary revision and have trust in the integrity of that commit.

@TomHennen
Copy link
Contributor

(ninja'd by @wlynch :))

This is an interesting proposal! A couple thoughts in no particular order:

  1. I think this same approach is applicable to all types of software binaries that are created from source. Perhaps the proposal doesn't need to be specific to container images?
  2. "Should we verify each commit in history that made up the container image? (before running the container image)"
    Do you mean have the consumer fetch the history of the of the image and verify everything themselves? If so does this attestation need to do anything besides point them at the repo & commit that the build came from? A downside is that it assumes the consumer as read access to the repo and its history which may not always be true.
  3. If we don't do (2), how does the caller know that the attestor included all the relevant commits in the attestation? I see two options a) include the commit parents from the gitsign signature in this attestation (presumably as a subfield of author?) b) greatly simplify this attestation by having gitsign attestation generate do the entire history verification but then emit an attestation that just says "I verified all the commits that make up this code were signed up to N levels deep" (or similar). Rationale: gitsign attestation generate is run by the builder and the builder can always lie about what it did anyways (so it has to be trusted unless someone is doing a reproducible build) so leverage that trust by having it do the check. The consumers can then verify the attestation much more easily. If people do want to reproduce the build then they can gitsign attestation generate themselves and see if generated attestation matches the one the original builder provided.
  4. This approach suggests that we trust a commit simply because it's signed. Is that the property we want? "instead of trusting the commits, we can trust the commit authors" I like this approach. You could have the list of accepted keys/authors in the repo and gitsign attestation generate could fetch that list and compare the signed commits against it. (it would be fun to figure out how to handle the list of trusted authors changing over time), then in the attestation it generates it can say "and all commits came from trusted authors". This might work well with (3).
  5. How would you like to handle repos that didn't require signed commits and only recently adopted them? That might work well with having some 'gitsign config' in the repo itself (probably the same place you store the list of trusted authors?). The config could say "after commit X" or "after date Y" all commits must be signed. gitsign attestation generate would again fetch this config and verify at the time it generates the config.
  6. To @wlynch's point about size, perhaps there's some checkpointing that could be done in the repo itself (as you suggest storing something in ref/)? Would it make any sense for the repo to generate this attestation instead of a process running on the builder?

@trishankatdatadog
Copy link

Good idea to sign attestations about commits!

  1. This could get large

I agree. This does not seem like it generally scales. I'm also not sure what it really buys you. Maybe signing the release/tag should be sufficient?

Also, what about the possibility to sign some/all sources files, not just commits?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants