Skip to content

Commit

Permalink
Add s3 remote cache
Browse files Browse the repository at this point in the history
Signed-off-by: Bertrand Paquet <[email protected]>
  • Loading branch information
bpaquet committed May 13, 2022
1 parent 394ccf8 commit e0b54a4
Show file tree
Hide file tree
Showing 535 changed files with 133,418 additions and 1 deletion.
23 changes: 23 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,29 @@ jobs:
name: coverage
path: ./coverage

test-s3:
runs-on: ubuntu-latest
needs:
- base
steps:
-
name: Checkout
uses: actions/checkout@v2
-
name: Expose GitHub Runtime
uses: crazy-max/ghaction-github-runtime@v1
-
name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
with:
version: ${{ env.BUILDX_VERSION }}
driver-opts: image=${{ env.REPO_SLUG_ORIGIN }}
buildkitd-flags: --debug
-
name: Test
run: |
hack/s3_test/run_test.sh
test-os:
runs-on: ${{ matrix.os }}
strategy:
Expand Down
47 changes: 47 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ You don't need to read this document unless you want to use the full-featured st
- [Registry (push image and cache separately)](#registry-push-image-and-cache-separately)
- [Local directory](#local-directory-1)
- [GitHub Actions cache (experimental)](#github-actions-cache-experimental)
- [S3 cache (experimental)](#s3-cache-experimental)
- [Consistent hashing](#consistent-hashing)
- [Metadata](#metadata)
- [Systemd socket activation](#systemd-socket-activation)
Expand Down Expand Up @@ -426,6 +427,52 @@ in your workflow to expose the runtime.
* `type=gha`
* `scope=buildkit`: which scope cache object belongs to (default `buildkit`)

#### S3 cache (experimental)

```bash
buildctl build ... \
--output type=image,name=docker.io/username/image,push=true \
--export-cache type=s3,region=eu-west-1,bucket=my_bucket,name=my_image \
--import-cache type=s3,region=eu-west-1,bucket=my_bucket,name=my_image
```

The following attributes are required:
* `bucket`: AWS S3 bucket (default: `$AWS_BUCKET`)
* `region`: AWS region (default: `$AWS_REGION`)

Storage locations:
* blobs: `s3://<bucket>/<prefix><blobs_prefix>/<sha256>`, default: `s3://<bucket>/blobs/<sha256>`
* manifests: `s3://<bucket>/<prefix><manifests_prefix>/<name>`, default: `s3://<bucket>/manifests/<name>`

S3 configuration:
* `blobs_prefix`: global prefix to store / read blobs on s3. (default: `blobs/`)
* `manifests_prefix`: global prefix to store / read blobs on s3. (default: `manifests/`)
* `endpoint_url`: specify a specific S3 endpoint. (default: empty)
* `use_path_style`: if set to `true`, put the bucket name in the URL instead of in the hostname. (default: `false`)

AWS Authentication:

The simplest way is to use an IAM Instance profile.
Others options are:

* Any system using environment variables / config files supported by the [AWS Go SDK](https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html). The configuration must be available for the buildkit daemon, not for the client.
* Access key ID and Secret Access Key, using the `access_key_id` and `secret_access_key` attributes.


`--export-cache` options:
* `type=s3`
* `mode=min` (default): only export layers for the resulting image
* `mode=max`: export all the layers of all intermediate steps.
* `prefix`: global prefix to store / read files on s3. Default: empty
* `name=buildkit`: name of the manifest to use (default `buildkit`). Multiple manifest names can be specified at the same time, separated by `;`. The standard use case is to use the git sha1 as name, and the branch name as duplicate, and load both with 2 `import-cache` commands.

`--import-cache` options:
* `type=s3`
* `prefix=`: global prefix to store / read files on s3. Default: empty
* `blobs_prefix=`: global prefix to store / read blobs on s3. (default: `blobs/`)
* `manifests_prefix=`: global prefix to store / read blobs on s3. (default: `manifests/`)
* `name=buildkit`: name of the manifest to use (default `buildkit`)

### Consistent hashing

If you have multiple BuildKit daemon instances but you don't want to use registry for sharing cache across the cluster,
Expand Down
75 changes: 75 additions & 0 deletions cache/remotecache/s3/readerat.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
package s3

import (
"io"
)

type ReaderAtCloser interface {
io.ReaderAt
io.Closer
}

type readerAtCloser struct {
offset int64
rc io.ReadCloser
ra io.ReaderAt
open func(offset int64) (io.ReadCloser, error)
closed bool
}

func toReaderAtCloser(open func(offset int64) (io.ReadCloser, error)) ReaderAtCloser {
return &readerAtCloser{
open: open,
}
}

func (hrs *readerAtCloser) ReadAt(p []byte, off int64) (n int, err error) {
if hrs.closed {
return 0, io.EOF
}

if hrs.ra != nil {
return hrs.ra.ReadAt(p, off)
}

if hrs.rc == nil || off != hrs.offset {
if hrs.rc != nil {
hrs.rc.Close()
hrs.rc = nil
}
rc, err := hrs.open(off)
if err != nil {
return 0, err
}
hrs.rc = rc
}
if ra, ok := hrs.rc.(io.ReaderAt); ok {
hrs.ra = ra
n, err = ra.ReadAt(p, off)
} else {
for {
var nn int
nn, err = hrs.rc.Read(p)
n += nn
p = p[nn:]
if nn == len(p) || err != nil {
break
}
}
}

hrs.offset += int64(n)
return
}

func (hrs *readerAtCloser) Close() error {
if hrs.closed {
return nil
}
hrs.closed = true
if hrs.rc != nil {
return hrs.rc.Close()
}

return nil
}
Loading

0 comments on commit e0b54a4

Please sign in to comment.