Skip to content

Commit

Permalink
docs: add design, roadmap, bof notes
Browse files Browse the repository at this point in the history
Signed-off-by: Akihiro Suda <[email protected]>
  • Loading branch information
AkihiroSuda committed Nov 13, 2017
1 parent 1513899 commit 9f8696c
Show file tree
Hide file tree
Showing 5 changed files with 178 additions and 2 deletions.
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,6 @@ Key features:
- Pluggable architecture


Read the proposal from https://github.com/moby/moby/issues/32925

#### Quick start

BuildKit daemon can be built in two different versions: one that uses [containerd](https://github.com/containerd/containerd) for execution and distribution, and a standalone version that doesn't have other dependencies apart from [runc](https://github.com/opencontainers/runc). We are open for adding more backends. `buildd` is a CLI utility for serving the gRPC API.
Expand Down Expand Up @@ -162,3 +160,8 @@ Validating your updates before submission:
```bash
make validate-all
```

#### Documents
- https://github.com/moby/moby/issues/32925: Original proposal
- ['docs/roadmap.md']: roadmap (tentative)
- ['docs/misc']: miscellaneous unformatted documents
51 changes: 51 additions & 0 deletions docs/misc/bof-2017-copenhagen.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# BuildKit BoF at [Moby Summit Copenhagen (October 2017)](https://blog.mobyproject.org/dockercon-eu-moby-summit-recap-moby-project-security-and-networking-16da4f8172f0)

## Unformatted and INACCURATE note

```
10-20 attendees: me(akihiro suda), tonis, vincent demeester, mhbauer, simon, tibor, sebastiaan, sven…
Q&A:
- Distributed mode? [vincent, et al]
○ Single graph solver, single state for instruction cache [tonis]
○ Stateless masters and workers [akihiro]
○ Cpu stat… [tonis]
- Infinit for distributed cache? [simon]
○ Docker registry? IPFS? [akihiro]
○ Needs investigation for plugin infrastructure
- Multi-output for multi-arch?
○ Execute requests in parallel and caches are used, so it produces multi outputs [tonis]
- How frontend is implemented?
○ Dockerfile2llb and frontend cmd
- Gobuild in dockerfile? [simon]
○ Not at the moment, we need to add Dockerfile instruction [tonis]
○ Like: FROM … AS gobuild FROM alpine RUN --nested=gobuild github.com/cmd/foo > foo
- Buildkit for CI [simon]
- Serverless buildkit
- Direct LLB build for Dockerfile?
- Why client and server?
○ For moby
- Ccache
○ Persistent source [tonis]
- Compare with bazel [tibor]
- Image validation, notary? [vincent]
- Libentitlements for buildkit? [tonis]
○ Ctrd-level entitlements? [tibor]
- Rootless containers? Although unlikely to work with apt/yum [akihiro]
○ Need to find solution that works [tonis]
- Integration to moby asm [tibor]
○ For building vmlinuz? [akihiro]
- Cache format [tibor]
○ Ctrd snapshot but more coupled with content [tonis]
- Delta image as in balena [tibor]
○ As ctrd snapshotter and differ [tibor, akihiro]
- Two types of cache (llb, content) [tonis]
- Windows support [simon]
○ Wait for ctrd 1.1 [tonis]
○ Needs worker constraint metadata for vertex [akihiro]
○ Windows is not so fast.
- Multi-arch
○ Worker constraint vertex md
○ Copier
○ Manifest list generator
```
47 changes: 47 additions & 0 deletions docs/misc/bof-2017-losangeles.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# BuildKit BoF at [Moby Summit Los Angeles (September 2017)](https://blog.mobyproject.org/moby-summit-los-angeles-recap-a41e6acf81f8)

## Unformatted and INACCURATE note

```
• BuildKit BOF: [me(Akihiro Suda), Tonis, Tibor, (Patrick), (Mhbauer) (Kiril)]
○ Gobuilder looks good workload for distributed mode [me]
§ It depends on data sharing bottleneck, maybe filegrain [tonis]
□ Filegrain is a server? [tibor]
® No, client [me]
□ Use worker node as if registry [tonis]
□ Differs from IPFS? [tibor]
® OCI-compatible [tonis]
® Filegrain over IPFS is also possible [me]
○ Usecase of nested build? [tibor]
§ Ex. Use my local runc for docker development [tonis]
○ Scheduling: select a host which already has the cache[tonis]
○ Need for etcd? [me]
§ No [tonis]
□ HA is another problem, lower priority
○ Cache/metadata will container worker info
○ Solver/state.go volatile [tonis]
○ Single buildd manager that receives build requests, workers are in worker mode [tonis]
○ Worker could be started by passing the master node IP address [tonis]
○ op.Run() returns its worker ID
○ Map[CacheKey]WorkerID
○ For root node of the graph (source, aka nodes without input), probably query ask workers whether they have a cachekey for the operation definition (pb.Op)
§ Query: "are you able to reproduce a cachekey for this operation definition?"
□ If none have it, choose randomly
□ if query is heavy, query to git maybe
□ gossip?
□ can support many nodes
○ Because of freshness: always ping original repo (at least for git)
§ Funny corner case: cannot have 160char hex-named branch
○ Cache for http source: uses etags to know if it should repull, otherwise calculate hash of pulled content
○ Local files: difficult
§ Find the same worker that already received the content from the client the first time? Or always sync to master?
□ Use local workre on master node for simple tasks? (with caveat of constraints)
○ Private images
§ Worker needs to ask master for the credential
○ LLB
§ not Definition bytes; send struct{Definition bytes, metadata} as a single structure object
□ Solver/solver.go: Solve(context.Context, struct{Definition [][]bytes, Metadata map[digest]MetadataEntry})
llb.Marshal should also return this struct object
```
28 changes: 28 additions & 0 deletions docs/misc/design-distributed-mode.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Design: distributed mode (work in progress)

## master
- Stateless.
-- No etcd.
-- The orchestrator (Kubernetes) is expected to restart master container on failure.

## worker
- Workers could be started by passing the master host information.
-- A worker connect to the master and tell its workerID.
-- TODO: how to connect to the master with multiple container replicas?

## scheduling
- The master asks the workers: "are you able to reproduce the cachekey for this operation?", and the master become aware of `map[op][]workerID`.
-- This map does not need to be 100% accurate. So we have many opportunity for optimization.
-- The master could ask cpu/mem/io stat as well and utilize such information for better scheduling.
- The master schedules a vertex job to a worker which is likely to have the most numbers of the dependee-vertice caches.
-- If none have it, choose randomly

## scalability
- We may be able to use gossip (or something like that) for improving scalability of the cache map

## credential
- Because the worker can need credentials at any time, and it does not have any open session with the client, it needs to ask the master who does have an open session with the client

## local file
- we need to find the same worker that already received the content from client the first time.
- OR always sync to manager: manager can do the work of a worker (with caveat of constraints)
47 changes: 47 additions & 0 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# BuildKit Roadmap (tentative)

This document roughly describes the roadmap of the BuildKit project.
We will be using GitHub Projects and GitHub Milestones for more detailed roadmap.

## Task 1 (2018Q1)
- Implement all features needed for replacing the legacy builder backend of moby-engine

- Test, test, and test (help needed!)

- Integrate BuildKit to moby-engine as an experimental builder backend. (`moby-engine --experimental --build-driver buildkit`)
-- Probably for Linux only

## Task 2 (2018Q2-Q3)

- Promote BuildKit to the default moby-engine build backend
-- At this time, BuildKit API and LLB spec do not need to be stabilized

## Task 3 (2018Q1-Q2)

- Implement basic distributed mode

## Task 4 (2018-2019)

-- Stabilize BuildKit API and LLB spec

## Task 5 (2018-2019)
- Optimize distributed mode, especially on scalability of the distributed cache map (gossip protocol or something like that?)

## Task 6+
- Stabilize distributed mode
- Add more features

- - -

TODO: DAG-ify these roadmap tasks in more pretty format

<!-- e.g. PERT chart, but we should NOT be too much bureaucratic :P -->


```
1 ----> 2
|
v
3 ----> 4
\___> 5-->6+
```

0 comments on commit 9f8696c

Please sign in to comment.