Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEP-0060: Update proposal and mark Remote Resolution implementable #547

Merged
merged 1 commit into from Dec 14, 2021
Merged

TEP-0060: Update proposal and mark Remote Resolution implementable #547

merged 1 commit into from Dec 14, 2021

Conversation

ghost
Copy link

@ghost ghost commented Oct 21, 2021

Remote resolution is currently in proposed state while work's been
going on to try and validate some of the alternative approaches being
put forward.

This commit moves TEP-0060 to implementable, updates the proposed
approach and adds notes about the experimental controller built to
validate ideas from the TEP.

There is an accompanying PR
in experimental with a proof-of-concept controller implementing a
couple of the alternatives presented here.

@tekton-robot tekton-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 21, 2021
@ghost
Copy link
Author

ghost commented Oct 21, 2021

/kind tep

@tekton-robot tekton-robot added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Oct 21, 2021
@bobcatfish
Copy link
Contributor

/assign @afrittoli
/assign @vdemeester
/assign @bobcatfish
/assign @wlynch

Copy link
Contributor

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the new syntax and the proposal overall! The best part of it imo is that if we are in agreement about the syntax, we can get this into the hands of users to get feedback - and still change the implementation (e.g. maybe we decide CRD based isn't the way to go) without being disruptive

i'm sure others will have thoughts but for my part 💯 :

/approve


Add new fields to `taskRef` and `pipelineRef` to set the resolver and
its parameters. Access to these new fields will be locked behind the
`alpha` feature gate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, i like the new syntax!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Storage of task and pipeline definitions could be considered as a runtime concern - a TaskRun or a PipelineRun would ask to execute the definition specified by a certain Ref.

When it comes to a Pipeline though, the definition of the Task used by the Pipeline can be considered an author time concern, and that is not well defined unless we specify a taskRef. Today the taskRef is only resolvable (unless bundles are used) - when a Pipeline is installed on a cluster - which caused a lot of debate when defining how to store Pipelines in the catalog.

Having tasks stored in git or container registries makes the definition available independently of the running cluster, and thus Pipeline definitions more self-consistent.
It may reduce Pipeline reusability, however I would argue that if someone wants to run a Pipeline from the catalog using same Tasks from a different source, that would actually be a different different pipeline, as we have no way today to identify a Task as itself globally, independent of the storage location.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could still over time add the ability to override the taskRef resolver at runtime?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afrittoli yeah I agree - I don't think anything we're proposing here gets in the way of us implementing a feature like taskRef rewriting down the road. Designing and implementing this in a Resolver first may help us figure out the edge cases before migrating that logic into Pipelines itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking some more about this, and I think there is an issue with this syntax unless we introduce a concept of bundle and bundle resolution.

Let's consider a repo that contains contains a Pipeline that users some Tasks, defined alongside the Pipeline.
If we want to use SHAs (as opposed to tags) to references to resources, it's impossible to release / publish the Pipeline alongside the Tasks in a consistent version, since the Pipeline definition must include the version of the Tasks that are being published, and that version (in SHA format) is only known after the publishing.

This is true when the target is a container registry but it could also apply to git. If we wanted to create a git tag, we would have to make a PR to update the pipeline definition.

Possible solutions:

  • always release tasks and pipelines independently - however that would make the releasing workflow much more complex
  • always define the task spec embedded into the Pipeline, however we might not want to do that since we cannot define the same metadata with embedded specs and we might want to share the Task with other pipelines

The other option is to have a concept of bundle: if a Pipeline from a remote resource includes Tasks from the same resource, it may refer to them just by name, and the controller would look them up in the same "bundle" i.e. from the same source / version.

We might even consider decoupling the concept of "bundle" from the "container registry" storage, so that we could have bundles with different backends, such as git, object storage or else

Copy link
Author

@ghost ghost Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes total sense! What's the process from here? Do you want to propose a new TEP to start the ball rolling on standardizing that approach?

It would also be possible for a resolver to "figure out" how to rewrite a Pipeline's YAML based on similar rules to the ones you described. E.g. consider the following pipelineRef submitted in a PipelineRun:

pipelineRef:
  resolver: git
  resource:
    repo: github.com/sbwsg/stuff.git
    path: /foo/bar/pipeline.yaml
    commit: abc123

The git Resolver clones the repo, checks out the commit, and reads /foo/bar/pipeline.yaml. Then it "transforms" the PipelineTask taskRefs, rewriting any that match the same repo. So this is the pipeline yaml before the transform:

kind: Pipeline
# ... metadata etc ...
spec:
  tasks:
  - name: get-code
    taskRef:
      resolver: git
      resource:
        repo: github.com/sbwsg/stuff.git
        path: /foo/baz/git-clone.yaml

And this is the pipeline yaml after the transform:

kind: Pipeline
# ... metadata etc ...
spec:
  tasks:
  - name: get-code
    taskRef:
      resolver: git
      resource:
        repo: github.com/sbwsg/stuff.git
        path: /foo/baz/git-clone.yaml
        commit: abc123 # <-- This has been copied from the pipelineRef to the taskRef by the git resolver

And finally the resolver returns this transformed YAML.

These kinds of features and rules can be implemented per-Resolver until Tekton Pipelines stabilizes on a common syntax for non-OCI bundles per your suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

We don't have to make it a mandatory feature of a resolver, but we should at least define how a resolver that supports "bundles" should behave.

`TaskRun` that issued this request. If the runs are deleted, their
requests are cleaned up. In future this may also be useful for
de-duplicating requests in the same namespace for the same remote
Tekton resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

repositories. This is new functionality that is not supported by Tekton
Pipelines currently. In addition to basic support for fetching files
from git this resolver will also need to support quite rich
configuration (which can be developed over time):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 makes sense! i guess we can choose whether it makes sense to provide this in the request, or configure this on the resolver itself (or maybe both, with one overriding the other)

- How will it be tested in isolation vs with other components?
Eventually the resolution project may reach a point of maturity that
Tekton Pipelines opts to use it for all `taskRef` and `pipelineRef`
resolution. At this point Pipelines will need additional test coverage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point i'm thinking we would also start packaging some or all of the resolvers by default with Tekton Pipelines? pointing this out b/c i couldnt find a section of the TEP that mentioned this, might be worth adding somewhere (np if it's already here and i missed it haha)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I've updated with an explicit mention of bundling resolvers with pipelines if the project reaches an agreed level of maturity. This is in the Design Details section.

- Correctly translating all `taskRefs` and `pipelineRefs` to
`ResourceRequests`.
- End-to-end behaviour of all the resolvers supported by Pipelines "out
of the box".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it doesnt need to be mentioned but using this in our dogfooding would be a great way of testing it as well - and i think we could really use it to reduce some of our current duplication

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers! I've added a call out to dogfooding in the Test Plan.

@@ -887,6 +1127,7 @@ Pros:

Cons:
- We'd still need to decide precisely what the protocol for resolution is.
- `PipelineRunPending` is documented for users / third-party tools, not intended for Pipelines' internal use-cases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another potential con is that someone who is already using this status for some other purpose wouldn't be able to also use the feature

but i think a mitigation of that (maybe something we might want to do anyway?) would be to make PipelineRunPending more like a semaphore - i.e. make it a count so you could set it to "2" to use it for both remote resolution and some other purpose; both controllers would decrement it after they've done whatever they need to do. maybe that's crazy tho haha

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I've added this as an additional con with the assumption that any decisions we make vis-a-vis synchronization primitives like semaphores would happen in a dedicated TEP.

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 26, 2021
Copy link
Member

@wlynch wlynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super excited about this! :D

the Pipelines codebase.
- New approaches to resolution and resource handling can be built,
tested and productionized without burdening the Tekton Pipelines
project directly.

## Drawbacks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A drawback I noticed with custom tasks that also applies here is that RBAC doesn't have good support for resource subtypes. e.g. there's no way to restrict a Git Resource resolver to only have access to git types - we're relying on implementors to do the right thing and only respond to ResourceRequests of their corresponding types.

I don't think this is necessarily blocking - this is a drawback we see all over the k8s ecosystem (e.g. the Tekton Pipeline controller has full Pod management capabilities - users trust that we only ever operate on Tekton managed pods), but I wish we could do better. :(

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've added this as an additional drawback. I think there might be some interesting ways to help mitigate this problem but nothing springs to mind at the moment which would allow finer-grained RBAC to manage precisely which resolvers have access to specific types of ResourceRequest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if some sort of duck typing could be used here 🤔

(e.g. the controller creates un/semi-structured requests based on a common schema, but these get resolved to individual apiVersion+kinds)

type: Succeeded
```

### Resolver specifics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we expecting auth to be handled for ResourceRequests? My assumption is that the Resolver will have its own secrets and use those to interact with external sources, but how do we know whether a given namespace / service account / user / etc is allowed to have access to those Resources?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will depend a bit on the requirements of each resolver. Taking Git as an example, during initial development my plan is to support configuration via a config file mounted e.g. from a ConfigMap.

The config for git resolution will allow an operator to specify global and per-repository settings that can limit (a) which repos are accessible at all 1 and (b) which repos are allowed from specific namespaces. Here's a bit of hand-waved config to center the discussion:

allowed-repositories:

- name: tekton-catalog
  repository_url: https://github.com/tektoncd/catalog.git

- name: app-team-1-private-repo
  repository_url: [email protected]:app-team-1
  allowed-namespaces:
  - app-team-1

In this example config the operator has cut off access to all repos except for the public tekton catalog and an app team's private repo. Access to the app team's private repo is limited to ResourceRequest objects created in the app-team-1 namespace. A git resolver can enforce this by virtue of the fact that ResourceRequests are namespace-scoped.

Looking a bit further forward, as part of the Workflows WG we've been discussing a Repository CRD object that could expose some or all of this configuration for use across multiple applications, not just by a git resolver but also, for example, by a controller polling a repo for changes).

Footnotes

  1. While the scope of this discussion is around repo access ideally the available config for git resolution is going to be more granular than that. E.g. let operators disallow resources pulled from any but a limited set of subdirectories in a repo, exact file paths within a repo, pulled from any but specific commit digests, or only from signed commits (with a specific set of authors' public keys).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM - I'd like to dig into the config syntax a bit more (I think we would benefit by leveraging a similar syntax as https://git-scm.com/docs/gitcredentials#_credential_contexts), but happy to punt this to another discussion to keep the TEP moving.

`apply`ing or `delete`ing them.

A resolver observes `ResourceRequest` objects, filtering on the
`resolution.tekton.dev/resolver` label to find only those it is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will users know what resolvers are available to them?
Would it be worth having a simple ResourceResolver CRD that exposes metadata of the supported resolver types on the cluster?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea a lot. It reminds me of the use of ClusterInterceptor in Triggers.

I am wondering if we can sneak up on this requirement during alpha rather than introduce an additional CRD in this TEP though. For example: what if we added a specific label or annotation on resolver deployments in the tekton-remote-resolution namespace? This way an interested party (like the Pipelines controller) who has read access to tekton-remote-resolution can query for the available resolvers but we don't need to pay the cost of specifying an additional custom resource type until we know it's useful and necessary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Comment on lines +697 to +740
The parameters for a "git" `ResourceRequest` will initially look as
follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd highly recommend adding provider data into the params. It might even be easier to recommend separate github, gitlab, bitbucket, etc resolvers instead of a centralized git resolver (at least to start).

Defaulting to provider specific resolvers has some benefits:

  1. Allows you to target provider specific file fetching methods (e.g. GitHub, GitLab) - this avoids needing to clone the repo entirely.
  2. Makes self-hosted instances easier to identify and work with - i.e. how do you know what provider / auth method to use if the URL is git.example.com? Even if you're planning to use standard git protocols, small details like the username used differ per-provider.
  3. Generally seems easier to handle provider edge cases (e.g. if users need to provide GitHub App specific identifiers)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are several distinct ideas here that are worth separating out for discussion purposes:

  1. Support provider-specific optimizations when requesting resources (e.g. use github's api to read a file if possible rather than cloning an entire repo).
  2. Create provider-specific resolvers.
  3. Allow users to pick which provider is used in their taskRef or pipelineRef.

Let's look at each of these individually:

Support provider-specific optimizations

I agree with this one but am not sure it necessarily implies per-provider resolvers or exposing the provider as user-facing config (vs operator- or admin-facing).

Users should specify the provider in their taskRef or pipelineRef

My assumption here is that the git repo provider is likely to be extremely static information - I don't think a repository is going to hop from being a github repo to being a gitlab repo to being a gerrit repo within a short amount of time. Given the static nature of the information I think allowing users to specify the provider of a given git repo with every taskrun or pipelinerun would end up being redundant.

In addition to that I'm not totally convinced that this is configuration you want to put in the hands of end-users. Operators or administrators, definitely, but not necessarily end-users. That's not to discount the idea entirely - just that I think we should figure out some use cases for making this configurable by "everybody".

So instead what I'd suggest is that we make this operator-level configuration supplied to the git resolver:

allowed-repositories:

- name: tekton-catalog
  repository_url: https://github.com/tektoncd/catalog.git
  provider: github

- name: app-team-1-private-repo
  repository_url: [email protected]:app-team-1
  provider: github-enterprise

By default the git resolver can treat every git repo the same. When a repository is attributed to a specific provider though there can be code paths in the git resolver that can use provider-specific optimizations to fetch a resource.

But! We might not have code paths for every git repo provider under the sun which leads on to...

Provider-Specific Resolvers

I think one of the nice things about the syntax / design as proposed is that we can provide a general purpose git resolver out of the box with optimizations for the most popular providers. Someone else can in turn implement resolvers for their own niche-git-provider which use specific optimizations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support provider-specific optimizations when requesting resources (e.g. use github's api to read a file if possible rather than cloning an entire repo).

Note that using APIs instead of git directly (GitHub API) has its set of limitation too (rate limits, …). Also are we making the assumption that you need to do a git clone with all path and all history to get a specific commit and path from a git repository ? Because I don't think this assumption is completely valid, there is way to "clone" only the thing you are interested into (without grabing the full history of the repo nor the full content of it at a given point)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I think this is also a fair stance to take! A lot of this is going to have to shake out at implementation time I think.

In terms of the TEP I think we can look at this from the perspectives of different user profiles:

  • Users who submit TaskRuns and PipelineRuns shouldn't have to worry about any of the nuanced detail of interacting with the repo.
  • Operators maybe care about some of the specifics so they can configure those things once.
  • Developers of git Resolvers have to really care about it and figure it all out in code, test cases and abstractions.

### Non-Goals
- Integrate mechanisms to verify remote tasks and pipelines
before they are executed by Pipelines' reconcilers. See
[TEP-0091](https://github.com/tektoncd/community/pull/537) on this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

teps/0060-remote-resource-resolution.md Show resolved Hide resolved

Add new fields to `taskRef` and `pipelineRef` to set the resolver and
its parameters. Access to these new fields will be locked behind the
`alpha` feature gate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Storage of task and pipeline definitions could be considered as a runtime concern - a TaskRun or a PipelineRun would ask to execute the definition specified by a certain Ref.

When it comes to a Pipeline though, the definition of the Task used by the Pipeline can be considered an author time concern, and that is not well defined unless we specify a taskRef. Today the taskRef is only resolvable (unless bundles are used) - when a Pipeline is installed on a cluster - which caused a lot of debate when defining how to store Pipelines in the catalog.

Having tasks stored in git or container registries makes the definition available independently of the running cluster, and thus Pipeline definitions more self-consistent.
It may reduce Pipeline reusability, however I would argue that if someone wants to run a Pipeline from the catalog using same Tasks from a different source, that would actually be a different different pipeline, as we have no way today to identify a Task as itself globally, independent of the storage location.


Add new fields to `taskRef` and `pipelineRef` to set the resolver and
its parameters. Access to these new fields will be locked behind the
`alpha` feature gate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could still over time add the ability to override the taskRef resolver at runtime?

@pritidesai
Copy link
Member

in progress, responding to feedback

type: Succeeded
```

### Resolver specifics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM - I'd like to dig into the config syntax a bit more (I think we would benefit by leveraging a similar syntax as https://git-scm.com/docs/gitcredentials#_credential_contexts), but happy to punt this to another discussion to keep the TEP moving.

the Pipelines codebase.
- New approaches to resolution and resource handling can be built,
tested and productionized without burdening the Tekton Pipelines
project directly.

## Drawbacks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if some sort of duck typing could be used here 🤔

(e.g. the controller creates un/semi-structured requests based on a common schema, but these get resolved to individual apiVersion+kinds)

`apply`ing or `delete`ing them.

A resolver observes `ResourceRequest` objects, filtering on the
`resolution.tekton.dev/resolver` label to find only those it is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

machine-readable `reason` and human-readable `message` explaining the
failure.

### 3. Create a new Tekton Resolution project
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this now going to be a non-experimental project? If so, is the intention to keep this out of tektoncd/pipeline longterm?
I think it's fine to experiment with this a bit, but eventually I would see this living in tektoncd/pipeline directly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - good catch, this is really intended to be a production-ready project, definitely not experimental anymore, with its own project in the tektoncd org. I'll update the TEP with more detail, cheers!

I'm a bit unsure how to proceed wrt this living in tektoncd/pipeline. Would it be possible to clarify in more precise terms what you want to live in the Pipelines project? The reason I request this is that previous review cycles on this TEP (and in discussions before it was created) have really pushed back on that idea a lot, so I've intentionally directed the design away from it over the past six months. Here are a number of instances of feedback I am thinking about:

I guess my question becomes: how do we resolve this design question once and for all? (...pun intended). Should this be a separate project or should we bake this functionality directly into Tekton Pipelines?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the TEP with the production-readiness note about this new project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question becomes: how do we resolve this design question once and for all? (...pun intended). Should this be a separate project or should we bake this functionality directly into Tekton Pipelines?

My usual answer here is : if we can do it in a separate project let's do it, and see in the future if it make sense to bake it into pipeline. The more tektoncd/pipeline mature / gets adopted, the more causious we have to be into getting things in 😅

Copy link
Author

@ghost ghost Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tend to like the idea of keeping the responsibilities separated along project boundaries here, for a couple of reasons:

  • Faster iteration times bootstrapping a new project vs dumping giant PRs on the Pipelines team.
  • Reduce the chance of affecting other parts of Pipeline's codebase.
  • The whole "yes is forever" thing. We can always revisit this decision.
  • (edit): One of the goals is also to let this project be used by other projects. Baking into Pipelines doesn't make that goal easier I don't think.

I'll raise this in the API WG when I'm back from holiday (Nov 29). I am hopeful we can figure out how to proceed on this one with just a bit more discussion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback from the API WG just now: will this be embedded in Tekton Pipelines directly eventually? Maybe - might make sense or might not. Tekton Operator can be configured to deploy this, Pipelines itself could embed it, or it could be separate forever. In the short term keeping it separate would work, or placing it in Pipelines and extracting it out later if we decide it's a bad fit might work as well.

@afrittoli - do you have any thoughts what you'd like to see here to close this discussion item? I think this is the last open point that needs to be resolved and I'd like to keep the TEP moving if possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my 2 cents, either option seems fine for now, doesnt seem like it would be too hard to change later

  • separate project: cons: more overhead now (creating, releasing and maintaining a separate project, not clear how to bundle this easily for users
  • inside tekton pipelines: cons: moving into a separate project later = impacts folks who rely on it (but if ppl are relying on it, maybe we wouldnt want to move it out?

At that point Pipelines' maintainers would need to decide how best to
provide default resolvers "out of the box". One possible approach would be
to deploy the `ResourceRequest` lifecycle reconciler and a set of resolvers
as part of Pipelines' `release.yaml`.

## Test Plan
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until now we have not provided users with a recommended way to manage their Tasks.
I think it would be great to provide some "typical task management" examples, one per task provider, and focus some of our tests around them. We could setup different part of dogfooding with different approaches.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, I've added an additional paragraph describing this to the Dogfooding portion of the Test Plan section:

+ Once Remote Resolution is supported in Dogfooding we can also implement
+ testing for recommended "Task Management" approaches. Documenting these
+ will also provide guidelines to help the community structure their own
+ repos and registries of `Pipelines` and `Tasks`.

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this!
It looks mostly good, I have once concern still with how pipeline can reference to tasks from the same "bundle".

machine-readable `reason` and human-readable `message` explaining the
failure.

### 3. Create a new Tekton Resolution project
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question becomes: how do we resolve this design question once and for all? (...pun intended). Should this be a separate project or should we bake this functionality directly into Tekton Pipelines?

My usual answer here is : if we can do it in a separate project let's do it, and see in the future if it make sense to bake it into pipeline. The more tektoncd/pipeline mature / gets adopted, the more causious we have to be into getting things in 😅

if any are written. See the [Design Details](#design-details) section
for more on specific resolvers.

The `ResourceRequest` Lifecycle Controller will be responsible for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would it be used by tektoncd/pipeline controller/machinery ? (might be documented already elsewhere right ?)

Copy link
Author

@ghost ghost Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this answers your question exactly but the plan is for Pipelines to interact with the resolution machinery through helper libraries that the tektoncd/resolver project publishes. The helpers are going to hide as much of the specifics of requesting as possible, but Pipelines may need to thread through something like a ResourceRequest client/lister. But the client / lister would also be imported from the tektoncd/resolver project - not new codegen in Pipelines.

This is described a little bit above with "Function and struct helpers for other projects to use" but I'll expand on that in the Design Details section.

To summarize my thinking very quickly, the Pipelines controllers will interact with the libraries through intermediary structs rather than "raw" CRD or HTTP request objects. So, hand-waving, the reconciler code might end up looking something like:

var errTaskStillResolving = errors.New("taskRef still resolving")

// resolveTask wraps helpers from the tektoncd/resolver project to implement async
// taskRef resolution in the TaskRun reconciler.
func (r *reconciler) resolveTask(ctx context.Context, ref *v1beta1.TaskRef) (*v1beta1.Task, error) {
  if alphaFlagEnabled(ctx) {
    task := &v1beta1.Task{}
    // resp here is a struct like resolutionhelpers.Resolved{ Data: []byte, Annotations: map[string]string } or similar.
    if resp, err := resolutionhelpers.Resolve(ref.Resolver, ref.Resource); err != nil {
      if errors.Is(err, &resolutionhelpers.ErrResolutionInProgress{}) {
        return nil, errTaskStillResolving // non-fatal: a resource request has just been created or still in progress.
      } else {
        return nil, controller.NewPermanentError(err)
      }
    } else {
      if err := resolutionhelpers.Unmarshal(resp, task); err != nil {
        return nil, controller.NewPermanentError(err)
      }
      task.SetDefaults(ctx) // + do any other task initialization stuff
      return task, nil
    }
  }
}

Copy link
Author

@ghost ghost Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the design details section with this info (although not the hyper-specific hand-waved code example 😄 )

Comment on lines 387 to 397
_Risk_: Relying on a CRD may introduce scaling problems that couldn't be
discovered during proof-of-concept testing. Task and Pipeline
definitions may only get larger until a CRD no longer provides enough
space. In busy CI/CD clusters many rapidly created large
`ResourceRequests` may cause API server or database performance to
degrade.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also, possibly, adding one more CRD for each "resource". Depending on the usage of tektoncd/pipeline, it may pressurize etcd at twice the speed 😅

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the text here from "may cause API server or database performance to degrade" to call out etcd specifically: "may cause API server or etcd performance to degrade". I've also updated the potential mitigation here to indicate that any solution which is strictly better than a ResourceRequest CRD is a good alternative so long as it meets the same goals that a CRD does:

  • It doesn't block Pipelines' reconcilers while potentially slow resource fetches are happening.
  • It supports the same approach to parameters.
  • It has a way to re-trigger reconciles of TaskRun and PipelineRun objects when their taskRef or pipelineRefs have been successfully retrieved.

taskRef:
resolver: git
resource:
repository_url: https://github.com/tektoncd/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wrong repository url 😝, it ain't a repo 👼🏼

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted, fixed cheers!

Comment on lines +697 to +740
The parameters for a "git" `ResourceRequest` will initially look as
follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support provider-specific optimizations when requesting resources (e.g. use github's api to read a file if possible rather than cloning an entire repo).

Note that using APIs instead of git directly (GitHub API) has its set of limitation too (rate limits, …). Also are we making the assumption that you need to do a git clone with all path and all history to get a specific commit and path from a git repository ? Because I don't think this assumption is completely valid, there is way to "clone" only the thing you are interested into (without grabing the full history of the repo nor the full content of it at a given point)

Remote resolution is currently in proposed state while work's been
going on to try and validate some of the alternative approaches being
put forward.

This commit moves TEP-0060 to implementable, updates the proposed
approach and adds notes about the experimental controller built to
validate ideas from the TEP.
@ghost
Copy link
Author

ghost commented Dec 7, 2021

Updated with an additional Risk related to data integrity and another alternative implementation. Here's the diff:

diff --git a/teps/0060-remote-resource-resolution.md b/teps/0060-remote-resource-resolution.md
index 0b10fd5..ddbad54 100644
--- a/teps/0060-remote-resource-resolution.md
+++ b/teps/0060-remote-resource-resolution.md
@@ -25,6 +25,7 @@ authors:
   - [Risks and Mitigations](#risks-and-mitigations)
     - [Relying on a CRD as storage for in-lined resolved data](#relying-on-a-crd-as-storage-for-in-lined-resolved-data)
     - [Changing the way it works means potentially rewriting multiple resolvers](#changing-the-way-it-works-means-potentially-rewriting-multiple-resolvers)
+  - [Data Integrity](#data-integrity)
   - [User Experience \(optional\)](#user-experience-optional)
     - [Simple flow: user submits `TaskRun` using public catalog `Task`](#simple-flow-user-submits-taskrun-using-public-catalog-task)
   - [Performance](#performance)
@@ -66,6 +67,8 @@ authors:
     - [Applicability For Other Tekton Projects](#applicability-for-other-tekton-projects-6)
   - [Use an Admission Controller to Perform Resolution](#use-an-admission-controller-to-perform-resolution)
     - [Applicability For Other Tekton Projects](#applicability-for-other-tekton-projects-7)
+  - [Sync Task and Pipeline objects directly into the cluster](#sync-task-and-pipeline-objects-directly-into-the-cluster)
+    - [Applicability For Other Tekton Projects](#applicability-for-other-tekton-projects-8)
 - [Open Questions](#open-questions)
 - [Future Extensions](#future-extensions)
 - [References](#references)
@@ -421,6 +424,28 @@ written to be agnostic about the "protocol". All interaction with
 clients / CRDs will be kept in shared helpers so that a rewrite only
 impacts that shared code.
 
+#### Data Integrity
+
+_Risk_: `ResourceRequest` objects have no data integrity mechanism yet, so
+a motivated actor with access to the cluster and write permissions on
+`ResourceRequest` objects can modify them without detection. This
+becomes a more notable concern when thinking about task verification
+occurring in Resolvers, as is planned in
+[TEP-0091](https://github.com/tektoncd/community/pull/537). A user with
+the necessary permissions could change a `ResourceRequest` object
+containing a Task _after_ Task verification occurred.
+
+_Possible Mitigation_: Tekton already has solutions undergoing design to address
+this problem on two fronts, and so it would make sense to integrate directly
+with one of them:
+1. [TEP-0089 SPIRE support](https://github.com/tektoncd/community/pull/529)
+where Tekton's objects (i.e. a `ResourceRequest`) can be signed by authorized
+workloads (i.e. a `ResourceRequest` Reconciler).
+2. The solution under design in TEP-0086 ([available to read
+here](https://hackmd.io/a6Kl4oS0SaOyBqBPTirzaQ)) which includes content
+addressability as a desirable property of the storage subsystem (OCI
+Registry being a good candidate).
+
 ### User Experience (optional)
 
 #### Simple flow: user submits `TaskRun` using public catalog `Task`
@@ -1221,6 +1246,34 @@ Admission controllers are a Kubernetes-wide concept. It seems reasonable to assu
 other Tekton projects could also leverage this mechanism for their own resource resolution
 needs. Possible opportunity to share controller or libraries to achieve this?
 
+### Sync Task and Pipeline objects directly into the cluster
+
+Rather than using an intermediary data format like `ResourceRequest`
+objects Resolvers could instead pull Tasks and Pipelines out of
+storage and `kubectl apply` them to the cluster to keep them in sync.
+
+Pros:
+- Syncing the contents of a repo into the cluster can happen
+  totally independently of Tekton Pipelines' knowledge.
+- Quite a bit easier to reason about than other options.
+
+Cons:
+- Unclear how this would work for pipelines-as-code use-cases where a
+  user's Pull Request could include Pipeline or Task changes that should
+  be used in testing. On-demand sync as a fallback?
+- Unclear how tasks and pipelines with the same name from multiple
+  branches or commits could co-exist in the same namespace. Potential
+  risk for confusing outcomes here.
+
+#### Applicability For Other Tekton Projects
+
+This solution would be quite specific to Tekton Pipelines if coded to
+only submit Tasks and Pipelines to the cluster. Alternatively resolvers
+could operate with zero understanding of the resources and simply
+`kubectl apply` whatever appears in the repository but this too has
+drawbacks with regard to deciding which resources should be applied and
+which shouldn't.
+
 ## Open Questions

@ghost
Copy link
Author

ghost commented Dec 13, 2021

Currently looking for a non-Googler to further review and/or approve 🙏

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the replies.
I'm looking forward to these feature.
There are still things that needs to be defined, like:

  • support for bundles
  • support for validation / defaulting
  • support for signature validation (which can be done via admission controller for instance if those are supported)

But we can define all these as part of the new project.

/approve

@ghost
Copy link
Author

ghost commented Dec 13, 2021

Amazing, just as I was requesting the review @afrittoli nailed it, thanks a lot Andrea!

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 14, 2021
@tekton-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: afrittoli, bobcatfish, vdemeester, wlynch

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ghost
Copy link
Author

ghost commented Dec 14, 2021

Hm, is there a way to bump the EasyCLA bot?

@lbernick
Copy link
Member

Hm, is there a way to bump the EasyCLA bot?

this happened to me, you can try closing and reopening the pr

@ghost
Copy link
Author

ghost commented Dec 14, 2021

/close

@tekton-robot
Copy link
Contributor

@sbwsg: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ghost
Copy link
Author

ghost commented Dec 14, 2021

/reopen

@tekton-robot tekton-robot reopened this Dec 14, 2021
@tekton-robot
Copy link
Contributor

@sbwsg: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot merged commit 701070e into tektoncd:main Dec 14, 2021
@ghost
Copy link
Author

ghost commented Dec 14, 2021

Thanks a lot for reviewing everybody - really appreciate the amount of time you have all spent looking at this and giving feedback!

@ghost ghost mentioned this pull request Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Status: Implementable
Development

Successfully merging this pull request may close these issues.

7 participants