Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Live resource view #267

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

feat: Live resource view #267

wants to merge 17 commits into from

Conversation

jannfis
Copy link
Collaborator

@jannfis jannfis commented Jan 6, 2025

What does this PR do / why we need it:

This PR introduces functionality to provide live resource view from the agents to the control plane. There are still a few caveats with this, and some things to sort out.

The general architecture is documented here.

It's also the pre-requisite for other upcoming features, such as log view, resource actions and resource manipulation, this is why I'd like to have the code in early.

The CLI components to manipulate cluster secrets will be a follow-up PR.

Which issue(s) this PR fixes:

Fixes #?

How to test changes / Special notes to the reviewer:

The CLI code for managing cluster secrets will be submitted in a different PR, thus, the code in here is potentially difficult to test.

Checklist

  • Documentation update is required by this PR (and has been updated) OR no documentation update is required.

@jannfis jannfis force-pushed the feat/live-resource-view branch from d83a0f9 to a7c58b7 Compare January 6, 2025 15:33
@codecov-commenter
Copy link

codecov-commenter commented Jan 6, 2025

Codecov Report

Attention: Patch coverage is 47.33010% with 651 lines in your changes missing coverage. Please review.

Project coverage is 48.41%. Comparing base (d16afb3) to head (dd5d108).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
cmd/principal/main.go 0.00% 93 Missing ⚠️
principal/server.go 14.94% 68 Missing and 6 partials ⚠️
agent/inbound.go 10.25% 70 Missing ⚠️
cmd/cmdutil/log.go 0.00% 51 Missing ⚠️
internal/event/event.go 7.54% 49 Missing ⚠️
internal/argocd/cluster/informer.go 58.82% 26 Missing and 9 partials ⚠️
principal/resource.go 70.17% 27 Missing and 7 partials ⚠️
internal/argocd/cluster/conversion.go 42.30% 18 Missing and 12 partials ⚠️
cmd/cmdutil/term.go 0.00% 24 Missing ⚠️
principal/resourceproxy/resourceproxy.go 81.35% 14 Missing and 8 partials ⚠️
... and 17 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #267      +/-   ##
==========================================
+ Coverage   48.20%   48.41%   +0.21%     
==========================================
  Files          55       72      +17     
  Lines        5066     6219    +1153     
==========================================
+ Hits         2442     3011     +569     
- Misses       2424     2949     +525     
- Partials      200      259      +59     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jannfis jannfis force-pushed the feat/live-resource-view branch 2 times, most recently from 262d0e2 to ad9b02d Compare January 8, 2025 17:08
@jannfis jannfis force-pushed the feat/live-resource-view branch from ad9b02d to 1bd19ed Compare January 9, 2025 02:00
Signed-off-by: jannfis <[email protected]>
Signed-off-by: jannfis <[email protected]>
Signed-off-by: jannfis <[email protected]>
Signed-off-by: jannfis <[email protected]>
@jannfis jannfis marked this pull request as ready for review January 14, 2025 01:07
Copy link
Collaborator

@chetan-rns chetan-rns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of small questions. Everything else looks good to me.

principal/resource.go Show resolved Hide resolved
principal/server.go Outdated Show resolved Hide resolved
principal/server.go Show resolved Hide resolved
Copy link
Member

@jgwest jgwest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @jannfis, posted my thoughts below on anything that jumped out at me while working through the PR.

And one final question: I didn't see any E2E-style tests for the feature, what are your plans here?

agent/inbound.go Show resolved Hide resolved
agent/inbound.go Show resolved Hide resolved
internal/argocd/cluster/conversion.go Show resolved Hide resolved
internal/argocd/cluster/manager.go Outdated Show resolved Hide resolved
internal/argocd/cluster/manager.go Show resolved Hide resolved
test/testutil/context.go Outdated Show resolved Hide resolved
test/testutil/waitfor.go Outdated Show resolved Hide resolved
internal/resourceproxy/resourceproxy.go Outdated Show resolved Hide resolved
principal/event.go Show resolved Hide resolved
internal/resourceproxy/resourceproxy.go Outdated Show resolved Hide resolved
Comment on lines +118 to +120
if err != nil {
logCtx.Errorf("could not request resource: %v", err)
status = err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the err here is not nil, based on my understanding, we were not able to fetch any resources. Should we be returning the error from here itself?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. In my thought process, it is not an error condition for processIncomingResourceRequest itself when we could not fetch the resource from the local cluster. We rather need to let the other side know, that resource fetching failed (and the reason, e.g. resource not found or access forbidden).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Continuing on the same scenario, would the json marshaling (line 129 below) be considered more of an internal error or should we be returning that to the caller as well? My thought is that, we might loose sending the response in scenarios where we find errors in json marshaling.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question, and one that could be asked about any error condition in that particular method.

For now, I think the answer is: We want to transport eventual status codes from the communication between the agent and Kubernetes API back to the principal, but nothing else. This is because the principal can then pass that status code on to the consumer of the resource proxy.

We could go ahead and send HTTP 500 status codes back to the principal whenever something goes wrong in processIncomingResourceRequest, then the question is, should we? I have not yet found a good answer to that question. The principal will notice that no reply has been sent from the agent anyway, and will let its consumer know.

cmd/cmdutil/term.go Outdated Show resolved Hide resolved
@jannfis
Copy link
Collaborator Author

jannfis commented Jan 23, 2025

@jgwest I have not yet thought about e2e tests tbh. Agree we need to have some. Will work on them together with the corresponding CLI PR that's in the queue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants