Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics/GenevaActions for Clustersync #3785

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

rhamitarora
Copy link
Collaborator

@rhamitarora rhamitarora commented Aug 21, 2024

Which issue this PR addresses:

ARO-9545 and ARO-8659 both JIRA's have common code

What this PR does / why we need it:

  1. Create new clustersync metrics under monitor package. Both syncSets and selectorSyncSets should be merged into the same Geneva metric.
  2. Create a Geneva Action to show the clustersync resource of a cluster.

Test plan for issue:

Unit test cases added.
Need to create respective metrics dashboard in Geneva.

Is there any documentation that needs to be updated for this PR?

Will create TSGs for respective metrics.

How do you know this will function as expected in production?

Monitor from Geneva Dashboard.

@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 5 times, most recently from 3cdbf8c to 8d1a6e9 Compare August 27, 2024 11:30
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 6 times, most recently from b5ac73b to 99fa8df Compare September 11, 2024 07:40
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 18 times, most recently from 08835f8 to 41a6e7c Compare September 17, 2024 12:43
@rhamitarora rhamitarora marked this pull request as ready for review September 17, 2024 14:16
bitoku
bitoku previously approved these changes Nov 7, 2024
Copy link
Collaborator

@tsatam tsatam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, just a few questions about the metric we emit to make sure the metric works for us downstream (dashboarding/alerting).

pkg/monitor/cluster/clustersync.go Show resolved Hide resolved
pkg/monitor/cluster/clustersync.go Show resolved Hide resolved
test/e2e/monitor.go Show resolved Hide resolved
pkg/monitor/cluster/cluster.go Show resolved Hide resolved
@rhamitarora rhamitarora requested a review from tsatam November 25, 2024 05:29
@rhamitarora rhamitarora added enhancement New feature or request size-medium Size medium ready-for-review go Pull requests that update Go code labels Dec 3, 2024
tsatam
tsatam previously approved these changes Dec 4, 2024
@hlipsig
Copy link
Contributor

hlipsig commented Dec 5, 2024

/azp run ci

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@hlipsig
Copy link
Contributor

hlipsig commented Dec 11, 2024

/azp run ci

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch from 4522c6e to 9c145d8 Compare January 2, 2025 14:53
merging 8659 and 9545

Metrics for SyncSet and SelectorSyncSets
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch from 9c145d8 to 3ec319c Compare January 14, 2025 14:35
without Hive, make sure that it doesn't panic when the monitor's
hive.ClusterManager is nil

Compare to preexisting code in emitHiveRegistrationStatus
@kimorris27
Copy link
Contributor

Documenting E2E failure here prior to my commits to attempt to fix it:

run-e2e  | Monitor [It] must run and must not return any errors
run-e2e  | /app/test/e2e/monitor.go:20
run-e2e  | 
run-e2e  |   [PANICKED] Test Panicked
run-e2e  |   In [It] at: /usr/lib/golang/src/runtime/panic.go:261 @ 01/14/25 17:54:52.526
run-e2e  | 
run-e2e  |   runtime error: invalid memory address or nil pointer dereference
run-e2e  | 
run-e2e  |   Full Stack Trace
run-e2e  |     github.com/Azure/ARO-RP/pkg/monitor/cluster.(*Monitor).emitClusterSync(0xc000768000, {0x7f8ec8032368?, 0xc000ff8120?})
run-e2e  |     	/app/pkg/monitor/cluster/clustersync.go:14 +0x4a
run-e2e  |     github.com/Azure/ARO-RP/pkg/monitor/cluster.(*Monitor).Monitor(0xc000768000, {0x7f8ec8032368, 0xc000ff8120})
run-e2e  |     	/app/pkg/monitor/cluster/cluster.go:223 +0xc74
run-e2e  |     github.com/Azure/ARO-RP/test/e2e.glob..func27.1({0x7f8ec8032368, 0xc000ff8120})
run-e2e  |     	/app/test/e2e/monitor.go:32 +0x1b7

@kimorris27 kimorris27 dismissed stale reviews from tsatam and bitoku via 164adbc January 14, 2025 20:47
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch 2 times, most recently from 1f19f29 to d6a1882 Compare January 16, 2025 01:59
@rhamitarora rhamitarora force-pushed the rhamitarora/ARO-9545-syncset-metrics branch from d6a1882 to 0740eea Compare January 16, 2025 02:25
@rhamitarora
Copy link
Collaborator Author

Documenting E2E failure here prior to my commits to attempt to fix it:

run-e2e  | Monitor [It] must run and must not return any errors
run-e2e  | /app/test/e2e/monitor.go:20
run-e2e  | 
run-e2e  |   [PANICKED] Test Panicked
run-e2e  |   In [It] at: /usr/lib/golang/src/runtime/panic.go:261 @ 01/14/25 17:54:52.526
run-e2e  | 
run-e2e  |   runtime error: invalid memory address or nil pointer dereference
run-e2e  | 
run-e2e  |   Full Stack Trace
run-e2e  |     github.com/Azure/ARO-RP/pkg/monitor/cluster.(*Monitor).emitClusterSync(0xc000768000, {0x7f8ec8032368?, 0xc000ff8120?})
run-e2e  |     	/app/pkg/monitor/cluster/clustersync.go:14 +0x4a
run-e2e  |     github.com/Azure/ARO-RP/pkg/monitor/cluster.(*Monitor).Monitor(0xc000768000, {0x7f8ec8032368, 0xc000ff8120})
run-e2e  |     	/app/pkg/monitor/cluster/cluster.go:223 +0xc74
run-e2e  |     github.com/Azure/ARO-RP/test/e2e.glob..func27.1({0x7f8ec8032368, 0xc000ff8120})
run-e2e  |     	/app/test/e2e/monitor.go:32 +0x1b7

I reverted the changes of app/test/e2e/monitor.go and passed NIL for hiveClusterManager

Copy link
Collaborator

@sankur-codes sankur-codes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. Just a few nitpicks and clean code suggestions.

pkg/monitor/cluster/clustersync.go Show resolved Hide resolved
pkg/monitor/cluster/clustersync.go Show resolved Hide resolved
func (f *frontend) getAdminHiveClusterSync(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
log := ctx.Value(middleware.ContextKeyLog).(*logrus.Entry)
resourceId := strings.TrimPrefix(filepath.Dir(r.URL.Path), "/admin")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Does it make sense to have an empty/malformed check on the resourceId ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this practice is consistent with the approach followed by other APIs as well, ensuring alignment and standardization across implementations. pkg/frontend/admin_openshiftcluster_drainnode.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request firefly Issues or Pull requests owned by Team Firefly go Pull requests that update Go code ready-for-review size-medium Size medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants