Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] Use health check extension in e2e test instead of /metrics #5859

Closed
1 of 2 tasks
yurishkuro opened this issue Aug 18, 2024 · 8 comments · Fixed by #6113
Closed
1 of 2 tasks

[v2] Use health check extension in e2e test instead of /metrics #5859

yurishkuro opened this issue Aug 18, 2024 · 8 comments · Fixed by #6113
Labels
good first issue Good for beginners help wanted Features that maintainers are willing to accept but do not have cycles to implement v2

Comments

@yurishkuro
Copy link
Member

yurishkuro commented Aug 18, 2024

In the current e2e tests we are using the metrics endpoint to check that the v2 binary is up and ready for tests:

   e2e_integration.go:98: Checking if Jaeger-v2 is available on http://localhost:8888/metrics

Since we already introduced a health check extension (#5831), we should be using that instead of /metrics.

Changes required:

  • configure health check extension in other config files ([v2] Configure healthcheck extension #5831 only did it in all-in-one)
  • change e2e test to ping the /status endpoint of the health check extension instead of /metrics. Wait for statusOk in the JSON, not just HTTP 200.
@yurishkuro yurishkuro added help wanted Features that maintainers are willing to accept but do not have cycles to implement good first issue Good for beginners v2 labels Aug 18, 2024
@yurishkuro yurishkuro removed help wanted Features that maintainers are willing to accept but do not have cycles to implement good first issue Good for beginners labels Aug 21, 2024
yurishkuro added a commit that referenced this issue Aug 23, 2024
**Which problem is this PR solving?**

Part of #5633, part of #5859

**Description of the changes**
* Integrate health check extension to monitor and report Jaeger V2
component's health
* Enhance all-in-one CI test to ping the new health port

**How was this change tested?**

The changes were tested by running the following command:

```bash
make test
```
```bash
CI actions and new Unit Tests
```
**Checklist**

- [x] I have read
[CONTRIBUTING_GUIDELINES.md](https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md)
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - `for jaeger: make lint test`
  - `for jaeger-ui: yarn lint` and `yarn test`

---------

Signed-off-by: Wise-Wizard <[email protected]>
Signed-off-by: Yuri Shkuro <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
@yurishkuro
Copy link
Member Author

Current state: in #5861 all configs were added, but grpc-storage test failed because additional traces from WriteSpan endpoint are being generated (a big no-no). I am going to merge #5861 with a temporary override in grpc_test to continue using the original query service port for health check, but we need to identify the problem and remove the override for healthcheck endpoint. Surprisingly just switching to query port fixes the issue, while querying 13133 results in extra spans. My hypothesis is that the healthcheck endpoint is registered with tracing enabled, so when we hit if from the test it generates a trace from within the collector, and writing that trace generates the other traces for WriteSpan endpoint that we're seeing. We need to make sure that we do not let OTEL framework instantiate a tracer that indiscriminately traces everything. cc @Wise-Wizard

@yurishkuro
Copy link
Member Author

Reproducer in #5861 (comment)

JaredTan95 pushed a commit to JaredTan95/jaeger that referenced this issue Aug 28, 2024
…5861)

**Which problem is this PR solving?**

Part of jaegertracing#5633, part of jaegertracing#5859

**Description of the changes**
* Integrate health check extension to monitor and report Jaeger V2
component's health
* Enhance all-in-one CI test to ping the new health port

**How was this change tested?**

The changes were tested by running the following command:

```bash
make test
```
```bash
CI actions and new Unit Tests
```
**Checklist**

- [x] I have read
[CONTRIBUTING_GUIDELINES.md](https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md)
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - `for jaeger: make lint test`
  - `for jaeger-ui: yarn lint` and `yarn test`

---------

Signed-off-by: Wise-Wizard <[email protected]>
Signed-off-by: Yuri Shkuro <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Signed-off-by: Jared Tan <[email protected]>
mahadzaryab1 pushed a commit to mahadzaryab1/jaeger that referenced this issue Aug 31, 2024
…5861)

**Which problem is this PR solving?**

Part of jaegertracing#5633, part of jaegertracing#5859

**Description of the changes**
* Integrate health check extension to monitor and report Jaeger V2
component's health
* Enhance all-in-one CI test to ping the new health port

**How was this change tested?**

The changes were tested by running the following command:

```bash
make test
```
```bash
CI actions and new Unit Tests
```
**Checklist**

- [x] I have read
[CONTRIBUTING_GUIDELINES.md](https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md)
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - `for jaeger: make lint test`
  - `for jaeger-ui: yarn lint` and `yarn test`

---------

Signed-off-by: Wise-Wizard <[email protected]>
Signed-off-by: Yuri Shkuro <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Co-authored-by: Yuri Shkuro <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
@yurishkuro yurishkuro added help wanted Features that maintainers are willing to accept but do not have cycles to implement good first issue Good for beginners labels Oct 17, 2024
@yurishkuro
Copy link
Member Author

It looks like most of the work here is done, but there is still one location that queries a different endpoint

$ rg HealthCheckEndpoint cmd/jaeger
cmd/jaeger/internal/integration/grpc_test.go
36:			HealthCheckEndpoint: fmt.Sprintf("http://localhost:%d/", ports.QueryHTTP),

@mahadzaryab1
Copy link
Collaborator

@yurishkuro how should we go about not generating the extra traces from the healthcheckextension?

@yurishkuro
Copy link
Member Author

are we generating them now?

@mahadzaryab1
Copy link
Collaborator

@yurishkuro yeah I ran your reproducer code from the original PR and it looks to have the same failure

@yurishkuro
Copy link
Member Author

Reproducer and fix #6113

@yurishkuro
Copy link
Member Author

We don't generate extra trace from health extension, it's coming from grpc storage (issue #5971).

yurishkuro pushed a commit that referenced this issue Oct 27, 2024
<!--
!! Please DELETE this comment before posting.
We appreciate your contribution to the Jaeger project! 👋🎉
-->

## Which problem is this PR solving?
- Fixes #5971 
- Towards #6113 and #5859

## Description of the changes
- This PR fixes an issue where the GRPC remote storage client was
provided a tracer which was resulting in an infinite loop of trace
generation. This infinite loop would happen when we would try to write a
trace to storage which would generate a new trace that needed to be
written and so on. This PR provides a fix for this by using a noop
tracer for the writer clients so that we do not generate traces on the
write paths but still do so when reading.
- This is likely just a temporary fix and we'll want to monitor
open-telemetry/opentelemetry-collector#10663
for a better long-term fix.

## How was this change tested?
- Added the healthcheck endpoint which was previously failing in #6113.

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

## Co-Authors 
This PR is a continuation of
#5979
Co-authored-by: cx <[email protected]>

---------

Signed-off-by: Mahad Zaryab <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for beginners help wanted Features that maintainers are willing to accept but do not have cycles to implement v2
Projects
Status: Done
2 participants