Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for tcp_long_connection_metrics #1224

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

yp969803
Copy link
Contributor

@yp969803 yp969803 commented Feb 6, 2025

What type of PR is this?
Proposal for "Metrics for TCP Long Connection"
LFX 2025 term-1

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #1211

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign supercharge-xsy for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kmesh-bot kmesh-bot added size/L and removed size/S labels Feb 6, 2025
@yp969803
Copy link
Contributor Author

yp969803 commented Feb 6, 2025

Some questions:

  • Do we also need acceess logs for tcp-long-connection or just metrics works
  • Does current logs support open-telemetry format
  • If not, do we need opentelemetry compatible logs

@LiZhenCheng9527 @nlgwcy @hzxuzhonghu

@hzxuzhonghu
Copy link
Member

Do we also need acceess logs for tcp-long-connection or just metrics works

The accesslog is printed after connection closed. Can keep it as now

Integrate with OTEL sounds reasonable to me.

@LiZhenCheng9527
Copy link
Collaborator

Some questions:

  • Do we also need acceess logs for tcp-long-connection or just metrics works
  • Does current logs support open-telemetry format
  • If not, do we need opentelemetry compatible logs

@LiZhenCheng9527 @nlgwcy @hzxuzhonghu

  • In proposal, You need to include the design of the accesslog and metrics. In code, you can only focus on metrics.
  • Now, accesslog of Kmesh is not support OTEL format. We will support it later.
  • If you want to do this work. : )


- Reporting of metrics and access logs, at periodic time or based on throughput (e.g. after transfer of 1mb of data).

- User can fine tune the time and throughput using yaml during kmesh deployment or can use CLI tool kmeshctl anytime.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this mean? How to fine tune the time and throughput?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking to, give users options to set their prefered periodic time and threshold values during the start of kmesh daemon, be setting the values in values.yaml file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand.
You can change the description

}
```

The value of the period or the threshold is provided by the user, if not a default value of 5 seconds and 1 mb is chosen.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you able to explain in the proposal why 1MB is the threshold?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the threshold were set too low, the system might generate too many reports, leading to noise and increased processing overhead, 1 mb threshold sounds appropriate to me. We are also giving users options to set their own threshold if he is not satisfied with 1mb

@LiZhenCheng9527
Copy link
Collaborator

@nlgwcy PTAL

@yp969803
Copy link
Contributor Author

Sorry for inactivity i am sick from last 6 days, i will reply to all your queries and complete my proposal asap.

authors:
- "yp969803"
reviewers:
- "nglwcy"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- "nglwcy"
- "nlgwcy"

know that this has succeeded?
-->

- Collect detailed traffic metrics (e.g. bytes send/recieved, direction, throughput, round-trip time, latency , state-change) continously during the lifetime of long TCP connections using ebpf.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is recommended that TCP retransmission and packet loss measurement indicators be added.

@yp969803
Copy link
Contributor Author

@LiZhenCheng9527 can you review the ebpf code in the proposal!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[lfx-mentorship-2025-Mar-May] Metrics for TCP Long Connection
5 participants