Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support new v2 protocol and make concept consistently #1937

Merged
merged 16 commits into from
Nov 21, 2018
Merged

Conversation

wu-sheng
Copy link
Member

All new protocols, v2, are available in https://github.com/apache/incubator-skywalking-data-collect-protocol/tree/005879debb510206a9de4f3b94d4132ec8ff2179

All new services are in language-agent-v2 and register. gRPC is provided, no Restful yet.

@ascrutae I need you to make autotests ready for all these as soon as possible. Please make the high priority of this task.

@liuhaoyang @ascrutae This should be supported in new .NET release and Nodejs release

@wu-sheng wu-sheng added core feature Core and important feature. Sometimes, break backwards compatibility. backend OAP backend related. feature New feature labels Nov 19, 2018
@wu-sheng wu-sheng added this to the 6.0.0-beta milestone Nov 19, 2018
@coveralls
Copy link

coveralls commented Nov 19, 2018

Coverage Status

Coverage decreased (-1.2%) to 12.036% when pulling e2cd856 on v6-protocol into 0ae26f8 on master.

@wu-sheng
Copy link
Member Author

I have passed my local test. @ascrutae wait for your tests to recheck.

@wu-sheng
Copy link
Member Author

I have known @JaredTan95 you are building and publishing a series of videos for SkyWalking, even for v6. Very appreciate for your contributions.

I want to head you up, this PR is important before 6.0.0-GA, it makes the concepts consistently in core codes and UI. New protocols are provided, old(version 1) is still supported.

@wu-sheng
Copy link
Member Author

I just finish the v2 protocol document in this PR. Text is here, links are not working in comment, but right in doc md.


Trace Data Protocol v2

Trace Data Protocol describes the data format between SkyWalking agent/sniffer and backend.

Overview

Trace data protocol is defined and provided in gRPC format.

For each agent/SDK, it needs to register service id and service instance id before reporting any kind of trace
or metric data.

Step 1. Do register

Register service takes charge of
all register methods. At step 1, we need doServiceRegister, then doServiceInstanceRegister.

  1. First of all, do doServiceRegister, input is serviceName, which could be declared by any UTF-8 String. The return
    value is KeyValue pair, serviceName as key, service id as value. Batch is also supported.
  2. After have service id, use doServiceInstanceRegister to do instance register. Input is service id, UUID,
    and register time. UUID should be unique in the whole distributed environments. The return value is still KeyValue pair,
    UUID as key, service instance id as value. Batch is also supported.

For register, the most important notice is that, the process is expected as async in backend, so, the return could be NULL.
In most cases, you need to set a timer to call these services repeated, until you got the response. Suggestion loop cycle, 10s.

Because batch is supported, even for most language agent/SDK, no scenario to do batch register. We suggest to check the serviceName
and UUID in response, and match with your expected value.

Step 2. Send trace and metric

After you have trace id and trace instance id, you could send traces and metric. Now we
have

  1. TraceSegmentReportService#collect for skywalking native trace format
  2. JVMMetricReportService#collect for skywalking native jvm format

For trace format, there are some notices

  1. Segment is a concept in SkyWalking, it should include all span for per request in a single OS process, usually single thread based on language.
  2. Span has 3 different groups.
  • EntrySpan
    EntrySpan represents a service provider, also the endpoint of server side. As an APM system, we are targeting the
    application servers. So almost all the services and MQ-comsumer are EntrySpan(s).

  • LocalSpan
    LocalSpan represents a normal Java method, which don't relate with remote service, neither a MQ producer/comsumer
    nor a service(e.g. HTTP service) provider/consumer.

  • ExitSpan
    ExitSpan represents a client of service or MQ-producer, as named as LeafSpan at early age of SkyWalking.
    e.g. accessing DB by JDBC, reading Redis/Memcached are cataloged an ExitSpan.

  1. Span parent info called Reference, which is included in span. Reference carries more fields besides
    trace id, parent segment id, span id. Others are entry service instance id, parent service instance id,
    entry endpoint, parent endpoint and network address. Follow SkyWalking Trace Data Protocol v2,
    you will know how to get all these fields.

Step 3. Keep alive.

ServiceInstancePing#doPing should be called per several seconds. Make the backend know this instance is still
alive. Existed service instance id and UUID used in doServiceInstanceRegister are required.

@SkyWalkingRobot
Copy link

Here is the test report and validate logs

@wu-sheng
Copy link
Member Author

@ascrutae Could you recheck the new test repo and cases to find out why 145 cases fail. I guess my changes could not make this happens. Correct me if I do a wrong assumption.

Copy link
Member

@ascrutae ascrutae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wu-sheng wu-sheng merged commit c6ada8c into master Nov 21, 2018
@wu-sheng wu-sheng deleted the v6-protocol branch November 21, 2018 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend OAP backend related. core feature Core and important feature. Sometimes, break backwards compatibility. feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants