Trace Identifiers

Current State: Draft Author: tedsuo

The OpenTracing SpanContext interface is extended to include ToSpanID and ToTraceID accessors.

The OpenTracing model of computation specifies two primary object types, Spans and Traces, but does not specify identifiers for these objects. Identifiers for the two primary object types make it easier to correlate tracing data with data in other systems, simplify important tasks, and allow the creation of reusable trace observers. Some use cases are detailed below.

Background: Existing Protocols

Before discussing changes to the OpenTracing specification, it’s worth reviewing several popular wire protocols which contain these trace identifiers.

Trace-Context HTTP Headers

Trace-Context HTTP headers are in the process of being standardized via the w3c. The tracing community has voiced strong support in implementing these headers for use in tracing interop.

The traceparent header contains the following fields: version, trace-id, span-id, and trace-options.

field	format	description
`trace-id`	128-bit; 32HEXDIG	The ID of the whole trace forest. If all bytes are 0, the `Trace-Parent` may be ignored.
`span-id`	64-bit; 16HEXDIG	The ID of the caller span (parent). If all bytes are 0, the `Trace-Parent` may be ignored.

B3 HTTP Headers

The B3 HTTP headers are widely adopted, mostly by Zipkin-like tracing systems. The B3 protocol includes X-B3-TraceId and X-B3-SpanId as required headers, which contain the TraceId and SpanId values, respectively.

field	format	description
`TraceId`	64 or 128-bit; opaque	The ID of the trace. Every span in a trace shares this ID.
`SpanId`	64-bit; opaque	Indicates the position of the current operation in the trace tree. The value may or may not be derived from the value of the `traceId`.

Specification Changes

The SpanContext section of the specification is extended to include the following properties:

method	format	description
`ToTraceID`	string	Globally unique. Every span in a trace shares this ID.
`ToSpanID`	string	Unique within a trace. Each span within a trace contains a different ID.

String values are used for identifiers. In this context, a string is defined as an immutable, variable length sequence of characters. The empty string is a valid return type.

A string is preferred over other formats for the following reasons:

Forwards compatibility with future versions of Trace-Context and other standards.
Backwards compatibility with pre-existing ID formats.
Strongly supported across many languages, and commonly used for transferring data between independent subsystems.

Why `ToTraceID` instead of `TraceID`?

The method names ToTraceID and ToSpanID were chosen over TraceID and SpanID for two reasons:

To avoid conflicts with tracer implementations, which often use TraceID and SpanID to expose their identifiers in their native format.
To communicate to the user that a format conversion may be occuring, and thus there may be some cost to calling these methods.

Future Formats

In some cases, additional identifier formats may be added, besides string. Two reasons for additional formats were identified:

Avoiding double conversions. Rather than converting from a native format to a string value, and then directly into another format, an additional method could be added to allow the tracer to do the conversion directly. For example, using identifiers as byte arrays may turn out to be popular, and may incur an extra conversion cost from string.
If tracing systems converge on a common trace propagation format, such as Trace-Context, accessors may be added for that format as well.

Backwards Compatibility and Optional Support

The OpenTracing specification does not currently require trace and span identifiers. To continue support for existing tracers, the empty string value can be returned when no ID has been set.

Use Cases

Log Correlation

The primary expected consumer for Trace-Context identifiers are logging systems which run independently from the tracing system.

Log indexing has become a common practice, often by including a request identifier in the log. In the past, this has involved manually propagating these identifiers as headers. However, systems which using OpenTracing automatically propagate these identifiers via the Inject/Extract interface. Some of these identifiers are user-generated, and contained in Baggage. However, the most relevant identifiers for log indexing are the Trace and Span IDs. Therefore, exposing these values would be immensely valuable.

Trace Observers

The OpenTracing community would like to develop secondary observation systems which utilize the tracing runtime, but are tracer independent. Trace and span identifiers would allow these observers to correlate tracing data without having knowledge of the wire protocol or tracing implemnetation. Examples include:

Generating metrics from tracing data
Integrating with runtime and system diagnostics
Integrating with 3rd-party context-propagation systems
Correlating logs, as mentioned above

Risk Assessment

Because this proposal includes the exposure of new information, and adds entirely new concepts to the interface, some risks exist.

Tracer support

Some existing tracers may not be able to support this feature, as their internal model does not include any client-side trace identifiers. These tracers may choose to not support this feature by returning empty string values.

Protocol support

It's possible that new tracing protocols may emerge which use an entirely different header scheme. Examples could include a tracing system which handles trace joins explicitly as part of the protocol, and thus no longer has an equivalent concept of a trace id, or a system which uses backpropagation to contain additional data.

This is mitigated by the fact that the concept of a span and a trace are directly part of the OpenTracing model of computation. Some form of identifier for these objects will be availiable to a tracer that conforms to this model of computation. While its likely that additional identifiers or interfaces may be necessary to handle future changes, it is impossible that the trace and span concepts will be removed from this version of OpenTracing.

Because the accessors produce a variable-width string value, new formats and wireprotocols for these identifiers will not result in a breaking change for the OpenTracing interface. Likewise, systems which consume this data are by definition separate from the tracing system, and are not dependent on the format of the identifier.

Extra Allocations and Overhead

Internally, tracers do not always use strings to represent their identifiers. So there is a conversion cost when using these accessors.

While a single allocation may be inevitable, exposing accessors in additional formats could be done to prevent double allocations while formatting the identifiers. For example, converting from a tracer’s native format to a string may trigger an allocation. If there are many systems which want to consume the identifier in a format which requires an allocation when converting from a string, a second allocation could occur.

Restrictions on Length and Formatting

There may be some advantage in specifying a maximum length for an identifier, or restricting on the available character set. However, it is currently not clear what the correct value for these limits should be, or which use cases would benefit from applying them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trace_identifiers.md

trace_identifiers.md

Trace Identifiers

Background: Existing Protocols

Trace-Context HTTP Headers

B3 HTTP Headers

Specification Changes

Why `ToTraceID` instead of `TraceID`?

Future Formats

Backwards Compatibility and Optional Support

Use Cases

Log Correlation

Trace Observers

Risk Assessment

Tracer support

Protocol support

Extra Allocations and Overhead

Restrictions on Length and Formatting

Files

trace_identifiers.md

Latest commit

History

trace_identifiers.md

File metadata and controls

Trace Identifiers

Background: Existing Protocols

Trace-Context HTTP Headers

B3 HTTP Headers

Specification Changes

Why ToTraceID instead of TraceID?

Future Formats

Backwards Compatibility and Optional Support

Use Cases

Log Correlation

Trace Observers

Risk Assessment

Tracer support

Protocol support

Extra Allocations and Overhead

Restrictions on Length and Formatting

Why `ToTraceID` instead of `TraceID`?