Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(injector): External resource detection for OpenTelemetry-instrumented applications #215

Merged
merged 6 commits into from
Dec 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,12 @@ jobs:
run: |
make lint

- name: lint C sources with clang-format
uses: jidicula/[email protected]
with:
clang-format-version: '19'
check-path: 'images/instrumentation/injector/src'

- name: install Helm unittest plugin
shell: bash
run: |
Expand Down
30 changes: 27 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,10 @@ golangci-lint: golangci-lint-install
@echo "-------------------------------- (linting Go code)"
$(GOLANGCI_LINT) run

.PHONY: golang-lint-fix
golang-lint-fix: golangci-lint-install
$(GOLANGCI_LINT) run --fix

.PHONY: helm-chart-lint
helm-chart-lint:
@echo "-------------------------------- (linting Helm charts)"
Expand Down Expand Up @@ -200,12 +204,32 @@ prometheus-crd-version-check:
@echo "-------------------------------- (verifying the Prometheus CRD version is in sync)"
./test-resources/bin/prometheus-crd-version-check.sh

.PHONY: c-lint-installed
c-lint-installed:
@set +x
@if ! clang-format --version > /dev/null; then \
echo "error: clang-format is not installed. Run 'brew install clang-format' or similar."; \
exit 1; \
fi

.PHONY: c-lint
c-lint: c-lint-installed
ifeq ("${CI}","true")
@echo "CI: skip linting C source files via make lint, will run as separate Github action job step"
else
@echo "-------------------------------- (linting C source files)"
clang-format --dry-run --Werror images/instrumentation/injector/src/*.c
endif

.PHONY: c-lint-fix
c-lint-fix: c-lint-installed
clang-format -i images/instrumentation/injector/src/*.c

.PHONY: lint
lint: golangci-lint helm-chart-lint shellcheck-lint prometheus-crd-version-check
lint: golangci-lint helm-chart-lint shellcheck-lint prometheus-crd-version-check c-lint

.PHONY: lint-fix
lint-fix: golangci-lint ## Run golangci-lint linter and perform fixes
$(GOLANGCI_LINT) run --fix
lint-fix: golang-lint-fix c-lint-fix

##@ Build

Expand Down
17 changes: 14 additions & 3 deletions helm-chart/dash0-operator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -672,9 +672,10 @@ tracing works, intended for the technically curious reader.
You can safely skip this section if you are not interested in the technical details.

Workloads in [monitored namespaces](#enable-dash0-monitoring-for-a-namespace) are instrumented by the Dash0 operator
to enable tracing for [supported runtimes](#supported-runtimes) out of the box.
This allows Dash0 users to avoid the hassle of manually adding the OpenTelemetry SDK to their applications—Dash0 takes
care of it automatically.
to enable tracing for [supported runtimes](#supported-runtimes) out of the box, and to improve Kubernetes-related
resource attribute auto-detection.
This allows Dash0 users to avoid the hassle of manually adding the OpenTelemetry SDK to their applications.
Dash0 takes care of it automatically.

To achieve this, the Dash0 operator adds an
[init container](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) with the [Dash0 instrumentation
Expand All @@ -698,6 +699,16 @@ For example, it sets (or appends to) `NODE_OPTIONS` to activate the
[Dash0 OpenTelemetry distribution for Node.js](https://github.com/dash0hq/opentelemetry-js-distribution) to collect
tracing data from all Node.js workloads.

Additionally, the Dash0 injector automatically improves Kubernetes-related resource attributes as follows: The operator
sets the environment variables `DASH0_NAMESPACE_NAME`, `DASH0_POD_NAME`, `DASH0_POD_UID` and `DASH0_CONTAINER_NAME` on
workloads.
The Dash0 injector binary picks these values up and uses them to populate the resource attributes `k8s.namespace.name`,
`k8s.pod.name`, `k8s.pod.uid` and `k8s.container.name` via the `OTEL_RESOURCE_ATTRIBUTES` environment variable.
If `OTEL_RESOURCE_ATTRIBUTES` is already set on the process, the key-value pairs for these attributes are appended to
the existing value of `OTEL_RESOURCE_ATTRIBUTES`.
If `OTEL_RESOURCE_ATTRIBUTES` was not set on the process, the Dash0 injector will add `OTEL_RESOURCE_ATTRIBUTES` as
a new environment variable.

If you are curious, the source code for the injector is open source and can be found
[here](https://github.com/dash0hq/dash0-operator/blob/main/images/instrumentation/injector/src/dash0_injector.c).

Expand Down
243 changes: 163 additions & 80 deletions images/instrumentation/injector/src/dash0_injector.c
Original file line number Diff line number Diff line change
@@ -1,112 +1,195 @@
#include <unistd.h>
#include <stdint.h>
#include <unistd.h>

#define ALIGN (sizeof(size_t))
#define UCHAR_MAX 255
#define ONES ((size_t)-1/UCHAR_MAX)
#define HIGHS (ONES * (UCHAR_MAX/2+1))
#define HASZERO(x) ((x)-ONES & ~(x) & HIGHS)
#define ONES ((size_t)-1 / UCHAR_MAX)
#define HIGHS (ONES * (UCHAR_MAX / 2 + 1))
#define HASZERO(x) ((x) - ONES & ~(x) & HIGHS)

#define NODE_OPTIONS_ENV_VAR_NAME "NODE_OPTIONS"
#define NODE_OPTIONS_DASH0_REQUIRE "--require /__dash0__/instrumentation/node.js/node_modules/@dash0hq/opentelemetry"
#define NODE_OPTIONS_DASH0_REQUIRE \
"--require " \
"/__dash0__/instrumentation/node.js/node_modules/@dash0hq/opentelemetry"
#define OTEL_RESOURCE_ATTRIBUTES_ENV_VAR_NAME "OTEL_RESOURCE_ATTRIBUTES"
#define DASH0_NAMESPACE_NAME_ENV_VAR_NAME "DASH0_NAMESPACE_NAME"
#define DASH0_POD_UID_ENV_VAR_NAME "DASH0_POD_UID"
#define DASH0_POD_NAME_ENV_VAR_NAME "DASH0_POD_NAME"
#define DASH0_POD_CONTAINER_NAME_VAR_NAME "DASH0_CONTAINER_NAME"

extern char **__environ;

size_t __strlen(const char *s)
{
const char *a = s;
const size_t *w;
for (; (uintptr_t)s % ALIGN; s++) if (!*s) return s-a;
for (w = (const void *)s; !HASZERO(*w); w++);
for (s = (const void *)w; *s; s++);
return s-a;
size_t __strlen(const char *s) {
const char *a = s;
const size_t *w;
for (; (uintptr_t)s % ALIGN; s++)
if (!*s)
return s - a;
for (w = (const void *)s; !HASZERO(*w); w++)
;
for (s = (const void *)w; *s; s++)
;
return s - a;
}

char *__strchrnul(const char *s, int c)
{
size_t *w, k;

c = (unsigned char)c;
if (!c) return (char *)s + __strlen(s);

for (; (uintptr_t)s % ALIGN; s++)
if (!*s || *(unsigned char *)s == c) return (char *)s;
k = ONES * c;
for (w = (void *)s; !HASZERO(*w) && !HASZERO(*w^k); w++);
for (s = (void *)w; *s && *(unsigned char *)s != c; s++);
return (char *)s;
char *__strchrnul(const char *s, int c) {
size_t *w, k;

c = (unsigned char)c;
if (!c)
return (char *)s + __strlen(s);

for (; (uintptr_t)s % ALIGN; s++)
if (!*s || *(unsigned char *)s == c)
return (char *)s;
k = ONES * c;
for (w = (void *)s; !HASZERO(*w) && !HASZERO(*w ^ k); w++)
;
for (s = (void *)w; *s && *(unsigned char *)s != c; s++)
;
return (char *)s;
}

char *__strcpy(char *restrict dest, const char *restrict src)
{
const unsigned char *s = src;
unsigned char *d = dest;
while ((*d++ = *s++));
return dest;
char *__strcpy(char *restrict dest, const char *restrict src) {
const unsigned char *s = src;
unsigned char *d = dest;
while ((*d++ = *s++))
;
return dest;
}

char *__strcat(char *restrict dest, const char *restrict src)
{
__strcpy(dest + __strlen(dest), src);
return dest;
char *__strcat(char *restrict dest, const char *restrict src) {
__strcpy(dest + __strlen(dest), src);
return dest;
}

int __strcmp(const char *l, const char *r)
{
for (; *l==*r && *l; l++, r++);
return *(unsigned char *)l - *(unsigned char *)r;
int __strcmp(const char *l, const char *r) {
for (; *l == *r && *l; l++, r++)
;
return *(unsigned char *)l - *(unsigned char *)r;
}

int __strncmp(const char *_l, const char *_r, size_t n)
{
const unsigned char *l=(void *)_l, *r=(void *)_r;
if (!n--) return 0;
for (; *l && *r && n && *l == *r ; l++, r++, n--);
return *l - *r;
int __strncmp(const char *_l, const char *_r, size_t n) {
const unsigned char *l = (void *)_l, *r = (void *)_r;
if (!n--)
return 0;
for (; *l && *r && n && *l == *r; l++, r++, n--)
;
return *l - *r;
}

char *__getenv(const char *name)
{
size_t l = __strchrnul(name, '=') - name;
if (l && !name[l] && __environ)
for (char **e = __environ; *e; e++)
if (!__strncmp(name, *e, l) && l[*e] == '=')
return *e + l+1;
return 0;
char *__getenv(const char *name) {
size_t l = __strchrnul(name, '=') - name;
if (l && !name[l] && __environ)
for (char **e = __environ; *e; e++)
if (!__strncmp(name, *e, l) && l[*e] == '=')
return *e + l + 1;
return 0;
}

/*
* Buffers of statically-allocated memory that we can use to safely return to
* the program manipulated values of env vars without dynamic allocations.
*/
char val1[1012];
char val2[1012];

char *getenv(const char *name)
{
char *origValue = __getenv(name);
int l = __strlen(name);

char *nodeOptionsVarName = NODE_OPTIONS_ENV_VAR_NAME;
if (__strcmp(name, nodeOptionsVarName) == 0)
{
if (__strlen(val1) == 0)
{
// Prepend our --require as the first item to the NODE_OPTIONS string.
char *nodeOptionsDash0Require = NODE_OPTIONS_DASH0_REQUIRE;
__strcat(val1, nodeOptionsDash0Require);

if (origValue != NULL)
{
// If NODE_OPTIONS were present, append the existing NODE_OPTIONS after our --require.
__strcat(val1, " ");
__strcat(val1, origValue);
}
char cachedModifiedOtelResourceAttributesValue[1012];
char cachedModifiedNodeOptionsValue[1012];

char *getenv(const char *name) {
char *origValue = __getenv(name);
int l = __strlen(name);

char *otelResourceAttributesVarName = OTEL_RESOURCE_ATTRIBUTES_ENV_VAR_NAME;
char *nodeOptionsVarName = NODE_OPTIONS_ENV_VAR_NAME;
if (__strcmp(name, otelResourceAttributesVarName) == 0) {
if (__strlen(cachedModifiedOtelResourceAttributesValue) == 0) {
// This environment variable (OTEL_RESOURCE_ATTRIBUTES) has not been
// requested before, calculate the modified value and cache it.
char *namespaceName = __getenv(DASH0_NAMESPACE_NAME_ENV_VAR_NAME);
char *podUid = __getenv(DASH0_POD_UID_ENV_VAR_NAME);
char *podName = __getenv(DASH0_POD_NAME_ENV_VAR_NAME);
char *containerName = __getenv(DASH0_POD_CONTAINER_NAME_VAR_NAME);

int attributeCount = 0;

/*
* We do not perform octect escaping in the resource attributes as
* specified in
* https://opentelemetry.io/docs/specs/otel/resource/sdk/#specifying-resource-information-via-an-environment-variable
* because the values that are passed down to the injector comes from
* fields that Kubernetes already enforces to either conform to RFC 1035
* or RFC RFC 1123
* (https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#dns-label-names),
* and in either case, none of the characters allowed require escaping
* based on https://www.w3.org/TR/baggage/#header-content
*/

if (namespaceName != NULL && __strlen(namespaceName) > 0) {
__strcat(cachedModifiedOtelResourceAttributesValue,
basti1302 marked this conversation as resolved.
Show resolved Hide resolved
"k8s.namespace.name=");
__strcat(cachedModifiedOtelResourceAttributesValue, namespaceName);
attributeCount += 1;
}

if (podName != NULL && __strlen(podName) > 0) {
if (attributeCount > 0) {
__strcat(cachedModifiedOtelResourceAttributesValue, ",");
}

return val1;
__strcat(cachedModifiedOtelResourceAttributesValue, "k8s.pod.name=");
__strcat(cachedModifiedOtelResourceAttributesValue, podName);
attributeCount += 1;
}

if (podUid != NULL && __strlen(podUid) > 0) {
if (attributeCount > 0) {
__strcat(cachedModifiedOtelResourceAttributesValue, ",");
}

__strcat(cachedModifiedOtelResourceAttributesValue, "k8s.pod.uid=");
__strcat(cachedModifiedOtelResourceAttributesValue, podUid);
attributeCount += 1;
}

if (containerName != NULL && __strlen(containerName) > 0) {
if (attributeCount > 0) {
__strcat(cachedModifiedOtelResourceAttributesValue, ",");
}

__strcat(cachedModifiedOtelResourceAttributesValue,
"k8s.container.name=");
__strcat(cachedModifiedOtelResourceAttributesValue, containerName);
attributeCount += 1;
}

if (origValue != NULL && __strlen(origValue) > 0) {
if (attributeCount > 0) {
__strcat(cachedModifiedOtelResourceAttributesValue, ",");
}

__strcat(cachedModifiedOtelResourceAttributesValue, origValue);
}
}

return origValue;
}
return cachedModifiedOtelResourceAttributesValue;
} else if (__strcmp(name, nodeOptionsVarName) == 0) {
if (__strlen(cachedModifiedNodeOptionsValue) == 0) {
// This environment variable (NODE_OPTIONS) has not been requested before,
// calculate the modified value and cache it.

// Prepend our --require as the first item to the NODE_OPTIONS string.
char *nodeOptionsDash0Require = NODE_OPTIONS_DASH0_REQUIRE;
__strcat(cachedModifiedNodeOptionsValue, nodeOptionsDash0Require);

if (origValue != NULL && __strlen(origValue) > 0) {
// If NODE_OPTIONS were present, append the existing NODE_OPTIONS after
// our --require.
__strcat(cachedModifiedNodeOptionsValue, " ");
__strcat(cachedModifiedNodeOptionsValue, origValue);
}
}

return cachedModifiedNodeOptionsValue;
}

return origValue;
}
3 changes: 3 additions & 0 deletions images/instrumentation/injector/test/app/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ function main () {
process.stdout.write("; ")
echoEnvVar("NODE_OPTIONS");
break;
case "otel_resource_attributes":
echoEnvVar("OTEL_RESOURCE_ATTRIBUTES");
break;
default:
console.error(`unknown test case: ${testCase}`);
process.exit(1)
Expand Down
Loading
Loading