Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect against giant sample keys #1363

Closed
kentquirk opened this issue Oct 4, 2024 · 0 comments · Fixed by #1364
Closed

Protect against giant sample keys #1363

kentquirk opened this issue Oct 4, 2024 · 0 comments · Fixed by #1364
Labels
type: enhancement New feature or request

Comments

@kentquirk
Copy link
Contributor

Imagine a large trace with thousands of spans, one for each user, and someone accidentally sets up the key fields for the sampler to include a high-cardinality field like user ID. The value of the sampler key would be an enormous string containing every user ID.

We think we saw this happen today. We should protect against it by limiting the size of the key, most likely by limiting the number of items that can be placed into the key.

@kentquirk kentquirk added the type: enhancement New feature or request label Oct 4, 2024
MikeGoldsmith added a commit that referenced this issue Oct 7, 2024
## Which problem is this PR solving?

- If someone sends a big trace where one of the sampler keys is a
high-cardinality field, Refinery could generate a huge sampler key
value. This causes problems both for refinery but also the downstream
telemetry. Let's not do that.

## Short description of the changes

- Put a limit of 100 unique values to make up any one sampler key. Even
that is big and probably useless but it should avoid any real use cases.
- Add a test to show it works.

Fixes #1363

---------

Co-authored-by: Mike Goldsmth <[email protected]>
MikeGoldsmith added a commit that referenced this issue Oct 7, 2024
## Which problem is this PR solving?

- If someone sends a big trace where one of the sampler keys is a
high-cardinality field, Refinery could generate a huge sampler key
value. This causes problems both for refinery but also the downstream
telemetry. Let's not do that.

## Short description of the changes

- Put a limit of 100 unique values to make up any one sampler key. Even
that is big and probably useless but it should avoid any real use cases.
- Add a test to show it works.

Fixes #1363

---------

Co-authored-by: Mike Goldsmth <[email protected]>
TylerHelmuth pushed a commit that referenced this issue Oct 16, 2024
## Which problem is this PR solving?

- If someone sends a big trace where one of the sampler keys is a
high-cardinality field, Refinery could generate a huge sampler key
value. This causes problems both for refinery but also the downstream
telemetry. Let's not do that.

## Short description of the changes

- Put a limit of 100 unique values to make up any one sampler key. Even
that is big and probably useless but it should avoid any real use cases.
- Add a test to show it works.

Fixes #1363

---------

Co-authored-by: Mike Goldsmth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant