Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rhythm: fix block time range adjustment #4746

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
* [BUGFIX] TraceQL results caching bug for floats ending in .0 [#4539](https://github.com/grafana/tempo/pull/4539) (@carles-grafana)
* [BUGFIX] Fix metrics streaming for all non-trivial metrics [#4624](https://github.com/grafana/tempo/pull/4624) (@joe-elliott)
* [BUGFIX] Fix starting consuming log [#4539](https://github.com/grafana/tempo/pull/4539) (@javiermolinar)
* [BUGFIX] Rhythm - fix adjustment of the start and end range for livetraces blocks [#4746](https://github.com/grafana/tempo/pull/4746) (@javiermolinar)
* [BUGFIX] Return the operand as the only value if the tag is already filtered in the query [#4673](https://github.com/grafana/tempo/pull/4673) (@mapno)
* [BUGFIX] Fix memcached settings for docker compose example [#4346](https://github.com/grafana/tempo/pull/4695) (@ruslan-mikhailov)

Expand Down
20 changes: 17 additions & 3 deletions modules/blockbuilder/tenant_store.go
Original file line number Diff line number Diff line change
Expand Up @@ -157,18 +157,32 @@ func (s *tenantStore) Flush(ctx context.Context, store tempodb.Writer) error {
return nil
}

// Adjust the time range based on when the record was added to the partition, factoring in slack and cycle duration.
// Any span with a start or end time outside this range will be constrained to the valid limits.
func (s *tenantStore) adjustTimeRangeForSlack(start, end time.Time) (time.Time, time.Time) {
startOfRange := s.startTime.Add(-s.slackDuration)
endOfRange := s.startTime.Add(s.slackDuration + s.cycleDuration)

warn := false
if start.Before(startOfRange) {
warn = true
start = s.startTime
start = startOfRange
}
if end.After(endOfRange) || end.Before(start) {
// clock skew, missconfiguration or simply data tampering
if start.After(endOfRange) {
warn = true
end = s.startTime
start = endOfRange
}
// this can happen with old spans added to Tempo
// setting it to start to not jump forward unexpectedly
if end.Before(start) {
warn = true
end = start
}
Comment on lines +178 to +181
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just wrong data, no? I don't think we have to fix it, or at least it's not related to ingestion slack.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean end.Before(startOfRange) instead?

Suggested change
if end.Before(start) {
warn = true
end = start
}
if end.Before(startOfRange) {
warn = true
end = start
}

Copy link
Contributor Author

@javiermolinar javiermolinar Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not wrong data, old data. Imagine someone wants to add old spans into Tempo
I did a very similar fix a time ago:
#3954
We need to correct the end date or the block will be wrong

Copy link
Contributor Author

@javiermolinar javiermolinar Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using start instead of startOfRange makes sense to me because it addresses both scenarios. If the start is before the startOfRange it has been already adjusted. In the edge case where end < startOfRange < start it fixes it as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the edge case where end < startOfRange < start it fixes it as well

This is what I meant by wrong data, end < start is logically incorrect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally during a test block builders flushed several blocks in a row with meta.start==meta.end. I'm not sure that this change is the fix, because the only related change is the case end after endOfRange. Previously it was set to start and now it's endOfRange. For this to be the issue it would have to mean that every trace in the block was a long-running one that exceeds the buffer s.startTime.Add(s.slackDuration + s.cycleDuration). I'd like to dig more and see if there is another explanation.


if end.After(endOfRange) {
warn = true
end = endOfRange
}

if warn {
Expand Down
19 changes: 13 additions & 6 deletions modules/blockbuilder/tenant_store_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -58,22 +58,29 @@ func TestAdjustTimeRangeForSlack(t *testing.T) {
name: "start before slack range",
start: startCycleTime.Add(-10 * time.Minute),
end: startCycleTime.Add(2 * time.Minute),
expectedStart: startCycleTime,
expectedStart: startCycleTime.Add(-slackDuration),
expectedEnd: startCycleTime.Add(2 * time.Minute),
},
{
name: "end after slack range",
start: startCycleTime.Add(-2 * time.Minute),
end: startCycleTime.Add(20 * time.Minute),
expectedStart: startCycleTime.Add(-2 * time.Minute),
expectedEnd: startCycleTime,
expectedEnd: startCycleTime.Add(slackDuration + cycleDuration),
},
{
name: "end before start",
start: startCycleTime.Add(-2 * time.Minute),
end: startCycleTime.Add(-3 * time.Minute),
expectedStart: startCycleTime.Add(-2 * time.Minute),
expectedEnd: startCycleTime,
start: startCycleTime.Add(-10 * time.Minute),
end: startCycleTime.Add(-9 * time.Minute),
expectedStart: startCycleTime.Add(-slackDuration),
expectedEnd: startCycleTime.Add(-slackDuration),
},
{
name: "both start and end after slack range",
start: startCycleTime.Add(5 * time.Minute),
end: startCycleTime.Add(10 * time.Minute),
expectedStart: startCycleTime.Add(cycleDuration + slackDuration),
expectedEnd: startCycleTime.Add(cycleDuration + slackDuration),
},
}

Expand Down