Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[storage] Remove dependency on archive flag in ES reader #6490

Merged
merged 33 commits into from
Jan 14, 2025

Conversation

mahadzaryab1
Copy link
Collaborator

@mahadzaryab1 mahadzaryab1 commented Jan 6, 2025

Which problem is this PR solving?

Description of the changes

  • This PR removes the concept of archive from the ElasticSearch span reader and span writer by relying on the following configurations:
    • The index names in the archive storage are a special case of the use_aliases configuration so the archive bool in the archive reader was mostly replaced by the use_aliases configuration (more details on this are in the description of Phase out the distinction between primary and archive storage #6065).
    • To maintain backwards compatibility, we had to move some of the construction of the aliases to CreateArchiveSpanReader and CreateArchiveSpanWriter

How was this change tested?

  • CI

Checklist

@mahadzaryab1
Copy link
Collaborator Author

@yurishkuro How should we handle the archive namespacing in the reader and factory as they are currently hardcoded.

@mahadzaryab1 mahadzaryab1 changed the title Remove Dependency on Archive Flag In ES Reader [WIP][storage] Remove dependency on archive flag in ES reader Jan 6, 2025
@mahadzaryab1 mahadzaryab1 force-pushed the es-archive-dependency branch from 20e7ccc to 944f4e0 Compare January 6, 2025 02:43
Copy link

codecov bot commented Jan 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.23%. Comparing base (fcc8936) to head (6688faa).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6490      +/-   ##
==========================================
+ Coverage   96.21%   96.23%   +0.02%     
==========================================
  Files         372      372              
  Lines       21360    21363       +3     
==========================================
+ Hits        20551    20559       +8     
+ Misses        616      612       -4     
+ Partials      193      192       -1     
Flag Coverage Δ
badger_v1 10.65% <0.00%> (-0.01%) ⬇️
badger_v2 2.78% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v1-manual 16.55% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2-auto 2.71% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2-manual 2.71% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v1-manual 16.55% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2-auto 2.71% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2-manual 2.71% <0.00%> (-0.01%) ⬇️
elasticsearch-6.x-v1 20.33% <75.51%> (+0.09%) ⬆️
elasticsearch-7.x-v1 20.40% <75.51%> (+0.10%) ⬆️
elasticsearch-8.x-v1 20.56% <75.51%> (+0.10%) ⬆️
elasticsearch-8.x-v2 2.77% <0.00%> (-0.01%) ⬇️
grpc_v1 12.17% <0.00%> (+<0.01%) ⬆️
grpc_v2 9.04% <0.00%> (-0.01%) ⬇️
kafka-3.x-v1 10.33% <0.00%> (-0.01%) ⬇️
kafka-3.x-v2 2.78% <0.00%> (-0.01%) ⬇️
memory_v2 2.78% <0.00%> (-0.01%) ⬇️
opensearch-1.x-v1 20.46% <75.51%> (+0.10%) ⬆️
opensearch-2.x-v1 20.46% <75.51%> (+0.10%) ⬆️
opensearch-2.x-v2 2.77% <0.00%> (-0.01%) ⬇️
tailsampling-processor 0.51% <0.00%> (-0.01%) ⬇️
unittests 95.11% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mahadzaryab1 mahadzaryab1 force-pushed the es-archive-dependency branch 2 times, most recently from 712c974 to 9adf03e Compare January 8, 2025 00:39
@mahadzaryab1 mahadzaryab1 force-pushed the es-archive-dependency branch from 67145ac to 2f7a9be Compare January 8, 2025 03:53
@mahadzaryab1 mahadzaryab1 force-pushed the es-archive-dependency branch from 2f7a9be to 0ae679d Compare January 8, 2025 03:56
Comment on lines 221 to 225
suffix := "archive"
if f.archiveConfig.UseReadWriteAliases {
suffix += "-read"
}
sr, err := createSpanReader(f.getArchiveClient, f.archiveConfig, f.logger, f.tracer, suffix, true)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro Not ideal to have to do this but this is roughly what we need to maintain backwards compatibility.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we want to expose a configuration option to override the default read and write alias but that could be one way to streamline this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • who adds read/write suffix to the index name?
  • how does that work for non-archive, does it not use -read suffix?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who adds read/write suffix to the index name?

Its added by the reader and the writer but we need to handle the special case where the archive storage also has use_aliases set to true

how does that work for non-archive, does it not use -read suffix?

Yeah it does - its done by the reader and the writer. I made some changes to make it an else-if change. The current behavior is that if a suffix is passed in, we use that. If a suffix isn't passed in but use_aliases is set, then we use the default read/write aliases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's why I don't follow - if reader/writer automatically add -read/-write suffix, why do you need to add it manually above?

Copy link
Collaborator Author

@mahadzaryab1 mahadzaryab1 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add it above just to address the case where use_aliases wasn't initially set for the archive storage. Since we hardcode it to true for the archive storage now, we'll always set the index to be archive-read and archive-write in the reader/writer. But that would break backwards compatibility because previously it would just be archive for both the read and write alias for the archive storage.

Signed-off-by: Mahad Zaryab <[email protected]>
@@ -245,7 +245,7 @@ func TestArchiveDisabled(t *testing.T) {
func TestArchiveEnabled(t *testing.T) {
f := NewFactory()
f.primaryConfig = &escfg.Configuration{}
f.archiveConfig = &escfg.Configuration{Enabled: true}
f.archiveConfig = &escfg.Configuration{Enabled: true, UseReadWriteAliases: true}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change necessary? It goes contrary to default config.

Copy link
Collaborator Author

@mahadzaryab1 mahadzaryab1 Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro This was mostly for code coverage purposes. I can remove it but then we may need to merge with less than 100% code coverage or add a new test that goes through that code path

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my comment on the archive tests would address cove coverage

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro It didn't because that test doesn't go through CreateArchiveSpanReader. I just added a test for both cases here.

Signed-off-by: Mahad Zaryab <[email protected]>
@@ -1267,7 +1252,7 @@ func TestSpanReader_GetEmptyIndex(t *testing.T) {
}

func TestSpanReader_ArchiveTraces(t *testing.T) {
withArchiveSpanReader(t, false, func(r *spanReaderTest) {
withArchiveSpanReader(t, true, "archive", func(r *spanReaderTest) {
Copy link
Member

@yurishkuro yurishkuro Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by passing the suffix explicitly (which no production code does today) you're overriding the default behavior. It's ok to do this as one of the tests, but we should keep the "default" tests where only use_aliases flag is varied and check that it affects the index name. And then for passing a suffix explicitly like this I would use a completely different string like foobar, to ensure that the validation is not mistaking the index name from default behavior with the index name with overwritten suffix.

Perhaps it makes sense to combine into a table driven test

use_aliases input suffix expected index
false "" ...
true "" ...
false "foobar" ...
true "foobar" ...

Copy link
Collaborator Author

@mahadzaryab1 mahadzaryab1 Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro Done. Can we remove the test below this now?

Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Archive: archive,
UseReadWriteAliases: cfg.UseReadWriteAliases,
UseReadWriteAliases: useReadWriteAliases,
WriteAlias: writeAliasSuffix,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
WriteAlias: writeAliasSuffix,
WriteAliasSuffix: writeAliasSuffix,

}{
{false, "", "jaeger-span-"},
{true, "", "jaeger-span-read"},
{false, "archive", "jaeger-span-"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{false, "archive", "jaeger-span-"},
{false, "foobar", "jaeger-span-"},

Is "jaeger-span-" correct?

Copy link
Member

@yurishkuro yurishkuro Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't seem like it would match previous behavior

Copy link
Collaborator Author

@mahadzaryab1 mahadzaryab1 Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro I ran an equivalent test on main

func TestSpanReader_ArchiveTraces(t *testing.T) {
	testCases := []struct {
		useAliases bool
		archive    bool
		expected   string
	}{
		{false, false, "jaeger-span-"},
		{true, false, "jaeger-span-read"},
		{false, true, "jaeger-span-archive"},
		{true, true, "jaeger-span-archive-read"},
	}

	for _, tc := range testCases {
		t.Run(fmt.Sprintf("useAliases=%v archive=%v", tc.useAliases, tc.archive), func(t *testing.T) {
			withArchiveSpanReader(t, tc.useAliases, tc.archive, func(r *spanReaderTest) {
				mockSearchService(r).
					Return(&elastic.SearchResult{}, nil)
				mockArchiveMultiSearchService(r, tc.expected).
					Return(&elastic.MultiSearchResult{
						Responses: []*elastic.SearchResult{},
					}, nil)
				query := spanstore.GetTraceParameters{}
				trace, err := r.reader.GetTrace(context.Background(), query)
				require.NotEmpty(t, r.traceBuffer.GetSpans(), "Spans recorded")
				require.Nil(t, trace)
				require.EqualError(t, err, "trace not found")
			})
		})
	}
}

Case 1 and 2 are the same.

There is a slight difference for case 3 and 4 because we had to change the reader to accommodate some edge cases - with the most important one being the case where archive storage also has read_aliases set to true. However, we moved that check to the factory. So while the behaviour of the reader is slightly different, the overall behaviour should be the same which we see from the integration tests passing.

// CreateArchiveSpanReader implements storage.ArchiveFactory
func (f *Factory) CreateArchiveSpanReader() (spanstore.Reader, error) {
	if !f.archiveConfig.Enabled {
		return nil, nil
	}
	readAliasSuffix := "archive"
	if f.archiveConfig.UseReadWriteAliases {
		readAliasSuffix += "-read"
	}
	sr, err := createSpanReader(f.getArchiveClient, f.archiveConfig, f.logger, f.tracer, readAliasSuffix, true)
	if err != nil {
		return nil, err
	}
	archiveMetricsFactory := f.metricsFactory.Namespace(
		metrics.NSOptions{
			Tags: map[string]string{
				"role": "archive",
			},
		},
	)
	return spanstoremetrics.NewReaderDecorator(sr, archiveMetricsFactory), nil
}

Case 3 isn't possible anymore because we can never have a case where the readAliasSuffix is set but use_aliases isn't (we hardcode use_aliases to true in the above). And for case 4, the differentiation between archive and archive-read is now just handled by CreateArchiveSpanReader instead of inside the reader.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For v2, the user can get the archive behaviour by just setting use_aliases to true. It is a breaking change because the storage won't be reading/writing to and from the *-archive index anymore. But I believe that's something that we said we're okay with according to #6519

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v2 callers are still using create-archive functions. Does it mean they will be getting the same index names as before this change without any changes to the config?

Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
}{
{false, "", "jaeger-span-"},
{true, "", "jaeger-span-read"},
{false, "foobar", "jaeger-span-"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the only one that I don't understand, why is it getting a weird index name?

but it's not a blocker since passing the suffix is new functionality, so not a braking change (but possibly a bug).

Copy link
Collaborator Author

@mahadzaryab1 mahadzaryab1 Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro This is from this block. I made it so that we only look at the readAliasSuffix if use_aliases is true. We could change it to always be considered but I thought that might be confusing because then there would be two paths to set an alias rather than one. Let me know what you think. It shouldn't make a difference either way because use_aliases is always set to true for the archive storage while also passing in a custom readAliasSuffix.

	readAliasSuffix := ""
	// Setting the maxSpanAge to a large duration will ensure all spans in the "read" alias are accessible by queries (query window = [now - maxSpanAge, now]).
	// When read/write aliases are enabled, which are required for index rollovers, only the "read" alias is queried and therefore should not affect performance.
	if p.UseReadWriteAliases {
		maxSpanAge = dawnOfTimeSpanAge
		if p.ReadAliasSuffix != "" {
			readAliasSuffix = p.ReadAliasSuffix
		} else {
			readAliasSuffix = "read"
		}
	}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code would never result in "jaeger-span-" though, right? It would either produce "jaeger-span-read" (the else clause) or "jaeger-span-${ReadAliasSuffix}"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah exactly - that's tested in TestSpanReaderIndices

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only one remaining question, but otherwise I feel like we spend enough energy trying to ensure backwards compatibility.

@mahadzaryab1 mahadzaryab1 enabled auto-merge (squash) January 14, 2025 00:16
@mahadzaryab1 mahadzaryab1 merged commit c678a64 into jaegertracing:main Jan 14, 2025
55 checks passed
@mahadzaryab1 mahadzaryab1 deleted the es-archive-dependency branch January 14, 2025 02:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants