Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup swapSourceProvider(...) workaround #118480

Merged
merged 6 commits into from
Jan 14, 2025

Conversation

martijnvg
Copy link
Member

This reverts the workaround that was introduced in #117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. This is now covered by the ReinitializingSourceProvider workaround that covers that and the concurrency problem. With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.

This reverts the workaround that was introduced in elastic#117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic.
This is now covered by the `ReinitializingSourceProvider` workaround that covers that and the concurrency problem.
With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.
@martijnvg martijnvg added >non-issue auto-backport Automatically create backport pull requests when merged :StorageEngine/Mapping The storage related side of mappings v9.0.0 v8.18.0 labels Dec 11, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@martijnvg
Copy link
Member Author

The last commit addresses a test failure that surfaces after running many iterations. For example: repeat 10 ./gradlew ":x-pack:plugin:esql:internalClusterTest" --tests "org.elasticsearch.xpack.esql.action.Esql*IT.testScriptField" -Dtests.iters=256

The issue was the lastSeenDoc field was reused overwritten by different threads, the latest commit moves the lastSeenDoc field to PerThreadSourceProvider so that each thread gets its own place to store the last seen docid.

This problem became visible after reverting #117792 in favor of just relying on ReinitializingSourceProvider. But with the latest fix doesn't can't happen any more. Now the entire compute engine source provider workaround only resides in ReinitializingSourceProvider.

@martijnvg martijnvg requested a review from dnhatn January 13, 2025 19:14
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Martijn!

@martijnvg martijnvg enabled auto-merge (squash) January 14, 2025 08:23
@martijnvg martijnvg merged commit d953079 into elastic:main Jan 14, 2025
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jan 14, 2025
This reverts the workaround that was introduced in elastic#117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. This is now covered by the ReinitializingSourceProvider workaround that covers that and the concurrency problem. With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.

Additional another in `ReinitializingSourceProvider` was fixed, the issue was the lastSeenDoc field was reused overwritten by different threads, the latest commit moves the lastSeenDoc field to PerThreadSourceProvider so that each thread gets its own place to store the last seen docid.
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jan 14, 2025
This reverts the workaround that was introduced in elastic#117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. This is now covered by the ReinitializingSourceProvider workaround that covers that and the concurrency problem. With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.

Additional another in `ReinitializingSourceProvider` was fixed, the issue was the lastSeenDoc field was reused overwritten by different threads, the latest commit moves the lastSeenDoc field to PerThreadSourceProvider so that each thread gets its own place to store the last seen docid.
martijnvg added a commit that referenced this pull request Jan 15, 2025
Backporting #118480 to 8.x branch.

This reverts the workaround that was introduced in #117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. This is now covered by the ReinitializingSourceProvider workaround that covers that and the concurrency problem. With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.

Additional another in `ReinitializingSourceProvider` was fixed, the issue was the lastSeenDoc field was reused overwritten by different threads, the latest commit moves the lastSeenDoc field to PerThreadSourceProvider so that each thread gets its own place to store the last seen docid.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >non-issue :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants