Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#29772][Go SDK] Add EventTime Timer tests. #29829

Merged
merged 7 commits into from
Dec 28, 2023

Conversation

lostluck
Copy link
Contributor

Add event time timer validation tests.

Execute both with bounded, and unbounded modes incase runners execute them differently (namely dataflow).

Fix bug in handling MapWindows, which couldn't map from Global Windows to GlobalWindows properly due to a quirk in the SDK's Window Coder + KV handling.

Pre-Work for #29772


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@lostluck
Copy link
Contributor Author

Currently triggering all the post commits, so I can properly add filtering for failures.

Copy link

codecov bot commented Dec 19, 2023

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (1986517) 38.23% compared to head (b8b5350) 38.24%.
Report is 31 commits behind head on master.

❗ Current head b8b5350 differs from pull request most recent head 1fe6f00. Consider uploading reports for the commit 1fe6f00 to get more accurate results

Files Patch % Lines
sdks/go/pkg/beam/core/runtime/exec/window.go 63.63% 3 Missing and 1 partial ⚠️
sdks/go/pkg/beam/core/runtime/exec/translate.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #29829      +/-   ##
==========================================
+ Coverage   38.23%   38.24%   +0.01%     
==========================================
  Files         696      696              
  Lines      101878   101883       +5     
==========================================
+ Hits        38952    38969      +17     
+ Misses      61309    61299      -10     
+ Partials     1617     1615       -2     
Flag Coverage Δ
go 53.97% <58.33%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@lostluck
Copy link
Contributor Author

Yeah, just Dataflow appears to support timers. The others fail in various ways.

Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @jrmccluskey for label go.
R: @damccorm for label build.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@lostluck
Copy link
Contributor Author

R: @damondouglas

Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@lostluck
Copy link
Contributor Author

Friendly ping for a review here.

@lostluck lostluck changed the title [#29772][Go SDK]Add EventTime Timer tests. [#29772][Go SDK] Add EventTime Timer tests. Dec 28, 2023
Copy link
Contributor

@damondouglas damondouglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI sdks/go/test/integration/integration.go says at the top of the file:

// Running integration tests can be done with a go test call with any flags that
// are required by the test pipelines, such as --runner or --endpoint.
// Example:
//
//	go test -v ./sdks/go/test/integration/... --runner=portable --endpoint=localhost:8099

which results in the error pattern ./sdks/go/test/integration/...: directory prefix sdks/go/test/integration does not contain main module or its selected dependencies. This makes sense since there's no go.mod at the root of the repos. I was able to execute this PR's branch in an ephemeral Google Cloud console using:

cd sdks
go test -v ./go/test/integration/... --runner=portable --endpoint=localhost:8099

However, I received the error:

2023/12/28 17:40:50 Cross-compiling /home/damondouglas/timerIntegrationTests/sdks/go/test/integration/wordcount/wordcount_test.go with GOOS=linux GOARCH=amd64 CGO_ENABLED=0 as /tmp/worker-1-1703785250974444350
    wordcount_test.go:113: WordCount("foo") failed:     connecting to job service
        failed to dial server at localhost:8099
                caused by:
        context deadline exceeded
--- FAIL: TestWordCount (188.39s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1847fbc]

goroutine 23 [running]:
testing.tRunner.func1.2({0x1b951a0, 0x4abc560})
        /usr/local/go/src/testing/testing.go:1545 +0x238
testing.tRunner.func1()
        /usr/local/go/src/testing/testing.go:1548 +0x397
panic({0x1b951a0?, 0x4abc560?})
        /usr/local/go/src/runtime/panic.go:914 +0x21f
github.com/apache/beam/sdks/v2/go/pkg/beam/runners/universal/runnerlib.universalPipelineResult.Metrics(...)
        /timerIntegrationTests/sdks/go/pkg/beam/runners/universal/runnerlib/execute.go:164
github.com/apache/beam/sdks/v2/go/test/integration/wordcount.TestWordCount(0xc000500ea0)
        /timerIntegrationTests/sdks/go/test/integration/wordcount/wordcount_test.go:116 +0x4b7
testing.tRunner(0xc000500ea0, 0x2582458)
        /usr/local/go/src/testing/testing.go:1595 +0xff
created by testing.(*T).Run in goroutine 1
        /usr/local/go/src/testing/testing.go:1648 +0x3ad
FAIL    github.com/apache/beam/sdks/v2/go/test/integration/wordcount    188.507s

sdks/go/test/integration/primitives/timers.go Outdated Show resolved Hide resolved
sdks/go/test/integration/primitives/timers.go Outdated Show resolved Hide resolved
Comment on lines 92 to 99
// TimersEventTime takes in an impulse transform and then validates
// event time timer execution.
//
// The impulse is provided outside to swap between a bounded impulse, and
// an unbounded one, because the Go SDK uses that to determine if a pipeline
// is "streaming" or not. This matters at least for executions on Dataflow.
//
// Regardless,the pipelines should pass.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May we consider the following? (Blame the comment lines alignment on the GitHub comment editor).

// Validator is a func that validates a Pipeline.
type Validator func(s beam.Scope)

// TimersEventTimeValidator produces a Validator that validates whether an OnTimer callback is invoked. 
// It passert.EqualsList expected timestamps (as millisecond int) plus an expected offset.
//
// makeImp is a func that produces an impulse (either beam.Impulse or beam.PeriodicImpulse). The 
// purpose of makeImp is to provide TimersEventTimeValidator either a bounded impulse or an 
// unbounded one. Beam, in general, uses this impulse to determine if a pipeline is "streaming" or 
// "batch". Test coverage for streaming or batch matters for executions on certain runners such as 
// Dataflow.
func TimersEventTimeValidator(makeImp func(s beam.Scope) beam.PCollection) Validator {

}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Validator is actually constructing the pipeline. It's not validating a pipeline. The validation has to be built in. It's not clear what benefit the additional indirection through a type (which is still exposed as a func(s beam.Scope) anyway serves here, and it's not going to be obvious a "Validator" is actually the same as expected by ptest.BuildAndRun, even though the compiler is happy with it.

  2. I agree that the current TimersEventTime builder function isn't ideal. I'm going to unexport it, and simply move the full pipeline builds "internally", so it's not exposed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how you re-worded the comment. It's more clear now. Feel free to resolve this conversation if you'd like.

sdks/go/test/integration/primitives/timers.go Show resolved Hide resolved
sdks/go/test/integration/primitives/timers.go Show resolved Hide resolved
sdks/go/test/integration/primitives/timers.go Show resolved Hide resolved
sdks/go/test/integration/primitives/timers_test.go Outdated Show resolved Hide resolved
@lostluck
Copy link
Contributor Author

Great question! Fixing that is out of scope for this PR.

integration.go does not start runners. The Go SDK has never auto-started runners, largely due to that being an undocumented expectation that has never been fulfilled.

The lack of runner lead to deadline failure, and then retuning a nil Pipeline results. The test in particular doesn't fail out when there are nil PipelineResults though, so the metrics check then fails. This could be fixed by not returning a nil pipeline results in such cases in the default runner lib, or by making the test check if a nil was returned.

@lostluck
Copy link
Contributor Author

Updated PTAL!

@damondouglas damondouglas self-requested a review December 28, 2023 20:28
Copy link
Contributor

@damondouglas damondouglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lostluck LGTM

@lostluck
Copy link
Contributor Author

Dataflow continues to pass, removing trigger files and merging.

@lostluck lostluck merged commit 783c72a into apache:master Dec 28, 2023
7 checks passed
@github-actions github-actions bot removed the build label Dec 28, 2023
@lostluck lostluck deleted the timerIntegrationTests branch December 28, 2023 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants