Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove a race condition in sdk testutil server stop #10342

Merged
merged 2 commits into from
Jun 9, 2021

Conversation

dgsb
Copy link
Contributor

@dgsb dgsb commented Jun 3, 2021

We have 2 threads calling exec.Cmd.Wait which is not reentrant. It causes some test case failure at our side when running the test suite with the -race flag.

I've just removed the cmd.Wait which looks useless.

Related to #8329

Unverified

This user has not yet uploaded their public signing key.
@hashicorp-cla
Copy link

hashicorp-cla commented Jun 3, 2021

CLA assistant check
All committers have signed the CLA.

@vercel vercel bot temporarily deployed to Preview – consul-ui-staging June 3, 2021 08:57 Inactive
@vercel vercel bot temporarily deployed to Preview – consul June 3, 2021 08:57 Inactive
@dnephin dnephin added theme/testing Testing, and related enhancements pr/no-changelog PR does not need a corresponding .changelog entry labels Jun 8, 2021
Copy link
Contributor

@dnephin dnephin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR and for fixing this data race! We've been tracking the work to enable the race detector in #8329, but we had not seen this race yet.

I've made one suggestion below, but otherwise I think this looks correct.

Comment on lines 344 to -345
s.cmd.Process.Signal(syscall.SIGABRT)
s.cmd.Wait()
Copy link
Contributor

@dnephin dnephin Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do still want to wait for the process to exit in this case, because there is a defer at the top of this function which is responsible for removing the temporary directory. We want to make sure the process has exited before we try and remove it to make sure the cleanup succeeds.

I think we can do that by blocking on a read from the waitDone channel instead of calling s.cmd.Wait()

<-waitDone
return ...

Sending SIGABRT should ensure the process exits, so it should be safe to block here in the main goroutine. We might want to change this signal to SIGKILL at some point, if the process is not shutting down with SIGABRT, but I think we can wait to make that change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @dnephin that totally makes sense. Can you check the last commit ?

Copy link
Contributor

@dnephin dnephin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@dnephin dnephin merged commit 797bfb7 into hashicorp:master Jun 9, 2021
@hc-github-team-consul-core
Copy link
Collaborator

🍒 If backport labels were added before merging, cherry-picking will start automatically.

To retroactively trigger a backport after merging, add backport labels and re-run https://circleci.com/gh/hashicorp/consul/383609.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/no-changelog PR does not need a corresponding .changelog entry theme/testing Testing, and related enhancements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants