-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/exec: data race in TestStdinClose #17647
Comments
It's not OK to call Wait while working with stdin/stdout. |
@dvyukov, I don't understand why this is a race. I will explain why I think it's not a race. Please show me why I am wrong. The subprocess is running case "stdinClose" at line 660:
That is, the subprocess consumes the entire stdin and then exits. TestStdinClose does (simplifying):
The new goroutine is kicked off and cmd.Wait blocks in the wait system call waiting for the subprocess to exit. The new goroutine writes stdinCloseTestString to stdin (using the underlying *os.File) and then closes stdin (using the underlying *os.File). Once stdin closes, the subprocess should see the EOF and exit. Then cmd.Wait unblocks - the wait system call returns - and now sees the subprocess exit and calls close on the underlying *os.File for stdin, just in case the calling code has not already closed the file. That second close should not do anything because the file is already closed. Also, the whole thing is sequenced: the goroutine's close of stdin is what makes the subprocess exit, and the subprocess exit is what makes the second half of cmd.Wait begin, and the second half of cmd.Wait is what is closing the file in the race report. Having explained why this code is essentially single-threaded, I am confused about the race report. The race report seems to be saying that the second half of cmd.Wait has already executed and the copy loop in the goroutine is now running and racing with Wait. But assuming something is not completely broken in the test, the copy loop should always run before cmd.Wait. Is the race report accurate about the time ordering here? Or does the race detector just pick one event to be 'previous' when it doesn't actually know which one came first in time order? If the race report is accurate about the time ordering, then the test itself must be broken somehow, in which case there should be other failure output. If the race report has the time ordering backward, and in fact it is objecting to the write in goroutine 27 racing with a previous read in goroutine 28, that's just not a race: they are sequenced by the subprocess exiting. |
@rsc Your reasoning regarding event ordering is correct, I missed that there is transitive relation. Race detector does not see transitive synchronization that involves other process. To fix that we could synchronize all fd Close's with all unblocks of cmd.Wait (Close will need to a release on a fake address, and Wait do acquire on the same address after return). However, the signal to exit can have other forms, e.g. one process could write some magic value to a pipe, or maybe create a special file or disk. The ordering of events in the report is precise. Tsan observed that in this particular execution Close happened before Write. How it happened is a good question. Fortunately I still have the log:
So in this case the child process was killed (seems to be happening on race gomote machines regularly: Which suggests that: you can never be sure that the process won't exit before you signal it. A child process can be killed by user, by kernel, or just exit with an error. |
That was a wrong part of the log, but it was still killed:
|
cmd.Process.Kill() can be a workaround? is there temporary solution? |
The solution is to not call cmd.Wait() while working with stdin/stdout pipes. |
Thanks the problem is still valid |
1.8 is scheduled to be released February 1, 2017. |
CL https://golang.org/cl/33298 mentions this issue. |
Hello,
Regarding the issue at dour and 1.8.
Does 1.8 will include the fix and will be out? Is there any beta or should
I modify my src with fix above?
Thank you,
Yoni Shperling
On Wed, 16 Nov 2016 at 4:25 GopherBot ***@***.***> wrote:
Closed #17647 <#17647> via b906df6
<b906df6>
.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#17647 (comment)>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/ALrs2nEpFvjj_zUsf1p28WfkYVF7lN9Sks5q-mmSgaJpZM4KjeKw>
.
--
Best Regards,
Yoni Shperling
|
1.8 includes the fix. Release candidate 3 for Go 1.8 is currently available, and the final release will occur within a couple of weeks. See the golang-dev mailing list for details. |
Thank you!
…On Wed, Feb 1, 2017 at 4:48 PM, Ian Lance Taylor ***@***.***> wrote:
1.8 includes the fix. Release candidate 3 for Go 1.8 is currently
available, and the final release will occur within a couple of weeks. See
the golang-dev mailing list for details.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#17647 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ALrs2i1bU3W9YAqHvHiEN7uiuqSLpYO1ks5rYJtNgaJpZM4KjeKw>
.
--
Best Regards,
Yoni Shperling
|
Change https://golang.org/cl/65490 mentions this issue: |
CL 31148 added code to protect again simultaneous calls to Close and Wait when using the standard input pipe, to fix the race condition described in issue #9307. That issue is a special case of the race between Close and Write described by issue #7970. Since issue #7970 was not fixed, CL 31148 fixed the problem specific to os/exec. Since then, issue #7970 has been fixed, so the specific fix in os/exec is no longer necessary. Remove it, effectively reverting CL 31148 and followup CL 33298. Updates #7970 Updates #9307 Updates #17647 Change-Id: Ic0b62569cb0aba44b32153cf5f9632bd1f1b411a Reviewed-on: https://go-review.googlesource.com/65490 Run-TryBot: Ian Lance Taylor <[email protected]> Reviewed-by: Daniel Martí <[email protected]> Reviewed-by: Miguel Bernabeu <[email protected]> Reviewed-by: Russ Cox <[email protected]> Reviewed-by: Joe Tsai <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
While testing new race runtime I've hit:
I can't reproduce it with current race runtime, but the race looks very real. Introduced while fixing another race in os/exec #6270. One does not simply use os/exec without races.
The text was updated successfully, but these errors were encountered: