-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: windows callback on non-Go thread fails #6751
Comments
The issue is raised here https://groups.google.com/d/topic/golang-nuts/h91eDx27vrQ/discussion Alex |
Comment 2 by vanillabluesky007: Looks great. Probably the same issue that i triggered. The only difference in my case is that my dll uses C-Runtime therefore creates threads with _beginthreadex rather than CreateThread. (different thread management in both) See: http://msdn.microsoft.com/en-us/library/windows/desktop/ms682453(v=vs.85).aspx (P.s nice trick calling kernel32.dll directly and apologies for being late on opening the issue!) /Sven |
sven, Attaching your test case here for posterity. I could not use your dll, it uses msvcr110d.dll that I don't have on my amd64 pc. And I couldn't build it on my 386 pc, because _beginthread expects __cdecl, not __stdcall function (it, probably, didn't matter for you, if you're on amd64). Anyway, I think my test is good enough for our purpose. Don't you agree? I investigated where my test hangs. It is in lockextra, like you said. And it is because no free m. cgo does not have that problem, because it sets runtime·needextram to 1 at the start. I don't know why it is limited to cgo only. But if I move it into runtime, then my test PASSes. I suspect your program should work too. Here is my diff: diff --git a/src/pkg/runtime/cgo/iscgo.c b/src/pkg/runtime/cgo/iscgo.c --- a/src/pkg/runtime/cgo/iscgo.c +++ b/src/pkg/runtime/cgo/iscgo.c @@ -12,4 +12,3 @@ #include "../runtime.h" bool runtime·iscgo = 1; -uint32 runtime·needextram = 1; // create an extra M on first cgo call diff --git a/src/pkg/runtime/crash_test.go b/src/pkg/runtime/crash_test.go --- a/src/pkg/runtime/crash_test.go +++ b/src/pkg/runtime/crash_test.go @@ -105,7 +105,7 @@ testDeadlock(t, lockedDeadlockSource) } -func TestLockedDeadlock2(t *testing.T) { +func testLockedDeadlock2(t *testing.T) { testDeadlock(t, lockedDeadlockSource2) } diff --git a/src/pkg/runtime/proc.c b/src/pkg/runtime/proc.c --- a/src/pkg/runtime/proc.c +++ b/src/pkg/runtime/proc.c @@ -64,7 +64,7 @@ Sched runtime·sched; int32 runtime·gomaxprocs; -uint32 runtime·needextram; +uint32 runtime·needextram = 1; bool runtime·iscgo; M runtime·m0; G runtime·g0; // idle goroutine for m0 diff --git a/src/pkg/runtime/syscall_windows_test.go b/src/pkg/runtime/syscall_windows_test.go --- a/src/pkg/runtime/syscall_windows_test.go +++ b/src/pkg/runtime/syscall_windows_test.go @@ -240,5 +240,35 @@ } func TestCallbackInAnotherThread(t *testing.T) { - // TODO: test a function which calls back in another thread: QueueUserAPC() or CreateThread() + d := GetDLL(t, "kernel32.dll") + + f := func(p uintptr) uintptr { + return p + } + r, _, err := d.Proc("CreateThread").Call(0, 0, syscall.NewCallback(f), 123, 0, 0) + if r == 0 { + t.Fatalf("CreateThread failed: %v", err) + } + h := syscall.Handle(r) + defer syscall.CloseHandle(h) + + switch s, err := syscall.WaitForSingleObject(h, 100); s { + case syscall.WAIT_OBJECT_0: + break + case syscall.WAIT_TIMEOUT: + t.Fatal("timeout waiting for thread to exit") + case syscall.WAIT_FAILED: + t.Fatalf("WaitForSingleObject failed: %v", err) + default: + t.Fatalf("WaitForSingleObject returns unexpected value %v", s) + } + + var ec uint32 + r, _, err = d.Proc("GetExitCodeThread").Call(uintptr(h), uintptr(unsafe.Pointer(&ec))) + if r == 0 { + t.Fatalf("GetExitCodeThread failed: %v", err) + } + if ec != 123 { + t.Fatalf("expected 123, but got %d", ec) + } } I had to disable one test. I don't know why, so we have to resolve that. We cannot make changes now, not until after Go 1.2 is released. Perhaps this change will get accepted. I hope you can use this in the meantime to fix your problem locally. Alex Attachments:
|
cameron.howey, My patch at #3 is not good. Like I said there: "... I had to disable one test. I don't know why, so we have to resolve that. ...". It appears the test fails with "deadlock" message, so we need to rearrange scheduler somewhat before we can apply this patch. I didn't push for it, because I don't have anyone complaining about it. Perhaps, if it is important to you, you could tell us what your difficulty is. Maybe that will convince someone to put some actual effort into it. Thank you. Alex |
I wanted to use Windows COM (specifically the MMDevice and EndpointVolume APIs) from a Go project. If this sounds like a bad idea, it is. So I wrote a simple DLL in C++ for the COM stuff and exported the relevant functions. I have the DLL start another thread and run the COM stuff from there, to avoid worrying about how COM was going to interact with Go's thread-hopping, etc. I use IAudioEndpointVolume::RegisterEndpointNotificationCallback to get notifications when the system volume changes. I wanted to use a Go callback here. So the issue is that the DLL runs a separate thread but needs to use a Go callback in that thread. I was not otherwise using Cgo, instead using syscall to call my DLL. Like I said, using import "C" solved the issue. It works like a champ. But I spent some hours trying to figure it out beforehand (the Go executable just crashed with no message) until I found this on the issue tracker. Possibly you don't hear about this issue much because most Go programmers don't use callbacks in a separate thread in a DLL on Windows. Furthermore, people who ARE doing this might already use import "C", which would hide the problem. |
cameron.howey, > I wanted to use Windows COM (specifically the MMDevice and EndpointVolume APIs) from a Go project. If this sounds like a bad idea, it is. So I wrote a simple DLL in C++ for the COM stuff and exported the relevant functions. ... I would like to see what you are trying to do. You can provide a C program, if you like. I am interested to see, if I can convert it into pure Go. Just for fun. When I have time. Thank you. Alex |
I posted a gist: https://gist.github.com/chowey/6705c5c6b07399e6a73a |
Are there any plans to fix this problem? My specific need is the following: I have cgofuse, a cross-platform FUSE (Filesystem in Userspace) package based on cgo. I am hoping to eliminate the need for cgo on Windows, by using This is to support the go-ipfs project, which would rather avoid the use of a C compiler on Windows. More information here: ipfs/kubo#5003 (comment) Ping @djdv. |
It may be worth mentioning that my original intention was to write a dynamic binding for winfsp separate from cgofuse. This issue prevents either effort from progressing. |
@djdv try |
|
There is no one working on this. If you want this fixed, I suggest you try and fix it yourself. This issue has a test, so all you need to do is adjust runtime package to make test pass. :-) If you need any help, just ask here. We will try and answer any questions you might have. Alex |
@alexbrainman thanks. It is unlikely I will have the time to look into this for a few days, but I am hoping to find some time for experimentation in the near future. |
Despite my earlier statement about needing a few days to look into this I decided to have a look, but have already hit a snag. Running I have traced this problem to the following 2 tests: The OS is Windows 10.0.14393 inside VirtualBox. When the hang happens the OS becomes completely unresponsive and even after waiting an extended period of time it does not come back. The OS does not appear to bug check (no |
The patch below appears to resolve this issue. The patch includes a simple fix in Unfortunately the patch fails multiple deadlock tests: I suspect the reason is that (Please note that until a couple of hours ago I had no idea about G's, M's and P's, so I may be way off in my analysis. The following link helped me get a quick high-level understanding.) Here is the patch: diff --git a/src/runtime/proc.go b/src/runtime/proc.go
index ba76f7c3e7..14944da81f 100644
--- a/src/runtime/proc.go
+++ b/src/runtime/proc.go
@@ -1287,7 +1287,7 @@ func mstart1() {
//go:yeswritebarrierrec
func mstartm0() {
// Create an extra M for callbacks on threads not created by Go.
- if iscgo && !cgoHasExtraM {
+ if (iscgo || GOOS == "windows") && !cgoHasExtraM {
cgoHasExtraM = true
newextram()
}
@@ -1612,7 +1612,7 @@ func allocm(_p_ *p, fn func()) *m {
// put the m back on the list.
//go:nosplit
func needm(x byte) {
- if iscgo && !cgoHasExtraM {
+ if (iscgo || GOOS == "windows") && !cgoHasExtraM {
// Can happen if C/C++ code calls Go from a global ctor.
// Can not throw, because scheduler is not initialized yet.
write(2, unsafe.Pointer(&earlycgocallback[0]), int32(len(earlycgocallback)))
diff --git a/src/runtime/syscall_windows_test.go b/src/runtime/syscall_windows_test.go
index dfde12a211..fd5ecca03d 100644
--- a/src/runtime/syscall_windows_test.go
+++ b/src/runtime/syscall_windows_test.go
@@ -251,7 +251,34 @@ func TestBlockingCallback(t *testing.T) {
}
func TestCallbackInAnotherThread(t *testing.T) {
- // TODO: test a function which calls back in another thread: QueueUserAPC() or CreateThread()
+ d := GetDLL(t, "kernel32.dll")
+ f := func(p uintptr) uintptr {
+ return p
+ }
+ r, _, err := d.Proc("CreateThread").Call(0, 0, syscall.NewCallback(f), 123, 0, 0)
+ if r == 0 {
+ t.Fatalf("CreateThread failed: %v", err)
+ }
+ h := syscall.Handle(r)
+ defer syscall.CloseHandle(h)
+ switch s, err := syscall.WaitForSingleObject(h, 100); s {
+ case syscall.WAIT_OBJECT_0:
+ break
+ case syscall.WAIT_TIMEOUT:
+ t.Fatal("timeout waiting for thread to exit")
+ case syscall.WAIT_FAILED:
+ t.Fatalf("WaitForSingleObject failed: %v", err)
+ default:
+ t.Fatalf("WaitForSingleObject returns unexpected value %v", s)
+ }
+ var ec uint32
+ r, _, err = d.Proc("GetExitCodeThread").Call(uintptr(h), uintptr(unsafe.Pointer(&ec)))
+ if r == 0 {
+ t.Fatalf("GetExitCodeThread failed: %v", err)
+ }
+ if ec != 123 {
+ t.Fatalf("expected 123, but got %d", ec)
+ }
}
type cbDLLFunc int // int determines number of callback parameters |
Actually this problem appears to happen with cgo as well. Consider the following go program: package main
func main() {
select {}
} Run it to find that $ go run deadlock.go
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [select (no cases)]:
main.main()
/Users/billziss/Projects/go/src/deadlock.go:4 +0x20
exit status 2 Now consider the following cgo program: package main
import "C"
func main() {
select {}
} Run it to find that $ go run deadlock.go
^Csignal: interrupt I believe that the problem lies with EDIT: this latest test was done on a Mac with stock Go 1.10. |
The following patch passes all tests on Windows and macOS. (On Windows this excludes the 2 tests diff --git a/src/runtime/proc.go b/src/runtime/proc.go
index ba76f7c3e7..1647044f36 100644
--- a/src/runtime/proc.go
+++ b/src/runtime/proc.go
@@ -1287,7 +1287,7 @@ func mstart1() {
//go:yeswritebarrierrec
func mstartm0() {
// Create an extra M for callbacks on threads not created by Go.
- if iscgo && !cgoHasExtraM {
+ if (iscgo || GOOS == "windows") && !cgoHasExtraM {
cgoHasExtraM = true
newextram()
}
@@ -1612,7 +1612,7 @@ func allocm(_p_ *p, fn func()) *m {
// put the m back on the list.
//go:nosplit
func needm(x byte) {
- if iscgo && !cgoHasExtraM {
+ if (iscgo || GOOS == "windows") && !cgoHasExtraM {
// Can happen if C/C++ code calls Go from a global ctor.
// Can not throw, because scheduler is not initialized yet.
write(2, unsafe.Pointer(&earlycgocallback[0]), int32(len(earlycgocallback)))
@@ -4209,8 +4209,13 @@ func checkdead() {
return
}
+ var run0 int32
+ if !iscgo && cgoHasExtraM {
+ run0 = 1
+ }
+
run := mcount() - sched.nmidle - sched.nmidlelocked - sched.nmsys
- if run > 0 {
+ if run > run0 {
return
}
if run < 0 {
diff --git a/src/runtime/syscall_windows_test.go b/src/runtime/syscall_windows_test.go
index dfde12a211..fd5ecca03d 100644
--- a/src/runtime/syscall_windows_test.go
+++ b/src/runtime/syscall_windows_test.go
@@ -251,7 +251,34 @@ func TestBlockingCallback(t *testing.T) {
}
func TestCallbackInAnotherThread(t *testing.T) {
- // TODO: test a function which calls back in another thread: QueueUserAPC() or CreateThread()
+ d := GetDLL(t, "kernel32.dll")
+ f := func(p uintptr) uintptr {
+ return p
+ }
+ r, _, err := d.Proc("CreateThread").Call(0, 0, syscall.NewCallback(f), 123, 0, 0)
+ if r == 0 {
+ t.Fatalf("CreateThread failed: %v", err)
+ }
+ h := syscall.Handle(r)
+ defer syscall.CloseHandle(h)
+ switch s, err := syscall.WaitForSingleObject(h, 100); s {
+ case syscall.WAIT_OBJECT_0:
+ break
+ case syscall.WAIT_TIMEOUT:
+ t.Fatal("timeout waiting for thread to exit")
+ case syscall.WAIT_FAILED:
+ t.Fatalf("WaitForSingleObject failed: %v", err)
+ default:
+ t.Fatalf("WaitForSingleObject returns unexpected value %v", s)
+ }
+ var ec uint32
+ r, _, err = d.Proc("GetExitCodeThread").Call(uintptr(h), uintptr(unsafe.Pointer(&ec)))
+ if r == 0 {
+ t.Fatalf("GetExitCodeThread failed: %v", err)
+ }
+ if ec != 123 {
+ t.Fatalf("expected 123, but got %d", ec)
+ }
}
type cbDLLFunc int // int determines number of callback parameters |
Did not take you long to come up with a solution. :-)
I have never seen anything like that. If you care, you can create new issue and someone will investigate.
Normally hanged test are debugged by adding -timeout parameter. -timeout forces test to crash, and crash prints stack trace, and you can see where test hangs. But your test hangs whole OS, so I don't see how my advise is useful here.
I do not know, if your program should hang or not. So I created #25538 to confirm.
Yes this works for me too. On Windows 10 here. I am not familiar with scheduler, so I cannot confirm your patch is correct or not. But if you send your change for review https://golang.org/doc/contribute.html someone will check your code. Thank you. Alex |
Thanks for the pointer. The document mentions Gerrit, which I am not familiar with, but it looks like I can now submit the patch as a GitHub PR as well. The Google CLA looks fairly benign, so I am inclined to sign it. Re: the patch itself. A large portion of this patch includes your test @alexbrainman which I would not be able to submit under the Google CLA (I would be violating term 5 or have to follow the process in term 7, which I do not understand). So I can only submit my changes against |
I have now submitted PR # 25575, which includes the changes in |
Change https://golang.org/cl/114755 mentions this issue: |
Yes that should work. Unfortunately your PR #25575 seems stuck - you created and signed CLA 4 hours ago, and Gobot should have created Gerrit CL by now. We'll give it little more time, and maybe ping for help. It is not good that it is weekend now, perhaps we'll have to wait for next week.
I contributed to Go before, so I have signed CLA, and you could take my code as part of your change. But I created https://go-review.googlesource.com/c/go/+/114755 in case we want two separate changes. I don't care how, as long as your change goes in with the test.
Thank you for fixing this. It have very little free time nowdays. Alex |
I see. Obviously I am unfamiliar with the process, so I did not know what to expect. Perhaps the "gobot" needs a kick? :)
You mean to tell me that you do not work for Google, but somehow you have 710+ commits on this project? Respect!
I would prefer it if we kept the changes separate. The CLA states in term 5 that: "You represent that each of Your Contributions is Your original creation...". I find it inadvisable to knowingly violate the CLA on the same day I signed it :)
You are welcome. I very much understand the no-time argument. |
Updates #6751 Change-Id: Ibb176a17e67c67f855bc4f3e5462dddaedaa8a58 Reviewed-on: https://go-review.googlesource.com/114755 Run-TryBot: Alex Brainman <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>
Perhaps it does need a kick. I pinged Brad on your PR, maybe he will help. It is getting late Friday in West coast of USA, so it might be a while when someone will help - Go Team members are real people. But we will get your change reviewed.
No, I do not work for Google. Alex |
Change https://golang.org/cl/114802 mentions this issue: |
Dear alexbrainman I tried your old below sample but same result.
Could you please give me some solution ? Best Regards, |
Same problem as #6751 ? As you can see, the issue is still opened, so the problem still remains. But @billziss-gh proposed solution - you can try his solution at https://golang.org/cl/114802. Alex |
The text was updated successfully, but these errors were encountered: