Skip to content

Commit

Permalink
Try to close race condition in FreeBSD tests
Browse files Browse the repository at this point in the history
We're seeing frequent test failures in the FileWatching test on
FreeBSD. Here's my theory of what happens:
 - Both the timer callback and the poll callback execute on the same libuv loop
 - They each schedule their respective tasks
 - Whichever task gets scheduled first first determines the result

However, in the test, we expect that, if the poll callback ran, (which we know
because we know there was an event pending), then that result does
actually get delivered to the toplevel task. This PR tries to close
this hole by adding the following condition:

If the task is no longer waiting on the file watcher (because libuv already
scheduled it), then wait for the task to run to completion, independent
of any timeout. I believe this should close the above race condition
and hopefully fix the test.
  • Loading branch information
Keno committed Nov 12, 2020
1 parent af3cd59 commit 2c9160c
Showing 1 changed file with 14 additions and 4 deletions.
18 changes: 14 additions & 4 deletions stdlib/FileWatching/src/FileWatching.jl
Original file line number Diff line number Diff line change
Expand Up @@ -254,6 +254,10 @@ mutable struct _FDWatcher
end
end

function iswaiting(fwd::_FDWatcher, t::Task)
return fwd.notify.waitq === t.queue
end

mutable struct FDWatcher
watcher::_FDWatcher
readable::Bool
Expand Down Expand Up @@ -653,10 +657,10 @@ function poll_fd(s::Union{RawFD, Sys.iswindows() ? WindowsRawSocket : Union{}},
try
if timeout_s >= 0
result::FDEvent = FDEvent()
timer = Timer(timeout_s) do t
notify(wt)
end
@async begin
t = @async begin
timer = Timer(timeout_s) do t
notify(wt)
end
try
result = wait(fdw, readable=readable, writable=writable)
catch e
Expand All @@ -666,6 +670,12 @@ function poll_fd(s::Union{RawFD, Sys.iswindows() ? WindowsRawSocket : Union{}},
notify(wt)
end
wait(wt)
# It's possible that both the timer and the poll fired on the same
# libuv loop. In that case, which event we see here first depends
# on task schedule order. If we can see that the task isn't waiting
# on the file watcher anymore, just let it finish so we can see
# the modification to `result`
iswaiting(fdw, t) || wait(t)
return result
else
return wait(fdw, readable=readable, writable=writable)
Expand Down

0 comments on commit 2c9160c

Please sign in to comment.