-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows directory watcher stops working after some time #37233
Comments
Just to clarify the impact here, even a basic hello world web project (no angular) will currently consume an entire core indefinitely as long as webdev is running due to using the polling watcher. |
How critical is this bug? I probably won't have time to address this quarter meaning it probably won't get fixed until after the summer vacation. |
@sortie Can you take a look at the dart-lang/webdev#436 to see how critical it is? |
Do we have an idea when this was introduced? |
Afaik it has been this way forever. |
The problem has been around for a very long time. The impact has gone up since the new build system relies on caching to disk. |
Any update on a fix for this? |
related comment: dart-lang/webdev#436 (comment) |
This issue is blocking our company from upgrading to Dart 2 |
FYI @aadilmaan |
This is because of native void main() async {
var i = 0;
var file = File('out.txt');
await file.create();
while (true) {
i++;
await Future.delayed(new Duration(milliseconds : 1000));
await file.writeAsString('$i');
}
} With around 100 milliseconds sleeping, it works well. Once the sleeping time is down to 10 milliseconds, watcher can't perform well. Not sure how to deal with it. The other issue on silently shut down. I have no idea. I can't reproduce it locally. |
Is that the most efficient way to be notified of file changes for Windows? I'm no expert, but what about this: https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-findfirstchangenotificationw |
@bkonyi Any idea? Do we have a way to improve/fix it? |
Hm... nope, I'm not terribly familiar with the file system watcher code. I'm wondering if maybe we're seeing a race that's causing events to be dropped or if that's just a limitation of the API itself. I'm not sure why the isolate would be exiting though... it's almost like it things it's done with everything on the event queue and there's nothing left to do so it shuts down. Unfortunately, it's hard to say if you can't reproduce it though. |
I hope you don't mind, but I seem to recall reading on Bruce Dawson's blog him running into this sort of thing, so I asked him for his advice:
If you don't know him, he's an expert on Windows performance: https://randomascii.wordpress.com/ (He also works at Google) I was looking here: https://github.com/chromium/vs-chromium/blob/master/src/Core/Files/DirectoryChangeWatcher.State.cs |
MSDN says the following:
There is obviously an inherent reliability problem with the API - no matter what we do it always can loose events if somebody is doing a lot of file system changes. We can try to fight this issue, but I also think that I think the only simple way we could try to cope with this is to try bumping |
Just a simple thought while you lookging for a definitive solution: |
Here is at least one issue where we have discusses changing what we are watching to avoid seeing the updates we make in I think there may have been some other places where it came up. There are some tricky things to work out if we try to do this - for instance if we start up watchers in the directories we care about initially, and a new directory we would care about gets added, we won't have a watcher in it. I think we could solve this if there were some SDK and OS supported way of either:
The other thing to consider would be to move the output somewhere other than |
@jakemac53 Sorry for delayed reply. For all the issue you posted,
I managed to find the root cause for this. The reason is that Can you try this cl locally to see whether it solve your problem? https://dart-review.googlesource.com/c/sdk/+/111342
I have never triggered any termination when I worked on the issue. Do you know how to reproduce it consistently? |
I don't seem to be able to reproduce this any more... 🤷♂️.
I don't have the sdk set up to build on my windows box (its my home machine) so I can't easily test, trying to figure out how to just download the version the try bots built though. Or you could land the cl and I could get it from the build bots after they run. |
@zichangg I can now consistently repro the issue where the entire program simply exits - in 3 runs this happened between 100 and 200 events in so it doesn't take long (this is using the two original programs linked at the top). I saw the same behavior regardless of whether I started the watcher script before or after starting the file editing script. The program exits without any sort of stack trace or anything so I unfortunatley can't help much with debugging here, I don't know if there are additional vm flags I could provide to get more logging? |
@jakemac53 I failed to reproduce it. No matter using powershell, cmd or git bash. Do you have a detailed reproduction? I'm testing with
I don't think there is specific flags for file watchers. Maybe use observatory and set --pause-isolates-on-exit. |
@jakemac53 You can probably pass --trace-isolates to VM to print more info. |
Does the "program exiting" behavior reproduces more easily if you pause watcher(by clicking somewhere in powershell console which has watcher running) for 4-5 seconds, then resume it(hit Esc, for example). |
I will try this again tomorrow with |
@jakemac53 did you have a chance to try it out? I would like to make sure that this issue is not falling through. |
Thanks for the ping I had indeed forgotten to check this, I can check it when I get home tonight on my windows machine there. |
I tried with
|
I did find
|
@jakemac53 Just wanna double check. Is your reproduction the same as what @aam described?
|
I don't need to pause it to get the reproduction - I run the watch script and then run the file modification script. The watch script dies typically within a couple seconds (after starting the file modification one). |
@zichangg and @jakemac53. A solution for this issue is very important to my company team to start a project with Dart on Windows enviromnent. Is there a deadline to close this issue? PS: Just one thought, no thorough investigation: |
The analyzer uses the same watcher - but there is a specific bad interaction with the watcher and package:build due to the high number of file events we cause when doing builds. |
(also anecdotally I have many times seen analyzer stop working on windows and had to restart vscode to get it back, I would be willing to bet this is the reason) |
@jakemac53 can you try the following variant of watcher please?
Does it still exits for you? The explanation I have for why your version exits is that the only active receive port in that watcher is directory changes listener, but it just stops(perhaps due to internal constraints of buffer overflow handling), which leads to that single receive port being closed, leading to dart program exit. |
I think that's root cause. I tested different buffer size with aam's reproduction. With extremely small buffer size, program will exit automatically. With 64k bytes buffer, pausing watcher's powershell for several seconds can reproduce the problem.
For buffer size,
64kb is what we are using now. I don't think we have anything to do here. |
OK - if this is an issue with the underlying OS then we can fall back to trying to mitigate the problem instead. Can we get some sort of specific exception thrown when this happens? Then we can at least message to the user what has happened. That would give us a relatively sane path towards enabling this behind a flag. |
Yeah, something needs to get called when watcher exits(currently there is no indication that watcher stopped watching). The patch below allows to get
|
Folks, I'm sorry for the insistence. But would we now have a "light at the end of the tunnel" for this issue? |
This is not a low priority. |
According to https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-readdirectorychangesw, When using ReadDirectoryChangesW, buffer overflows will still return true. It ends up closing the stream without any notification. Throw an exception to notify users. Bug: #37233 Change-Id: I9aebed8b1f30b5e843ad37a51b87d234aa1d8ce6 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/119524 Commit-Queue: Zichang Guo <[email protected]> Reviewed-by: Siva Annamalai <[email protected]> Reviewed-by: Alexander Aprelev <[email protected]>
https://dart-review.googlesource.com/c/sdk/+/119524 landed, it has an example of how one can catch those exceptions(watcher method used testWatchOverflow test https://dart-review.googlesource.com/c/sdk/+/119524/16/tests/standalone_2/io/file_system_watcher_test.dart#451). Please see if it works for you. |
To handle the exception, simply surround stream.listen() with runZoned() and provide onError() callback which will be invoked if exception occurs. runZoned(() {
watcher.listen((data) async {
// handle event
});
}, onError: (error) {
// handle error
}); |
Original issue on the
watcher
package dart-lang/tools#1713, related issue inbuild_runner
dart-lang/webdev#436.I created a simple repro below that uses vanilla
Directory.watch
. You can run each of these scripts at the same time to repro, which exposes a few issues:Stream closed!!
message on stdout, but the process does exit. There is no unhandled exception that surfaces either.File modification script, edits a single file in a loop:
Watcher script, logs watch events:
As a result of this we have to use a polling watcher, which doesn't perform well.
The text was updated successfully, but these errors were encountered: