-
-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pantsd output is inconsistently truncated when piped #7465
Comments
I've edited the title based on the very high likelihood of a relation to #6687. |
After a brief attempt to retro, this is what I get:
I will keep investigating, my next step is to see whether |
|
Yay! |
|
I haven’t been able to repro this on sha
Could you tell me what sha you’re running it from, and what macOS version you have? |
This is on master 5eed2e7. macOS 10.14.4 (Mojave), 2018 MacBook Pro.
With Py2:
So now is the inconsistency related to Py3..? Notably Py2 used to be inconsistent for me too, whereas now it appears deterministic. |
Might be related to the number of cores then? We are running in the same OS and the same commit, the only difference is the computer itself... I don't think I'll have access to a more powerful OSX machine anytime soon, but can investigate vms in my personal Linux Mint (~Ubuntu) laptop. |
Potentially. pants at master · sysctl -n hw.ncpu
12 (Let me know if you'd like me to run any other commands) I do think it's significant that there's a difference between Py2 and Py3. I'm really curious what the cause is, and suspect it may have to do with |
This is probably the case, but it's really hard to verify without access to a repro. Could you run pantsd in debug mode and upload the resulting logs?
|
https://gist.github.com/Eric-Arellano/f5b4c03af61784083c0ea443d1b6b056 |
Thanks! I can't see anything obvious, but could you try this patch? Will only work on python3:
I'm sorry for the dance, if this fixes it then it was probably the buffering. If not, I will try to increase the buffer size to see if it repros on my machine. |
Yes, that diff worked!
Yay! However, when piping there's now consistently really bad performance:
This is only when piping. When outputting normally, it takes about 1.5 seconds. Also, |
Seems like it takes longer for
so when following section (in DaemonPantsRunner._pipe_stdio(...) exits, some outputs have not been received by NailgunStreamStdinReader reader yet.
During my testing, if I provide an extra 1 second (sleep(1)) either at the end of |
The default buffer size is 8192. Instead of using the default one, if we pass in
|
I think that we might have seen a repro of this on Linux as well (which isn't too surprising, since it seems like a timing issue). |
### Problem Dealing with raw file descriptors is confusing at best and error-prone at worst. ### Solution Define a `Pipe` abstraction that owns (read: "managest the lifetimes of") the write and read ends of a pipe, as well as provide convenience methods to query the state of the fds. ### Result It's now a bit more obvious what pipes are open. Part one of the fix for #7465
## Problem As described in #7465, when we pipe the output of a pants run with pantsd into anything, it may truncate output. The root cause: - `NailgunStreamWriter` runs in another thread, and is periodically reading from `std{out,err}` and writing it in the Nailgun socket. Among other uses, we create one of these whenever we need to redirect stdout/err from pants into the client. - In the current implementation of pantsd, this happens in DaemonPantsRunner, whenever we call to `nailgunned_stdio` and either stdout, err or in are not TTYs (there's more nuance, but it's not relevant). We create pipes and give the writing end to the `DaemonPantsRunner`, while the `NailgunStreamWriter` listens to the reading end for output. The `DaemonPantsRunner` then redirects `sys.stdout/err` (and some other stuff) to the reading ends of those pipes, through `stdio_as`. This happens in two places: - When we warm up the v2 product graph, where things like v2 list are computed. - When we actually execute the pants run under pantsd. - When the daemon is finished with `nailgunned_stdio`, it stops the `NailgunStreamWriter` thread that is transmitting output to the client through the socket, and then it joins until it's done. **- The issue:** When the daemon `stop()`s the thread, this stops reading immediately, even if it hasn't finished transmitting every chunk to the client. Therefore, there is a race condition between `NailgunStreamWriter` trying to send all the chunk through the socket and `DaemonPantsRunner` trying to close it. This can be checked by inserting a `time.sleep(0.1)` [here](https://github.com/pantsbuild/pants/blob/3c17cd2b54b468590ddf8f486f37652ed0db6afc/src/python/pants/java/nailgun_io.py#L201), which will make the issue repro consistently. ## Solution Depends on #7740, and includes commits from that. **The only commit worth reviewing is b9d0c95** The `DaemonPantsRunner` must no longer `stop()` the thread, but rather mark the pipes that are is stdout/err as "I'm not going to write to this anymore, please close yourself when you're done reading from this". We create the class `PipedNailgunStreamWriter`, wich is aware that it's reading end belongs to a pipe. Whenever we don't have anything to read, we check if the write end of the pipe is closed. If so, we stop trying to read from that pipe. When no pipes are left, we close ourselves. ## Result Hopefully, `./pants --enable-pantsd list :: | wc -l` should give consistent output, and #7465 should be resolved.
### Problem Dealing with raw file descriptors is confusing at best and error-prone at worst. ### Solution Define a `Pipe` abstraction that owns (read: "managest the lifetimes of") the write and read ends of a pipe, as well as provide convenience methods to query the state of the fds. ### Result It's now a bit more obvious what pipes are open. Part one of the fix for #7465
## Problem As described in #7465, when we pipe the output of a pants run with pantsd into anything, it may truncate output. The root cause: - `NailgunStreamWriter` runs in another thread, and is periodically reading from `std{out,err}` and writing it in the Nailgun socket. Among other uses, we create one of these whenever we need to redirect stdout/err from pants into the client. - In the current implementation of pantsd, this happens in DaemonPantsRunner, whenever we call to `nailgunned_stdio` and either stdout, err or in are not TTYs (there's more nuance, but it's not relevant). We create pipes and give the writing end to the `DaemonPantsRunner`, while the `NailgunStreamWriter` listens to the reading end for output. The `DaemonPantsRunner` then redirects `sys.stdout/err` (and some other stuff) to the reading ends of those pipes, through `stdio_as`. This happens in two places: - When we warm up the v2 product graph, where things like v2 list are computed. - When we actually execute the pants run under pantsd. - When the daemon is finished with `nailgunned_stdio`, it stops the `NailgunStreamWriter` thread that is transmitting output to the client through the socket, and then it joins until it's done. **- The issue:** When the daemon `stop()`s the thread, this stops reading immediately, even if it hasn't finished transmitting every chunk to the client. Therefore, there is a race condition between `NailgunStreamWriter` trying to send all the chunk through the socket and `DaemonPantsRunner` trying to close it. This can be checked by inserting a `time.sleep(0.1)` [here](https://github.com/pantsbuild/pants/blob/3c17cd2b54b468590ddf8f486f37652ed0db6afc/src/python/pants/java/nailgun_io.py#L201), which will make the issue repro consistently. ## Solution Depends on #7740, and includes commits from that. **The only commit worth reviewing is b9d0c95** The `DaemonPantsRunner` must no longer `stop()` the thread, but rather mark the pipes that are is stdout/err as "I'm not going to write to this anymore, please close yourself when you're done reading from this". We create the class `PipedNailgunStreamWriter`, wich is aware that it's reading end belongs to a pipe. Whenever we don't have anything to read, we check if the write end of the pipe is closed. If so, we stop trying to read from that pipe. When no pipes are left, we close ourselves. ## Result Hopefully, `./pants --enable-pantsd list :: | wc -l` should give consistent output, and #7465 should be resolved.
Multiple runs produce inconsistent line counts:
This happens with Python 2, although less severely:
Note, however that removing the pipe, i.e.
PANTS_ENABLE_PANTSD=true ./pants2 --no-v1 --v2 list ::
, leads to the entire output being printed correctly.--
This only happens on macOS, not Ubuntu.
The text was updated successfully, but these errors were encountered: