-
-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create WorkunitOutput subsystem which redirects workunit output #8765
create WorkunitOutput subsystem which redirects workunit output #8765
Conversation
accomplish this via introducing OutputTeeingFileBackedRWBuf, which uses a 'tee' subprocess to spread output across multiple files.
I'm not sure we want to be exposing this when the future of workunits in v2 is kind of up in the air, as it's presumably something we'd need to maintain compatibility for, or deprecate, which could constrain v2 direction noticeably... Is there a reason this is currently urgent? Assuming we do do it, though, the interface of concatenating (and maybe interleaving? I didn't look at synchronisation) all of the outputs together into one file seems weird to me. How would you feel about the "output path" actually being a directory into which we'll put structure (either named files per workunit, or even a directory hierarchy, or something, given they have hierarchy)? If the user then wants to concat them, great, they can. |
This would be useful to me for multiple scenarios. Will a work-unit support saving to the build-cache and allow me to retrieve it for cached artifacts? For example, if a build completed with 3 compiler warnings, and later I rebuilt, it would be useful to still be able to retrieve the warnings from the cached artifact. |
That would not come from this change. We'd need to include stdout and stderr in those cache bundles (which personally I think we should do), but that would be orthogonal. This would only output this information onto a file on your local disk, only if you didn't hit a cache. |
Got it, I think I slightly misunderstood the change then. Although it wouldn't have to be std-in for my purposes as long as I could declare something as a cacheable artifact. For example, the diff-changed goal outputs a git patch to stdout, but I only did that as I am scraping stdout to retrieve it. I would love to be able to write it to a file instead, that I could easily retrieve later from the build cache. |
That, you actually can do, as long as you're willing to modify the pants task. Anything you write to the pants/src/python/pants/backend/python/tasks/python_binary_create.py Lines 92 to 101 in 0c39921
|
Especially after talking with @ShaneDelmore further about what he might want from this kind of option, I'm thinking of a two-fold plan to unify these:
For (2), @ShaneDelmore suggested something like: > ./pants fetch lspIndexes dataproducts/project/:: Translating that into some idea of pants options could be: > ./pants fetch --task=compile.lsp-indexes dataproducts/project/:: which would be interpreted as "fetch the cached I'm going to spend 5 minutes making this into a google doc. |
Honestly, I'd probably rather just port scalafix to v2, which gets stdout and stderr caching and replay for free, rather than building an entire new caching mechanism for stdout and stderr specifically for v1, given this is already a solved problem in v2.
Commented on the doc. |
Closing due to being stale. Workunits have seen some substantial changes the past few months, from what I can tell. |
TODO: test cases!!!
Problem
We would like to be able to get some parts of pants's output in different places, for multiple reasons:
./pants run
, but direct the process's output somewhere else (instead of amidst all the pants output).For some background on how this particular interface came to be, see this comment which tangentially introduced the idea of redirecting output: #7071 (comment), although it was buried amidst a larger discussion of output in v2 rules.
Solution
(also see:
./pants help workunit-output
)WorkunitOutput
subsystem and attach it to theRunTracker
.--redirections
as a dict mapping file names to "redirection specs", which are just nested dicts. This looks something like:.*
(everything), which declare any of theTOOL
orTEST
labels, on the outputs namedstdout
andstderr
(exact match). In this precise command line, that would mean that all pytest output would be appended to a file namedout.txt
./dev/stdout
is treated specially, and results in thetee
subprocess inheriting the pants stdout. This means that workunit output which would normally be hidden (such as coursier) can be selectively toggled on by the user.OutputTeeingFileBackedRWBuf
, which uses atee
subprocess to spread output across multiple files.Result
Pants can redirect some of its output elsewhere with a command-line option without affecting the rest of the pants run!