Skip to content

Commit

Permalink
Add flag to replay stdout/stderr across builds
Browse files Browse the repository at this point in the history
This adds an experimental flag, --experimental_replay_action_out_err,
which changes Bazel's behavior to replay action stdout/stderr on
subsequent builds, even if the action is cached in Skyframe or in the
action cache.

I made 4 changes for this to work:

1. Add a new directory, bazel-out/_action/ which contains persistent
   stdout/stderr files, which use the action key as the name (so the names
   are also deterministic across server restarts).

2. Report the action output as a warning rather than an info message
   (info messages are not cached by Skyframe but discarded after being
   reported once).

3. Report the warning message to the Skyframe event listener, rather
   than the global reporter.

4. Replay the warning message on a cache hit.

There are a few shortcomings of this approach:

1. It switches the entire ActionExecutionContext to the Skyframe event
   listener if the new flag is set, which could also cause replay for
   other events that are posted to this listener as long as the action
   is a Skyframe hit. When changing configuration options or restarting
   the server, these events are dropped as we currently have no way of
   persisting them in the action cache.

2. Its interaction with action rewinding is not ideal. Action rewinding
   can rerun actions after their stdout/stderr have already been posted
   to the CLI.

   When an action is re-executed after rewinding, all events from that
   action are suppressed, and the empty set is recorded in Skyframe,
   replacing the previous (possibly non-empty) set of events.

   At this time, action rewinding is primarily used in contexts where
   this is not a concern.

3. If the stdout/stderr files are locally modified, then Bazel replays
   the modified files. It might be nicer to have Bazel rerun the
   corresponding actions in this case by recording the stdout/stderr
   hashes in the action cache and checking them on incremental builds.

4. Flipping the flag on/off invalidates neither the Skyframe nor the
   on-disk action cache, and can therefore cause replay of stale or no
   events.

   As a workaround, the flag should be set in the bazelrc. Note that
   some tools may invoke bazel without the bazelrc, which can cause this
   problem to occur even so.

This was highly requested for Apple/iOS development to make tool
warnings more visible to developers, and this implementation may be
sufficient for that use case.

Changes to the output filter are taken into account. However, some
actions opt-in to unconditionally report their output, which is not
supported if the flag is enabled. Currently, that only applies to test
actions.

Personally, I think action out/err replay should be enabled by default.
I find the current behavior highly unintuitive because the visibility of
warning messages depends on the state of the various caches which are
intentionally opaque. That would also address (4) above.

PiperOrigin-RevId: 292344740
  • Loading branch information
ulfjack authored and copybara-github committed Jan 30, 2020
1 parent 9f75340 commit f20b33c
Show file tree
Hide file tree
Showing 12 changed files with 248 additions and 38 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,11 @@ public final class ActionLogBufferPathGenerator {
private final AtomicInteger actionCounter = new AtomicInteger();

private final Path actionOutputRoot;
private final Path persistentActionOutputRoot;

public ActionLogBufferPathGenerator(Path actionOutputRoot) {
public ActionLogBufferPathGenerator(Path actionOutputRoot, Path persistentActionOutputRoot) {
this.actionOutputRoot = actionOutputRoot;
this.persistentActionOutputRoot = persistentActionOutputRoot;
}

/**
Expand All @@ -39,4 +41,11 @@ public FileOutErr generate(ArtifactPathResolver resolver) {
resolver.convertPath(actionOutputRoot.getRelative("stdout-" + actionId)),
resolver.convertPath(actionOutputRoot.getRelative("stderr-" + actionId)));
}

/** Generates a unique filename for an action to store its output. */
public FileOutErr persistent(String actionKey, ArtifactPathResolver resolver) {
return new FileOutErr(
resolver.convertPath(persistentActionOutputRoot.getRelative("stdout-" + actionKey)),
resolver.convertPath(persistentActionOutputRoot.getRelative("stderr-" + actionKey)));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,10 @@ public Path getActionTempsDirectory(Path execRoot) {
return execRoot.getRelative(getRelativeOutputPath()).getRelative("_tmp/actions");
}

public Path getPersistentActionOutsDirectory(Path execRoot) {
return execRoot.getRelative(getRelativeOutputPath()).getRelative("_actions");
}

/** Returns the installed embedded binaries directory, under the shared installBase location. */
public Path getEmbeddedBinariesRoot() {
return serverDirectories.getEmbeddedBinariesRoot();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,15 @@ public boolean useTopLevelTargetsForSymlinks() {
help = "If this flag is set, use a fork join pool in the abstract queue visitor.")
public boolean useForkJoinPool;

@Option(
name = "experimental_replay_action_out_err",
defaultValue = "false",
documentationCategory = OptionDocumentationCategory.UNDOCUMENTED,
metadataTags = OptionMetadataTag.EXPERIMENTAL,
effectTags = {OptionEffectTag.EXECUTION},
help = "If this flag is set, replay action out/err on incremental builds.")
public boolean replayActionOutErr;

/**
* Converter for jobs: Takes keyword ({@value #FLAG_SYNTAX}). Values must be between 1 and
* MAX_JOBS.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,6 @@
import com.google.devtools.build.lib.skyframe.SkyframeExecutor;
import com.google.devtools.build.lib.util.AbruptExitException;
import com.google.devtools.build.lib.util.LoggingUtil;
import com.google.devtools.build.lib.vfs.FileSystemUtils;
import com.google.devtools.build.lib.vfs.ModifiedFileSet;
import com.google.devtools.build.lib.vfs.OutputService;
import com.google.devtools.build.lib.vfs.Path;
Expand Down Expand Up @@ -447,10 +446,16 @@ private void createActionLogDirectory() throws ExecutorInitException {
}

try {
FileSystemUtils.createDirectoryAndParents(directory);
directory.createDirectoryAndParents();
} catch (IOException e) {
throw new ExecutorInitException("Couldn't create action output directory", e);
}

try {
env.getPersistentActionOutsDirectory().createDirectoryAndParents();
} catch (IOException e) {
throw new ExecutorInitException("Couldn't create persistent action output directory", e);
}
}

/**
Expand Down Expand Up @@ -674,11 +679,11 @@ private Builder createBuilder(
ModifiedFileSet modifiedOutputFiles) {
BuildRequestOptions options = request.getBuildOptions();

Path actionOutputRoot = env.getActionTempsDirectory();
skyframeExecutor.setActionOutputRoot(
env.getActionTempsDirectory(), env.getPersistentActionOutsDirectory());

Predicate<Action> executionFilter =
CheckUpToDateFilter.fromOptions(request.getOptions(ExecutionOptions.class));

skyframeExecutor.setActionOutputRoot(actionOutputRoot);
ArtifactFactory artifactFactory = env.getSkyframeBuildView().getArtifactFactory();
return new SkyframeBuilder(
skyframeExecutor,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -475,6 +475,10 @@ public Path getActionTempsDirectory() {
return getDirectories().getActionTempsDirectory(getExecRoot());
}

public Path getPersistentActionOutsDirectory() {
return getDirectories().getPersistentActionOutsDirectory(getExecRoot());
}

/**
* Returns the working directory of the {@code blaze} client process.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
import com.google.devtools.build.lib.causes.Cause;
import com.google.devtools.build.lib.causes.LabelCause;
import com.google.devtools.build.lib.clock.BlazeClock;
import com.google.devtools.build.lib.cmdline.Label;
import com.google.devtools.build.lib.cmdline.LabelSyntaxException;
import com.google.devtools.build.lib.cmdline.PackageIdentifier;
import com.google.devtools.build.lib.collect.compacthashset.CompactHashSet;
Expand Down Expand Up @@ -777,7 +778,8 @@ private ActionExecutionValue checkCacheAndExecuteIfNeeded(
metadataHandler,
actionStartTime,
state.allInputs.actionCacheInputs,
clientEnv);
clientEnv,
pathResolver);
}

if (state.token == null) {
Expand Down Expand Up @@ -906,7 +908,8 @@ private ActionExecutionValue checkCacheAndExecuteIfNeeded(
expandedFilesets,
ImmutableMap.copyOf(state.topLevelFilesets),
state.actionFileSystem,
skyframeDepsResult);
skyframeDepsResult,
env.getListener());
ActionExecutionValue result;
try {
result =
Expand Down Expand Up @@ -1305,10 +1308,21 @@ static boolean actionDependsOnBuildId(Action action) {
|| action instanceof NotifyOnActionCacheHit;
}

/** All info/warning messages associated with actions should be always displayed. */
@Override
public String extractTag(SkyKey skyKey) {
return null;
// The return value from this method is only applied to non-error, non-debug events that are
// posted through the EventHandler associated with the SkyFunction.Environment. For those
// events, this setting overrides whatever tag is set.
//
// If action out/err replay is enabled, then we intentionally post through the Environment to
// ensure that the output is replayed on subsequent builds. In that case, we need this to be the
// action owner's label.
//
// Otherwise, Events from action execution are posted to the global Reporter rather than through
// the Environment, so this setting is ignored. Note that the SkyframeActionExecutor manually
// checks the action owner's label against the Reporter's output filter in that case, which has
// the same effect as setting it as a tag on the corresponding event.
return Label.print(((ActionLookupData) skyKey).getActionLookupKey().getLabel());
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@
import com.google.devtools.build.lib.concurrent.Sharder;
import com.google.devtools.build.lib.concurrent.ThrowableRecordingRunnableWrapper;
import com.google.devtools.build.lib.events.Event;
import com.google.devtools.build.lib.events.EventHandler;
import com.google.devtools.build.lib.events.ExtendedEventHandler;
import com.google.devtools.build.lib.events.Reporter;
import com.google.devtools.build.lib.profiler.Profiler;
Expand Down Expand Up @@ -207,6 +208,7 @@ enum ProgressEventBehavior {
private boolean useAsyncExecution;
private boolean strictConflictChecks;
private boolean hadExecutionError;
private boolean replayActionOutErr;
private MetadataProvider perBuildFileCache;
private ActionInputPrefetcher actionInputPrefetcher;
/** These variables are nulled out between executions. */
Expand Down Expand Up @@ -436,6 +438,7 @@ void prepareForExecution(
this.useAsyncExecution = options.getOptions(BuildRequestOptions.class).useAsyncExecution;
this.strictConflictChecks = options.getOptions(BuildRequestOptions.class).strictConflictChecks;
this.finalizeActions = options.getOptions(BuildRequestOptions.class).finalizeActions;
this.replayActionOutErr = options.getOptions(BuildRequestOptions.class).replayActionOutErr;
this.outputService = outputService;
RemoteOptions remoteOptions = options.getOptions(RemoteOptions.class);
this.bazelRemoteExecutionEnabled = remoteOptions != null && remoteOptions.isRemoteEnabled();
Expand Down Expand Up @@ -636,9 +639,23 @@ public ActionExecutionContext getContext(
Map<Artifact, ImmutableList<FilesetOutputSymlink>> expandedFilesets,
ImmutableMap<Artifact, ImmutableList<FilesetOutputSymlink>> topLevelFilesets,
@Nullable FileSystem actionFileSystem,
@Nullable Object skyframeDepsResult) {
FileOutErr fileOutErr = actionLogBufferPathGenerator.generate(
ArtifactPathResolver.createPathResolver(actionFileSystem, executorEngine.getExecRoot()));
@Nullable Object skyframeDepsResult,
ExtendedEventHandler skyframeCachingEventHandler) {
ArtifactPathResolver artifactPathResolver =
ArtifactPathResolver.createPathResolver(actionFileSystem, executorEngine.getExecRoot());
FileOutErr fileOutErr;
if (replayActionOutErr) {
String actionKey = action.getKey(actionKeyContext);
fileOutErr = actionLogBufferPathGenerator.persistent(actionKey, artifactPathResolver);
try {
fileOutErr.getErrorPath().delete();
fileOutErr.getOutputPath().delete();
} catch (IOException e) {
throw new IllegalStateException(e);
}
} else {
fileOutErr = actionLogBufferPathGenerator.generate(artifactPathResolver);
}
return new ActionExecutionContext(
executorEngine,
createFileCache(metadataProvider, actionFileSystem),
Expand All @@ -647,7 +664,9 @@ public ActionExecutionContext getContext(
metadataHandler,
lostInputsCheck(actionFileSystem, action, outputService),
fileOutErr,
selectEventHandler(progressEventBehavior),
replayActionOutErr && progressEventBehavior.equals(ProgressEventBehavior.EMIT)
? skyframeCachingEventHandler
: selectEventHandler(progressEventBehavior),
clientEnv,
topLevelFilesets,
new ArtifactExpanderImpl(expandedInputs, expandedFilesets),
Expand All @@ -667,7 +686,8 @@ Token checkActionCache(
MetadataHandler metadataHandler,
long actionStartTime,
List<Artifact> resolvedCacheArtifacts,
Map<String, String> clientEnv)
Map<String, String> clientEnv,
ArtifactPathResolver pathResolver)
throws ActionExecutionException {
Token token;
try (SilentCloseable c = profiler.profile(ProfilerTask.ACTION_CHECK, action.describe())) {
Expand Down Expand Up @@ -697,6 +717,20 @@ Token checkActionCache(
eventPosted = true;
}

if (replayActionOutErr) {
// TODO(ulfjack): This assumes that the stdout/stderr files are unmodified. It would be
// better to integrate them with the action cache and rerun the action when they change.
String actionKey = action.getKey(actionKeyContext);
FileOutErr fileOutErr = actionLogBufferPathGenerator.persistent(actionKey, pathResolver);
// Set the mightHaveOutput bit in FileOutErr. Otherwise hasRecordedOutput() doesn't check if
// the file exists and just returns false.
fileOutErr.getOutputPath();
fileOutErr.getErrorPath();
if (fileOutErr.hasRecordedOutput()) {
dumpRecordedOutErr(eventHandler, action, fileOutErr);
}
}

if (action instanceof NotifyOnActionCacheHit) {
NotifyOnActionCacheHit notify = (NotifyOnActionCacheHit) action;
ExtendedEventHandler contextEventHandler =
Expand Down Expand Up @@ -1133,11 +1167,15 @@ private ActionExecutionValue actuallyCompleteAction(
// The .showOutput() method is not necessarily a quick check: in its
// current implementation it uses regular expression matching.
FileOutErr outErrBuffer = actionExecutionContext.getFileOutErr();
if (outErrBuffer.hasRecordedOutput()
&& (action.showsOutputUnconditionally()
|| reporter.showOutput(Label.print(action.getOwner().getLabel())))) {
dumpRecordedOutErr(action, outErrBuffer);
outputAlreadyDumped = true;
if (outErrBuffer.hasRecordedOutput()) {
if (replayActionOutErr) {
dumpRecordedOutErr(actionExecutionContext.getEventHandler(), action, outErrBuffer);
outputAlreadyDumped = true;
} else if ((action.showsOutputUnconditionally()
|| reporter.showOutput(Label.print(action.getOwner().getLabel())))) {
dumpRecordedOutErr(reporter, action, outErrBuffer);
outputAlreadyDumped = true;
}
}

MetadataHandler metadataHandler = actionExecutionContext.getMetadataHandler();
Expand Down Expand Up @@ -1618,7 +1656,7 @@ private void printError(String message, Action action, FileOutErr actionOutput)
message = "Couldn't " + describeAction(action) + ": " + message;
}
Event event = Event.error(action.getOwner().getLocation(), message);
dumpRecordedOutErr(event, actionOutput);
dumpRecordedOutErr(reporter, event, actionOutput);
recordExecutionError();
}
}
Expand All @@ -1640,13 +1678,14 @@ private static String describeAction(Action action) {
* @param action The action whose output is being dumped
* @param outErrBuffer The OutErr that recorded the actions output
*/
private void dumpRecordedOutErr(Action action, FileOutErr outErrBuffer) {
StringBuilder message = new StringBuilder("");
message.append("From ");
message.append(action.describe());
message.append(":");
Event event = Event.info(message.toString());
dumpRecordedOutErr(event, outErrBuffer);
private void dumpRecordedOutErr(
EventHandler eventHandler, Action action, FileOutErr outErrBuffer) {
Event event =
replayActionOutErr
// Info events are not cached in Skyframe, so we make this a warning.
? Event.warn("From " + action.describe() + ":")
: Event.info("From " + action.describe() + ":");
dumpRecordedOutErr(eventHandler, event, outErrBuffer);
}

/**
Expand All @@ -1655,20 +1694,17 @@ private void dumpRecordedOutErr(Action action, FileOutErr outErrBuffer) {
* @param prefixEvent An event to post before dumping the output
* @param outErrBuffer The OutErr that recorded the actions output
*/
private void dumpRecordedOutErr(Event prefixEvent, FileOutErr outErrBuffer) {
private void dumpRecordedOutErr(
EventHandler eventHandler, Event prefixEvent, FileOutErr outErrBuffer) {
// Only print the output if we're not winding down.
if (isBuilderAborting()) {
return;
}
if (outErrBuffer != null && outErrBuffer.hasRecordedOutput()) {
// Bind the output to the prefix event.
// Note: here we temporarily (until the event is handled by the UI) read all
// output into memory; as the output of regular actions (as opposed to test runs) usually is
// short, so this should not be a problem. If it does turn out to be a problem, we have to
// pass the outErrbuffer instead.
reporter.handle(prefixEvent.withStdoutStderr(outErrBuffer));
eventHandler.handle(prefixEvent.withStdoutStderr(outErrBuffer));
} else {
reporter.handle(prefixEvent);
eventHandler.handle(prefixEvent);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1446,9 +1446,16 @@ public void setClientEnv(Map<String, String> clientEnv) {
}

/** Sets the path for action log buffers. */
@Deprecated
public void setActionOutputRoot(Path actionOutputRoot) {
setActionOutputRoot(actionOutputRoot, actionOutputRoot);
}

/** Sets the path for action log buffers. */
public void setActionOutputRoot(Path actionOutputRoot, Path persistentActionOutputRoot) {
Preconditions.checkNotNull(actionOutputRoot);
this.actionLogBufferPathGenerator = new ActionLogBufferPathGenerator(actionOutputRoot);
this.actionLogBufferPathGenerator =
new ActionLogBufferPathGenerator(actionOutputRoot, persistentActionOutputRoot);
this.skyframeActionExecutor.setActionLogBufferPathGenerator(actionLogBufferPathGenerator);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -314,7 +314,9 @@ public void initializeSkyframeExecutor(boolean doPackageLoadingChecks) throws Ex
useConfiguration();
setUpSkyframe();
this.actionLogBufferPathGenerator =
new ActionLogBufferPathGenerator(directories.getActionTempsDirectory(getExecRoot()));
new ActionLogBufferPathGenerator(
directories.getActionTempsDirectory(getExecRoot()),
directories.getPersistentActionOutsDirectory(getExecRoot()));
}

public void initializeMockClient() throws IOException {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -223,7 +223,7 @@ protected Builder createBuilder(

Path actionOutputBase = scratch.dir("/usr/local/google/_blaze_jrluser/FAKEMD5/action_out/");
skyframeActionExecutor.setActionLogBufferPathGenerator(
new ActionLogBufferPathGenerator(actionOutputBase));
new ActionLogBufferPathGenerator(actionOutputBase, actionOutputBase));

MetadataProvider cache =
new SingleBuildFileCache(rootDirectory.getPathString(), scratch.getFileSystem());
Expand Down
7 changes: 7 additions & 0 deletions src/test/py/bazel/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,13 @@ py_test(
deps = [":test_base"],
)

py_test(
name = "action_replay_test",
size = "medium",
srcs = ["action_replay_test.py"],
deps = [":test_base"],
)

py_test(
name = "genrule_test",
size = "medium",
Expand Down
Loading

0 comments on commit f20b33c

Please sign in to comment.