-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for invoking bazel with args from file #10796
Add support for invoking bazel with args from file #10796
Conversation
Todo: Windows quoting seems to be different and happen in WindowsEscapeArg somewhere. Need to test.
👋My first time contributing to the bazel client, C++ isn't my area of expertise. Although I realise the PR isn't in a completely mergeable state (missing tests/documentation), I wanted to open a PR earlier rather than later to ask for feedback (so I don't have to rewrite tests/code etc for code that might need a bit of an overhaul). I'll leave some comments on the files I have open questions/doubts about, please give it a look! |
const std::string &command, const std::vector<std::string> &command_args, | ||
const std::string &invocation_policy, | ||
bool uses_param_file, const std::string &invocation_policy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe refactor Communicate
to take a const OptionProcessor *option_processor
, as the first three arguments are currently:
server->Communicate(
option_processor.GetCommand(),
option_processor.GetCommandArguments(),
option_processor.UsingParamFile()
@@ -715,6 +715,20 @@ static void RunBatchMode( | |||
|
|||
jvm_args_vector.insert(jvm_args_vector.end(), command_arguments.begin(), | |||
command_arguments.end()); | |||
if (option_processor.UsingParamFile()) { | |||
blaze_util::Path param_file = | |||
blaze_util::Path(startup_options.output_base).GetRelative("params"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if these param files are safe to be written to output_base, or if they need unique locations to support multiple clients using the same output_base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll probably want to dig around the client locking code to sort out how multiple clients would react to this.
content = blaze_util::Quoted(*i) : | ||
content += '\n' + blaze_util::Quoted(*i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The java argument file format takes all whitespace as separate arguments, so we need to appropriately quote each.
https://docs.oracle.com/javase/9/tools/java.htm#JSWOR-GUID-4856361B-8BFD-4964-AE84-121F5F6CF111
std::string Quoted(const std::string s) | ||
{ | ||
std::ostringstream ss; | ||
ss << std::quoted( s.c_str() ); | ||
return ss.str(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there's a better way to achieve quoting that java expects the argfile to have.
|
||
string parameter_file = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a oneof
with repeated bytes arg = 2;
, that way it is clear for the java code which field to parse. Right now the java code attempts to parse field 2, if its empty it attempts to parse this.
I'm not sure where this proto file is used, and if this would be backward compatible change. Any tips on how to do this best welcome.
thanks for your contribution! This seems like a pretty cool feature to have. I'll try to find you a reviewer for this! |
args = builder.build(); | ||
} catch (IOException e) { | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need to modify the Java code?
I think it would be simpler to open the file and do the expansion in option_processor.cc
.
(also skip the change to the proto file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, after doing the expansion in the cpp client we still need to send it to the java server. As the server/client communicate over gRPC this still has a message size limit, see #8609 (comment) for some more context.
The only solution I found for that is making sure the client writes the arguments to a file and instructs the server to read that file, for which we'd need a proto change (or implicitly encode this somehow in the existing field for args).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm told that the gRPC limit is entirely artificial and we can just increase it.
Call .maxInboundMessageSize(1<<30) on NettyServerBuilder here:
NettyServerBuilder.forAddress(address).addService(this).directExecutor().build().start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep a sensible limit on the max inbound message size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michajlo what would you consider a sensible limit? 10/100s of megabytes?
What I like about the file approach is that it doesn't impose another seemingly arbitrarily chosen limit. What I'd like to avoid is ending up with a choice that needs to be increased 6months or a year down the line. Maybe the limit should be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably 10s? Generous arbitrary limits are nice because they provide some healthy pushback to prevent folks from wandering into the minefield of less well defined limits.
I don't think we should assume that the bazel client and bazel server are always running on the same machine (and sharing the same filesystem), so changing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately given the client-server setup and the misc paths to starting up a server or initiating requests, param files can wind up getting tricky. This might require a larger design discussion to make sure we don't accidentally paint ourselves into any corners. I'm happy to review and help you with the design if you decide to take it on, but I'm also going to offer what might be a more direct route to solving your initial problem over on #8609.
for (const string &arg : arg_vector) { | ||
request.add_arg(arg); | ||
if (uses_param_file) { | ||
blaze_util::Path param_file = output_base_.GetRelative("params"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can definitely be races between clients here - we release the client-side lock right after sending the request, so by the time the file is being parsed on the server it could have been replaced by another file.
@@ -715,6 +715,20 @@ static void RunBatchMode( | |||
|
|||
jvm_args_vector.insert(jvm_args_vector.end(), command_arguments.begin(), | |||
command_arguments.end()); | |||
if (option_processor.UsingParamFile()) { | |||
blaze_util::Path param_file = | |||
blaze_util::Path(startup_options.output_base).GetRelative("params"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll probably want to dig around the client locking code to sort out how multiple clients would react to this.
args = builder.build(); | ||
} catch (IOException e) { | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably 10s? Generous arbitrary limits are nice because they provide some healthy pushback to prevent folks from wandering into the minefield of less well defined limits.
using_param_file = true; | ||
string filename = args[1].substr(1, args[1].length()-1); | ||
string contents; | ||
if (!blaze_util::ReadFile(filename, &contents)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid io in SplitCommandLine, and generally as much as possible in options processing besides for rc handling. Can this be lifted to a different abstraction layer? This also avoids mutating args, which could get confusing.
Thanks all for the comments! There's definitely some extra investigation to do here, I agree with @michajlo that it is probably a good idea to have a larger design discussion around it. However, I think that might take some more time and I would like to have a reasonable resolution for #8609 soon. The alternative suggested (a |
An alternative to bazelbuild#10796 to fix bazelbuild#8609. As bazelbuild#8609 (comment) points out there's currently a --query_file to read a query pattern from file, but there's no analogous feature available for build/test. Generic parameter file support attempted in bazelbuild#10796 turns out to be a little more complex than expected due to bazel's client/server architecture and likely requires some design document. This PR would fix help bazelbuild#8609 without rushing into any designs, so we can spend time properly designing generic parameter files while also alleviating the immediate problem with a known/simpler pattern.
An alternative to #10796 to fix #8609. As #8609 (comment) points out there's currently a `--query_file` to read a query pattern from file, but there's no analogous feature available for build/test. Generic parameter file support attempted in #10796 turns out to be a little more complex than expected due to bazel's client/server architecture and likely requires some design document. This PR would fix help #8609 without rushing into any designs, so we can spend time properly designing generic parameter files while also alleviating the immediate problem with a known/simpler pattern. Closes #10856. PiperOrigin-RevId: 298953350
With #10856 landed should this be closed? |
Yes #10856 fixes my immediate issue for #8609. I don't have the time right now to dive into general argfile support and writeup a proposal. Closing out this PR to reflect I'm not actively working on it. If anyone ends up looking at argfile support in the future, I hope this points them in the right direction :). |
An alternative to bazelbuild#10796 to fix bazelbuild#8609. As bazelbuild#8609 (comment) points out there's currently a `--query_file` to read a query pattern from file, but there's no analogous feature available for build/test. Generic parameter file support attempted in bazelbuild#10796 turns out to be a little more complex than expected due to bazel's client/server architecture and likely requires some design document. This PR would fix help bazelbuild#8609 without rushing into any designs, so we can spend time properly designing generic parameter files while also alleviating the immediate problem with a known/simpler pattern. Closes bazelbuild#10856. PiperOrigin-RevId: 298953350
Fixes #8609. Also see https://groups.google.com/d/msg/bazel-discuss/895-OjU2Z2A/mYlw05rpDgAJ. Allows Bazel to be invoked with all arguments coming from a file rather than using commandline arguments. For example:
With the current implementation: only one argfile is allowed, it should be prefixed with an
@
and the first argument (argv[1]). Argfiles should be newline separated (\n
), one argument per line.TODO:
command_server.proto
changes/semantics when using argfile.