-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util: add util.parseArgs() #35015
util: add util.parseArgs() #35015
Conversation
Nice, I didn't know if this was ever going to actually happen or not.
There's an IEEE standard for this, which might help.
You mention the possibility of renaming the
Hope to have time for a more thorough review later, but I do agree that command-line argument parsing is a fundamental capability that should be readily available. What I've noticed is that there are basically three major styles:
|
@DerekNonGeneric Thanks, didn't know about the IEEE thing. I can adopt this terminology if we think it'd be more clear. |
lib/internal/process/parse_args.js
Outdated
'object', | ||
options); | ||
} | ||
if (typeof options.expectsValue === 'string') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason for this overload? it seems somewhat nonsensical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's friendly to support a string if the value can be singular, instead of requiring an array. This sort of behavior is very common.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that just seems confusing to me. can we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overload seems to be a category error, it should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -0 on having this overload: it doesn't really bother me, but since we should rename expectsValue
to something else, and if we rename to something on plural (like optionsWithValues
) it might make more sense/be more intuitive to have this as an Array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For yargs' chaining API, I like that I can provide options as either an array or string. If I only have a couple options, I'll usually write:
yargs
.boolean('option-1')
.boolean('option-2')
But, if I have many options that I'm configuring, I might write:
yargs.boolean(['option-1', 'option-2', 'option-3', 'option-4')
I don't have as strong of an opinion in this case, given that this API isn't chaining like yargs (you configure all the options in one go any ways).
I don't think either implementation would be a major usability issue, and am supportive of whatever we land on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't decided yet where I land on "accept string or not", but if we don't accept string it should throw when string is passed, otherwise we'll get situations where the string is treated as an array ("foo"
getting treated as ['f', 'o', 'o']
leads to hard to debug issues).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -0 too. Either is fine by me, but I guess if it was up to me, I'd prefer to not permit a string initially, because adding it later is no problem but taking it away once it's out there is a big problem. That said, I doubt we'd have to/want to take it away, hence my -0 rather than -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I am also a member of the -0 club.
lib/internal/process/parse_args.js
Outdated
const arg = argv[pos]; | ||
if (arg === undefined) { | ||
return result; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be safer to remove the _
checks below and move them up here, maybe something like if (/^-{0,2}_$/.test(arg)) continue
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't just do that, because --_
is ignored, --_=foo
is ignored, and --_ foo
is ignored (in the case of _
in expectsValue
), so we need to parse our array further before just skipping ahead (or we introduce ambiguity). We could throw if _
is in expectsValue
and make this somewhat easier.
doc/api/process.md
Outdated
object supporting the following property: | ||
* `expectsValue` {Array<string>|string} (Optional) One or more argument | ||
strings which _expect a value_ when present | ||
* Returns: {Object} An object having properties corresponding to parsed Options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just return { options: { ... }, positionals: [...] }
? it gets rid of the awkward _
handling and is imo much cleaner than magic properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to this, but it is based on userland conventions. Personally I would prefer the magic over nested properties, but that's just me (as a potential consumer of this API).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I've seen the userland libraries. I'd like to think we can do better 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm with Gus here, userland convention originated before deconstructors, so having a _
property could be considered better/more attractive than returning options: { ... }, positionals: [...] }
. The explicit object return is, IMO, more intuitive and user-friendly:
const { options, positionals } = processArgs()
_
might be somewhat familiar to users used to yargs, commander etc, but it won't be intuitive for new users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that const { options, positionals } = processArgs()
is a pretty elegant compromise, I never loved _
😆
This seems like something that should live off of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 that we can do way better than the magic _
Would like someone else to weigh in on the |
I'm interested in feedback on all "possible alternatives" and questions listed in the description. Consolidating it into a list here:
In addition, I would like further input on these:
When answering these, please consider the expected users of this API and the developer experience. What would be easiest to understand at first glance? Once familiar, will the API be ergonomic or tedious? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsnek I will not be changing the API to force |
I don't think confusing apis provide a smoother developer experience. |
@devsnek There's nothing confusing about the API; as I mentioned, this is widely supported across userland. |
Sounds like @nodejs/tsc may need to weigh in. |
Some comments on the issue summary first, will look at the code later.
How much more complicated is it to implement this behavior, and why is it more complicated vs more complex?
String type when there's no repetition seems fine.
I never seen this behavior on any CLI apps (Node.js or otherwise), IMO it can be removed
Python uses the terminology "arguments" (https://docs.python.org/3/library/argparse.html), but we don't have to use the same terminology. We need to use a terminology, the one you proposed seems as good as any, as long as we are consistent across our documentation (including documentation about Node.js CLI options) about it we're good.
We tend to avoid reordering things unless necessary because it makes git archaeology harder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(still reviewing, commenting on the docs first so I don't lose my comments)
doc/api/process.md
Outdated
* `argv` {Array<string>|Object} (Optional) Array of argument strings; defaults | ||
to [`process.argv.slice(2)`](process_argv). If an Object, the default is used, | ||
and this parameter is considered to be the `options` parameter. | ||
* `options` {Object} (Optional) The `options` parameter, if present, is an | ||
object supporting the following property: | ||
* `expectsValue` {Array<string>|string} (Optional) One or more argument | ||
strings which _expect a value_ when present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not how we usually write optional parameters ("If an..., it is ...")
I'm worried the over verbosity might make it more complicated to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've addressed this somewhat, but I find the linked documentation too terse. 🤷
doc/api/process.md
Outdated
object supporting the following property: | ||
* `expectsValue` {Array<string>|string} (Optional) One or more argument | ||
strings which _expect a value_ when present | ||
* Returns: {Object} An object having properties corresponding to parsed Options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm with Gus here, userland convention originated before deconstructors, so having a _
property could be considered better/more attractive than returning options: { ... }, positionals: [...] }
. The explicit object return is, IMO, more intuitive and user-friendly:
const { options, positionals } = processArgs()
_
might be somewhat familiar to users used to yargs, commander etc, but it won't be intuitive for new users.
doc/api/process.md
Outdated
following a `--` (e.g., `['--', 'script.js']`) | ||
* Positionals appear in the Array property `_` of the returned object | ||
* The `_` property will _always_ be present and an Array, even if empty | ||
* If present in the `argv` Array, `--` is discarded and is omitted from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some user land lib has this behavior and I remember finding it very confusing. If users want to pass through everything after --
to a child process, there's no way of knowing which positionals were after --
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point, though I'm unsure if it's really common enough to worry about here... again, if this API doesn't cut it, there are great userland libraries that will (though, I'm not sure if they do in this case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though I'm unsure if it's really common enough to worry about here
It's more common than the count behavior :)
IMO if a flag with |
I think the proper solution would be a system you have to pass in all the args you expect, whether they're flags or options, if it's an option whether it's required, whether certain positional arguments are required, etc. |
@devsnek I kinda agree, this API is kind of a mix of low level and high level API: high level due to the magic bits, but low level because the user has to implement so many boilerplate things that are common on CLI tools (--help, options validation, type conversion, etc.). The way it is implemented today it would be possible to add a "strict" mode in the future though, or a more comprehensive parser builder. The API clearly has a goal of simplicity and is not intended to fully replace existing parser modules, nor is intended to cater for more complex use cases without offloading the heavy work to the user, or at least that's my understanding. |
Regarding the "count" feature: Regarding returning diff't data structures (a string for singular option values and an array for multiples), I would rather @bcoe could speak to the relative wisdom of such an API, because that is Regarding whether options should be explicitly defined... maybe, but, a better developer experience is to not throw on unknown options, and let the API consumer decide what to do with them. In a CLI app, throwing an exception should be avoided wherever possible--end-users typically don't need to see a stack trace, and whether the exception message would actually provide actionable context is a crapshoot. I think we'd rather let the API consumer decide how to express such an error (or silently accept unknown options). But otherwise, I'm unclear on what explicit definition of boolean-flag-style options really buys an API consumer. If it doesn't throw (and IMO it should not), should this API silently ignore the unknown options? The API consumer would not know if unknown options were passed, unless we threw them into another bucket property, e.g. Also: @nodejs/tooling in case you missed it |
lib/internal/process/parse_args.js
Outdated
if (typeof options !== 'object' || options === null) { | ||
throw new ERR_INVALID_ARG_TYPE( | ||
'options', | ||
'object', | ||
options); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (typeof options !== 'object' || options === null) { | |
throw new ERR_INVALID_ARG_TYPE( | |
'options', | |
'object', | |
options); | |
} | |
validateObject(options, 'options'); |
(from internal/validators
)
lib/internal/process/parse_args.js
Outdated
if (arg === undefined) { | ||
return result; | ||
} | ||
if (StringPrototypeStartsWith(arg, '-')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps
if (StringPrototypeStartsWith(arg, '-')) { | |
if (arg[0] === '-') { |
? (and same for other single-char startsWith use-cases)
Many popular languages don't have argument parsers in their standard libraries, including Java, R, C, C++, C#, etc. But a few standard libraries do have built-in option parsers.
All popular languages provide option parsers as external libraries. Node.js probably has hundreds via npm, C# has System.CommandLine. Java has many, including Apache Commons CLI. R has optparse. So it seems to me that any modern framework either brings no option parser with it, or a much more powerful parser than the one being proposed here. Also, I am afraid that adding this to Node.js might reduce awareness for better solutions in the ecosystem. See also Sam's concerns in nodejs/tooling#19 (comment), nodejs/tooling#19 (comment), nodejs/tooling#19 (comment). |
With the approach on this PR the developer has to validate all options through |
I would like to chime in from the perspective of someone who is both supportive of this functionality, and leads development yargs (one of the most widely adopted argument parsers in the npm ecosystem). Thinking from the perspective of my day jobFor my job, I work at Google as a DPE. We behave as customer zero for Google Cloud Products:
It's my work around samples that has me advocating for a built-in command line argument parser in Node.js. It allows you to provide an elegant snippet of code to folks that just works: const {projectId, datasetName} = process.parseArgs()
const bigqueryClient = new BigQuery({
projectId: args
});
const [dataset] = await bigqueryClient.createDataset(datasetName);
console.log(`Dataset ${dataset.id} created.`); It creates confusion for users that they need to bring other dependencies to the table to use an exemplary snippet of code (beyond the code you're writing an example for). Thinking from the perspective of a Node.js tooling authorNode.js is a powerful platform, and I would love to be able to do more with it without pulling in dependencies: Node.js has great http built-ins, fs built-ins, encryption built-ins, etc., and yet when I write a new library I tend to pull in dependencies like: {
"scripts": {
"precompile": "rimraf ./build",
"compile": "... some build step"
}
} Python exposes a variety of useful built-ins as modules, and allows you to run them as command line applications: python -m http.server 8000 --bind 127.0.0.1 What if I could do this in Node.js?
Having a built-in command line argument parser I'm convinced could help unlock a variety of interesting ideas like this. |
In terms of opinions for repeated flags: I have seen too many bugs caused by repeated query params breaking apps that didn't bother to check for isArray. In practice this kind of "almost always a string but if you pass 2+ it becomes an array" leads to one of two things: super verbose defensive code that needs to type check every step of the way OR tools that throw errors because the file named "[a,b]" couldn't be found (or even worse "x.substring isn't a function"). So... I personally would prefer if support for repetition would be dropped (it's the much less often used kind afaict), made explicit via a 2nd option, or if it's always an array and never a string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few initial nits, but I feel like this is an awesome start.
doc/api/process.md
Outdated
object supporting the following property: | ||
* `expectsValue` {Array<string>|string} (Optional) One or more argument | ||
strings which _expect a value_ when present | ||
* Returns: {Object} An object having properties corresponding to parsed Options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that const { options, positionals } = processArgs()
is a pretty elegant compromise, I never loved _
😆
I don't think it's uncommon enough for us to drop it (personally I've used it on quite a few CLIs I wrote). Always an array seems like a good option for me.
The current proposal supports arrays :D |
Honestly I'm not sure we're ever going to approach "good" behaviour, much less perfect, if so much guessing and coercion has to be involved. |
@boneskull I think we should leave interpretation to the user. The only thing we should absolutely ensure is that, if the option has a parameter, that it makes it into the result as a string so the user can then process it as they see fit. |
@gabrielschulhof So essentially you're proposing:
(Is that correct?) |
It's important that @tniessen gives the green light to this strategy. |
I don't think I suggested any changes directly, at least none that you implemented. I did highlight problems, but I never said I had solutions to these problems (apart from leaving option parsing to the ecosystem). You are looking for an API that should be as simple as possible, should work with basically zero configuration, and should never throw exceptions based on You expect a solution from me, but I don't have one. I simply don't know how to combine the properties I am concerned about (safety and correctness) with the properties you are focusing on (simplicity and not throwing). This is nothing personal @boneskull, and I understand your frustration. Nobody has proposed any complete solutions, and at this point, I don't even know if a good solution exists :)
As you correctly mentioned in #35015 (comment), this is not safe either, but probably at least closer to being correct. However, it introduces return values with unpredictable types, and I don't understand why this behavior would be preferable over throwing an error. I know that you are not considering that an option (no pun intended), but I personally don't get why. To me, it seems illogical to accept values for flags, which, according to the documentation, don't have values. To use the proposed API safely, developers must check whether values were passed for flags (which don't accept values). If a value is permitted for a flag, why didn't the developer declare it as an option, which takes an optional value? If a value is passed for a flag, most applications should probably throw or display an error. In other words, it seems to me that not throwing an error in this API makes safe usage much more difficult for developers, compared to just handling an error. I simply don't see the logic behind this. This is what I assume safe usage would look like: const { options, positionals } = util.parseArgs({ optionsWithValues: ['foo'] });
// Since Node.js allows passing anything for flags, we need to check every single flag for a value.
// The value might be 'false' or something similar, which would lead to incorrect behavior if ignored.
// Node.js could absolutely do this itself, but it doesn't want to.
for (const name of ['bar', 'baz', 'qux', 'quux', 'quuz', 'corge', 'grault']) {
if (options[name] !== undefined && options[name] !== true) {
console.error(`I did not expect a value for the flag ${name}, but Node.js accepted one.`);
console.error(`I cannot really help you, other than tell you not to specify a value for this flag.`);
process.exit(1);
}
}
// Okay, after manually checking whether Node.js accepted any values for
// flags that should not accept values, we can finally actually use the values.
const { foo, bar, baz, qux, quux, quuz, corge, grault } = options;
// Oh wait, we also need to make sure no unexpected positionals were passed...
// (Not included here) I honestly don't understand how this is better than simply handling one exception. I know you don't want to consider this, but I am interested in the reason.
If the PR is somewhat close to implementing somewhat correct behavior, and if the resulting unsafe behavior is properly documented, I'll dismiss my request for changes. That doesn't mean I'll approve the PR, but I also won't stand in the way. There must be some way for the API to be used safely. With its current state, I don't see how that would be possible, and adding inherently unsafe APIs to Node.js does not seem logical to me. As much as I am not a fan of the proposed solution, it seems to allow safe usage of the API, assuming developers are aware of the potential pitfalls caused by it, and implement their own error checking. Of course, you can also leave it to the TSC to overrule my request for changes. Maybe others are fine with landing it as it is, despite the resulting safety and correctness issues. |
fwiw I also raised concerns about correctness and safety, but I think those comments got lost when the PR was updated at some point. |
@mmarchini I haven't rebased... what are you referring to? |
Also as a point of order: given @tniessen objection and the fact that to reach a correct and safe implementation we might end up with material changes to the API, I don't think it's worth discussing Gus objection right now (as the code they're objecting might not exist on a future solution). If that piece of code remains once @tniessen objection is either resolved or dismissed, the TSC must reach a decision on Gus' objection in a timely manner. |
@boneskull I'll try to find my comments, but as I said they disappeared (at least in the code view), so it might take a while to find it. |
Won't be pushing any changes until Tuesday. I plan on implementing this and updating the documentation accordingly. |
Ok, so here are my comments related to correctness and safety (although I didn't use those specific words, and I apologize if I wasn't as clear as @tniessen on my concerns):
(note I didn't make it an explicit objection because I do believe we can have a follow up API with more strict/structured format, but I definitely see why others might think a loose API is not ideal in the first place) |
I don't believe balancing these concerns is impossible. OKAY, what about this: We want to know if an end-user gave us weird or unexpected input. A flag having a value, for instance. e.g., Instead of just returning This way, we can:
This means that Likewise, if |
// called via `node script.js --foo bar baz` | ||
const argv = util.parseArgs(); | ||
|
||
if (argv.foo === true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argv.options.foo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking objection explicitly to make sure this don't fall off the cracks:
--
shouldn't be discarded as it makes it impossible for the developer to provide pass-through behavior for their scripts.
What is the behavior of the API today:
> util.parseArgs(['--foo', '--', '--bar'])
{ options: { 'foo': true }, { positionals: ['--bar'] } }
What I expect from the API:
> util.parseArgs(['--foo', '--', '--bar'])
{ options: { 'foo': true }, { positionals: ['--', '--bar'] } }
My objection is resolved with the above change in behavior.
@mmarchini I think that is reasonable. |
Edited my comment just to clarify that my objection is resolved with the change above (just to avoid confusion). Also, I'm happy with other solutions for the passthrough problem, I only suggested that one because it's the simplest solution I could think of. Other options I find reasonable:
|
) => { | ||
if (!ArrayIsArray(argv)) { | ||
options = argv; | ||
argv = ArrayPrototypeSlice(process.argv, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaulting to process.argv.slice(2)
is fine but I think the implementation should somewhat also consider the eval usecase, right now if I try running node -p|-e
I get a very unhelpful error:
$ node -p 'require('util').parseArgs()' foo bar
internal/validators.js:122
throw new ERR_INVALID_ARG_TYPE(name, 'string', value);
^
TypeError [ERR_INVALID_ARG_TYPE]: The "id" argument must be of type string. Received an instance of Object
at new NodeError (internal/errors.js:253:15)
at validateString (internal/validators.js:122:11)
at Module.require (internal/modules/cjs/loader.js:972:3)
at require (internal/modules/cjs/helpers.js:88:18)
at [eval]:1:1
at Script.runInThisContext (vm.js:132:18)
at Object.runInThisContext (vm.js:309:38)
at internal/process/execution.js:77:19
at [eval]-wrapper:6:22
at evalScript (internal/process/execution.js:76:60) {
code: 'ERR_INVALID_ARG_TYPE'
}
In the past I've done something like process.argv.slice(require.main ? 2 : 1)
in order to support it (though there might be better ways to handle the check in core).
IMO parseArgs
should either handle eval/script properly OR at least throw an useful error instead 😊
@boneskull Why was this closed? |
@kibertoad there are efforts to reopen the discussion around this feature in the new year. |
Awesome! |
For possible interest, I researched what terminology to use consistently in Commander docs last year. I settled on different terms, but did mention these as used elsewhere. Short version, Commander terminology: The command line arguments are made up of options, option-arguments, commands, and command-arguments. Long version: https://github.com/tj/commander.js/blob/master/docs/terminology.md |
Add a function,
util.parseArgs()
, which accepts an array ofarguments and returns a parsed object representation thereof.
Ref: nodejs/tooling#19
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesMotivation
While this has been discussed at length in the Node.js Tooling Group and its associated issue, let me provide a summary:
process.argv.slice(2)
is rather awkward boilerplate, especially for those new to Node.jsapp.js
process.parseArgs()
makes handling command-line arguments "natively" much easier. Given that so many tools are being written in Node.js, it makes sense to center this experience.Design Considerations
First, let me acknowledge that there are many ways to parse command-line arguments. There is no standard, cross-platform, agreed-upon convention. Command-line arguments look different in Windows vs. POSIX vs. GNU, and there's much variation across programs. And these are still conventions, not hard & fast requirements. We can easily be paralyzed attempting to choose what "styles" to support or not. It is certain that there will be someone who agrees that Node.js should have this feature, but should not do it in this way.
But to implement the feature, we have to do it in some way. This is why the way is the way it is:
I have researched the various features and behavior of many popular userland command-line parsing libraries, and have distilled it down to the most commonly supported features, while striving further to trim any features which are not strictly necessary to get the bulk of the work done. While these do not align to, say, POSIX conventions, they do align with end-user expectations of how a Node.js CLI should work. What follows is consideration of a few specific features.
The Requirement of
=
for Options Expecting a ValueFor example, one may argue that
--foo=bar
should be the only way to use the valuebar
for the optionfoo
; but users of CLI apps built on Node.js expect--foo bar
to work just as well. There was not a single popular argument-parsing library that did not support this behavior. Thus,process.parseArgs()
supports this behavior (it cannot be automatic without introducing ambiguity, but I will discuss that later).Combination of Single-Character Flags
Another one is combining (or concatenating?) "short flags"--those using a single hyphen, like
-v
--where-vD
would be equivalent to-v -D
. While this is a POSIX convention, it is not universally supported by the popular command-line parsers. Since it is inherently sugar (and makes the implementation more complicated), we chose not to implement it.Data Types
Like HTML attribute values (
<tag attr="1">
), command-line arguments are provided to programs as strings, regardless of the data type they imply. While most of the userland arg parsers support some notion of a "data type"-i.e., this argument value is a number, string, or boolean--it is not strictly necessary. It is up to the user to handle the coercion of these values.Default Behavior: Boolean Flags
The default behavior is to treat anything that looks like an argument (that's mainly "arguments beginning with one or more dashes") as a boolean flag. The presence of one of these arguments implies
true
. From investigation of popular CLI apps, we found that most arguments are treated as boolean flags, so it makes sense for this to be the default behavior. This means that a developer who just wants to know whether something is "on" or "off" will not need to provide any options toprocess.parseArgs()
.Handling Values
Some arguments do need values, (e.g.,
--require my-script.js
), and in order to eliminate ambiguity, the API consumer must define which arguments expect a value. This is done via theexpectsValue
option toprocess.parseArgs()
, which is the only option toprocess.parseArgs()
. This is the only optionprocess.parseArgs()
accepts.Possible alternatives:
expectsValue
to something elseRepeated Arguments
It's common to need to support multiple values for a single argument, e.g.,
--require a.js --require b.js
. In this example,require
needs to be listed in theexpectsValue
option. The result is an object containing arequire
property whose value is an array of strings;['b.js', 'c.js']
. In the example of--require c.js
, the value of therequire
property is a string,'c.js'
.When working with boolean flags (those not declared in
expectsValue
), it was trivial to support the case in which repeated arguments result in a count. One-v
will result in an object where{v: true}
, but-v -v
will result in{v: 2}
. Either way, the value will be truthy.Possible alternatives:
expectsValue
) will parse to Array of strings, even if there is only one string in the Array (e.g.,--require c.js
becomes{require: ['c.js']}
. That makes the API more consistent at the expense of making the common case (no repetition) slightly more awkward.Positional Arguments
Arguments after
--
or without a dash prefix are considered "positional". These are placed into the Array property_
of the returned object. This is a convention used by many other userland argparsers in Node.js. It is always present, even if empty. This also means that_
is reserved as a flag/option name (e.g.,--_
will be ignored).Possible alternatives:
_
is provided inexpectsValue
Intended Audience
It is already possible to build great arg parsing modules on top of what Node.js provides; the prickly API is abstracted away by these modules. Thus,
process.parseArgs()
is not necessarily intended for library authors; it is intended for developers of simple CLI tools, ad-hoc scripts, deployed Node.js applications, and learning materials.It is exceedingly difficult to provide an API which would both be friendly to these Node.js users while being extensible enough for libraries to build upon. We chose to prioritize these use cases because these are currently not well-served by Node.js' API.
Questions
In particular, I'm not 100% confident in the terminology I chose for the documentation ("Flags", "Options", "Positionals"). This does align with other documentation I've read on the subject of CLI arguments, I am unsure if introducing this terminology to our documentation is a Good Idea. Perhaps it can be expressed without new terminology.
I sorted some files around my modification in in
node.gyp
, which looked like it wanted to be in order, but was not. It did not seem to affect the build, but I can revert these changes if need be.Should it be
process.parseArgv()
? While it does parseprocess.argv
by default, it does not necessarily need to be used withprocess.argv
.Do I need to do more input validation, throw more exceptions, or take other defensive measures?
Credits
While this is my implementation, the design is a product of work by myself, @bcoe, @ruyadorno, and @nodejs/tooling.