Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OptionParser should allow flag with optional value #5338

Open
straight-shoota opened this issue Dec 1, 2017 · 24 comments
Open

OptionParser should allow flag with optional value #5338

straight-shoota opened this issue Dec 1, 2017 · 24 comments

Comments

@straight-shoota
Copy link
Member

straight-shoota commented Dec 1, 2017

In #4499 it has come up that OptionParser does not allow for a flag to be used either with a value (--flag=value) or without a value (--flag). In the referenced PR it happend that --color a.cr was misinterpreted as flag --color with value a.cr, whereas it is meant to be a value-less flag --color and an unnamed argument a.cr.

Currently, the format --flag [value] allows a flag value to be missing, but this only works if it is the last argument or the next argument is also a flag. It would also match an unnamed argument. There should be a way to prevent that.

Now, obviously, this can't work with an empty space as separater between flag an value as this would always be ambiguous, but it would work if it was enforced to be an equals sign: --foo arg would be an value-less flag, --flag=value would be a flag with value value.

@makenowjust
Copy link
Contributor

this only works if it is the last argument or the next argument is also a flag.

It depends options handler order. For example:

require "option_parser"

def parse(args)
  p [:parse, args]
  OptionParser.parse(args) do |opts|
    opts.on("-a [val]", "") do |val|
      p [:a, val]
    end

    opts.on("-b [val]", "") do |val|
      p [:b, val]
    end

    opts.unknown_args do |args|
      p [:unknown_args, args]
    end
  end
  puts
end

parse(%w(-a -b b))
# [:parse, ["-a", "-b", "b"]
# [:a, "-b"]
# [:unknown_args, ["b"]]

parse(%w(-b -a a))
# [:parse, ["-b", "-a", "a"]
# [:a, "a"]
# [:unknown_args, []]
# NOTE: why "-b" is not called?

@straight-shoota
Copy link
Member Author

straight-shoota commented Dec 1, 2017

Wow, this looks like there are some serious issues with optional argument values.

@urde-graven
Copy link

urde-graven commented Jan 9, 2019

This is patched in commit 0c306ca, which fixes 4 issues - see new specs in this commit against crystal main branch:

Failures:

  1) OptionParser optional option doesn't get separated value if specified with merged flag
     Failure/Error: flag.should eq(expect_value)

       Expected: ""
            got: "123"

     # spec/std/option_parser_spec.cr:15

  2) OptionParser optional option doesn't get separated value that looks like flag if specified with separated flag
     Failure/Error: flag.should eq(expect_value)

       Expected: ""
            got: "-g -h"

     # spec/std/option_parser_spec.cr:15

  3) OptionParser doesn't get value shifted in position because of removing another flag
     Failure/Error: flag.should eq(expect_value)

       Expected: ""
            got: "123"

     # spec/std/option_parser_spec.cr:15

  4) OptionParser should take short-flag's missing information about option from long flag
     Failure/Error: captured_z.should eq ""

       Expected: ""
            got: "14"

Three fixed problems are mentioned here - flag matched as optional value, separate argument matched when optional value is defined together with flag, and shifted argument matched as value because arguments were removed by previous matches. I also added improvement requested for example in #6632, because removing of consumed arguments is now additional work, but this is evidently done poorly to maintain backward compatibility - it should probably work like in Ruby's OptionParser, eg #parse should be non-destructive and #parse! should consume flags and their values:

OptionParser.parse!
(from ruby core)
------------------------------------------------------------------------
  parse!(argv = default_argv, into: nil)

------------------------------------------------------------------------
Same as #parse, but removes switches destructively. Non-option arguments remain in argv.

But I would agree with #4809 (comment), that optionparser needs a rewrite and this could be part of such work

@RX14
Copy link
Contributor

RX14 commented Jan 10, 2019

For now we should probably move OptionParser to be internal to the compiler if nobody wants to fix it. I certainly look at how much code it takes to implement it and i'm worried. It's implemented in an efficient way (callbacks) but I don't think that's probably the best way to implement it.

@urde-graven
Copy link

Something should be done, because these bugs are not minor. Else I would not try to push them upstream - I have not good experience with trying it, I almost always maintain my own fork. And I'm also working on more interesting things now - possibility of building shared libraries in crystal, generating .h files for them and so on. If you want, I can cut down these fixes to minimum, so it will be simpler to apply. Also I can remove new optional argument for non-destructive parsing. This patch should make parsing much faster for huge amount of arguments, but problem is overall algorithm, which scans all non-consumed arguments again and again for each handler instead of going sequentially through arguments and match them against all handlers.

@urde-graven
Copy link

urde-graven commented Jan 17, 2019

What is your decision @RX14?

@RX14
Copy link
Contributor

RX14 commented Jan 18, 2019

I'd like an opinion from the other core team members - I'm too busy with exams to rewrite OptionParser, as much as I would like to.

@urde-graven
Copy link

This is understandable, I also almost started to work on it. But I also know that that would be pointless work. Do you want to rewrite it, rather than apply this fast fix? Or you just don't like something about it? Are you afraid of new bugs?

@asterite
Copy link
Member

Does this really need a rewrite? It seems to be just a bug with OptionParser...

@urde-graven
Copy link

urde-graven commented Jan 18, 2019

I'd think that minimally mentioned algorithm needs rework, which I can do. I didn't started to rewrite it to minimize changes. I meant, that new interface of whole OptionParser would by pointless work from me, because you have probably another ideas. My idea is, that OptionParser in stdlib should be suited for most common simple programs, and also allow to serve as basic building block for more advanced "CLI Builders". So for example it should be also able to stop parsing on first unknown argument (I found this request two times here) and non-destructive parsing.

@RX14
Copy link
Contributor

RX14 commented Jan 18, 2019

@asterite to me, just looking at OptionParser, it seems like a lot of complex code to do a "simple" thing. I might just be blinded by my preconceptions of how easy arguments parsing is.

@urde-graven
Copy link

Have you already decided? I hope I don't push you too much to stress you.

@urde-graven
Copy link

So are you usually trying to similarly discourage new contributors? Or should I take it as a special discriminatory honor? Or you just don't care?

@asterite
Copy link
Member

@urde-graven If you have a patch that fixes some issue in OptionParser, please open a PR. If it behaves like Ruby's OptionParser then it will likely be accepted.

@urde-graven
Copy link

@asterite Only patch already mentioned in #5338 (comment). I have not worked on it further nor opened PR due to the discussion here. If It's ok, I'll open it.

@asterite
Copy link
Member

@urde-graven If it works like Ruby's OptionParser, then yes, please open a PR. Thank you!

@asterite
Copy link
Member

@straight-shoota I'm sorry that it took me so long, but could you explain the issue with some example code and command line invocation?

@urde-graven
Copy link

@asterite

  1. The first problem ("separate argument matched when optional value is defined together with flag") is explained also in Compiler: use Colorize.on_tty_only! #4499 (comment). Example:
require "option_parser"

def parse(args)
  p [:parse, args]
  OptionParser.parse(args) do |opts|
    opts.on("-c[always|never|auto]", "--color=[always|never|auto]", "") do |val|
      p [:color, val]
    end
    opts.unknown_args do |args|
      p [:unknown_args, args]
    end
  end
  puts
end

parse(%w(--color a.cr))
# old OptionParser:
# [:parse, ["--color", "a.cr"]]
# [:color, "a.cr"]
# [:unknown_args, []]
# with patch:
# [:parse, ["--color", "a.cr"]]
# [:color, ""]
# [:unknown_args, ["a.cr"]]

parse(%w(-c a.cr))
# old OptionParser:
# [:parse, ["-c", "a.cr"]]
# [:color, "a.cr"]
# [:unknown_args, []]
# with patch:
# [:parse, ["-c", "a.cr"]]
# [:color, ""]
# [:unknown_args, ["a.cr"]]
  1. Example for "flag matched as optional value" is in OptionParser should allow flag with optional value #5338 (comment) (and the patch also fixes problem with missing call of -b, which I overlooked before). Output of the provided example with patched OptionParser:
[:parse, ["-a", "-b", "b"]]
[:a, ""]
[:b, "b"]
[:unknown_args, []]

[:parse, ["-b", "-a", "a"]]
[:a, "a"]
[:b, ""]
[:unknown_args, []]
  1. Example for the third problem ("shifted argument matched as value because arguments were removed by previous matches") follows:
require "option_parser"

def parse(args)
  p [:parse, args]
  OptionParser.parse(args) do |opts|
    opts.on("-a [val]", "") do |val|
      p [:a, val]
    end
    opts.on("-b [val]", "") do |val|
      p [:b, val]
    end
    opts.unknown_args do |args|
      p [:unknown_args, args]
    end
  end
  puts
end

parse(%w(-b -a a c)
# old OptionParser:
# [:parse, ["-b", "-a", "a", "c"]]
# [:a, "a"]
# [:b, "c"]
# [:unknown_args, []]
# with patch:
# [:parse, ["-b", "-a", "a", "c"]]
# [:a, "a"]
# [:b, ""]
# [:unknown_args, ["c"]]
  1. "OptionParser should take short-flag's missing information about option from long flag" - from spec:
args = ["-f", "12", "-g", "13", "-h", "14"]
captured_f = captured_g = captured_z = nil
parser = OptionParser.parse(args) do |opts|
  opts.on("-f", "--flag X", "some flag") { |flag_value| captured_f = flag_value }
  opts.on("-g", "--gflag=Y", "another flag") { |flag_value| captured_g = flag_value }
  opts.on("-h", "--hopt=[Z]", "optional flag") { |flag_value| captured_z = flag_value }
end
captured_f.should eq "12"
captured_g.should eq "13"
captured_z.should eq ""
args.should eq ["14"]
parser.to_s.should contain "   -f, --flag X"

@asterite
Copy link
Member

@urde-graven Thank you!

I think all the existing problems in OptionParser come from the fact that I don't (didn't?) understand how option parsing work in general. I thought --flag value was valid syntax. But it seems these two are only valid:

  • -xvalue: when -x is defined (single letter) then the value comes right after the first letter
  • --some=value: when --some is defined, the value must be given with right after the name followed by = and the value

With that we can easily allow optional values:

  • for -x[value]: nothing comes after x
  • for --some=[value]: nothing comes after some

Is what I'm saying correct?

Is there any program out there where you can say --flag value?

@straight-shoota
Copy link
Member Author

There is no standard concept for option parsing. There are different styles employed by different programs.

Is there any program out there where you can say --flag value?

Yes, of course. Loads of them. I'd say this is actually more common than --flag=value. And there is nothing wrong with that notation per se. It can only be ambiguous when the value is optional.

@asterite
Copy link
Member

Cool, then. Feel free to continue forward with this, then (I won't have any opinion at all because I don't know about this subject).

@urde-graven
Copy link

@HertzDevil
Copy link
Contributor

HertzDevil commented Dec 4, 2021

There is one related issue; an empty argument is indistinguishable from a missing argument:

require "option_parser"

def parse_args(args)
  OptionParser.parse(args) do |parser|
    parser.on("-c [x]", "--color [x]", "optional arg") { |x| p x }
  end
end

# argument missing from first set of args only
parse_args %w(-c)     # => ""
parse_args ["-c", ""] # => ""

# argument missing from first set of args only
parse_args %w(--color)     # => ""
parse_args %w(--color=)    # => ""
parse_args ["--color", ""] # => ""

That means if a shell script does something like crystal --color=$(... 2>/dev/null), it might accidentally enable colored output because --color= would become equivalent to --color. The case for short options is avoidable because #11537 implies the '' in -c '' is not parsed as the argument to the -c.

Argp for example distinguishes --color from --color=:

#include <stdio.h>
#include <argp.h>

static struct argp_option options[] = {
    {.name = "a", .key = 'a', .arg = "x", .flags = OPTION_ARG_OPTIONAL},
    {0},
};

static error_t parse_opt (int key, char *arg, struct argp_state *state) {
    switch (key) {
    case 'a':
        printf("\"%s\"\n", arg);
        break;
    default:
        return ARGP_ERR_UNKNOWN;
    }
    return 0;
}

static struct argp argp = {.options = options, .parser = parse_opt};

int main(int argc, char **argv) {
    char *args1[] = {"./a.out", "--a"};                 /* ./a.out --a */
    argp_parse(&argp, 2, args1, ARGP_NO_ERRS, 0, NULL); /* "(null)"    */

    char *args2[] = {"./a.out", "--a="};                /* ./a.out --a= */
    argp_parse(&argp, 2, args2, ARGP_NO_ERRS, 0, NULL); /* ""           */

    /* for completeness */
    char *args3[] = {"./a.out", "--a", ""};             /* ./a.out --a '' */
    argp_parse(&argp, 3, args3, ARGP_NO_ERRS, 0, NULL); /* "(null)"       */

    return 0;
}

@straight-shoota
Copy link
Member Author

In order to differentiate no value and empty value, the argument value could represent that with nil and "". But that would require a change to the method signature, because the proc currently only yields String. And it probably doesn't make sense to have this type when a value is required because user code would need to handle the nil case which can never happen.
This is a result of runtime string parsing to determine the optionality of a value.

A possible workaround for that would be to change the method signature, for example by introducing a new parameter for determining optionality which would allow to change the block type to yield String?. This would be backwards compatible because the new block type only applies to the new signature. Method calls using the old signature without option argument won't be affected.

def on(short_flag : String, long_flag : String, description : String, *, optional_value : Bool, &block : String? ->)

An alternative would be an overload that accepts a proc instance instead of capturing a block.

def on(short_flag : String, long_flag : String, description : String, proc : String? ->)

This would be a bit more tedious to use though, because you need to create an explicit Proc instance:

parser.on("-c", "--color", "", -> (value : String?) do
end)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants