Simplifying the standard library with multi-arity functions #418

ghost · 2014-06-16T20:40:30Z

My suggestion is simple: functions that are identical save for an optional parameter (such as all functions that end with _by) could drop that prefix and become multi-arity functions, that is, functions that accept several definitions with a varying number of parameters. For example, this could be the manual entry for min / max / min_by / max_by:

Find the minimum or maximum element of the input array. Passing
a parameter allows you to specify a particular field or
property to examine, e.g. min(.foo) finds the object
with the smallest foo field.

For legacy reasons, min_by(.foo) and max_by(.foo) exist,
with identical behaviour to min(.foo) and max(.foo). These
are considered deprecated and will be dropped in the next major
version.

I've started working on it. Is everyone okay with this? @nicowilliams

The text was updated successfully, but these errors were encountered:

nicowilliams · 2014-06-16T20:50:52Z

I kinda like the "_by" suffix, but I like this too. We should find all opportunities for simplification using multi-arities. For example: join.

ghost · 2014-06-16T21:11:25Z

These are the potential changes I've found so far:

recurse takes over recurse_down and .., the former is deprecated.
join and split get reasonable default arguments.
min and max take over min_by and max_by, which are deprecated.
unique takes over unique_by, which is deprecated.
sort takes over sort_by, which is deprecated.
group replaces group_by, which is deprecated, for consistency.
paths gains an optional argument, so that leaf_paths is the same as paths(values). The former is deprecated. [1]
range allows a one parameter invocation which assumes the range to start at zero.

Paraphrasing whoever played the Napster guy, "Drop the _by. It's cleaner."

[1] I'm not entirely sure if I understand how paths works, but it makes sense in my head. Correct me if this is impossible or not worth the effort.*

nicowilliams · 2014-06-16T21:43:10Z

Re: leaf_paths... that's not quite right. leaf_paths is defined thusly:

def leaf_paths: . as $dot|paths|select(. as $p|$dot|getpath($p)|type|. != "array" and . != "object");

so that paths an optional parameter that could be any of scalars and friends would have to be defined like so:

def paths(node_filter): . as $dot|paths|select(. as $p|$dot|getpath($p)|node_filter);

which works:

$ jq -c 'def paths(node_filter): . as $dot|paths|select(. as $p|$dot|getpath($p)|node_filter); paths(select(type == "string"))'
[[["a", {"b":[11,[{"c":"d"}]]}]]]
[0,0,0]
[0,0,1,"b",1,0,"c"]
$

And that's very nice indeed.

nicowilliams · 2014-06-16T21:44:43Z

range could also gain a third optional argument (step by).

With that, +1 to all of your list.

ghost · 2014-06-17T07:38:52Z

Also, add could take an optional argument as starting point for its internal reduce, as you said over at #280.

nicowilliams · 2014-06-17T14:08:03Z

@slapresta good point!

ghost · 2014-06-17T20:38:32Z

Actually, what we want for leaf_paths is paths(scalars), not paths(values), right?

nicowilliams · 2014-06-17T21:06:41Z

Yes, we should want paths(filter) to filter paths by the value found at that path. Or maybe we should call that paths_to(filter)? I think the latter is clearer.

Here's one way to define leaf_paths in terms of scalars:

def leaf_paths: . as $dot|paths|. as $p|$dot|getpath($p)|scalars|$p;

ghost · 2014-06-17T22:19:31Z

I think paths, without prefix, makes more sense than paths_to, especially if its argument is supposed to be optional.

Anyway, there's a starting point over at #426; perhaps we should keep working over there. I used the definition of paths(node_filter) given by your previous commit and based leaf_paths on that. I did everything on the bullet point list except for range's third argument (which is not defined as a function but as... something else; i guess it requires an understanding of jq's internals I currently lack), add's default value and join and split's default arguments (I'm not sure what these should be or, to be honest, what my idea was when I proposed them)

nicowilliams · 2014-06-17T23:35:39Z

join and split should use "," as the default, IMO.

Regarding add, IMO the best thing to do is to leave it alone. Addition of all all types to null is supported, so it works out alright.

pkoppstein · 2014-06-18T04:07:55Z

@nicowilliams wrote:

join and split should use "," as the default, IMO.

In awk, perl, ruby, javascript and no doubt other jq peers, split accepts regular expressions. Since jq's support for regexps is on the horizon, there is something to be said for holding off establishing a default for jq's split/1, mainly because it should almost surely be a regular expression.

For example, if we go with ", " as join's default, as Nico suggests, then it might make sense to choose " , *" or perhaps "\s,\s_" as the default for split. Or, following awk, python, ruby, etc, split's default would be "\s+". (_)

For your reference, here is an extract from ruby's documentation about split:

If pattern is a single space, str is split on whitespace, with leading whitespace and runs of contiguous whitespace characters ignored.
...
If pattern is omitted, the value of $; is used. If $; is nil (which is the default), str is split on whitespace as if ' ' were specified.

Footnote
(*) Python is a little unusual in having both str.split and re.split, but the default for python's str.split is effectively "\s+" according to the python documentation:

If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator

nicowilliams added the feature request label Jun 16, 2014

nicowilliams added this to the 1.5 release milestone Jun 16, 2014

ghost mentioned this issue Jun 17, 2014

Simplified standard library; marked redundant filters as duplicated #426

Closed

ghost closed this as completed Jun 17, 2014

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplifying the standard library with multi-arity functions #418

Simplifying the standard library with multi-arity functions #418

ghost commented Jun 16, 2014

nicowilliams commented Jun 16, 2014

ghost commented Jun 16, 2014

nicowilliams commented Jun 16, 2014

nicowilliams commented Jun 16, 2014

ghost commented Jun 17, 2014

nicowilliams commented Jun 17, 2014

ghost commented Jun 17, 2014

nicowilliams commented Jun 17, 2014

ghost commented Jun 17, 2014

nicowilliams commented Jun 17, 2014

pkoppstein commented Jun 18, 2014

Simplifying the standard library with multi-arity functions #418

Simplifying the standard library with multi-arity functions #418

Comments

ghost commented Jun 16, 2014

nicowilliams commented Jun 16, 2014

ghost commented Jun 16, 2014

nicowilliams commented Jun 16, 2014

nicowilliams commented Jun 16, 2014

ghost commented Jun 17, 2014

nicowilliams commented Jun 17, 2014

ghost commented Jun 17, 2014

nicowilliams commented Jun 17, 2014

ghost commented Jun 17, 2014

nicowilliams commented Jun 17, 2014

pkoppstein commented Jun 18, 2014