Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

require explicit predicates in find functions #23812

Merged
merged 1 commit into from
Sep 29, 2017
Merged

require explicit predicates in find functions #23812

merged 1 commit into from
Sep 29, 2017

Conversation

JeffBezanson
Copy link
Member

@JeffBezanson JeffBezanson commented Sep 21, 2017

I also tried out using equalto(x) as a predicate, and I really think we should adopt it. It reads nicely (!equalto(x) also works) and will cut down the number of function types.

fix #23120, fix #19186

Part of #10593.

@JeffBezanson
Copy link
Member Author

Also, I didn't touch findn, findnz, or find for sparse matrices. I suppose for find, we could require that predicates be false for 0 (giving an error otherwise).

@@ -1855,6 +1855,12 @@ end
nothing
end

@deprecate find(x::Number) find(x->x!=0, x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we encourage find(!iszero, x) for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, for Number I guess we should.

test/arrayops.jl Outdated
@@ -427,44 +427,43 @@ end
@test X[Y[end],1] == 5
@test X[end,Y[end]] == 11
end
equalto(y) = x->x==y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use Base.equalto for these, or do you specifically need == rather than isequal for these tests?

@ararslan ararslan added collections Data structures holding multiple items, e.g. sets deprecation This change introduces or involves a deprecation labels Sep 21, 2017
@ararslan
Copy link
Member

I really like equalto!

@TotalVerb
Copy link
Contributor

While it shouldn't block anything, I think it's a little strange that equalto is partially applied == and not partially applied isequal.

@JeffBezanson
Copy link
Member Author

That's actually only the version in the tests --- I'll replace that. The one I put in Base uses isequal.

@@ -1855,6 +1855,12 @@ end
nothing
end

@deprecate find(x::Number) find(!iszero, x)
@deprecate findnext(A, v, i::Integer) findnext(x->x==v, A, i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should export equalto and use it in the deprecation targets?

@JeffBezanson JeffBezanson force-pushed the jb/find1 branch 2 times, most recently from f151e89 to 6253be9 Compare September 22, 2017 02:06
@JeffBezanson
Copy link
Member Author

Ok, I went ahead and exported equalto. It's a thing.

Copy link
Member

@ararslan ararslan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jeff does it again! 💯

@ararslan
Copy link
Member

Ah, this needs a NEWS item for the addition of equalto.

@@ -1615,8 +1615,14 @@ julia> findnext(A,3)
function findnext(A, start::Integer)
l = endof(A)
i = start
warned = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this warn everytime the function is called?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was counting on depwarn for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this looks good to me — the warned makes sure the slower depwarn is only called once for each invocation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry for the noise.

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's really a nice simplification of the API.

Also, I didn't touch findn, findnz, or find for sparse matrices. I suppose for find, we could require that predicates be false for 0 (giving an error otherwise).

findn and findnz are special, so it's fine to keep them (at least for now), but having find behave differently for sparse matrices doesn't sound like a good idea to me. We could at least change the signature of the current find(::SparseMatrixCSC) method to find(::Union{typeof(!iszero), typeof(!equalto(0))}, ::SparseMatrixCSC) to ensure consistency and still benefit from an optimized implementation for the common case.

That approach can be generalized this by checking whether the predicate is true or false on zero elements and assuming it's pure. I don't think throwing an error is a good idea: as discussed elsewhere, sparse matrices should behave the same as dense ones everywhere it's possible.

base/array.jl Outdated
@@ -1596,14 +1596,14 @@ cat(n::Integer, x::Integer...) = reshape([x...], (ntuple(x->1, n-1)..., length(x
"""
findnext(A, i::Integer)
Find the next linear index >= `i` of a non-zero element of `A`, or `0` if not found.
Find the next linear index >= `i` of a true element of `A`, or `0` if not found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add backticks around true (here and in other docstring) since "a true element" may not be completely obvious when you have no idea what the function does.

base/array.jl Outdated
@@ -1596,14 +1596,14 @@ cat(n::Integer, x::Integer...) = reshape([x...], (ntuple(x->1, n-1)..., length(x
"""
findnext(A, i::Integer)
Find the next linear index >= `i` of a non-zero element of `A`, or `0` if not found.
Find the next linear index >= `i` of a true element of `A`, or `0` if not found.
# Examples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense to add an example using equalto, since people will likely look at this docstring when they need this feature (same for other docstrings). Or just mention it in the description.

base/file.jl Outdated
@@ -271,7 +271,7 @@ function tempname(temppath::AbstractString,uunique::UInt32)
tempp = cwstring(temppath)
tname = Vector{UInt16}(32767)
uunique = ccall(:GetTempFileNameW,stdcall,UInt32,(Ptr{UInt16},Ptr{UInt16},UInt32,Ptr{UInt16}), tempp,temp_prefix,uunique,tname)
lentname = findfirst(tname,0)-1
lentname = findfirst(equalto(0),tname)-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't iszero work here?

@@ -5604,7 +5604,7 @@ function _getfield_elim_pass!(e::Expr, sv::InferenceState)
if alloc !== false
flen, fnames = alloc
if isa(j, QuoteNode)
j = findfirst(fnames, j.value)
j = findfirst(x->x == j.value, fnames)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't equalto be enough? Same below.

"""
equalto(x)

Create a function that compares its argument to `x` using `isequal`; i.e. returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[`isequal`](@ref)

@@ -359,7 +359,7 @@ function prune_versions(reqs::Requires, deps::Dict{String,Dict{VersionNumber,Ava
vmaskp[vn] = falses(luds)
end
for (vn,a) in fdepsp
vmind = findfirst(uniqdepssets, a.requires)
vmind = findfirst(Base.equalto(a.requires), uniqdepssets)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Base.?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be necessary since it's exported?

@JeffBezanson
Copy link
Member Author

Thanks for the good review @nalimilan .

I don't think throwing an error is a good idea: as discussed elsewhere, sparse matrices should behave the same as dense ones everywhere it's possible.

The only reason I want to throw an error is so I don't have to implement a new algorithm for finding zeros. That case can be added later.

if A[i] != 0
a = A[i]
if !warned && !(a isa Bool)
depwarn("In the future `findnext` will only work on boolean collections. Use `findnext(x->x!=0, A)` instead.", :findnext)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I've done things like this I've often added breadcrumbs that point to it from deprecated.jl that help make sure they get removed at the appropriate timeframe.

@JeffBezanson
Copy link
Member Author

JeffBezanson commented Sep 22, 2017

Ah --- previously find(f, sparse) called the generic implementation, so we can continue to do that at least for the zero case.

@JeffBezanson
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@garrison
Copy link
Member

In UniqueVectors.jl I override findfirst with a value so that the search can be performed in O(1) time, with a Dict lookup. There is no way to do this by passing a predicate. If this pull request lands, the usage of findfirst in that package will no longer be consistent with its use in Base, so it will probably make sense for me to change the method name to something else. It may also be worth looking through the package ecosystem to see whether other packages rely on overriding findfirst(a, elt) or similar functionality as well.

Overall I am on the fence, leaning toward being -1 about this change. findfirst,findnext,findprev with a value are more specific forms than with a predicate, and UniqueVectors is unlikely to be the only case where overriding these methods for a specific data structure (with improved efficiency) makes sense.

As for #19186, I've never found the argument order confusing, as long as I keep in mind that predicates go first so that the do-block syntax can be supported.

@garrison
Copy link
Member

garrison commented Sep 23, 2017

Or perhaps this would solve my issue:

struct EqualTo{T}
    val::T
end

(x::EqualTo)(y) = isequal(x.val,y)

equalto(x) = EqualTo(x)

Then I could override findfirst(::EqualTo, ::UniqueVector).

@garrison
Copy link
Member

garrison commented Sep 23, 2017

If this PR lands, it would also be good to see a test where !equalto(x) is used. (My EqualTo suggestion above would break this functionality, unless it comes with an additional NotEqualTo type and appropriate methods for !.)

@JeffBezanson
Copy link
Member Author

To make ! work all we'd have to do is make EqualTo a subtype of Function, so +1 to that approach.

What I find confusing about the current methods is that in findnext(x, y), y is an index, but in findfirst(x, y), y is a value to search for. Of course this makes sense if you think about it, but I always have to do an extra mental step when I see those calls.

@JeffBezanson
Copy link
Member Author

Done --- I just made equalto the name of the type.

@@ -972,3 +972,22 @@ julia> filter(!isalpha, str)
```
"""
!(f::Function) = (x...)->!f(x...)

struct equalto{T} <: Function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably use the typical upper camel case style for type names as recommended in the style guide and used in Base, especially since this is user-facing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree, since this is meant to be used like a function, equalto(x).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can always define equalto(x) = EqualTo(x), as is done with Iterators.filter returning a Filter object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A problem with conflating equalto with "regular" functions is that it doesn't have its own type in the same way that other functions do. In particular, to specialize on a particular function, you use ::typeof(f), but in this case typeof(equalto) will be DataType. So users will have to remember that to specialize on this function they need ::equalto instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @ararslan, that pattern is quite common, we also use it with zip vs. Zip1/Zip2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but I'm not sure that's such a great pattern. It's confusing to have both filter and Filter. That came from a situation where filter had several methods, only one of which returned a Filter, but then the functions were separated (into Base.filter and Base.Iterators.filter). zip is also different in that it might return either a Zip1 or a Zip2, so it can't be directly identified with either type.

@JeffBezanson
Copy link
Member Author

That's right --- @garrison 's use case would dispatch as findfirst(f::equalto, a).

@ararslan
Copy link
Member

Still seems simpler and more consistent to me to do what we do in Base.Iterators: define the type using the usual conventions and make an appropriately named accessor function. I still don't see why this has to be a special case from that convention.

@JeffBezanson
Copy link
Member Author

But then why don't we have dict(x...) = Dict(x...), rational(x) = Rational(x), etc.?

@ararslan
Copy link
Member

Okay, then if this is supposed to be used as a straight type constructor rather than an accessor function, I think we should go with the stylistic conventions and call it EqualTo.

@JeffBezanson
Copy link
Member Author

How about const EqualTo = equalto? :)

@nalimilan
Copy link
Member

EqualTo sounds fine. At least it makes it clear that it's a type. The fact that it's <: Function is kind of an implementation detail in the context of find* functions.

@JeffBezanson
Copy link
Member Author

Actually I got my last comment backwards. We could call the type EqualTo to follow the type naming convention, then have const equalto = EqualTo for nicer calling syntax for the common cases.

@JeffBezanson JeffBezanson merged commit 7f14cd6 into master Sep 29, 2017
@JeffBezanson JeffBezanson deleted the jb/find1 branch September 29, 2017 03:06
martinholters added a commit to HSU-ANT/ACME.jl that referenced this pull request Sep 29, 2017
martinholters added a commit to martinholters/DSP.jl that referenced this pull request Sep 29, 2017
This fixes the deprecations due to JuliaLang/julia#23812 and makes for
cleaner code anyway. (Truncation of trailing zeros is done in the `Poly`
constructor so there is no need to do it here.)
martinholters added a commit to JuliaDSP/DSP.jl that referenced this pull request Oct 9, 2017
This fixes the deprecations due to JuliaLang/julia#23812 and makes for
cleaner code anyway. (Truncation of trailing zeros is done in the `Poly`
constructor so there is no need to do it here.)
if A[i] != 0
a = A[i]
if !warned && !(a isa Bool)
depwarn("In the future `findnext` will only work on boolean collections. Use `findnext(x->x!=0, A)` instead.", :findnext)
Copy link
Contributor

@GregPlowman GregPlowman Oct 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should start be an argument to findnext in the depwarn message?

depwarn(" ... Use `findnext(x->x!=0, A, start)` instead.", :findnext)

Similarly for findprev?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it should.

garrison added a commit to garrison/UniqueVectors.jl that referenced this pull request Oct 14, 2017
tkluck added a commit to tkluck/julia that referenced this pull request Nov 3, 2017
…e explicit

Since we now need explicit predicates [1], this optimization only works
if we know that the predicate is a function that is false for zero
values. As suggested in that pull request, we could find out by calling
`f(zero(eltype(array)))` and hoping that `f` is pure, but I like being
a bit more conservative and only applying this optimization only to the
case where we *know* `f` is equal to `!iszero`.

For clarity, this commit also renames the helper method
_sparse_findnext()  to _sparse_findnextnz(), because now that the
predicate-less version doesn't exist anymore, the `nz` part isn't
implicit anymore either.

[1]: JuliaLang#23812
@timholy
Copy link
Member

timholy commented Jan 9, 2018

Were these methods deliberately omitted, or was that an oversight? If the latter, I will likely get them as part of moving towards using nothing as a sentinel.

I'm contemplating adding a sentinel as an optional last argument, and because it could be an Integer there are all sorts of ambiguities that crop up. Do we actually need to specify start::Integer? To me it seems like it might be cleaner if the only typed argument is ::Function.

If people like that, I'd be tempted to include that change among the deprecations too, because otherwise I may have to add a bunch of specializations just to resolve ambiguities. There's some risk that this would change method sorting in a breaking way for packages

@nalimilan
Copy link
Member

Good catch! These should definitely be deprecated, I must have missed them.

I'm not sure we need to specify start::Integer, and indeed as a follow-up to #24774 start should also be allowed to be of the type returned by key (which isn't known in advance).

Feel free to put everything in a PR it that makes things simpler for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets deprecation This change introduces or involves a deprecation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make find's default predicate be identity Inconsistent order of arguments in variants of 'find'