-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Nullable lifting infrastructure #18758
Conversation
I would honestly rather this be a method of |
@TotalVerb I'm not opposed to this, as long as there is then a method |
I suppose the issue is that there are two senses of broadcasting here: broadcasting over the nullables themselves (the lifting) and broadcasting the resulting operation over the array. It may be difficult to combine those into one operation, especially one as semantically dense as I really don't like the type "annotations" here. Couldn't inference be applied once, with |
That's where the function broadcast{F}(_f::Lifted{F}, X, Y)
...
for (x, y) in broadcasted_indices
res[i] = broadcast(_f.f, U, x, y)
end
res
end |
Would you please elaborate? I'm not sure I understand.
We'll need to get somebody who's familiar with these things on the line. My naive thinking was that calling |
For A naive implementation of lifted(foo) could be like this: lifted(foo) = (x, y) -> isnull(x) || isnull(y) ? NULL : unsafe_lift(foo, x, y) where The issue with this function is that it is type-unstable, as it could either return stable_ifelse{T,U}(b::Bool, x::Nullable{T}, y::Nullable{U}) =
ifelse(b, Nullable{Union{T,U}}(x), Nullable{Union{T,U}}(y))
lifted(foo) = (x, y) -> stable_ifelse(isnull(x) || isnull(y), NULL, unsafe_lift(foo, x, y)) This works for certain values, and has good generated code as far as I can tell. It's based on a combination of promoting distinct nullable types to a stable one. Even better, there is no inference-dependent behaviour at all here. To me, it's clear that for operations and types that will support it ( Now, for safety, we need to turn this
For the first option, the implementation is obvious. For the second, I prefer a solution similar to what
julia> f(x) = rand(Bool) ? 1 : 1.0
f (generic function with 1 method)
julia> map(f, [1])
1-element Array{Int64,1}:
1
julia> map(f, [1])
1-element Array{Float64,1}:
1.0
julia> map(x -> 1, Int[])
0-element Array{Int64,1}
julia> map(f, Int[])
0-element Array{Union{Float64,Int64},1} So to be consistent, we could do: lifted2(foo) = (x, y) -> isnull(x) || isnull(y) ?
Nullable{Core.Inference.return_type(foo, Tuple{typeof(x).parameters[1], typeof(y).parameters[1]})}() :
unsafe_lift(foo, x, y) But I would actually argue against consistency here, because if we are going to use the actual type in the real case, then we should avoid non-concrete parameters in
Which we can implement as: Base.@pure inferred_concrete_return_type{T,U}(f,::Type{T},::Type{U}) =
let X = Core.Inference.return_type(f,Tuple{T,U})
isleaftype(X) ? X : Union{}
end
lifted2(foo) = function{T,U}(x::Nullable{T}, y::Nullable{U}) -> isnull(x) || isnull(y) ?
Nullable{inferred_concrete_return_type(foo, Tuple{T,U})}() :
unsafe_lift(foo, x, y) As-is, this will not be type stable. However, I think it could be manually made type stable through inference special-casing. This function has some degree of utility outside of nullables, so a case could be made for such a special case. |
@davidagold Inference being run twice is not a concern. However, the result of inference not actually making the function type stable is indeed a concern. If we can't make the function type stable, it's obviously better just to return |
Why does this have to be in base, couldn't this stay in a package? |
You mean the entire nullable infrastructure, or just the lifting part? |
Just the stuff in this PR. It is not really clear to me where this would be used. I guess in |
isnull(x), | ||
ifelse( | ||
isnull(y), | ||
Nullable{Bool}(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this uses a lot of vertical space, would be better with more than a single token per line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Bool
is safe to evaluate even when missing, you could also use the two-argument form of Nullable
to make this more compact.
Is |
Unless we have another potential use for the term |
if isnull(x) | ||
return Nullable{U}() | ||
else | ||
return Nullable{U}(f(unsafe_get(x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really related, but I wonder whether the compiler isn't actually smart enough to generate the same code when using get
. Is unsafe_get
really needed except in very specific cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use unsafe_get
here because it is generic over Nullable
and non-Nullable
types, whereas get(x)
for unqualified x
is not defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. That difference is a bit weird though. Maybe we should move to unwrap
and have it work for any scalar.
|
I still think this is a good use of the word |
Some time ago I heard of talk of having "automatic" lifting for elementary operations like If so, is this PR is to allow us to address dispatch issues mostly? Or is it instead of automatic lifting entirely? |
@ViralBShah in Reactive |
@andyferris AFAIK there are no plans for automatic lifting of all functions. But NullableArrays.jl currently includes lifted operators, and I plan to open a PR again to move them into Base (following up #16988). |
I know close to nothing of category theory, but it's still enough that I would expect BUT this gets extremely fishy if different container-likes get merged, like |
I would think that for the case of arrays, what I would want is always |
I think that's why @davidagold proposes |
@TotalVerb Thank you for your thorough explanation. I really appreciate your taking the time to go through those thoughts, and I'd be interested to hear more about the compiler/inference work that would complement the present proposal. @andyferris I'm not exactly sure anymore what precisely "automatic lifting" means. In this case it sounds like it means that But we could also consider "automatic lifting" restricted to a particular context, such as a macro invocation: Given the ambiguity of the phrase "automatic lifting", I've found it more helpful to think in terms of "lifting by some method or other". So, the first approach, in which we define the method |
@TotalVerb's solution sounds good. Though, if returning |
I'd like to raise the question again whether this has to be in base at this point, or whether this could live in a package. Given that this a) doesn't add additional methods to base functions for base types and b) is not used by anything else in base, it seems to me that this could well live in a package, mature there and once we have a better sense whether this is a good strategy or not it could still be put into base at a later point. |
@davidanthoff I think you're probably right that this can mature in a package for now. But having this PR against Base has induced some very helpful discussion, so maybe it's worth keeping open and updating periodically. @TotalVerb I'm sorry I didn't believe you re: inference being run twice: julia> function lift(f, x)
U = Core.Inference.return_type(f, Tuple{eltype(typeof(x))})
if isnull(x)
return Nullable{U}()
else
return Nullable(f(unsafe_get(x)))
end
end
lift (generic function with 1 method)
julia> f(x) = x
f (generic function with 1 method)
julia> @code_warntype lift(f, 1)
Variables:
#self#::#lift
f::#f
x::Int64
U::Type{Int64}
Body:
begin
U::Type{Int64} = $(QuoteNode(Int64)) # line 3:
unless $(QuoteNode(false)) goto 6 # line 4:
return $(Expr(:new, Nullable{Int64}, false))
6: # line 6:
return $(Expr(:new, Nullable{Int64}, true, :(x)))
end::Nullable{Int64} That's really neat. Does this mean that |
""" | ||
immutable Lifted{F} | ||
f::F | ||
cache::Dict{Tuple{Vararg{DataType}}, DataType} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cache should be removed. The compiler already implements it.
NOTE: There are two exceptions to the above: `lift(|, Bool, x, y)` and | ||
`lift(&, Bool, x, y)`. These methods both follow three-valued logic semantics. | ||
""" | ||
function lift(f, U::DataType, x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be Type
instead of DataType
.
No, it just means that inference can infer its return value. |
Given the comments above, it seemed appropriate to do return typing inside Should |
Nullable is no longer in Base |
This PR introduces a generic lifting framework based on the "higher-order lifting" approach. Once this is rebased on top of #18484 it should allow for the following:
That is,
lift(f::F)
returns aLifted{F}
, which, when called on argumentsxs...
, lowers tolift(f, U, xs...)
, whereU
is chosen by type inference. We include the return type parameterU
as an argument tolift
for cases whenU
is invariant over many applications oflift
, e.g. when mapping somef
over a tightly typedNullableArray
. Having aLifted
type will allow us to dispatch such higher-order functions on whether or not anf
is lifted. This in turn will allow us to define, say,in a way that takes advantage of the aforementioned invariance.
This PR also implements three-valued logic semantics for lifted
&
and|
:This PR could perhaps use some fine-tuning with respect to the use of splatting and whether or not to
@inline
thelift(f, U, xs...)
definitions.cc: @johnmyleswhite @nalimilan @quinnj @davidanthoff @JeffBezanson @TotalVerb @vchuravy
EDIT: Tests should pass once this is rebased.