Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia objects are falsely marked as subtypes of most Python abstract base classes. #390

Open
LilithHafner opened this issue Oct 23, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@LilithHafner
Copy link
Contributor

LilithHafner commented Oct 23, 2023

Affects: JuliaCall

Describe the bug

An example of this that I ran into is that everything is reported as itterable, even when it cannot be iterated.

This looks pretty bad on the surface:

>>> import collections
>>> isinstance([1,2,3], collections.abc.Iterable)
True
>>> isinstance(1, collections.abc.Iterable)
False
>>> isinstance(sum, collections.abc.Iterable)
False
>>> f = jl.seval("f(x) = x+1")
>>> isinstance(f, collections.abc.Iterable)
True
>>> s = jl.seval(":a")
>>> isinstance(s, collections.abc.Iterable)
True
>>> f
Julia: f (generic function with 1 method)
>>> s
Julia: :a

And indeed it causes bugs when passing non-iterable Julia objects to Python functions which check for iterability and then, if iterable, iterate. For example, this function in PyTorch:

https://github.com/pytorch/pytorch/blob/185e76238d9f634c9c8d6fc6be75aa4fc71e04d3/torch/autograd/gradcheck.py#L78-L87

@LilithHafner LilithHafner added the bug Something isn't working label Oct 23, 2023
@LilithHafner
Copy link
Contributor Author

This is because collections.abc.Iterable is an abstract base class ("abc") which means being an instance of it is defined in terms of having certain methods (analogous to Julia's' hasmethod). In Julia, it would be unidiomatic to define subtyping or even traits in this way because it is common to define generic methods that sometimes fail but are always applicable.

Apparently, this is not the case in python, so by defining the appropriate iteration methods on all wrapped Julia objects, regardless, even, of weather or not they hasmethod those methods, we falsely advertize all Julia objects as members of many abstract base classes.

@LilithHafner LilithHafner changed the title Everything is reported as itterable, even when it cannot be iterated Julia objects are falsely marked as subtypes of most Python abstract base classes. Oct 23, 2023
@cjdoris
Copy link
Collaborator

cjdoris commented Oct 24, 2023

For completeness, the full list of collections.abc types that juliacall.AnyValue is an instance of is Container, Hashable, Iterable, Reversible, Sized, Callable, Collection, Buffer.

This corresponds to the methods __contains__, __hash__, __iter__, __reversed__, __len__ and __buffer__ which we always define, even if they throw for most types.

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 24, 2023

My main observation here is that if you want isinstance(x, Iterable) to work reliably for Julia objects x, then you need a reliable way in Julia to test if an object is iterable, which doesn't exist.

@LilithHafner
Copy link
Contributor Author

I agree that we can't get isinstance to be right all the time. However, it would help my use case (working with existing Python packages that don't accommodate this weakness) if we were wrong a little less often.

While there is no reliable isinstance or hasmethod for Julia objects, there is an unreliable hasmethod which is a strict improvement and solves the examples in the OP.

julia> hasmethod(iterate, Tuple{Array})
true

julia> hasmethod(iterate, Tuple{typeof(sum)})
false

julia> hasmethod(iterate, Tuple{Symbol})
false

Given a Julia known Julia object, it is possible to wrap it in a zero-runtime-overhead Julia object that defines the appropriate methods so that hasmethod is accurate for that object. If PythonCall/JuliaCall use hasmethod, then this can be used as a workaround. Right now the only workaround I know of is to wrap the Julia object in a Python object which, in my case, results in runtime overhead.

Respecting hasmethod will fix many cases of this bug and enable a zero-overhead workaround in the remaining cases.

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 25, 2023

Honestly there's so much runtime overhead in doing anything with wrapped Julia objects I wouldn't worry about that :)

@LilithHafner
Copy link
Contributor Author

My usecase is passing the object into a higher order Julia function. AFAICT the wrapper is stripped away and there is zero overhead with respect to pure Julia.

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 27, 2023

Got you. I think a future version will consider any Python object implementing __julia__() or something as a wrapper of the Julia object it returns. Then you can write your own wrappers with whatever interface you like (e.g. not supporting iteration) but which are simply unwrapped by default when passed back to Julia.

@cjdoris
Copy link
Collaborator

cjdoris commented Oct 27, 2023

Having __iter__ present or not depending on the Julia type requires creating different Python types, whereas currently there is essentially only generic one.

I think I'm in favour of keeping the current one-type-for-all approach (in the name of genericness) plus the above __julia__ trait based way to provide custom wrapper types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants