Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion from mpz to numpy longdouble is incorrect #507

Closed
pearu opened this issue Aug 25, 2024 · 10 comments · Fixed by #514
Closed

Conversion from mpz to numpy longdouble is incorrect #507

pearu opened this issue Aug 25, 2024 · 10 comments · Fixed by #514

Comments

@pearu
Copy link

pearu commented Aug 25, 2024

As in the title.

Reproducer (using gmpy2 version 2.1.2):

>>> i = 5579686107214117131790972086716881
>>> numpy.longdouble(gmpy2.mpz(i))  # wrong result
5.5796861072141166625e+33
>>> numpy.longdouble(i)             # expected result
5.579686107214117132e+33

Notice that

>>> numpy.longdouble(numpy.double(i))
5.5796861072141166625e+33

that is, it looks like the conversion mpz->longdouble uses mpz->double->longdouble while it should use mpz->int->longdouble.

@skirpichev
Copy link
Contributor

I don't think it's a gmpy2 issue.

numpy.longdouble constructor just don't call __int__() dunder at all:

>>> import gmpy2, numpy
>>> class Spam:
...     def __int__(self):
...         return 123
... 
>>> numpy.longdouble(Spam())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: float() argument must be a string or a real number, not 'Spam'

@skirpichev
Copy link
Contributor

Ah, I see. This will work if gmpy2 types will have __array__() dunder.

@pearu, is this The Right Way to coerce scalars like mpz to numpy datatypes?

@pearu
Copy link
Author

pearu commented Aug 26, 2024

@skirpichev , __array__ protocol certainly provides a way to coerce mpz to longdouble:

import gmpy2
import numpy

class BigIntArray:

    def __init__(self, obj):
        self.obj = obj

    def __array__(self, dtype):
        return numpy.array(int(self.obj), dtype=dtype)

i = 5579686107214117131790972086716881
    
x1 = numpy.longdouble(BigIntArray(i))
x2 = numpy.longdouble(BigIntArray(gmpy2.mpz(i)))
print(x1, x2)  # outputs 5.579686107214117132e+33 5.579686107214117132e+33

@pearu
Copy link
Author

pearu commented Aug 26, 2024

Notice that the __array__ protocol adds numpy dependency to gmpy2. Although, this could be made a runtime dependency by using:

    def __array__(self, dtype):
        import numpy
        return numpy.array(int(self.obj), dtype=dtype)

@skirpichev
Copy link
Contributor

I think, that a hard dependence on the numpy is no-go.

Runtime dependency will a kinda work, but... probably this will be to slow. This might be a good idea for something like fmpz_mat (python-flint). But gmpy2 has only scalar types and I believe, that explicit type casts are better.

@pearu
Copy link
Author

pearu commented Aug 26, 2024

Runtime dependency will a kinda work, but... probably this will be to slow.

IMHO, this is acceptable as the issue is about silent incorrectness. In fact, repeated import is equivalent to dictionary access anyway:

In [8]: import numpy

In [9]: %timeit import numpy
64 ns ± 0.114 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [10]: d = {"foo": 2}

In [11]: %timeit d['foo']
24.9 ns ± 0.124 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

while creating longdouble is more expensive than import:

In [24]: %timeit numpy.longdouble(1)
319 ns ± 1.39 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

@skirpichev
Copy link
Contributor

creating longdouble is more expensive than import

(Hmm, I would guess it's because npy_longdouble_from_PyLong() don't do it numerically.)

On another hand, why numpy doesn't call __int__() dunder, if it's provided? FYI: feature was implemented in numpy/numpy#9968.

@pearu
Copy link
Author

pearu commented Aug 26, 2024

On another hand, why numpy doesn't call __int__() dunder, if it's provided?

It should not call __int__ when mpz provides __float__. And it would not call __float__ when __array__ is going to be provided.

@skirpichev
Copy link
Contributor

Sorry, I meant __index__(). This looks like python/cpython#120950.

skirpichev added a commit to skirpichev/gmpy that referenced this issue Sep 13, 2024
skirpichev added a commit to skirpichev/gmpy that referenced this issue Sep 13, 2024
skirpichev added a commit to skirpichev/gmpy that referenced this issue Sep 13, 2024
skirpichev added a commit to skirpichev/gmpy that referenced this issue Sep 13, 2024
skirpichev added a commit to skirpichev/gmpy that referenced this issue Sep 13, 2024
@skirpichev
Copy link
Contributor

pr is ready for review: #514

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants