Complex multiplication returns different results if wrapped in np array #26740

Svalorzen · 2024-06-18T11:02:11Z

Describe the issue:

I'm trying to restrict the problem, but for now it seems that with newer numpy versions on x64 certain complex products return different results depending on whether the operands are wrapped in a numpy array or not. Example code and some results I got below:

My machine:

1.18920906599179332e+00 3.77425653250409567e+01
1.18920906599179510e+00 3.77425653250409567e+01
False
SYS VERSION 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
PLATFORM x86_64
NP VERSION 2.0.0

Google Colab:

1.18920906599179332e+00 3.77425653250409567e+01
1.18920906599179510e+00 3.77425653250409567e+01
False
SYS VERSION 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
PLATFORM x86_64
NP VERSION 1.25.2

Some passing results:

Python 3.9.6 / NumPy 2.0 / arm64 (Apple Silicon):
1.18920906599179510e+00 3.77425653250409567e+01
1.18920906599179510e+00 3.77425653250409567e+01
True

Python 3.9.6 / NumPy 2.0.0 / x86_64 (Rosetta emulation under Apple Silicon):
1.18920906599179332e+00 3.77425653250409567e+01
1.18920906599179332e+00 3.77425653250409567e+01
True

Some online REPL:

1.18920906599179332e+00 3.77425653250409567e+01
1.18920906599179332e+00 3.77425653250409567e+01
True
SYS VERSION 3.8.2 (default, Mar 13 2020, 10:14:16)  [GCC 9.3.0]
PLATFORM x86_64
NP VERSION 1.18.2

Reproduce the code example:

import platform
import numpy as np
import sys
def ff(x):
    return f"{x:.17e}"

x = (-0.7173752049805099+0.6966870282122177j)
y = (25.441646575927734-27.90408706665039j)
xy = x * y
print(ff(xy.real), ff(xy.imag))
xxyy = (np.array([x]) * np.array([y]))[0]
print(ff(xxyy.real), ff(xxyy.imag))
print(xy == xxyy)
print("SYS VERSION", sys.version)
print("PLATFORM", platform.processor())
print("NP VERSION", np.version.version)

Error message:

No response

Python and NumPy Versions:

2.0.0
3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]

Runtime Environment:

[{'numpy_version': '2.0.0',
'python': '3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]',
'uname': uname_result(system='Linux', node='Luchino', release='5.15.0-112-generic', version='#122-Ubuntu SMP Thu May 23 07:48:21 UTC 2024', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'Zen',
'filepath': '/home/svalorzen/Tests/python/venv/lib/python3.10/site-packages/numpy.libs/libscipy_openblas64_-99b71e71.so',
'internal_api': 'openblas',
'num_threads': 32,
'prefix': 'libscipy_openblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.27'}]

Context for the issue:

I am porting some Python code to C++, and it is quite important that the output stays consistent. In my case, my C++ code produces the result of the "normal" Python multiplication, while the Python code uses numpy arrays and so generates the other result. This breaks my tests, alongside my sanity.

The text was updated successfully, but these errors were encountered:

ngoldbaum · 2024-06-18T16:29:26Z

NumPy only guarantees 4 ULP precision for floating point results, so you can't expect NumPy's results to always agree with the results of another math library. That said, it looks like the result you have is 8 ulp off:

In [10]: sympy.N(x*y, 50)
Out[10]: 1.189209065991793323746605892665684223175048828125 + 37.7425653250409567363021778874099254608154296875*I
In [11]: (1.18920906599179510e+00 - sympy.re(sympy.N(x*y, 50)))/math.ulp(1.18920906599179510e+00)
Out[11]: 8.0000000000000000000000000000000000000000000000000

Ping @r-devulap is my math right? Something else going on?

danra · 2024-06-18T18:36:44Z

NumPy only guarantees 4 ULP precision for floating point results

Interesting, mind providing the reference?

mhvk · 2024-06-18T18:57:37Z

For multiplication, I would think we do better than 4 ULP, but really the relevant check one needs to do is,

# 1ULP for the larger number that is involved
2**-52 * 25
# 5.551115123125783e-15
xy.real -xxyy.real
# np.float64(-1.7763568394002505e-15)

This is relevant since in the end one does x.real*y.real - x.imag*y.imag and both y.real and y.imag are fairly big.

Note that I had expected to find some way by reordering operations to reproduce the difference, but so far I failed; it does seem to have something to do with the different implementation for array and scalars.

mhvk · 2024-06-18T19:02:42Z

For reference, the implementation certainly looks bog-standard:

numpy/numpy/_core/src/umath/loops.c.src

Lines 2035 to 2046 in b8d1012

    
           NPY_NO_EXPORT void 
        
           @TYPE@_multiply(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func)) 
        
           { 
        
               BINARY_LOOP { 
        
                   const @ftype@ in1r = ((@ftype@ *)ip1)[0]; 
        
                   const @ftype@ in1i = ((@ftype@ *)ip1)[1]; 
        
                   const @ftype@ in2r = ((@ftype@ *)ip2)[0]; 
        
                   const @ftype@ in2i = ((@ftype@ *)ip2)[1]; 
        
                   ((@ftype@ *)op1)[0] = in1r*in2r - in1i*in2i; 
        
                   ((@ftype@ *)op1)[1] = in1r*in2i + in1i*in2r; 
        
               } 
        
           }

But possibly I'm missing some vectorization or so.

Scalars are done at

numpy/numpy/_core/src/umath/scalarmath.c.src

Lines 432 to 438 in b8d1012

    
           static inline int 
        
           @name@_ctype_multiply( @type@ a, @type@ b, @type@ *out) 
        
           { 
        
               npy_csetreal@c@(out, npy_creal@c@(a) * npy_creal@c@(b) - npy_cimag@c@(a) * npy_cimag@c@(b)); 
        
               npy_csetimag@c@(out, npy_creal@c@(a) * npy_cimag@c@(b) + npy_cimag@c@(a) * npy_creal@c@(b)); 
        
               return 0; 
        
           }

ngoldbaum · 2024-06-18T19:24:08Z

mind providing the reference?

I'm not sure there is one, maybe there should be somewhere. It came along with adopting the Intel SVML library for writing SIMD-optimized loops. If you search for issues containing the phrase "4 ulp" you'll see other developers talking about this over the years.

WarrenWeckesser · 2024-06-18T19:31:48Z

Note that neither of the displayed real parts of the computed values are the true expected real value. The best value is 1.1892090659917947:

In [74]: x
Out[74]: (-0.7173752049805099+0.6966870282122177j)

In [75]: y
Out[75]: (25.441646575927734-27.90408706665039j)

In [76]: from mpmath import mp

In [77]: mp.dps = 200

In [78]: xmp = mp.mpc(x)

In [79]: ymp = mp.mpc(y)

In [80]: float((xmp*ymp).real)
Out[80]: 1.1892090659917947

The observed value 1.1892090659917933 is the one that is furthest from the ideal, by 6 ULP:

In [82]: (t - 1.1892090659917933)/np.spacing(t)
Out[82]: 6.0

Svalorzen · 2024-06-18T19:44:55Z

If it can help in any way, the 1.1892090659917933 result is what I get on my local machine with C++ using Eigen, so at least for that value there's another platform that kind of agrees with it.

As an addition, I've been told that xxxyyy = np.cdouble(x)*np.cdouble(y) might also go one way or the other. On my machine I get the 933 result; on a friend's he gets 951 (we both get 933 with the "pure" python product). Again, don't know if this information can help narrow anything down.

seberg · 2024-06-18T19:58:37Z

The 4 ULP SVML thing is partially a thing of the past unless I am confused: we did run into issues with sin/cos and eventually switched the highe rprecision version since it wasn't that much slower anyway.
(I hope we can enable fast-but less precise mode at some point to allow such optimizations more cleanly.)

In either case the true number of ULPs, depends on the system math libraries and the functions, etc. in either case, so some functions may well have less than 4 ULPs!
For complex multiplication we compose things and have a subtraction in there, so a higher ULP shouldn't be too surprising, I think.

FWIW, we do have SIMD implementations for complex multiplication in numpy/numpy/_core/src/umath/loops_arithm_fp.dispatch.c.src, but in either case the only difference I think will be use of multiply-add (or similar) fused instructions which should increase precision if anything.

So given that as Warren said the less precise version is the scalar one that has existed forever, I don't think there is anything to do here.

seberg · 2024-06-18T19:59:42Z

I don't mind if we try to use the SIMD fused ops also for scalars, but we don't get to a point where results between Python and numpy or comparing different platforms will be identical that way.

mhvk · 2024-06-19T06:08:03Z

Interesting that the SIMD one is a bit more precise. Agree that we cannot do much here - given the numbers involved, this is a 1 ULP difference.

rgommers · 2024-06-19T06:59:14Z

The 4 ULP SVML thing is partially a thing of the past unless I am confused: we did run into issues with sin/cos and eventually switched the highe rprecision version since it wasn't that much slower anyway.

Agreed, this is correct. We decided that 4 ULP is not acceptable by default, and switched back. Also, 4 ULP was only ever the extreme of a distribution of errors, not a regularly expected difference.

Looks like we can close this?

ngoldbaum · 2024-06-19T12:42:30Z

Thanks for clarifying the math and apologies for my misremembering our policy.

Svalorzen added the 00 - Bug label Jun 18, 2024

seberg added 33 - Question Question about NumPy usage or development and removed 00 - Bug labels Jun 18, 2024

rgommers changed the title ~~BUG: Complex multiplication returns different results if wrapped in np array~~ Complex multiplication returns different results if wrapped in np array Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complex multiplication returns different results if wrapped in np array #26740

Complex multiplication returns different results if wrapped in np array #26740

Svalorzen commented Jun 18, 2024

ngoldbaum commented Jun 18, 2024

danra commented Jun 18, 2024

mhvk commented Jun 18, 2024

mhvk commented Jun 18, 2024

ngoldbaum commented Jun 18, 2024

WarrenWeckesser commented Jun 18, 2024 •

edited

Loading

Svalorzen commented Jun 18, 2024 •

edited

Loading

seberg commented Jun 18, 2024

seberg commented Jun 18, 2024

mhvk commented Jun 19, 2024

rgommers commented Jun 19, 2024

ngoldbaum commented Jun 19, 2024 •

edited

Loading

Complex multiplication returns different results if wrapped in np array #26740

Complex multiplication returns different results if wrapped in np array #26740

Comments

Svalorzen commented Jun 18, 2024

Describe the issue:

Reproduce the code example:

Error message:

Python and NumPy Versions:

Runtime Environment:

Context for the issue:

ngoldbaum commented Jun 18, 2024

danra commented Jun 18, 2024

mhvk commented Jun 18, 2024

mhvk commented Jun 18, 2024

ngoldbaum commented Jun 18, 2024

WarrenWeckesser commented Jun 18, 2024 • edited Loading

Svalorzen commented Jun 18, 2024 • edited Loading

seberg commented Jun 18, 2024

seberg commented Jun 18, 2024

mhvk commented Jun 19, 2024

rgommers commented Jun 19, 2024

ngoldbaum commented Jun 19, 2024 • edited Loading

WarrenWeckesser commented Jun 18, 2024 •

edited

Loading

Svalorzen commented Jun 18, 2024 •

edited

Loading

ngoldbaum commented Jun 19, 2024 •

edited

Loading