You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From scikit-learn/scikit-learn#27506 (comment) this was noticed when Debian started with numpy 1.26 which introduced SIMD and also the array is not on a 64 bit boundary.
I don't know too much about the SIMD intricacies, but in an ideal world, it would be nice to understand if scikit-learn Cyhon code is to blame for not creating an array on a 64bit boundary (should it be 64 bit by the way on a 32bit OS?) or numpy should be able to deal with unaligned arrays better ...
From scikit-learn/scikit-learn#27506 (comment) here is a way to reproduce. I am guessing there is a way to reproduce without scikit-learn, for example with a memmap and an offset (we had unaligned memory array issues in joblib a while ago where I learned this the hard way, see joblib/joblib#563 if you are really curious).
You can tell on the second-line the result is garbage in particular it has a NaN (which is what actually allowed us to notice on scikit-learn I think).
I haven't taken the time to put together a stand-alone snippet with only numpy, let me know if that would help, and I can try to do this. Also I am guessing that this reproduces on Numpy main but I haven't tried ...
The text was updated successfully, but these errors were encountered:
lesteve
changed the title
BUG: array negation is wrong on 32 bit OS (likely related to SIMD on non memory unaligned array)
BUG: array negation is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array)
Jun 21, 2024
lesteve
changed the title
BUG: array negation is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array)
BUG: array negation (with minus sign) is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array)
Jun 21, 2024
lesteve
changed the title
BUG: array negation (with minus sign) is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array)
BUG: array negation (i.e. minus sign) is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array)
Jun 21, 2024
Reproducible also with (presumably there is no need for the slicing if we would use f8,S4 as the dtype):
FROM docker.io/debian:sid-slim
RUN apt-get clean && apt-get update && apt-get install -y --no-install-recommends python3-numpy
RUN python3 -c '\
import numpy as np;\
\
impurities = np.random.random(5*4).astype("f8,S3")["f0"][::4]; \
print("f8 alignment:", np.dtype("f8").alignment); \
print(impurities, impurities.flags); \
print(-impurities); \
'
Which shows the problem that the alinment of f8 on the 32bit system is set to 4 by the compiler, and that is what NumPy guarantees for the inner-loop. But it seems the SIMD loop assumes that NumPy guarantees an alignment of 8 here.
Describe the issue:
This was originally noticed in scikit-learn reported in scikit-learn/scikit-learn#27506.
From scikit-learn/scikit-learn#27506 (comment) this was noticed when Debian started with numpy 1.26 which introduced SIMD and also the array is not on a 64 bit boundary.
I don't know too much about the SIMD intricacies, but in an ideal world, it would be nice to understand if scikit-learn Cyhon code is to blame for not creating an array on a 64bit boundary (should it be 64 bit by the way on a 32bit OS?) or numpy should be able to deal with unaligned arrays better ...
From scikit-learn/scikit-learn#27506 (comment) here is a way to reproduce. I am guessing there is a way to reproduce without scikit-learn, for example with a memmap and an offset (we had unaligned memory array issues in joblib a while ago where I learned this the hard way, see joblib/joblib#563 if you are really curious).
Reproduce the code example:
docker command
Dockerfile
Output
You can tell on the second-line the result is garbage in particular it has a NaN (which is what actually allowed us to notice on scikit-learn I think).
Python and NumPy Versions:
Runtime Environment:
Context for the issue:
I haven't taken the time to put together a stand-alone snippet with only numpy, let me know if that would help, and I can try to do this. Also I am guessing that this reproduces on Numpy
main
but I haven't tried ...The text was updated successfully, but these errors were encountered: