Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: array negation (i.e. minus sign) is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array) #26775

Open
lesteve opened this issue Jun 21, 2024 · 1 comment
Labels
00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery
Milestone

Comments

@lesteve
Copy link
Contributor

lesteve commented Jun 21, 2024

Describe the issue:

This was originally noticed in scikit-learn reported in scikit-learn/scikit-learn#27506.

From scikit-learn/scikit-learn#27506 (comment) this was noticed when Debian started with numpy 1.26 which introduced SIMD and also the array is not on a 64 bit boundary.

I don't know too much about the SIMD intricacies, but in an ideal world, it would be nice to understand if scikit-learn Cyhon code is to blame for not creating an array on a 64bit boundary (should it be 64 bit by the way on a 32bit OS?) or numpy should be able to deal with unaligned arrays better ...

From scikit-learn/scikit-learn#27506 (comment) here is a way to reproduce. I am guessing there is a way to reproduce without scikit-learn, for example with a memmap and an offset (we had unaligned memory array issues in joblib a while ago where I learned this the hard way, see joblib/joblib#563 if you are really curious).

Reproduce the code example:

docker command

docker build --progress plain --platform i386 .

Dockerfile

FROM docker.io/debian:sid-slim

RUN apt-get update && apt-get install -y --no-install-recommends python3-sklearn
RUN python3 -c '\
from sklearn.tree import DecisionTreeClassifier; \
\
X = [[-2, -1], [-1, -1], [-1, -2], [1, 1], [1, 2], [2, 1]]; \
y2 = [[-1, 1], [-1, 1], [-1, 1], [1, 2], [1, 2], [1, 3]]; \
w = [1, 1, 1, 0.5, 0.5, 0.5]; \
\
clf = DecisionTreeClassifier(max_depth=2, min_samples_split=2, criterion="gini", random_state=2); \
clf = clf.fit(X, y2, sample_weight=w); \
impurity = clf.tree_.impurity; \
print("impurity :", impurity); \
print("-impurity:", -impurity); \
'

Output

impurity : [0.4691358  0.         0.22222222 0.         0.        ]
-impurity: [-4.69135802e-001 -1.59149684e-314 -1.50000000e+000 -2.12199579e-314 nan]

You can tell on the second-line the result is garbage in particular it has a NaN (which is what actually allowed us to notice on scikit-learn I think).

Python and NumPy Versions:

1.26.4
3.11.9 (main, Apr 10 2024, 13:16:36) [GCC 13.2.0]

Runtime Environment:

[{'numpy_version': '1.26.4',
  'python': '3.11.9 (main, Apr 10 2024, 13:16:36) [GCC 13.2.0]',
  'uname': uname_result(system='Linux', node='buildkitsandbox', release='6.9.5-arch1-1', version='#1 SMP PREEMPT_DYNAMIC Sun, 16 Jun 2024 19:06:37 +0000', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2'],
                     'found': ['SSE3',
                                'SSSE3',
                                'SSE41',
                                'POPCNT',
                                'SSE42',
                                'AVX',
                                'F16C',
                                'FMA3',
                                'AVX2'],
                      'not_found': ['AVX512F',
                                    'AVX512CD',
                                    'AVX512_KNL',
                                    'AVX512_KNM',
                                    'AVX512_SKX',
                                    'AVX512_CLX',
                                    'AVX512_CNL',
                                    'AVX512_ICL',
                                    'AVX512_SPR']}}]

Context for the issue:

I haven't taken the time to put together a stand-alone snippet with only numpy, let me know if that would help, and I can try to do this. Also I am guessing that this reproduces on Numpy main but I haven't tried ...

@lesteve lesteve changed the title BUG: array negation is wrong on 32 bit OS (likely related to SIMD on non memory unaligned array) BUG: array negation is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array) Jun 21, 2024
@lesteve lesteve changed the title BUG: array negation is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array) BUG: array negation (with minus sign) is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array) Jun 21, 2024
@lesteve lesteve changed the title BUG: array negation (with minus sign) is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array) BUG: array negation (i.e. minus sign) is wrong on 32 bit OS (likely related to SIMD on non memory-aligned array) Jun 21, 2024
@ngoldbaum ngoldbaum added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Jun 21, 2024
@seberg
Copy link
Member

seberg commented Jun 27, 2024

Reproducible also with (presumably there is no need for the slicing if we would use f8,S4 as the dtype):

FROM docker.io/debian:sid-slim

RUN apt-get clean && apt-get update && apt-get install -y --no-install-recommends python3-numpy
RUN python3 -c '\
import numpy as np;\
\
impurities = np.random.random(5*4).astype("f8,S3")["f0"][::4]; \
print("f8 alignment:", np.dtype("f8").alignment); \
print(impurities, impurities.flags); \
print(-impurities); \
'

Which shows the problem that the alinment of f8 on the 32bit system is set to 4 by the compiler, and that is what NumPy guarantees for the inner-loop. But it seems the SIMD loop assumes that NumPy guarantees an alignment of 8 here.

@seberg seberg added this to the 2.0.1 release milestone Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery
Projects
None yet
Development

No branches or pull requests

3 participants