Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore NumPy 2 compatibility #742

Closed
jakirkham opened this issue Jun 25, 2024 · 23 comments · Fixed by #764
Closed

Explore NumPy 2 compatibility #742

jakirkham opened this issue Jun 25, 2024 · 23 comments · Fixed by #764
Milestone

Comments

@jakirkham
Copy link
Member

jakirkham commented Jun 25, 2024

Would be good to explore NumPy 2 compatibility in cuCIM and see what changes are needed to support NumPy 1 & 2.

@jakirkham jakirkham added this to the v24.08.00 milestone Jun 25, 2024
@jakirkham
Copy link
Member Author

cc @grlee77 (for vis)

@grlee77
Copy link
Contributor

grlee77 commented Jul 9, 2024

Current issues running cuCIM with NumPy 2.0 and CuPy 13.2

I ran the cuCIM test suite locally with NumPy 2.0 and CuPy 13.2. The remaining compatibility issues encountered are documented here:

1.) use of cupy.full seems broken due to use of numpy.can_cast. Here is an example traceback for a function using it

measure/_moments.py:423: in moments_normalized
    nu = cp.full(mu.shape, cp.nan, dtype=mu.dtype)
/home/grelee/mambaforge/envs/np2/lib/python3.10/site-packages/cupy/_creation/basic.py:325: in full
    cupy.copyto(a, fill_value, casting='unsafe')
/home/grelee/mambaforge/envs/np2/lib/python3.10/site-packages/cupy/_manipulation/basic.py:38: in copyto
    can_cast = numpy.can_cast(src, dst.dtype, casting)

Workaround: modify this line to always set can_cast = True (this works for purposes of cuCIM, but isn't the right general solution)

2.) use of cp.random.seed(1234) with an integer seed also fails due to numpy.can_cast

metrics/tests/test_structural_similarity.py:19: in <module>
    cp.random.seed(1234)
/home/grelee/mambaforge/envs/np2/lib/python3.10/site-packages/cupy/random/_generator.py:1274: in seed
    get_random_state().seed(seed)
/home/grelee/mambaforge/envs/np2/lib/python3.10/site-packages/cupy/random/_generator.py:813: in seed
    numpy.asarray(seed).astype(numpy.uint64, casting='safe'))
E   TypeError: Cannot cast scalar from dtype('int64') to dtype('uint64') according to the rule 'safe'

Workaround: remove the casting='safe' kwarg passed to astype (this works for purposes of cuCIM, but isn't the right general solution)

3.) output array dtype test failure in 3 out of 4 test_chan_vese_extended_output cases

segmentation/tests/test_chan_vese.py::test_chan_vese_extended_output[uint8] FAILED                                                                                                                 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
segmentation/tests/test_chan_vese.py:48: in test_chan_vese_extended_output
    assert result[1].dtype == float_dtype
E   AssertionError: assert dtype('float64') == <class 'numpy.float32'>
E    +  where dtype('float64') 

This can be fixed by applying the following patch to cuCIM

diff --git a/python/cucim/src/cucim/skimage/segmentation/_chan_vese.py b/python/cucim/src/cucim/skimage/segmentation/_chan_vese.py
index 789518cb..edbbc2ae 100644
--- a/python/cucim/src/cucim/skimage/segmentation/_chan_vese.py
+++ b/python/cucim/src/cucim/skimage/segmentation/_chan_vese.py
@@ -429,6 +429,7 @@ def chan_vese(
         energies = []
     phivar = tol + 1
 
+    dt = cp.asarray(dt, dtype=float_dtype)
     while phivar > tol and i < max_num_iter:
         # Save old level set values
         oldphi = phi

4. One tolerance test failure in stain normalization

tests/unit/core/test_stain_normalizer.py:140: in test_result_value
    cp.testing.assert_allclose(result, expected)
/home/grelee/mambaforge/envs/np2/lib/python3.10/site-packages/cupy/testing/_array.py:24: in assert_allclose
    numpy.testing.assert_allclose(
/home/grelee/mambaforge/envs/np2/lib/python3.10/contextlib.py:79: in inner
    return func(*args, **kwds)
E   AssertionError: 
E   Not equal to tolerance rtol=1e-07, atol=0
E   
E   Mismatched elements: 1 / 6 (16.7%)
E   Max absolute difference among violations: 1.09072533e-07
E   Max relative difference among violations: 5.78286203e-07
E    ACTUAL: array([[0.707107, 0.188614],
E          [0.      , 0.      ],
E          [0.707107, 0.982051]], dtype=float32)
E    DESIRED: array([[0.707107, 0.188613],
E          [0.      , 0.      ],
E          [0.707107, 0.982051]])

Workaround: will pass if the relative tolerance is bumped to rtol=1e-6 with this patch

diff --git a/python/cucim/tests/unit/core/test_stain_normalizer.py b/python/cucim/tests/unit/core/test_stain_normalizer.py
index aa969c52..34a44384 100644
--- a/python/cucim/tests/unit/core/test_stain_normalizer.py
+++ b/python/cucim/tests/unit/core/test_stain_normalizer.py
@@ -137,7 +137,7 @@ class TestStainExtractorMacenko:
                 stain_extraction_pca(image)
         else:
             result = stain_extraction_pca(image)
-            cp.testing.assert_allclose(result, expected)
+            cp.testing.assert_allclose(result, expected, rtol=1e-6)

@jakirkham
Copy link
Member Author

Thanks Greg! 🙏

cc @seberg (for awareness)

@grlee77
Copy link
Contributor

grlee77 commented Jul 9, 2024

Looks like the first issue above (cupy.full) is resolved by
cupy/cupy#8408

and the second (cp.random.seed) seems to be in progress in
cupy/cupy#8407

The last two can be resolved within cuCIM itself

@jakirkham
Copy link
Member Author

Thanks Greg! 🙏

Tweaked the second link in your comment to reference the relevant PR

rapids-bot bot pushed a commit that referenced this issue Jul 11, 2024
This MR resolves issues 3 and 4 reported [here](#742 (comment)) when testing with NumPy 2.0 and CuPy 13.2 locally (issues 1 and 2 will be addressed in CuPy itself)
#742 (comment)

For now, leave NumPy pinning as-is. Tests should continue to pass with NumPy 1.x

Authors:
  - Gregory Lee (https://github.com/grlee77)

Approvers:
  - https://github.com/jakirkham

URL: #746
@jakirkham
Copy link
Member Author

Is there anything left to do on this one? Or is it just waiting for a new CuPy release with the referenced fixes?

@grlee77
Copy link
Contributor

grlee77 commented Jul 16, 2024

Is there anything left to do on this one? Or is it just waiting for a new CuPy release with the referenced fixes?

I still need to make one small PR here with the fixes suggested in the patches for items 3 and 4 in #742 (comment). That will be fine to do for 24.08 as it will also be compatible with NumPy 1.x

@jakirkham
Copy link
Member Author

Ah ok. Thought these were resolved in PR ( #746 ). What else is still needed to address 3 & 4?

@grlee77
Copy link
Contributor

grlee77 commented Jul 17, 2024

sorry, I forgot that I opened #746 before going on vacation. Everything on our end should be addressed then!

@jakirkham
Copy link
Member Author

No worries. Great! 🎉

Will leave open to remind us to retest once the CuPy fixes are out

@jakirkham
Copy link
Member Author

Would it be possible to retest with the CuPy 13.3.0 dev Conda packages in this PR: conda-forge/cupy-feedstock#272 ?

@seberg
Copy link
Contributor

seberg commented Aug 1, 2024

I am running the 24.10 nightlies with NumPy 2.0.1 (pip) and CuPy 13.3.0dev30 from the feedstock. (Had one issue with sphinx installed during error collection, but unrelated warning that may need suppressing). Had to also update pip install -U scikit-image.

Looks all fine (at 20%), I assume it'll finish successfully and post if there is a problem. Just posting early to avoid possible unnecessary duplicate work.

@seberg
Copy link
Contributor

seberg commented Aug 1, 2024

OK, I did ran into an issue, but I don't think it is related:

FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_flat_level_set[float32] - cupy.cuda.compiler.CompileException: /tmp/tmpqk2f_4k5/3cbff8c95978e1a1c3f54e1162841b90dca1429a.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_flat_level_set[float64] - cupy.cuda.compiler.CompileException: /tmp/tmpxbs0ybzd/80dc19d2b985054e167877a3e6a51322b6c7f32d.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_small_disk_level_set - cupy.cuda.compiler.CompileException: /tmp/tmp6p3px9l8/80dc19d2b985054e167877a3e6a51322b6c7f32d.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_simple_shape - cupy.cuda.compiler.CompileException: /tmp/tmpc1lyovlz/80dc19d2b985054e167877a3e6a51322b6c7f32d.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_extended_output[uint8] - cupy.cuda.compiler.CompileException: /tmp/tmp6hu4sono/3cbff8c95978e1a1c3f54e1162841b90dca1429a.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_extended_output[float16] - cupy.cuda.compiler.CompileException: /tmp/tmpa2re68so/3cbff8c95978e1a1c3f54e1162841b90dca1429a.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_extended_output[float32] - cupy.cuda.compiler.CompileException: /tmp/tmpir8vpjiz/3cbff8c95978e1a1c3f54e1162841b90dca1429a.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_extended_output[float64] - cupy.cuda.compiler.CompileException: /tmp/tmpup7m9iz2/80dc19d2b985054e167877a3e6a51322b6c7f32d.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_remove_noise - cupy.cuda.compiler.CompileException: /tmp/tmp0h58gkt_/80dc19d2b985054e167877a3e6a51322b6c7f32d.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_gap_closing - cupy.cuda.compiler.CompileException: /tmp/tmp7b44nftg/80dc19d2b985054e167877a3e6a51322b6c7f32d.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"
FAILED src/cucim/skimage/segmentation/tests/test_chan_vese.py::test_chan_vese_blank_image - cupy.cuda.compiler.CompileException: /tmp/tmppwne918q/80dc19d2b985054e167877a3e6a51322b6c7f32d.cubin.cu(5): catastrophic error: cannot open source file "cooperative_groups.h"

@jakirkham
Copy link
Member Author

Thanks Sebastian! 🙏

Just to confirm, that is using the 24.08 nightlies of cuCIM?

Asking as I would like to confirm PR ( #746 ), which may have a relevant fix, is included

@seberg
Copy link
Contributor

seberg commented Aug 1, 2024

This was with the 24.10.00a2 nightlies, so yeah should be included.

@jakirkham
Copy link
Member Author

Ok thanks! 🙏

Could you please try installing cuda-cudart-dev and rerunning those tests?

@seberg
Copy link
Contributor

seberg commented Aug 1, 2024

Already have it installed, but 13.2 fails the same way in these tests (plus the extra failures).
One thing I did was that I haphazardly changed the channel order in the cucim env.yaml to prefer the nightly one, dunno if that might cause this (or some other cuda version incompatibility?).

@grlee77
Copy link
Contributor

grlee77 commented Aug 2, 2024

Thanks for making the Cupy 13.3.0 dev packages @seberg. I tried in a Python 3.10 environment with Numpy 2.0.1 and all tests were passing.

I did not use any env.yaml file but just manually did mamba install numpy matplotlib scipy scikit-image pytest, etc. and did a

For cuCIM itself I was on a 24.08 development branch and just installed an in-place build of the Python libs from the python/cucim folder using

pip install rapids-build-backend
pip install -e . -v --no-build-isolation --no-deps

This is the output of mamba list for me:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
alsa-lib                  1.2.12               h4ab18f5_0    conda-forge
aom                       3.9.1                hac33072_0    conda-forge
asttokens                 2.4.1              pyhd8ed1ab_0    conda-forge
attr                      2.5.1                h166bdaf_1    conda-forge
attrs                     23.2.0                   pypi_0    pypi
blosc                     1.21.6               hef167b5_0    conda-forge
brotli                    1.1.0                hd590300_1    conda-forge
brotli-bin                1.1.0                hd590300_1    conda-forge
brunsli                   0.1                  h9c3ff4c_0    conda-forge
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-blosc2                  2.15.1               hc57e6cf_0    conda-forge
ca-certificates           2024.7.4             hbcca054_0    conda-forge
cairo                     1.18.0               hbb29018_2    conda-forge
certifi                   2024.7.4           pyhd8ed1ab_0    conda-forge
charls                    2.4.2                h59595ed_0    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
contourpy                 1.2.1           py310hd41b1e2_0    conda-forge
coverage                  7.6.0           py310h5b4e0ec_0    conda-forge
cucim-cu12                24.8.0                   pypi_0    pypi
cuda-nvrtc                12.5.82              he02047a_0    conda-forge
cuda-version              12.5                 hd4f0392_3    conda-forge
cupy                      13.3.0dev30     py310h7421b7d_0    file:///home/grelee/src/public/cupy/conda_artifacts/build_artifacts
cupy-core                 13.3.0dev30     py310hfb24ac1_0    file:///home/grelee/src/public/cupy/conda_artifacts/build_artifacts
cycler                    0.12.1             pyhd8ed1ab_0    conda-forge
dav1d                     1.2.1                hd590300_0    conda-forge
dbus                      1.13.6               h5008d03_3    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.2              pyhd8ed1ab_0    conda-forge
execnet                   2.1.1              pyhd8ed1ab_0    conda-forge
executing                 2.0.1              pyhd8ed1ab_0    conda-forge
expat                     2.6.2                h59595ed_0    conda-forge
fastrlock                 0.8.2           py310hc6cd4ac_2    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 h77eed37_2    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.53.1          py310h5b4e0ec_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
gettext                   0.22.5               h59595ed_2    conda-forge
gettext-tools             0.22.5               h59595ed_2    conda-forge
giflib                    5.2.2                hd590300_0    conda-forge
glib                      2.80.3               h8a4344b_1    conda-forge
glib-tools                2.80.3               h73ef956_1    conda-forge
gputil                    1.4.0              pyh9f0ad1d_0    conda-forge
graphite2                 1.3.13            h59595ed_1003    conda-forge
gst-plugins-base          1.24.6               hbaaba92_0    conda-forge
gstreamer                 1.24.6               haf2f30d_0    conda-forge
harfbuzz                  9.0.0                hfac3d4d_0    conda-forge
icu                       73.2                 h59595ed_0    conda-forge
imagecodecs               2024.6.1        py310h51fded0_2    conda-forge
imageio                   2.34.2             pyh12aca89_0    conda-forge
importlib-metadata        8.2.0              pyha770c72_0    conda-forge
iniconfig                 2.0.0              pyhd8ed1ab_0    conda-forge
ipython                   8.26.0             pyh707e725_0    conda-forge
jedi                      0.19.1             pyhd8ed1ab_0    conda-forge
jsonschema                4.23.0                   pypi_0    pypi
jsonschema-specifications 2023.12.1                pypi_0    pypi
jxrlib                    1.1                  hd590300_3    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.5           py310hd41b1e2_1    conda-forge
krb5                      1.21.3               h659f571_0    conda-forge
lame                      3.100             h166bdaf_1003    conda-forge
lazy_loader               0.4                pyhd8ed1ab_0    conda-forge
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.40                 hf3520f5_7    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libaec                    1.1.3                h59595ed_0    conda-forge
libasprintf               0.22.5               h661eb56_2    conda-forge
libasprintf-devel         0.22.5               h661eb56_2    conda-forge
libavif16                 1.1.1                h9b56c87_0    conda-forge
libblas                   3.9.0           23_linux64_openblas    conda-forge
libbrotlicommon           1.1.0                hd590300_1    conda-forge
libbrotlidec              1.1.0                hd590300_1    conda-forge
libbrotlienc              1.1.0                hd590300_1    conda-forge
libcap                    2.69                 h0f662aa_0    conda-forge
libcblas                  3.9.0           23_linux64_openblas    conda-forge
libclang-cpp15            15.0.7          default_h127d8a8_5    conda-forge
libclang13                18.1.8          default_h9def88c_1    conda-forge
libcublas                 12.5.3.2             he02047a_0    conda-forge
libcufft                  11.2.3.61            he02047a_0    conda-forge
libcups                   2.3.3                h4637d8d_4    conda-forge
libcurand                 10.3.6.82            he02047a_0    conda-forge
libcusolver               11.6.3.83            he02047a_0    conda-forge
libcusparse               12.5.1.3             he02047a_0    conda-forge
libdeflate                1.20                 hd590300_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libevent                  2.1.12               hf998b51_1    conda-forge
libexpat                  2.6.2                h59595ed_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libflac                   1.4.3                h59595ed_0    conda-forge
libgcc-ng                 14.1.0               h77fa898_0    conda-forge
libgcrypt                 1.11.0               h4ab18f5_1    conda-forge
libgettextpo              0.22.5               h59595ed_2    conda-forge
libgettextpo-devel        0.22.5               h59595ed_2    conda-forge
libgfortran-ng            14.1.0               h69a702a_0    conda-forge
libgfortran5              14.1.0               hc5f4f2c_0    conda-forge
libglib                   2.80.3               h8a4344b_1    conda-forge
libgomp                   14.1.0               h77fa898_0    conda-forge
libgpg-error              1.50                 h4f305b6_0    conda-forge
libhwy                    1.1.0                h00ab1b0_0    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libjpeg-turbo             3.0.0                hd590300_1    conda-forge
libjxl                    0.10.3               h66b40c8_0    conda-forge
liblapack                 3.9.0           23_linux64_openblas    conda-forge
libllvm15                 15.0.7               hb3ce162_4    conda-forge
libllvm18                 18.1.8               h8b73ec9_1    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libnvjitlink              12.5.82              he02047a_0    conda-forge
libogg                    1.3.5                h4ab18f5_0    conda-forge
libopenblas               0.3.27          pthreads_hac2b453_1    conda-forge
libopus                   1.3.1                h7f98852_1    conda-forge
libpng                    1.6.43               h2797004_0    conda-forge
libpq                     16.3                 ha72fbe1_0    conda-forge
libsndfile                1.2.2                hc60ed4a_1    conda-forge
libsqlite                 3.46.0               hde9e2c9_0    conda-forge
libstdcxx-ng              14.1.0               hc0a3c3a_0    conda-forge
libsystemd0               255                  h3516f8a_1    conda-forge
libtiff                   4.6.0                h1dd3fc0_3    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libvorbis                 1.3.7                h9c3ff4c_0    conda-forge
libwebp-base              1.4.0                hd590300_0    conda-forge
libxcb                    1.16                 hd590300_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxkbcommon              1.7.0                h2c5496b_1    conda-forge
libxml2                   2.12.7               h4c95cb1_3    conda-forge
libzlib                   1.3.1                h4ab18f5_1    conda-forge
libzopfli                 1.0.3                h9c3ff4c_0    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
matplotlib                3.9.1           py310hff52083_0    conda-forge
matplotlib-base           3.9.1           py310h0b1de36_0    conda-forge
matplotlib-inline         0.1.7              pyhd8ed1ab_0    conda-forge
mpg123                    1.32.6               h59595ed_0    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
mysql-common              8.3.0                h70512c7_5    conda-forge
mysql-libs                8.3.0                ha479ceb_5    conda-forge
ncurses                   6.5                  h59595ed_0    conda-forge
networkx                  3.3                pyhd8ed1ab_1    conda-forge
nspr                      4.35                 h27087fc_0    conda-forge
nss                       3.103                h593d115_0    conda-forge
numpy                     2.0.1           py310hf9f9071_0    conda-forge
openjpeg                  2.5.2                h488ebb8_0    conda-forge
openssl                   3.3.1                h4bc722e_2    conda-forge
packaging                 24.1               pyhd8ed1ab_0    conda-forge
parso                     0.8.4              pyhd8ed1ab_0    conda-forge
pcre2                     10.44                h0f59acf_0    conda-forge
pexpect                   4.9.0              pyhd8ed1ab_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    10.4.0          py310hebfe307_0    conda-forge
pip                       24.2               pyhd8ed1ab_0    conda-forge
pixman                    0.43.2               h59595ed_0    conda-forge
pluggy                    1.5.0              pyhd8ed1ab_0    conda-forge
ply                       3.11               pyhd8ed1ab_2    conda-forge
prompt-toolkit            3.0.47             pyha770c72_0    conda-forge
psutil                    6.0.0           py310hc51659f_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pulseaudio-client         17.0                 hb77b528_0    conda-forge
pure_eval                 0.2.3              pyhd8ed1ab_0    conda-forge
pygments                  2.18.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.1.2              pyhd8ed1ab_0    conda-forge
pyproject-toml            0.0.10                   pypi_0    pypi
pyqt                      5.15.9          py310h04931ad_5    conda-forge
pyqt5-sip                 12.12.2         py310hc6cd4ac_5    conda-forge
pytest                    7.4.4              pyhd8ed1ab_0    conda-forge
pytest-cov                5.0.0              pyhd8ed1ab_0    conda-forge
pytest-lazy-fixture       0.6.3                      py_0    conda-forge
pytest-xdist              3.6.1              pyhd8ed1ab_0    conda-forge
python                    3.10.14         hd12c33a_0_cpython    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    4_cp310    conda-forge
pywavelets                1.6.0           py310h261611a_0    conda-forge
pyyaml                    6.0.1                    pypi_0    pypi
qhull                     2020.2               h434a139_5    conda-forge
qt-main                   5.15.8              h320f8da_24    conda-forge
rapids-build-backend      0.3.2                    pypi_0    pypi
rapids-dependency-file-generator 1.14.0                   pypi_0    pypi
rav1e                     0.6.6                he8a937b_2    conda-forge
readline                  8.2                  h8228510_1    conda-forge
referencing               0.35.1                   pypi_0    pypi
rpds-py                   0.19.1                   pypi_0    pypi
scikit-image              0.24.0          py310hf9f9076_1    conda-forge
scipy                     1.14.0          py310h93e2701_1    conda-forge
setuptools                71.0.4             pyhd8ed1ab_0    conda-forge
sip                       6.7.12          py310hc6cd4ac_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
snappy                    1.2.1                ha2e4443_0    conda-forge
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
svt-av1                   2.1.2                hac33072_0    conda-forge
tifffile                  2024.7.24          pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
tomlkit                   0.13.0                   pypi_0    pypi
tornado                   6.4.1           py310hc51659f_0    conda-forge
traitlets                 5.14.3             pyhd8ed1ab_0    conda-forge
typing_extensions         4.12.2             pyha770c72_0    conda-forge
tzdata                    2024a                h0c530f3_0    conda-forge
unicodedata2              15.1.0          py310h2372a71_0    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
wheel                     0.43.0             pyhd8ed1ab_1    conda-forge
xcb-util                  0.4.1                hb711507_2    conda-forge
xcb-util-image            0.4.0                hb711507_2    conda-forge
xcb-util-keysyms          0.4.1                hb711507_0    conda-forge
xcb-util-renderutil       0.3.10               hb711507_0    conda-forge
xcb-util-wm               0.4.2                hb711507_0    conda-forge
xkeyboard-config          2.42                 h4ab18f5_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.1.1                hd590300_0    conda-forge
xorg-libsm                1.2.4                h7391055_0    conda-forge
xorg-libx11               1.8.9                hb711507_1    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxrender           0.9.11               hd590300_0    conda-forge
xorg-libxxf86vm           1.1.5                h4bc722e_1    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xf86vidmodeproto     2.3.1             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
zfp                       1.0.1                hac33072_1    conda-forge
zipp                      3.19.2             pyhd8ed1ab_0    conda-forge
zlib                      1.3.1                h4ab18f5_1    conda-forge
zlib-ng                   2.2.1                he02047a_0    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge

@jakirkham
Copy link
Member Author

Thanks Sebastian and Greg! 🙏

Regarding cooperative_groups.h, this is a header that should be in cuda-cudart-dev. Not sure why that doesn't get picked up. This could be a configuration bug, packaging bug, or perhaps some upstream issue. In any event agree this is unrelated. If we are able to reproduce in a fresh environment, maybe we can open a new cupy-feedstock issue?

In any event, sounds like things are working. We just need a CuPy 13.3.0 release. So will close this out

@grlee77
Copy link
Contributor

grlee77 commented Aug 20, 2024

I found a single test failure on cuCIM with current CuPy 13.3 dev and NumPy 2.0.1. It occurs in kernel fusion code and a simplified pure CuPy reproducer is this:

import cupy as cp

delta_phi = cp.full((10, 10), 9.999999e-7, dtype=cp.float32)
dt = cp.asarray(0.5, dtype=cp.float32)
mu = cp.asarray(0.1, dtype=cp.float32)

@cp.fuse()
def _fused_kernel(dt, mu, delta_phi):
    return 1 + mu * dt * delta_phi

_fused_kernel(dt, mu, delta_phi)

which gives traceback

TypeError                                 Traceback (most recent call last)
File cupy/_core/fusion.pyx:891, in cupy._core.fusion.Fusion.__call__()

File cupy/_core/fusion.pyx:732, in cupy._core.fusion._FusionHistory.get_fusion()

Cell In[30], line 9, in _fused_kernel(dt, mu, delta_phi)
      7 @cp.fuse()
      8 def _fused_kernel(dt, mu, delta_phi):
----> 9     return 1 + mu * dt * delta_phi

File cupy/_core/fusion.pyx:198, in cupy._core.fusion._FusionVarScalar.__radd__()

File cupy/_core/_kernel.pyx:1241, in cupy._core._kernel.ufunc.__call__()

File cupy/_core/_fusion_thread_local.pyx:35, in cupy._core._fusion_thread_local.call_ufunc()

File cupy/_core/fusion.pyx:949, in cupy._core.fusion._call_ufunc()

File cupy/_core/fusion.pyx:605, in cupy._core.fusion._FusionHistory.call_ufunc()

File cupy/_core/fusion.pyx:554, in cupy._core.fusion._FusionHistory.call_ufunc.can_cast1()

TypeError: can_cast() does not support Python ints, floats, and complex because the result used to depend on the value.
This change was part of adopting NEP 50, we may explicitly allow them again in the future.

During handling of the above exception, another exception occurred:

CUDADriverError                           Traceback (most recent call last)
Cell In[30], line 11
      7 @cp.fuse()
      8 def _fused_kernel(dt, mu, delta_phi):
      9     return 1 + mu * dt * delta_phi
---> 11 _fused_kernel(dt, mu, delta_phi)

File cupy/_core/fusion.pyx:897, in cupy._core.fusion.Fusion.__call__()

File cupy/_core/new_fusion.pyx:162, in cupy._core.new_fusion.Fusion.__call__()

File cupy/_core/_fusion_kernel.pyx:353, in cupy._core._fusion_kernel.FusedKernel.execute()

File cupy/_util.pyx:64, in cupy._util.memoize.decorator.ret()

File cupy/_core/_fusion_kernel.pyx:49, in cupy._core._fusion_kernel._cuda_compile()

File cupy/_core/core.pyx:2271, in cupy._core.core.compile_with_cache()

File ~/mambaforge/envs/cucim_np2/lib/python3.10/site-packages/cupy/cuda/compiler.py:498, in _compile_module_with_cache(source, options, arch, cache_dir, extra_source, backend, enable_cooperative_groups, name_expressions, log_stream, jitify)
    494     return _compile_with_cache_hip(
    495         source, options, arch, cache_dir, extra_source, backend,
    496         name_expressions, log_stream, cache_in_memory)
    497 else:
--> 498     return _compile_with_cache_cuda(
    499         source, options, arch, cache_dir, extra_source, backend,
    500         enable_cooperative_groups, name_expressions, log_stream,
    501         cache_in_memory, jitify)

File ~/mambaforge/envs/cucim_np2/lib/python3.10/site-packages/cupy/cuda/compiler.py:583, in _compile_with_cache_cuda(source, options, arch, cache_dir, extra_source, backend, enable_cooperative_groups, name_expressions, log_stream, cache_in_memory, jitify)
    580 if _is_cudadevrt_needed(options):
    581     # for separate compilation
    582     ls = function.LinkState()
--> 583     ls.add_ptr_data(ptx, 'cupy.ptx')
    584     _cudadevrt = _get_cudadevrt_path()
    585     ls.add_ptr_file(_cudadevrt)

File cupy/cuda/function.pyx:294, in cupy.cuda.function.LinkState.add_ptr_data()

File cupy/cuda/function.pyx:295, in cupy.cuda.function.LinkState.add_ptr_data()

File cupy_backends/cuda/api/driver.pyx:179, in cupy_backends.cuda.api.driver.linkAddData()

File cupy_backends/cuda/api/driver.pyx:63, in cupy_backends.cuda.api.driver.check_status()

CUDADriverError: CUDA_ERROR_NO_BINARY_FOR_GPU: no kernel image is available for execution on the device

Oddly, it works again if parenthesis are inserted into the return statement like this

@cp.fuse()
def _fused_kernel(one, dt, mu, delta_phi):
    return 1 + mu * (dt * delta_phi)

or it works without parenthesis if a device scalar is passed in for 1

@cp.fuse()
def _fused_kernel(one, dt, mu, delta_phi):
    return one + mu * dt * delta_phi
_fused_kernel(cp.asarray(1.0, dtype=cp.float32), dt, mu, delta_phi)

@grlee77 grlee77 reopened this Aug 20, 2024
@grlee77
Copy link
Contributor

grlee77 commented Aug 20, 2024

We can pass a device scalar for the constant in cuCIM to work around the issue in that single kernel, but I wanted to report it here for completeness.

@jakirkham
Copy link
Member Author

Reproduced the same error Greg saw

FWIW also found this worked, which avoids picking a type for 1 until the type of the product is known. This may be more flexible with different types or promotion behaviors

import cupy as cp

delta_phi = cp.full((10, 10), 9.999999e-7, dtype=cp.float32)
dt = cp.asarray(0.5, dtype=cp.float32)
mu = cp.asarray(0.1, dtype=cp.float32)

@cp.fuse()
def _fused_kernel(dt, mu, delta_phi):
    r = mu * dt * delta_phi
    r += r.dtype.type(1)
    return r

@jakirkham
Copy link
Member Author

Filed as upstream issue: cupy/cupy#8536

@rapids-bot rapids-bot bot closed this as completed in #764 Aug 22, 2024
@rapids-bot rapids-bot bot closed this as completed in c004b6f Aug 22, 2024
@jakirkham jakirkham modified the milestones: v24.08.00, v24.10.00 Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants