ENH: `np.clip` as currently implemented (numpy 2.0.0) is quite hard to use #26759

seberg · 2024-06-19T19:20:35Z

I just want to add a voice to this thread that np.clip as currently implemented (numpy 2.0.0) is quite hard to use:

>>> np.clip(np.array([0, 255], dtype=np.uint8), 0, 4550)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jni/micromamba/envs/np2/lib/python3.12/site-packages/numpy/_core/fromnumeric.py", line 2247, in clip
    return _wrapfunc(a, 'clip', a_min, a_max, out=out, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jni/micromamba/envs/np2/lib/python3.12/site-packages/numpy/_core/fromnumeric.py", line 57, in _wrapfunc
    return bound(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/jni/micromamba/envs/np2/lib/python3.12/site-packages/numpy/_core/_methods.py", line 108, in _clip
    return um.clip(a, min, max, out=out, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OverflowError: Python integer 4550 out of bounds for uint8

I think most users in this scenario would expect to be able to use arrays of any type and Python ints and just have it work — there is nothing unsatisfiable about this expression, including keeping the input dtype.

When you start dealing with numpy scalars and so on, the story is indeed more complicated as noted in the above discussion, but the Python int scenario seems like an easy fix (as noted by @mhvk above) that would unlock a lot of uses already.

Sorry about the noise, I expect there will be lots in this repo this coming week. 😅 Thank you all! 🙏

Originally posted by @jni in #24976 (comment)

The text was updated successfully, but these errors were encountered:

seberg · 2024-06-19T19:23:24Z

Branching off from the other issue. In principle, if clip was a ufunc, doing such an implementation would be a bit tedious. But from the not ridiculously purist side of things, it is actually very easy

clip is implemented in Python (dispatching to minimum, maximum, or _clip). So when the dtype is a NumPy integral, and the value is Python integer, we can check for np.iinfo(dtype).max > max and the same for minimum and just set it to not passed.

That is so easy, we could possibly even consider it a bug fix, but not sure it should be.

mhvk · 2024-06-20T18:11:43Z

I agree we could start by just doing this on the python side -- the C side would be good mostly as a way to generalize what we already do for comparisons, thus serving as an example on how to do things like this more generally. But really, no reason not to just do the simple thing...

asmeurer · 2024-06-24T19:57:14Z

This is somewhat tied to the proposed change to remove type promotion #24976 (which is also necessary for array API compatibility, FYI), since not promoting min or max implies a downcast. The central issue is that downcasting via the usual "wraparound" logic leads to unexpected behavior. A more reasonable behavior for clip specifically would be to downcast by clipping. That would be surprising in most other contexts but for clip itself it should lead to expected behavior. Specifically, x = clip(a, min, max) should satisfy min <= x <= max, where the inequalities are based on pure numerical value (no wraparound from downcasting).

Python scalars erroring is basically the same issue, since ideally you'd want np.clip(np.array([0, 255], dtype=np.uint8), 0, 4550) to give the same result at np.clip(np.array([0, 255], dtype=np.uint8), np.array(0), np.array(4550)).

I think @seberg mentioned that NumPy already has similar logic somewhere for comparing int64 and uint64. For example, this gives the right answer, where a naive cast would not:

>>> np.asarray([-100], dtype=np.int64) < np.asarray([18446744073709551516 - 1], dtype=np.uint64)
array([ True])
>>> np.asarray([-100], dtype=np.int64).astype(np.uint64) < np.asarray([18446744073709551516 - 1], dtype=np.uint64)
array([False])

seberg added this to the 2.0.1 release milestone Jun 19, 2024

seberg modified the milestones: 2.0.1 release, 2.1.0 release Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: `np.clip` as currently implemented (numpy 2.0.0) is quite hard to use #26759

ENH: `np.clip` as currently implemented (numpy 2.0.0) is quite hard to use #26759

seberg commented Jun 19, 2024 •

edited

Loading

seberg commented Jun 19, 2024

mhvk commented Jun 20, 2024

asmeurer commented Jun 24, 2024

ENH: np.clip as currently implemented (numpy 2.0.0) is quite hard to use #26759

ENH: np.clip as currently implemented (numpy 2.0.0) is quite hard to use #26759

Comments

seberg commented Jun 19, 2024 • edited Loading

seberg commented Jun 19, 2024

mhvk commented Jun 20, 2024

asmeurer commented Jun 24, 2024

ENH: `np.clip` as currently implemented (numpy 2.0.0) is quite hard to use #26759

ENH: `np.clip` as currently implemented (numpy 2.0.0) is quite hard to use #26759

seberg commented Jun 19, 2024 •

edited

Loading