Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Numpy 2 #409

Open
jakelishman opened this issue Feb 14, 2024 · 7 comments
Open

Support for Numpy 2 #409

jakelishman opened this issue Feb 14, 2024 · 7 comments

Comments

@jakelishman
Copy link

jakelishman commented Feb 14, 2024

Hello!

With Numpy 2.0.0rc1 moving closer towards a release, has there been any thought about what will be needed to support the later version or if there's anything needed to make it safer to use in the presence of C API changes? I'm interested in helping with the transition to support both 1.x and 2.x, if useful.

For what it's worth, just naively running the test suite against a local build of Numpy 2 (as of numpy/numpy@182ee60) does pretty well, albeit using the deprecated numpy.core.multiarray path over the new numpy._core.multiarray. The failures I observed are:

  • --lib
    • dtype::tests::test_dtype_names (bool_ is now bool)
  • --test array
    • half_bf16_works: ml_dtypes fails to import because it's not ready for Numpy 2
    • copy_to_works: slot 82 PyArray_CopyInto from _ARRAY_API is now NULL in Numpy 2 so this reliably segfaults (though PyArray_CopyInto got moved into slot 50, so it's still there).

Everything else passed for me (macOS 14, Python 3.11, x86_64).

While copy_to_works is the only segfault I saw in the test suite, this is the current set of changes in the generated C API capsule between Numpy 1.26.4 and Numpy 2.0.0 (as of numpy/numpy@182ee60), where the number is the offset into the PyArray_API pointer array:

Removed in 'api_2.0.0.c':
     1: (void *) &PyBigArray_Type
     4: (void *) &PyArrayFlags_Type
    40: (void *) PyArray_SetNumericOps
    41: (void *) PyArray_GetNumericOps
    65: (void *) PyArray_ScalarFromObject
    66: (void *) PyArray_GetCastFunc
    67: (void *) PyArray_FromDims
    68: (void *) PyArray_FromDimsAndDataAndDescr
    81: (void *) PyArray_MoveInto
    82: (void *) PyArray_CopyInto
    83: (void *) PyArray_CopyAnyInto
   103: (void *) PyArray_FillObjectArray
   115: (void *) PyArray_NewFlagsObject
   117: (void *) PyArray_CompareUCS4
   122: (void *) PyArray_FieldNames
   163: (void *) PyArray_As1D
   164: (void *) PyArray_As2D
   171: (void *) PyArray_CopyAndTranspose
   173: (void *) PyArray_TypestrConvert
   197: (void *) PyArray_TypeNumFromName
   201: (void *) _PyArray_SigintHandler
   202: (void *) _PyArray_GetSigintBuf
   208: (void *) PyArray_CompareString
   219: (void *) PyArray_SetDatetimeParseFunction
   278: (void *) PyArray_GetArrayParamsFromObject
   291: (void *) PyDataMem_SetEventHook
   293: (void *) PyArray_MapIterSwapAxes
   294: (void *) PyArray_MapIterArray
   295: (void *) PyArray_MapIterNext
   301: (void *) PyArray_MapIterArrayCopyIfOverlap

Different between 'api_1.26.4.c' and 'api_2.0.0.c':
    50:
      (void *) PyArray_CastTo
      (void *) PyArray_CopyInto
    51:
      (void *) PyArray_CastAnyTo
      (void *) PyArray_CopyAnyInto

Added in 'api_2.0.0.c':
   307: (void *) NpyDatetime_ConvertDatetime64ToDatetimeStruct
   308: (void *) NpyDatetime_ConvertDatetimeStructToDatetime64
   309: (void *) NpyDatetime_ConvertPyDateTimeToDatetimeStruct
   310: (void *) NpyDatetime_GetDatetimeISO8601StrLen
   311: (void *) NpyDatetime_MakeISO8601Datetime
   312: (void *) NpyDatetime_ParseISO8601Datetime
   313: (void *) NpyString_load
   314: (void *) NpyString_pack
   315: (void *) NpyString_pack_null
   316: (void *) NpyString_acquire_allocator
   317: (void *) NpyString_acquire_allocators
   318: (void *) NpyString_release_allocator
   319: (void *) NpyString_release_allocators

All of the entries in the "removed in" are a place where the slot now has a null pointer, so would segfault if called. Slots 82 and 83 (PyArray_CopyInto and PyArray_CopyAnyInto) got moved into slots 50 and 51 respectively.

I've not been following if there were ABI incompatible differences between any functions themselves - I think there was a change from Py_ssize_t to intptr_t (or vice versa) in some, which can be ABI incompatible for some more esoteric platforms.


Would an appropriate way of handling most of the removals be to mark those in impl_array_api as fallible, and have an explicit non-null pointer check on each access so we can safely panic rather than segfaulting? For the high-level safe API, does something like checking the version on the GILOnceCell initialisation and saving the result so functions can check and Err on bad calls work?

Happy to help with any implementation, if it'd be useful.

@adamreichold
Copy link
Member

For the high-level safe API, does something like checking the version on the GILOnceCell initialisation and saving the result so functions can check and Err on bad calls work?

I think if we try to handle the differences between the NumPy versions, then the PyArrayAPI and PyUFuncAPI types are the most reasonable places to try to implement this, i.e. we should check the versions when loading the capsules, store that information inside PY_ARRAY_API and PY_UFUNC_API and make it available to other code.

Would an appropriate way of handling most of the removals be to mark those in impl_array_api as fallible, and have an explicit non-null pointer check on each access so we can safely panic rather than segfaulting?

I am not sure we need this as long as properly handle the differences for those functions which we do call. (Calling them in unsafe after all, so we can put the responsibility to check the version on the caller.) For the copy_to part, it would probably be nice if PyArrayAPI would provide an extra method which calls into slot 81 or 51 depending on the version found during loading.

@stinodego
Copy link

stinodego commented May 24, 2024

I've done a quick test run with NumPy 2.0.0-rc2, and the only thing that breaks for our code base is temporal dtypes. I get a segfault when I try to use those, e.g.:

Datetime::<units::Microseconds>::get_dtype_bound(py) // Worked before, now: Segmentation fault

PyArrayDescr::new_bound(py, intern!(py, "<M8[us]")).unwrap()  // Still works!

So using PyArrayDescr can work around that one, but I cannot create an array with temporal types now:

PyArray1::<Datetime<units::Microseconds>>::from_iter_bound(py, values)  // Segmentation fault

I imagine it's related to this change:
https://numpy.org/devdocs/numpy_2_0_migration_guide.html#the-pyarray-descr-struct-has-been-changed

Would be great if this could be fixed so that we can support NumPy 2.0.0!

@adamreichold
Copy link
Member

I imagine it's related to this change:
https://numpy.org/devdocs/numpy_2_0_migration_guide.html#the-pyarray-descr-struct-has-been-changed

Sounds reasonable, so we probably need adjust either the FFI definitions or use the wrapper functions in

let metadata = &mut *((*dtype.as_dtype_ptr()).c_metadata

@stinodego Would you be up to submitting a patch?

@stinodego
Copy link

Would you be up to submitting a patch?

I'm afraid I don't have to knowhow to submit a fix, and I don't really have the bandwidth to dive into this right now.

@aMarcireau
Copy link

Hi @stinodego,
I ran into the same issue and submitted a PR (#429). Could you test this on your code base and let me know if you run into issues with it?

@stinodego
Copy link

stinodego commented Jun 4, 2024

Hi @stinodego, I ran into the same issue and submitted a PR (#429). Could you test this on your code base and let me know if you run into issues with it?

@aMarcireau I tried running my tests with your PR with default_features = False, features = ["numpy-2"]and everything fails with:

pyo3_runtime.PanicException: the extension was compiled for numpy 2.x but the runtime version is 1.x (ABI 02000000.00000012)

I compiled with your branch and had NumPy 2.0.0rc2 installed.

However, if I leave the default features on, the tests that previously segfaulted now pass with flying colors!

So it looks like there's an issue with your NumPy version check, but the implementation works (at least for my code base).

@aMarcireau
Copy link

Thanks for the feedback. I indeed mixed up the conditions for the runtime version check. I just added a commit to the PR that should fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants