Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent Handling of Type Casting Hierarchy #3950

Open
jthielen opened this issue Apr 7, 2020 · 2 comments
Open

Consistent Handling of Type Casting Hierarchy #3950

jthielen opened this issue Apr 7, 2020 · 2 comments
Labels
topic-arrays related to flexible array support

Comments

@jthielen
Copy link
Contributor

jthielen commented Apr 7, 2020

As brought up in #3643, there appears to be some inconsistencies in how xarray handles other numeric/duck array types with regards to a well-defined type casting hierarchy across operations. For example, in the following:

Construction/Wrapping

  • Allows
    • xarray.core.indexing.ExplicitlyIndexed
    • pandas.Index
    • Dask array
    • __array_function__ implementers
  • Automatically converts
    • Anything with a values attribute to its values
    • Datetime-like array types
    • Masked arrays
    • Anything else for which np.asarray(data) is valid
  • Doesn't reject any type when trying to wrap (for an upcast type such as a HoloViews Dataset, this may be needed?)

Binary Ops

  • Defers based on xarray's internal hierarchy (Dataset, DataArray, Variable), otherwise relies upon methods of underlying data, and then wraps result.

(would be one less category to worry about if refactored to use __array_ufunc__, see #3936 (comment))

__array_ufunc__

  • Allows a list of supported types
    _HANDLED_TYPES = (
    np.ndarray,
    np.generic,
    numbers.Number,
    bytes,
    str,
    ) + dask_array_type

    along with SupportsArithmetic
  • Defers to all other types

__array_function__

One concrete example of where this has been problematic is with xarray DataArrays and Pint Quantities (#3643). xarray DataArray is above Pint Quantity in the (generally agreed upon) type casting hierarchy, and wrapping and binary ops work properly since Pint Quantities defer and xarray DataArrays handle the operation. However, ufuncs fail because they both attempt to defer to the other. Having a consistent way of handling type compatibility across all relevant areas in xarray should be able to remove these kinds of issues.

However, it would be good to keep in mind that an agreed upon way of how to do this in the broader ecosystem doesn't seem to be there yet, so this would still be treading in uncertain waters for the moment. I've been operating under these assumptions when working with Pint, but I definitely think there is a need for more authoritative guidance.

Also, if I'm mistaken in any of the things mentioned above, please do let me know!

cc @keewis, @shoyer

@max-sixty
Copy link
Collaborator

Is this still current?

@jthielen
Copy link
Contributor Author

Is this still current?

I think both yes and no? Since the big series of discussions back in 2021, I don't think much work ended up happening on cross-ecosystem compatibility with nested arrays specifically, so I would assume many of these issues (particularly with using numpy ufuncs and array functions) still remain. However, a lot of progress on the Array API has happened, so those issues may no longer be a priority, given that the current expectation seems to instead be just using the Array API of the top-level library, rather than having the NumPy APIs handle it all. So, as long as xarray (and more generally, each higher-level library) handles construction/wrapping consistently with Array API behaviors, all should be well?

So, for this issue in particular, my hunch would be to keep it around for now and then revisit once #7848 (and perhaps also other libraries' efforts like hgrecco/pint#1592) are resolved. But, my focus has been diverted away from these efforts for the past several years, so I'd gladly defer to folks who have kept up expertise in this area.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-arrays related to flexible array support
Projects
None yet
Development

No branches or pull requests

3 participants