Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #6706 – Update sources for compatibility with NumPy 2 #6724

Merged
merged 30 commits into from
Sep 20, 2024

Conversation

mhucka
Copy link
Contributor

@mhucka mhucka commented Sep 11, 2024

This is a set of changes necessary to address issue #6706 and support Cirq migration to NumPy 2. The result makes Cirq compatible with NumPy 2 and 1, with the exception of the cirq-rigetti module, which at this time has an incompatible requirement for NumPy 1.

The changes target NumPy 2.0.2 rather than NumPy 2.1. At this time, some package dependency conflicts arise from other packages used by Cirq when NumPy 2.1 is required. This currently limits us to 2.0.2. (Note for the Google Quantum team: Google's internal codebase is about to move to NumPy 2.0.2, not 2.1, so the inability of supporting NumPy 2.1 is not a problem with respect to this impending change.)

The rest of this text summarizes the changes in this PR.

Avoid a construct deprecated in NumPy 2

The NumPy 2 Migration Guide explicitly recommends changing constructs of the form np.array(state, copy=False) to np.asarray(state).

Avoid implicitly converting 2-D arrays of single value to scalars

NumPy 2 raises deprecation warnings about converting an ndarray with dimension > 0 of values likle [[0]] to a scalar value like 0. The solution is to retrieve the value using .item() instead.

Address change in NumPy string representation of scalars

As a consequence of NEP 51, the string representation of scalar numbers changed in NumPy 2 to include type information. This affected printing Cirq circuit diagrams: instead seeing numbers like 1.5, you would see np.float64(1.5) and similar.

The solution is to use .item() on scalars before passing them to anything that needs to use the scalar's string representation (via str or __repr__()). So, for example, if x is an np.int64 object, x.item() returns the Python object, and then the string form of that looks normal.

Explicitly convert NumPy ndarray of np.bool to Python bool

In NumPy 2 (and possibly earlier versions), lines 478-480 in cirq-google/cirq_google/api/v2/program_pb2.pyi produced a deprecation warning:

DeprecationWarning: In future, it will be an error
for 'np.bool' scalars to be interpreted as an index

This warning is somewhat misleading: while it is the case that Booleans are involved, they are not being used as indices.

The fields rs, xs, and zs of CliffordTableau as defined in file cirq-core/cirq/qis/clifford_tableau.py have type Optional[np.ndarray], and the values in the ndarray have NumPy type bool in practice. The protocol buffer version of CliffordTableau defined in file cirq-google/cirq_google/api/v2/program_pb2.pyi defines those fields as collections.abc.Iterable[builtins.bool]. At first blush, you might think they're arrays of Booleans in both cases, but unfortunately, there's a wrinkle: Python defines its built-in bool type as being derived from int (see PEP 285), while NumPy explicitly does not drive its bool from its integer class (see https://numpy.org/doc/2.0/reference/arrays.scalars.html#numpy.bool). The warning about converting np.bool to index values (i.e., integers) probably arises when the np.bool values in the ndarray are coerced into Python Booleans.

At first, I thought the obvious solution would be to use np.asarray to convert the values to builtins.bool, but this did not work:

>>> import numpy as np
>>> import builtins
>>> arr = np.array([True, False], dtype=np.bool)
>>> arr
array([ True, False])
>>> type(arr[0])
<class 'numpy.bool'>
>>> newarr = np.asarray(arr, dtype=builtins.bool)
>>> newarr
array([ True, False])
>>> type(newarr[0])
<class 'numpy.bool'>

They still end up being NumPy bools. Some other variations on this approach all failed to produce proper Python Booleans. In the end, what worked was to use map() to apply builtins.bool to every value in the incoming arrays. This may not be as efficient as possible; a possible optimization for the future is to look for a more efficient way to cast the types, or avoid having to do it at all.

Address changes in NumPy data type promotion

Note added 2024-09-20: Pavol reasoned convincingly that it would be better to pull the non-essential NumPy 2 type warnings to a separate PR at a later date, and focus this PR on only essential compatibility issues. Pavol amended the PR accordingly. Consequently, the changes described in this section are mostly not part of the final PR. This text is being left in place because it provides details that may be useful in the future PR.

One of the changes in NumPy 2 is to the behavior of type promotion. A possible negative impact of the changes is that some operations involving scalar types can lead to lower precision, or even overflow. For example, uint8(100) + 200 previously (in Numpy < 2.0) produced a unit16 value, but now results in a unit8 value and an overflow warning (not error). This can have an impact on Cirq. For example, in Cirq, simulator measurement result values are uint8's, and in some places, arrays of values are summed; this leads to overflows if the sum > 128. It would not be appropriate to change measurement values to be larger than uint8, so in cases like this, the proper solution is probably to make sure that where values are summed or otherwise numerically manipulated, uint16 or larger values are ensured.

NumPy 2 offers a new option (np._set_promotion_state("weak_and_warn")) to produce warnings where data types are changed. Commit 6cf50eb adds a new command-line to our pytest framework, such that running

check/pytest --warn-numpy-data-promotion

will turn on this NumPy setting. Running check/pytest with this option enabled revealed quite a lot of warnings. The present commit changes code in places where those warnings were raised, in an effort to eliminate as many of them as possible.

It is certainly the case that not all of the type promotion warnings are meaningful. Unfortunately, I found it sometimes difficult to be sure of which ones are meaningful, in part because Cirq's code has many layers and uses ndarrays a lot, and understanding the impact of a type demotion (say, from float64 to float32) was difficult for me to do. In view of this, I wanted to err on the side of caution and try to avoid losses of precision. The principles followed in the changes are roughly the following:

  • Don't worry about warnings about changes from complex64 to complex128, as this obviously does not reduce precision.

  • If a warning involves an operation using an ndarray, change the code to try to get the actual data type of the data elements in the array rather than use a specific data type. This is the reason some of the changes look like the following, where it reaches into an ndarray to get the dtype of an element and then later uses the .type() method of that dtype to cast the value of something else:

    dtype = args.target_tensor.flat[0].dtype
    .....
    args.target_tensor[subspace] *= dtype.type(x)
  • In cases where the above was not possible, or where it was obvious what the type must always be, the changes add type casts with explicit types like complex(x) or np.float64(x).

It is likely that this approach resulted in some unnecessary up-promotion of values and may have impacted run-time performance. Some simple overall timing of check/pytest did not reveal a glaring negative impact of the changes, but that doesn't mean real applications won't be impacted. Perhaps a future review can evaluate whether speedups are possible.

@mhucka mhucka marked this pull request as draft September 11, 2024 03:07
@mhucka mhucka changed the title WIP: fix #6706 – Update sources for compatibility with NumPy 2 Fix #6706 – Update sources for compatibility with NumPy 2 Sep 16, 2024
In NumPy 2 (and possibly earlier versions), lines 478-480 produced a
deprecation warning:

```
DeprecationWarning: In future, it will be an error
for 'np.bool' scalars to be interpreted as an index
```

This warning is somewhat misleading: it _is_ the case that Booleans
are involved, but they are not being used as indices.

The fields `rs`, `xs`, and `zs` of CliffordTableau as defined in file
`cirq-core/cirq/qis/clifford_tableau.py` have type
`Optional[np.ndarray]`, and the values in the ndarray have NumPy type
`bool` in practice. The protocol buffer version of CliffordTableau
defined in file `cirq-google/cirq_google/api/v2/program_pb2.pyi`
defines those fields as `collections.abc.Iterable[builtins.bool]`. At
first blush, you might think they're arrays of Booleans in both cases,
but unfortunately, there's a wrinkle: Python defines its built-in
`bool` type as being derived from `int` (see PEP 285), while NumPy
explicitly does _not_ drive its `bool` from its integer class (see
<https://numpy.org/doc/2.0/reference/arrays.scalars.html#numpy.bool>).
The warning about converting `np.bool` to index values (i.e.,
integers) probably arises when the `np.bool` values in the ndarray are
coerced into Python Booleans.

At first, I thought the obvious solution would be to use `np.asarray`
to convert the values to `builtins.bool`, but this did not work:

```
>>> import numpy as np
>>> import builtins
>>> arr = np.array([True, False], dtype=np.bool)
>>> arr
array([ True, False])
>>> type(arr[0])
<class 'numpy.bool'>
>>> newarr = np.asarray(arr, dtype=builtins.bool)
>>> newarr
array([ True, False])
>>> type(newarr[0])
<class 'numpy.bool'>
```

They still end up being NumPy bools. Some other variations on this
approach all failed to produce proper Python Booleans. In the end,
what worked was to use `map()` to apply `builtins.bool` to every value
in the incoming arrays. This may not be as efficient as possible; a
possible optimization for the future is to look for a more efficient
way to cast the types, or avoid having to do it at all.
The NumPy 2 Migration Guide [explicitly recommends
changing](https://numpy.org/doc/stable/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword)
constructs of the form

```python
np.array(state, copy=False)
```

to

```python
np.asarray(state)
```
NumPy 2 raises deprecation warnings about converting an ndarray with
dimension > 0 of values likle `[[0]]` to a scalar value like `0`. The
solution is to get the value using `.item()`.
mhucka and others added 10 commits September 19, 2024 10:24
This adds a new option to make NumPy warn about data promotion behavior that has changed in NumPy 2. This new promotion can lead to lower precision results when working with floating-point scalars, and errors or overflows when working with integer scalars. Invoking pytest with `--warn-numpy-data-promotion` will cause warnings warnings to be emitted when dissimilar data types are used in an operation in such a way that NumPy ends up changing the data type of the result value.

Although this new option for Cirq's pytest code is most useful during Cirq's migration to NumPy 2, the flag will likely remain for some time afterwards too, because developers will undoubtely need time to adjust to the new NumPy behavior.

For more information about the NumPy warning enabled by this option, see
<https://numpy.org/doc/2.0/numpy_2_0_migration_guide.html#changes-to-numpy-data-type-promotion>.
This updates the minimum NumPy version requirement to 2.0, and updates
a few other packages to versions that are compatible with NumPy 2.0.

Note: NumPy 2.1 was released 3 weeks ago, but at this time, Cirq can
only upgrade to 2.0. This is due to the facts that (a) Google's
internal codebase is moving to NumPy 2.0.2, and not 2.1 yet; and (b)
conflicts arise with some other packages used by Cirq if NumPy 2.1 is
required right now. These considerations will no doubt change in the
near future, at which time we can update Cirq to use NumPy 2.1 or
higher.
One of the changes in NumPy 2 is to the [behavior of type
promotion](https://numpy.org/devdocs/numpy_2_0_migration_guide.html#changes-to-numpy-data-type-promotion).
A possible negative impact of the changes is that some operations
involving scalar types can lead to lower precision, or even overflow.
For example, `uint8(100) + 200` previously (in Numpy < 2.0) produced a
`unit16` value, but now results in a `unit8` value and an overflow
_warning_ (not error). This can have an impact on Cirq. For example,
in Cirq, simulator measurement result values are `uint8`'s, and in
some places, arrays of values are summed; this leads to overflows if
the sum > 128. It would not be appropriate to change measurement
values to be larger than `uint8`, so in cases like this, the proper
solution is probably to make sure that where values are summed or
otherwise numerically manipulated, `uint16` or larger values are
ensured.

NumPy 2 offers a new option
(`np._set_promotion_state("weak_and_warn")`) to produce warnings where
data types are changed. Commit 6cf50eb adds a new command-line to our
pytest framework, such that running

```bash
check/pytest --warn-numpy-data-promotion
```

will turn on this NumPy setting. Running `check/pytest` with this
option enabled revealed quite a lot of warnings. The present commit
changes code in places where those warnings were raised, in an effort
to eliminate as many of them as possible.

It is certainly the case that not all of the type promotion warnings
are meaningful. Unfortunately, I found it sometimes difficult to be
sure of which ones _are_ meaningful, in part because Cirq's code has
many layers and uses ndarrays a lot, and understanding the impact of a
type demotion (say, from `float64` to `float32`) was difficult for me
to do. In view of this, I wanted to err on the side of caution and try
to avoid losses of precision. The principles followed in the changes
are roughly the following:

* Don't worry about warnings about changes from `complex64` to
  `complex128`, as this obviously does not reduce precision.

* If a warning involves an operation using an ndarray, change the code
  to try to get the actual data type of the data elements in the array
  rather than use a specific data type. This is the reason some of the
  changes look like the following, where it reaches into an ndarray to
  get the dtype of an element and then later uses the `.type()` method
  of that dtype to cast the value of something else:

    ```python
    dtype = args.target_tensor.flat[0].dtype
    .....
    args.target_tensor[subspace] *= dtype.type(x)
    ```

* In cases where the above was not possible, or where it was obvious
  what the type must always be, the changes add type casts with
  explicit types like `complex(x)` or `np.float64(x)`.

It is likely that this approach resulted in some unnecessary
up-promotion of values and may have impacted run-time performance.
Some simple overall timing of `check/pytest` did not reveal a glaring
negative impact of the changes, but that doesn't mean real
applications won't be impacted. Perhaps a future review can evaluate
whether speedups are possible.
This commit for one file implements a minor refactoring of 3 test
functions to make them all use similar idioms (for greater ease of
reading) and to address the same NumPy 2 data promotion warnings for
the remaining files in commit eeeabef.
Mypy flagged a couple of the previous data type declaration changes as
being incompatible with expected types. Changing them to satisfy mypy
did not affect Numpy data type promotion warnings.
* Sync with new API for checking device family in qcs-sdk-python,
  Ref: rigetti/qcs-sdk-rust#463 in isa.pyi

* Require qcs-sdk-python-0.20.1 which introduced the new family API

Fixes quantumlib#6732
Pytest was happy with the previous approach to declaring the value
types in a couple of expressions, but mypy was not. This new version
satisfies both.
As a consequence of [NEP
51](https://numpy.org/neps/nep-0051-scalar-representation.html#nep51),
the string representation of scalar numbers changed in NumPy 2 to
include type information. This affected printing Cirq circuit
diagrams: instead seeing numbers like 1.5, you would see
`np.float64(1.5)` and similar.

The solution is to avoid getting the repr output of NumPy scalars
directly, and instead doing `.item()` on them before passing them
to `format()` or other string-producing functions.
The recent changes support NumPy 2 (as long as cirq-rigetti is removed
manually), but they don't require NumPy 2. We can maintain
compatibility with Numpy 1.x.
Bumps [serve-static](https://github.com/expressjs/serve-static) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together.

Updates `serve-static` from 1.15.0 to 1.16.2
- [Release notes](https://github.com/expressjs/serve-static/releases)
- [Changelog](https://github.com/expressjs/serve-static/blob/v1.16.2/HISTORY.md)
- [Commits](expressjs/serve-static@v1.15.0...v1.16.2)

Updates `express` from 4.19.2 to 4.21.0
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/4.21.0/History.md)
- [Commits](expressjs/express@4.19.2...4.21.0)

---
updated-dependencies:
- dependency-name: serve-static
  dependency-type: indirect
- dependency-name: express
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Michael Hucka <mhucka@caltech.edu>
In the current version of pytest (8.3.3) with the pytest-asyncio
module version 0.24.0, we see the following warnings at the beginning
of a pytest run:

```
warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))

..../lib/python3.10/site-packages/pytest_asyncio/plugin.py:208:
PytestDeprecationWarning: The configuration option
"asyncio_default_fixture_loop_scope" is unset. The event loop scope for
asynchronous fixtures will default to the fixture caching scope. Future
versions of pytest-asyncio will default the loop scope for asynchronous
fixtures to function scope. Set the default fixture loop scope explicitly in
order to avoid unexpected behavior in the future. Valid fixture loop scopes
are: "function", "class", "module", "package", "session"
```

A [currently-open issue and discussion over in the pytest-asyncio
repo](pytest-dev/pytest-asyncio#924) suggests that
this is an undesired side-effect of a recent change in pytest-asyncio and is
not actually a significant warning. Moreover, the discussion suggests the
warning will be removed or changed in the future.

In the meantime, the warning is confusing because it makes it sound like
something is wrong. This simple PR silences the warning by adding a suitable
pytest init flag to `pyproject.toml'.
Copy link

codecov bot commented Sep 20, 2024

Codecov Report

Attention: Patch coverage is 96.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 97.83%. Comparing base (484df6f) to head (cab0fb2).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
conftest.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6724      +/-   ##
==========================================
- Coverage   97.83%   97.83%   -0.01%     
==========================================
  Files        1077     1077              
  Lines       92524    92537      +13     
==========================================
+ Hits        90523    90535      +12     
- Misses       2001     2002       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

In commit eb98361 I added the import of kahypar, which (at least at the time) appeared to have been imported by Quimb. Double-checking this import in clean environments reveals that in fact, nothing depends on kahypar.

Taking it out via a separate commit because right now this package is causing our GitHub actions for commit checks to fail, and I want to leave a record of what caused the failures and how they were resolved.
@mhucka mhucka marked this pull request as ready for review September 20, 2024 16:49
@mhucka mhucka merged commit 3fefe29 into quantumlib:main Sep 20, 2024
33 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants