Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start incremental migration from rust-cpython to PyO3 #12110

Merged
merged 7 commits into from
Jun 8, 2021

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented May 24, 2021

Rust-cpython vs PyO3

While developing my talk on Rust native extensions, I found PyO3 has had lots of traction recently, including Python cryptography now using it. PyO3 started as a fork, but has since diverged: https://pyo3.rs/v0.13.2/rust_cpython.html.

We went with rust-cpython instead of PyO3 for Rust FFI in #9593 because, at the time, PyO3 required the Rust nightly compiler. While this was a huge win over CFFI, PyO3 would now bring us several benefits:

The community has been very responsive too: PyO3/pyo3#1625, and they have an active Gitter room.

Incremental migration

To reduce risk and complexity, we use an incremental migration to PyO3 by building two distinct native extensions. At first, native_engine_pyo3 will only have self-contained functionality, e.g. PyStubCAS and, soon, the Nailgun server.

Because the migration is incremental, we can more safely improve our FFI implementation along the way, rather than merely preserving the status quo. For example, this PR makes the PyStubCAS API more ergnonomic.

When we reach critical mass with PyO3, native_engine_pyo3 will be renamed to native_engine and native_engine will be changed to native_engine_cpython. (Or, we'll remove rust-cpython in one big swoop). This will result in some churn in our Python files (due to import module changing), but I argue that's worth the reduction in risk.

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@Eric-Arellano Eric-Arellano changed the title WIP: Start incremental migration from rust-cpython to PyO3 Start incremental migration from rust-cpython to PyO3 May 24, 2021
@Eric-Arellano Eric-Arellano marked this pull request as ready for review May 24, 2021 16:07
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@Eric-Arellano
Copy link
Contributor Author

I think the parallelism mechanism is similar to Rust-CPython iiuc: py.allow_threads(||): https://pyo3.rs/v0.13.2/parallelism.html.

I didn't totally understand the section on the GIL and mutability. It'd be helpful if someone could check over that and confirm this is consistent with what we'd expect: https://pyo3.rs/v0.13.2/types.html.

Are there other questions we should be asking in deciding if we want PyO3 vs. Rust-CPython? I imagine a performance benchmark, but not sure how to get a meaningful one because porting the core functionality of our extension (rule graph scheduler) is non-trivial. Fwit, PyO3 claims we can use references rather than ownership much more than with rust-cpython: https://pyo3.rs/v0.13.2/rust_cpython.html#ownership-and-lifetimes.

@@ -0,0 +1,33 @@
// Copyright 2021 Pants project contributors (see CONTRIBUTORS.md).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to change this file organization at all for our FFI? I copied and pasted engine/src's layout, but we can use this as an opportunity to reorganize things.

For one, I think that interface.rs was way too big before, largely due to how hard it was with Rust-CPython to use types declared in other files. That is solved by PyO3 thanks to using normal Rust structs with normal visibility rules. We definitely want to break up interface.rs, although I'm not sure how? For example, externs/interface/nailgun.rs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we do want to change it, it probably makes sense to have that reorganization be a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I would agree, but I think we can make some improvements along the way given this:

Because the migration is incremental, we can more safely improve our FFI implementation along the way, rather than merely preserving the status quo. For example, this PR makes the PyStubCAS API more ergnonomic.

@Eric-Arellano
Copy link
Contributor Author

Eric-Arellano commented May 24, 2021

Can use the Python stable ABI again, meaning we only need to build one wheel per platform, rather than platform x interpreter. See https://pyo3.rs/v0.13.2/building_and_distribution.html#py_limited_apiabi3.

Oh, @stuhood is this a blocker for ABI3?

The buffer API is not supported.

@tdyas
Copy link
Contributor

tdyas commented May 25, 2021

Is the intent that engine_pyo3 eventually is merged back into engine?

@Eric-Arellano
Copy link
Contributor Author

Is the intent that engine_pyo3 eventually is merged back into engine?

Yes:

When we reach critical mass with PyO3, native_engine_pyo3 will be renamed to native_engine and native_engine will be changed to native_engine_cpython. (Or, we'll remove rust-cpython in one big swoop).

Copy link
Contributor

@tdyas tdyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't resist taking a quick look at this after you linked it from the PyO3 issue. I'm excited to see this happening; please feel welcome to ping me if you want me to check in on any future reviews 🚀

// (e.g. Python 3.7 vs 3.8).
println!("cargo:rerun-if-env-changed=PY");

if cfg!(target_os = "macos") {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this bit won't be necessary on the upcoming PyO3 0.14 (as it'll be done upstream for you).

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks!

@jsirois
Copy link
Contributor

jsirois commented Jun 7, 2021

Of the 4 justifications only the fourth is interesting. IIRC cpython did not use abi3 for perf reasons. Do you have any information on that? Are we trading one binary per OS for a loss in perf? This would be good to have a solid handle on before doing an incremental migration. Alternatively, a complete migration and resulting apples-apples perf numbers would of course be a fine avenue too.

@Eric-Arellano
Copy link
Contributor Author

only the fourth is interesting.

Does "interesting" == "compelling" here? If so, I disagree that the other reasons aren't compelling:

Uses procedural macros, rather than a custom macro_rules! DSL. This is more natural to work with and better understood by IDEs, rustfmt, etc.

There's inherent value in things being easier to understand, and our project attempts to prioritize understanding and maintainability where possible.

There's also inherent value in integrating with tooling like IDE IntelliSense.

Compare testutil.rs from before and after. The new code is more like typical Rust and is easier to develop and maintain. Further, PyCharm Rust did not understand the old macro, but understands this new code.

Great documentation

We have identified as a project that we prioritize bringing in new contributors. One of the best ways to do this is to make our code easier to understand.

Better error handling, especially due to not requiring Python

This allows us to write more idiomatic and maintainable Rust code. For example, PyO3 works with ? but rust-python does not.

PyO3 has had lots of traction recently

(This is a 5th reason). Better traction often (but not always) translates to a couple benefits:

--

IIRC cpython did not use abi3 for perf reasons.

I tried looking through their GitHub and couldn't find relevant references to "abi3" or "pep 384" explaining their rationale. But Stu explained it in #9593 as:

The cpython crate does not use PEP-384, and updating it to do so would be a significant undertaking: for example, the PyObject::call interface relies on PyTuple, which is not present under PEP-384. It would likely also erase some of the performance and usability benefit of this API.

Iiuc, the performance thing is speculation from Stu. PyO3's docs on ABI 3 do not mention any perf hit, only these 3 limitations: https://pyo3.rs/v0.13.2/building_and_distribution.html#missing-features. I do see in Cython's issue some performance concerns with PEP 384, but not clear how much and if it would hit Pants and PyO3.

Either way, I'm not proposing PyO3 exclusively because of PEP 384. It's not clear to me if we can even use it because we use PyBuffer, which the PEP says is not stable, and I'm not sure if we can work around that. Fwict, the direction I want to explore is using PyPy and/or distributing via PyOxidizer, both of which make the stable ABI irrelevant. In the meantime, PyO3 possibly would allow us to go back to a single release per OS, if we want it, can work around PyBuffer, and benchmark is fine.

@davidhewitt
Copy link

Iiuc, the performance thing is speculation from Stu. PyO3's docs on ABI 3 do not mention any perf hit, only these 3 limitations: https://pyo3.rs/v0.13.2/building_and_distribution.html#missing-features. I do see in Cython's issue some performance concerns with PEP 384, but not clear how much and if it would hit Pants and PyO3.

ABI3 is definitely slower than building for a known Python version. We've been doing a lot of optimization work for 0.14 which makes use of APIs which aren't available for ABI3. There was already some perf gap and I'm certain it's gotten substantially wider, though I haven't yet made concrete benchmark numbers to back that claim up.

(The docs for main branch do actually add a fourth note about optimizations: https://pyo3.rs/main/building_and_distribution.html#missing-features)

@jsirois
Copy link
Contributor

jsirois commented Jun 7, 2021

Ok, thanks @davidhewitt. @Eric-Arellano I think that takes the last justification off the table and makes this all simpler to think about. Is it worth the disruption for nicer APIs? Probably, but at least the motivation is now crisp.

@Eric-Arellano
Copy link
Contributor Author

I think that takes the last justification off the table and makes this all simpler to think about. Is it worth the disruption for nicer APIs?

Fair enough - I agree.

This is where an incremental migration comes in: reduce the size of the disruption. Because each change will be small, we can also clean up our FFI code with each PR, e.g. implementing this ticket Stu opened #11722.

@stuhood
Copy link
Sponsor Member

stuhood commented Jun 7, 2021

Another aspect worth considering here is whether Py03 eases or complicates embedding Python moving forward (or whether being mid-migration might do so). Entirely possible that it is neutral, but #7369 would be a higher priority than an API change, I think. It would make the ABI3 question moot, for example (since we'd embed a precise version).

@Eric-Arellano
Copy link
Contributor Author

@stuhood see #7369 (comment) for PyOxidizer maintainer's response. Does that sound good?

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That risk seems manageable, especially since using PyOxidizer isn't scheduled, and if we think that we can push through a migration in less than ... 3 months or so?

@Eric-Arellano
Copy link
Contributor Author

and if we think that we can push through a migration in less than ... 3 months or so?

Hopefully less! This migration is the main side project I want to do this summer, especially for me to gain more experience with Rust native extensions in prep for my EuroPython talk about the topic.

@Eric-Arellano Eric-Arellano merged commit 7a36876 into pantsbuild:main Jun 8, 2021
@Eric-Arellano Eric-Arellano deleted the pyo3 branch June 8, 2021 19:31
Eric-Arellano added a commit to Eric-Arellano/pants that referenced this pull request Jun 9, 2021
Eric-Arellano added a commit that referenced this pull request Nov 14, 2021
See #12110 for the original motivation.

## Why migrate

Beyond the APIs requiring much less boilerplate and being more ergonomic (e.g. the `.call0()` and `call1()` variants of `.call()`), I think the biggest benefit is clearer modeling for when the GIL is held.

With PyO3, the two [core types](https://pyo3.rs/v0.15.0/types.html) are `&PyAny` and `PyObject` (aka `Py<PyAny`). `&PyAny` is always a reference with the same lifetime as the `Python` token, which represents when the GIL is used. Whenever you want to do Python operations in Rust, like `.getattr()`, you use `&PyAny`. `PyObject` and any `Py<T>` are GIL-independent.

In contrast, rust-cpython only had `PyObject` and `&PyObject`. It passed around `Python` as an argument to any Python functions like `.getattr()`.

--

An additional benefit is that using proc macros instead of declarative macros means PyCharm no longer hangs for me when working in the `engine/` crate! PyCharm does much better w/ proc macros, and things feel zippy now like autocomplete working.

## Migration strategy

I tried originally doing an incremental strategy, most recently with #12451, which proved technically infeasible.

This PR completely swaps Rust-Cpython with PyO3, but tries to minimize the diff. It only makes changes when it was too difficult to get working with PyO3. For example, `interface.rs` is far too big, but this keeps the same organization as before.

For now, we still have the `native_engine_pyo3` extension along with `native_engine`. That will be consolidated in a followup.

## Benchmark

Before:
```
❯ hyperfine -w 1 -r 10 './pants --no-pantsd list ::'
Benchmark #1: ./pants --no-pantsd list ::
  Time (mean ± σ):      2.170 s ±  0.025 s    [User: 1.786 s, System: 0.303 s]
  Range (min … max):    2.142 s …  2.227 s    10 runs
```

After:

```
❯ hyperfine -w 1 -r 10 './pants --no-pantsd list ::'
Benchmark #1: ./pants --no-pantsd list ::
  Time (mean ± σ):      2.193 s ±  0.060 s    [User: 1.798 s, System: 0.292 s]
  Range (min … max):    2.144 s …  2.348 s    10 runs
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants