Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[internal] Port match_path_globs to PyO3 #12225

Merged
merged 2 commits into from
Jul 22, 2021

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented Jun 18, 2021

This shows how we can convert types defined in Python into Rust, e.g. PathGlobs.

Whereas we used to use a free function to do the conversion and asked for PyObject in the FFI function, now we implement the trait FromPyObject on a wrapper type PyPathGlobs so that we can request it directly in the function signature. This is more idiomatic and makes the #[pyfunction] more clear what it does.

We also now directly use PyO3's .getattr() and .extract() at callsites, rather than using our custom getattr() functions that wrapped those with Rust-Cpython. This removes a layer of indirection and doesn't noticeably increase boilerplate at callsites - it in fact reduces some because it's easier to chain .getattr() multiple times.

At least for now, we don't use Value, which is an Arc around a Rust-CPython PyObject. We were using that because cloning with Rust-CPython requires the GIL. Here, we're able to clone w/o requiring the GIL (in PyPathGlobs.parse()), so I went with a simpler implementation. Some benefits of avoiding Value are a) better type safety and b) avoiding a lock. We may end up needing Value at the end of the day, but we can add this back.

Benchmark

PyO3 does have slightly worse performance with this diff:

diff --git a/src/python/pants/bin/pants_loader.py b/src/python/pants/bin/pants_loader.py
index 10c8b406b..ec9ef50bf 100644
--- a/src/python/pants/bin/pants_loader.py
+++ b/src/python/pants/bin/pants_loader.py
@@ -126,4 +126,8 @@ def main() -> None:


 if __name__ == "__main__":
-    main()
+    from pants.engine.internals.native_engine import match_path_globs
+    from pants.engine.fs import PathGlobs
+
+    x = match_path_globs(PathGlobs(["**/*.txt"]), tuple(f"foo/{i}.txt" for i in range(1_000_000)))
+    # main()

Before:

❯ hyperfine --warmup 1 --runs=100 './pants'
Benchmark #1: ./pants
  Time (mean ± σ):     712.3 ms ±   4.6 ms    [User: 587.7 ms, System: 103.3 ms]
  Range (min … max):   701.0 ms … 722.9 ms    100 runs

After:

❯ hyperfine --warmup 1 --runs=100 './pants'
Benchmark #1: ./pants
  Time (mean ± σ):     732.7 ms ±   8.0 ms    [User: 610.8 ms, System: 100.2 ms]
  Range (min … max):   720.5 ms … 778.0 ms    100 runs

Another after benchmark run with fewer apps on my machine shows 726.1 ms ± 4.9 ms [User: 610.0 ms, System: 94.8 ms]. Note that User is still the same, and only System changes. So this seems to be an actual slowdown.

I am not yet sure why, but can profile if this is concerning.

[ci skip-build-wheels]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]

impl PyPathGlobs {
fn parse(self) -> PyResult<PreparedPathGlobs> {
self.0.clone().parse().map_err(|e| {
Copy link
Contributor Author

@Eric-Arellano Eric-Arellano Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! This clone() might explain the slowdown in the contrived benchmark. We clone an Arc instead with Rust-CPython.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that's not it. And the PathGlobs is cheap in the example. It's the paths: Vec<String> that's so huge and would be painful to clone.

Copy link
Sponsor Member

@stuhood stuhood Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this fn takes ownership of self, it shouldn't need to clone: it can consume self (including self.0).

EDIT: Oh. But we clone to render the error message. Hm.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the error type is String - presumably the glob which failed to parse? You might not need to clone the rest of the globs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

presumably the glob which failed to parse? You might not need to clone the rest of the globs.

Yeah, although we want to show what self was in the error message so it's more useful. Not technically necessary, but I think it's fine because the PathGlobs object is usually pretty small (a vec of a few strings).

@stuhood thoughts?

Copy link
Contributor Author

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @davidhewitt on the benchmark. Lmk if it'd help to do a profile.


impl PyPathGlobs {
fn parse(self) -> PyResult<PreparedPathGlobs> {
self.0.clone().parse().map_err(|e| {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that's not it. And the PathGlobs is cheap in the example. It's the paths: Vec<String> that's so huge and would be painful to clone.

Comment on lines 69 to 71
.into_iter()
.filter(|p| path_globs.matches(&PathBuf::from(p)))
.collect(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is different than Rust-CPython, which first maps to PathBuf, then filters, then converts back to strings. I keep the original strings for less work. Lmk if we need the maps.

@davidhewitt
Copy link

I'll take a more detailed look at this on the weekend (probably Sunday). We should be able to avoid a slowdown, so I'll investigate and propose solutions.

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for now, we don't use Value, which is an Arc around a Rust-CPython PyObject. We were using that because cloning with Rust-CPython requires the GIL.

Yea, I would be surprised if cloning a Python object with Py03 does not also require acquiring the GIL. AFAIK, adjusting the reference counts of Python objects always does.

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks really nice! :)

src/rust/engine/engine_pyo3/src/externs/interface/fs.rs Outdated Show resolved Hide resolved
.getattr("glob_match_error_behavior")?
.getattr("value")?
.extract()?;
let match_behavior = StrictGlobMatching::create(match_behavior_str, description_of_origin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was kind of hoping for an API closer to serde::Deserialize here, where a derive macro would to most of the field-mapping work for you, and you'd be able to write something like:

let path_globs: PathGlobs = pyo3_serde::from_object(obj)?;

rather than having to get each field by field-name and put them together yourself... That would probably necessitatedescription_of_origin being a field on glob_match_error_behavior rather than a sibling field in obj.

I guess this is PyO3/pyo3#566 and I definitely wouldn't suggest implementing it on the critical path of this conversion, but it would be nice to avoid that boilerplate... :)

Copy link
Sponsor Member

@stuhood stuhood Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also worth calling out that this wrapping of a Python object is an alternative to making the object native as we do for a few types in

py_class!(pub class PyDigest |py| {

I'm kindof curious what making the object native would look like with Py03's API... it might be less awkward than with rust-cpython's, and I expect that it would be more performant than double-constructing objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I didn't go with a native class because we already have examples of that and to land the port successfully I want to avoid rewriting everything to do that. But after the port is finished, it makes sense.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to do something like this to avoid a lot of the getattr boilerplate:

use pyo3::FromPyObject;

#[derive(FromPyObject)]
struct GlobMatchErrorBehavior<'a> {
    value: &'a str,
}

#[derive(FromPyObject)]
struct Conjunction<'a> {
    value: &'a str,
}

#[derive(FromPyObject)]
struct PyPathGlobs<'a> {
    globs: Vec<String>,
    description_of_origin: String,
    glob_match_error_behaviour: GlobMatchErrorBehavior<'a>,
    conjunction: Conjunction<'a>,
}

Copy link

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had a chance to assess performance yet, however some general comments...


impl PyPathGlobs {
fn parse(self) -> PyResult<PreparedPathGlobs> {
self.0.clone().parse().map_err(|e| {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the error type is String - presumably the glob which failed to parse? You might not need to clone the rest of the globs.

.getattr("glob_match_error_behavior")?
.getattr("value")?
.extract()?;
let match_behavior = StrictGlobMatching::create(match_behavior_str, description_of_origin)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to do something like this to avoid a lot of the getattr boilerplate:

use pyo3::FromPyObject;

#[derive(FromPyObject)]
struct GlobMatchErrorBehavior<'a> {
    value: &'a str,
}

#[derive(FromPyObject)]
struct Conjunction<'a> {
    value: &'a str,
}

#[derive(FromPyObject)]
struct PyPathGlobs<'a> {
    globs: Vec<String>,
    description_of_origin: String,
    glob_match_error_behaviour: GlobMatchErrorBehavior<'a>,
    conjunction: Conjunction<'a>,
}

@Eric-Arellano
Copy link
Contributor Author

Great! Using PyO3 0.14.1 and Daniel's suggestion of Path::new instead of PathBuf::from, I now get PyO3 as slightly faster but essentially equivalent:

Before

❯ hyperfine --warmup 1 --runs=100 './pants'
  Time (mean ± σ):     698.7 ms ±   4.6 ms    [User: 584.0 ms, System: 94.4 ms]
  Range (min … max):   679.7 ms … 714.6 ms    100 runs

After:

❯ hyperfine --warmup 1 --runs=100 './pants'
  Time (mean ± σ):     697.4 ms ±   5.0 ms    [User: 581.5 ms, System: 95.4 ms]
  Range (min … max):   685.8 ms … 718.2 ms    100 runs

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@Eric-Arellano Eric-Arellano merged commit 889446b into pantsbuild:main Jul 22, 2021
@Eric-Arellano Eric-Arellano deleted the pyo3-pathglobs branch July 22, 2021 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants