Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rayon_1_0_0::sort perf regression #56283

Closed
meven opened this issue Nov 27, 2018 · 8 comments
Closed

rayon_1_0_0::sort perf regression #56283

meven opened this issue Nov 27, 2018 · 8 comments
Assignees
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@meven
Copy link
Contributor

meven commented Nov 27, 2018

According to the work done on lolbench there was a perf regression on
rayon_1_0_0::sort::demo_merge_sort_descending and rayon_1_0_0::sort::par_sort_descending between nightly 2018-11-03 and 2018-11-04.

The benchmarks grew 100%, and close to 200 % in branch_instructions / iteration.

cc @anp

https://lolbench.rs/benchmarks/rayon-1-0-0-sort-demo-merge-sort-descending.html
https://lolbench.rs/benchmarks/rayon-1-0-0-sort-par-sort-descending.html

@Centril Centril added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Nov 27, 2018
@nagisa nagisa added regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 27, 2018
@pnkfelix pnkfelix self-assigned this Nov 29, 2018
@pnkfelix
Copy link
Member

tagging as P-high for the initial investigation of the scope of the performance regression.

@pnkfelix pnkfelix added the P-high High priority label Nov 29, 2018
@nikic
Copy link
Contributor

nikic commented Nov 29, 2018

Commit range is 8b09631...04fdb44, which notably includes the jemalloc removal. Most likely this is just an allocator regression and can be recovered by explicitly enabling jemalloc.

@pnkfelix
Copy link
Member

pnkfelix commented Dec 3, 2018

I can observe the ~2.0x slowdown regression on my Linux desktop.

My mac does not exhibit it nearly as cleanly.

I'm going to see if switching on jemallocator manually (by adding it as a #[global_allocator] to rayon-demo's main.rs) eliminates the regression.

  • Update: yes, turning on #[global_allocator] static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc; eliminates the regression.

@pnkfelix
Copy link
Member

pnkfelix commented Dec 3, 2018

Okay so I think that confirms the hypothesis from @nikic

You can see from the description on PR #55238 that switching from jemalloc to glibc alloc was expected to cause regressions. That included the note that you would get that performance back (and then some) by switching to jemalloc-sys from crates.io. The Rust stdlib don't do it by default, but as far as I can tell it is what we are recommending as the path for clients who want to maximize performance on this axis?

@pnkfelix
Copy link
Member

pnkfelix commented Dec 3, 2018

In any case my personal recommendation is that Rayon should consider turning on jemalloc for its demo code.

Here is the necessary diff:

diff --git a/rayon-demo/Cargo.toml b/rayon-demo/Cargo.toml
index 897b5c8..e1aba34 100644
--- a/rayon-demo/Cargo.toml
+++ b/rayon-demo/Cargo.toml
@@ -5,6 +5,7 @@ authors = ["Niko Matsakis <niko@alum.mit.edu>"]
 publish = false

 [dependencies]
+jemallocator = "0.1.8"
 rayon = { path = "../" }
 cgmath = "0.16"
 docopt = "1"
diff --git a/rayon-demo/src/main.rs b/rayon-demo/src/main.rs
index 84619c0..e21d7b0 100644
--- a/rayon-demo/src/main.rs
+++ b/rayon-demo/src/main.rs
@@ -1,5 +1,9 @@
 #![cfg_attr(test, feature(test))]

+extern crate jemallocator;
+#[global_allocator]
+static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
+
 use std::env;
 use std::io;
 use std::io::prelude::*;

@pnkfelix
Copy link
Member

pnkfelix commented Dec 3, 2018

(I think there isn't much more to investigate here with respect to the scope of the regression nor its source. Downgrading from P-high to P-medium, and nominating for discussion at next compiler meeting with inclination to close as "not-a-bug"/"wont-fix")

@pnkfelix pnkfelix added I-nominated P-medium Medium priority and removed P-high High priority labels Dec 3, 2018
@anp
Copy link
Member

anp commented Dec 3, 2018

FWIW I hope to support measuring against multiple different allocators in lolbench in the future, just not there yet.

@pnkfelix
Copy link
Member

pnkfelix commented Dec 6, 2018

closing as wontfix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants