Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add build metrics to rustbuild #93717

Merged
merged 4 commits into from
Jun 5, 2022
Merged

Conversation

pietroalbini
Copy link
Member

This PR adds a new module of rustbuild, ci_profiler, whose job is to gather as much information as possible about the CI build as possible and store it in a JSON file uploaded to ci-artifacts. Right now for each step it collects:

  • Type name and debug representation of the Step object.
  • Duration of the step (excluding child steps).
  • Systemwide CPU stats for the duration of the step (both single core and all cores).
  • Which child steps were executed.

This is capable of replacing both the scripts to collect CPU stats and the [TIMING] lines in build logs (not yet removed, until we port our tooling to use the CI profiler). The format is also extensible to be able in the future to collect more information.

r? @Mark-Simulacrum

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 6, 2022
@pietroalbini
Copy link
Member Author

@bors try

@bors
Copy link
Contributor

bors commented Feb 6, 2022

⌛ Trying commit 23e9357c0f89dab9db81f8d12f317968734b25ce with merge 74513354f36e2f00af506239e4feaa74bf715e30...

@pietroalbini
Copy link
Member Author

Some open questions:

  • I'm not really happy with the "CI profiler" name but couldn't come up with anything better, ideas?
  • To avoid installing sysinfo on local builds this uses a new cargo feature, enabled by x.py when the config is set. Is that overkill?
  • The sampling frequency for CPU stats right now is 1s, should we shrink it to 100ms?

src/bootstrap/ci_profiler.rs Outdated Show resolved Hide resolved
src/bootstrap/ci_profiler.rs Outdated Show resolved Hide resolved
src/bootstrap/ci_profiler.rs Outdated Show resolved Hide resolved
src/bootstrap/ci_profiler.rs Outdated Show resolved Hide resolved
assert!(state.running_steps.is_empty(), "steps are still executing");

let steps = std::mem::take(&mut state.finished_steps);
let json = vec![JsonNode::RustbuildInvocation {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should define the top-level structure of this file to be extensible (i.e., an object, rather than an array).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the format should be extensible, but I think the schema I have achieves most of that goal.

As discussed in other comments, we want to track multiple x.py invocations into a single file. But, if we had a top-level object, the object wouldn't be able to store information that differs between invocations, such as the wall clock time of the whole invocation.

In the current schema, the role of the top-level object is in the RustbuildInvocation node, and all the metadata would be stored there. Having the top-level array then allows to easily store multiple of those invocations into a single file.

The only metadata I can see us wanting to store in the top level is maybe a format version number, but I'm not sure it's worth changing the schema just for that. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least some consumers -- e.g., total wall time analysis like my rustc-ci-timing scripts -- don't care about individual rustbuild invocations.

I also think in general a top-level object future-proofs us against any expansion we want to do, and in general unless there's a strong motivation to avoid it I'd prefer to have it (even if the only field is a "invocations" array).

As I mentioned elsewhere there are some properties that are more machine-dependent than invocation-specific that I'd like to add in the future (e.g., CPU model, disk space, etc.) that are useful to have on a per-job level IMO. Those are not really well-mapped to an individual x.py invocation. Likewise, I think it is useful to have the ability to represent e.g. CPU time spent outside x.py invocations, for example if we start trying to track the duration of AWS uploads or docker builds. Both of those don't map well to "RustbuildInvocations", and while we can have an array of differently-typed objects, that gets somewhat unwieldy fast, IMO.

@Mark-Simulacrum
Copy link
Member

I'm not really happy with the "CI profiler" name but couldn't come up with anything better, ideas?

Moving away from the CI naming might be one approach, something like "metrics" or "build-summary" -- in some sense, we're not doing anything unique to CI, and one might imagine that eventually we'd like to collect these (opt-in) for folks building to measure some sense of what machines/platforms people are using for rustc development, what their build performance looks like, something like that.

To avoid installing sysinfo on local builds this uses a new cargo feature, enabled by x.py when the config is set. Is that overkill?

I think this is a good idea. I debated a little over whether to hand-roll the logic (moving it from Python to Rust code), since we already have a libc dep it's probably not that hard, but it seems likely that we'll want to include more things (more on that in a little bit).

The sampling frequency for CPU stats right now is 1s, should we shrink it to 100ms?

I think 1s is probably more than plenty -- I'd be worried about bootstrap starting to use more CPU time than warranted the lower we go just collecting stats. (Note that we don't optimize bootstrap, so it's code is good to keep simple; in practice it's probably not important but I want to avoid anything hugely complicated).


As I mentioned in one of the inline comments, I would like the JSON file to be an object so it's extensible. Right off the bat, I think we should try to include the following extra bits:

  • A total wall time (Instant started and stopped around the whole build)
    • I want to avoid losing (in aggregate) time if e.g. the x.py code switching between steps starts taking a long time for whatever reason. It's also much easier to pull a single number than trying to do a recursive traversal and sum up a bunch of tiny measurements.

In the long run I'd love to include things like RAM, CPU core count, CPU model, etc; but that can come later. Let's just make sure our format is extensible so those deserializing can just have Option<...> or similar around fields rather than maintaining entirely separate structs for each kind.

@Mark-Simulacrum
Copy link
Member

For the CPU stats, I think the current design is OK, but I'd probably prefer we restructure so that instead of each Step trying to have a list of measurements taken during its runtime, we instead move things such that we have a single list of CPU measurements, recorded over the full build, and then each step just has a Range<usize> indicating which of those belong to it. We can still have the ad-hoc logic you created to force re-starting the collection right before a Step begins, of course.

Ideally I think we may want to switch over to having that Vec -- similar to the current CPU collection -- have tuples of (time, usage), rather than pretending that we're recording at some set frequency. With the step-start interrupts there's not a consistent frequency regardless.


We may also want to, instead of trying to replace the current CPU measurement, leave that Python script alone for now. At least on builders that do significant work outside of rustbuild, the current approach will either (a) leave gaps in total wall time / CPU measurements or (b) require some work to bring that in. the src/ci/pgo.sh script has a lot of work outside of the core x.py loop, but which is (I believe) captured by the CPU python script.

I'm not sure whether the python script captures docker image re-builds today, but similarly it would be nice to avoid relying on x.py being the root of truth here -- I'm not sure how to best structure this, maybe you have some thoughts.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Feb 6, 2022

💔 Test failed - checks-actions

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 6, 2022
@pietroalbini
Copy link
Member Author

Moving away from the CI naming might be one approach, something like "metrics" or "build-summary" -- in some sense, we're not doing anything unique to CI, and one might imagine that eventually we'd like to collect these (opt-in) for folks building to measure some sense of what machines/platforms people are using for rustc development, what their build performance looks like, something like that.

metrics / build.metrics is a great name, I'll switch to it.

A total wall time (Instant started and stopped around the whole build)

I'll add that.

In the long run I'd love to include things like RAM, CPU core count, CPU model, etc

I'll see how hard it is to fetch that information with the sysinfo crate.

We may also want to, instead of trying to replace the current CPU measurement, leave that Python script alone for now. At least on builders that do significant work outside of rustbuild, the current approach will either (a) leave gaps in total wall time / CPU measurements or (b) require some work to bring that in. the src/ci/pgo.sh script has a lot of work outside of the core x.py loop, but which is (I believe) captured by the CPU python script.

I'm not sure whether the python script captures docker image re-builds today, but similarly it would be nice to avoid relying on x.py being the root of truth here -- I'm not sure how to best structure this, maybe you have some thoughts.

Now that you mention that, I think the Python script and the CPU information collected by this PR can and probably should coexhist:

  1. The Python script tracks CPU usage across the whole build, even when x.py is not executed. Arguably seeing a graph of the CPU usage of a builder should be a feature the CI platform itself provides (for example CircleCI does that and it's extremely useful). This would be used mostly for "general trends" like "we're using 100% of the macOS CPU all the time".

  2. The tracking done in this PR is specific to x.py steps, and is something that's not trivial to pull out of the logs emitted by the Python script (it's definitely doable, just annoying and not that accurate). This would be used mostly when analyzing a report generated by the data in this PR, to see which steps are not that parallel or don't benefit from many cores.

For the CPU stats, I think the current design is OK, but I'd probably prefer we restructure so that instead of each Step trying to have a list of measurements taken during its runtime, we instead move things such that we have a single list of CPU measurements, recorded over the full build, and then each step just has a Range indicating which of those belong to it. We can still have the ad-hoc logic you created to force re-starting the collection right before a Step begins, of course.

Ideally I think we may want to switch over to having that Vec -- similar to the current CPU collection -- have tuples of (time, usage), rather than pretending that we're recording at some set frequency. With the step-start interrupts there's not a consistent frequency regardless.

I agree this would be the right approach for use case 1), but for 2) I kinda prefer the current implementation.

@Mark-Simulacrum
Copy link
Member

I'm not opposed to indefinitely keeping both, I think that approach definitely has its benefits -- both kinds of data are useful.

Anyway, I think I'm still in favor of a top-level object (with CPU name etc) -- metrics are not necessarily per-invocation.

@bors
Copy link
Contributor

bors commented Feb 13, 2022

☔ The latest upstream changes (presumably #93713) made this pull request unmergeable. Please resolve the merge conflicts.

@pietroalbini pietroalbini added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 13, 2022
@pietroalbini pietroalbini changed the title Add CI profiler to rustbuild Add build metrics to rustbuild Feb 13, 2022
@pietroalbini
Copy link
Member Author

Addressed all review comments.

@rust-log-analyzer

This comment has been minimized.

@pietroalbini
Copy link
Member Author

@bors try to generate a report

@bors
Copy link
Contributor

bors commented Feb 13, 2022

⌛ Trying commit bbd125b96f07e92c1df7b3c46fefd9f8cd8c4ca6 with merge 463b48e8399de823c8ce7f7c9a30190c17751b7e...

src/ci/scripts/upload-artifacts.sh Outdated Show resolved Hide resolved
src/bootstrap/metrics.rs Outdated Show resolved Hide resolved
// new step starts. This ensures the first sample for each step is collected one interval
// after the step started executing, instead of a random time between the start of the step
// and the expected interval (depending on when the last sample was collected, as part of
// the previous step).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment sort of suggests that a step will always get a sample, but for short-lived steps that's not the case.

I think we also need to consider that at least in theory, we should be hitting the reset at step exit as well -- we probably often have situations where we have an outer step that maybe spends 1ms on CPU and an inner step spending 30s on CPU, for example, followed by processing by the outer step (say for 100ms). This sort of interleaving is why I'm a little doubtful about the reset trigger at all, since it doesn't really seem like it's going to produce good results to me.

@bors
Copy link
Contributor

bors commented Feb 13, 2022

☀️ Try build successful - checks-actions
Build commit: 463b48e8399de823c8ce7f7c9a30190c17751b7e (463b48e8399de823c8ce7f7c9a30190c17751b7e)

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 14, 2022
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 22, 2022
@Mark-Simulacrum
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented May 23, 2022

📌 Commit 836b2d2 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 23, 2022
@bors
Copy link
Contributor

bors commented May 24, 2022

⌛ Testing commit 836b2d2 with merge f8a37e6cf0ad737b3d9f4f614370f51e3edf3562...

@bors
Copy link
Contributor

bors commented May 24, 2022

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 24, 2022
@rust-log-analyzer

This comment has been minimized.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 24, 2022
@pietroalbini
Copy link
Member Author

Huh that failure is weird. @GuillaumeGomez any idea why the sysinfo could be failing this way?

thread 'main' panicked at 'attempt to multiply with overflow', C:\Users\runneradmin\.cargo\registry\src\github.com-1285ae84e5963aae\sysinfo-0.23.11\src\windows\system.rs:167:34
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

@GuillaumeGomez
Copy link
Member

Seems like some multiplications didn't get the right method used and it's still present: https://github.com/GuillaumeGomez/sysinfo/blob/master/src/windows/system.rs#L171-L178

Do you prefer I make a 0.23.x release for fixing this bug or do you prefer that I send a PR where sysinfo is used to update it to 0.24.x?

@pietroalbini
Copy link
Member Author

@GuillaumeGomez sysinfo is being introduced by this PR, so a fix would be necessary to land it. I can upgrade the PR to 0.24 without a problem, so feel free to avoid backporting the fix to 0.23 :)

@GuillaumeGomez
Copy link
Member

Great! I fix the issue in GuillaumeGomez/sysinfo#757. I'll ping you once a made a release with it then.

@GuillaumeGomez
Copy link
Member

I published the 0.24.1 version.

@pietroalbini
Copy link
Member Author

Bumped sysinfo version to include the fix.

@bors r=Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Jun 4, 2022

📌 Commit 70cdd7e has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 4, 2022
@bors
Copy link
Contributor

bors commented Jun 5, 2022

⌛ Testing commit 70cdd7e with merge 6dadfc0...

@bors
Copy link
Contributor

bors commented Jun 5, 2022

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 6dadfc0 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 5, 2022
@bors bors merged commit 6dadfc0 into rust-lang:master Jun 5, 2022
@rustbot rustbot added this to the 1.63.0 milestone Jun 5, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6dadfc0): comparison url.

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results
  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
4.0% 4.0% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-3.2% -3.2% 1
All 😿🎉 (primary) N/A N/A 0

Cycles

This benchmark run did not return any relevant results for this metric.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. the arithmetic mean of the percent change

  2. number of relevant changes

@pietroalbini pietroalbini deleted the pa-ci-profiler branch June 6, 2022 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.