-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Experimental] Performance test new paging-based measureme data sink. #76436
[Experimental] Performance test new paging-based measureme data sink. #76436
Conversation
@bors try |
@michaelwoerister: 🔑 Insufficient privileges: not in try users |
@bors try |
⌛ Trying commit d9054f8d414a52d189b26a0fa92f614740d596b2 with merge 12da0f82045ff20d2ae70c83c926bae9c22df223... |
💔 Test failed - checks-actions |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
8fd2459
to
ca1bb18
Compare
OK, this seems to pass all checks now. @Xanewok, would you be so kind and start another perf run for me? |
Sure! Let's see if this one-liner works... @bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit ca1bb18264167719f514e66b94870f465036810f with merge cc857b33d54a8ee4ee103623dc1de441b8cf6009... |
Try build finished but intentionally not queueing this on perf, as it won't work. I'm working on patch that we can land to make this work side-by-side with regular builds. |
Looks like this currently fails in rustc, when running on helloworld
|
Indeed it does. |
ca1bb18
to
e5d80ea
Compare
@bors try |
⌛ Trying commit e5d80ea with merge da6b924ee19d83e73682a095f60d8bc79de4cdc3... |
☀️ Try build successful - checks-actions, checks-azure |
Queued da6b924ee19d83e73682a095f60d8bc79de4cdc3 with parent 5a6b426, future comparison URL. |
I've put a new summarize on the perf collector which seemed to work locally (the collector supports 2 different simultaneous summarizes). I am uncertain if the upload logic will break, but we'll tackle that if necessary. |
Thanks @Mark-Simulacrum, I was able to do a full rustc-perf run with this version locally. Let's see if something else breaks. One way or the other, I'll need to rewrite the whole thing anyway before it can be merged. The current version has quite a bit of room for improvement in terms of code quality. |
Okay had to patch perf.rlo a bit but it should collect now. The site is likely to show just noise -- we've likely not changed overhead for collection being off, and perf.rlo only enables self-profile for one out of 2-3 runs of each benchmark. I will be able to manually pull up data from the database for the maximum rather than minimum statistic which should give us some idea of how much of a regression/win this is. |
Finished benchmarking try commit (da6b924ee19d83e73682a095f60d8bc79de4cdc3): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Alright, those numbers look encouraging. I'll put together a version that doesn't use a background thread tomorrow. |
Interesting, yes. There seem to be a few minor regressions. Maybe there is a bit of a pattern here: Most regressions seem to be scenarios that are quick to execute and where there's little time spent in LLVM and the linker. Maybe the background thread becomes the bottleneck towards the end, when it is still writing stuff to disk and the rest of the program is only waiting on it to finish? How did you generate those numbers? Is there a special mode in the collector? |
@Mark-Simulacrum I just pushed a version that writes data to disk in a blocking way. That data format is not affected by this so the tooling on the server should still be compatible. |
@bors try @rust-timer queue
The collector runs rustc 2-3 times: once for self-profile, and then 1-2 times without it. By default, the perf.rust-lang.org frontend shows the minimum of each stat across those runs. Since self-profile does impose some overhead, that's usually not the self-profile run. The page I posted switched min to max (manually editing the SQL query). I guess it might create a recursive "self feeding" problem, but in theory if we record events for the serialization as well, then crox should be able to show us when those are being serialized and to what extent we're waiting on them. I agree that it seems likely we're waiting at the end. |
Awaiting bors try build completion |
⌛ Trying commit 2ad078b with merge 7479e5973d66928be9e1bbce6cd377ff0957c094... |
☀️ Try build successful - checks-actions, checks-azure |
Queued 7479e5973d66928be9e1bbce6cd377ff0957c094 with parent 3f5e617, future comparison URL. |
Huh, that basically means we are not measuring anything useful with these perf runs, right?
There's definitely some things that we can record. For things that are application specific (e.g. the handling of query keys before they make it into However, the final flushing of the file would need some kind of special handling. Right now we do that in |
Running the query is not hard so I don't mind doing that. I'm also thinking about exposing a toggle for "just self-profile" results (we do technically have enough information) on the UI, though I'll probably not get around to it for some time. |
Finished benchmarking try commit (7479e5973d66928be9e1bbce6cd377ff0957c094): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Yeah, this looks actually better than the more complicated version. That's good to know. These things so often are so counterintuitive :) Thanks a lot for getting those numbers. I think it's really great that perf.rlo is using an SQL database now, btw. |
I'm going to close this PR, we got the numbers we need for deciding on how to forward. |
This PR will try to make sure that the new
measureme
file format (as described in rust-lang/measureme#128) does not regress performance while doing self-profiling.The perf.rlo run is actually expected to fail (because the tools on the perf server cannot handle the new format yet) but hopefully we'll still get the top-level timings that should show potential regressions.
The implementation of the new format as referenced by this PR is not production ready yet, but it should give a good indication of final performance. Once we know if this method of performance testing actually yields results, we can also test an alternative implementation that is slightly simpler.
cc @wesleywiser @Mark-Simulacrum