Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICE compiling rustc-serialize under sccache 'failed to acquire jobserver token' #42867

Closed
rillian opened this issue Jun 23, 2017 · 44 comments
Closed
Labels
C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️

Comments

@rillian
Copy link
Contributor

rillian commented Jun 23, 2017

Failure building Firefox with rustc 1.20.0-nightly (ab5bec2 2017-06-22).

I can't reproduced on a local build with --enable-stylo so it's something to do with the sccache wrapper or the build environment.

checking for rustc... /home/worker/workspace/build/src/rustc/bin/rustc
checking for cargo... /home/worker/workspace/build/src/rustc/bin/cargo
checking rustc version... 1.20.0-nightly
checking cargo version... 0.21.0
[...]
error: internal compiler error: unexpected panic
note: the compiler unexpectedly panicked. this is a bug.
note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
note: rustc 1.20.0-nightly (ab5bec255 2017-06-22) running on x86_64-unknown-linux-gnu
note: run with `RUST_BACKTRACE=1` for a backtrace
thread 'rustc' panicked at 'failed to acquire jobserver token: Error { repr: Os { code: 9, message: "Bad file descriptor" } }', /checkout/src/libcore/result.rs:860
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
             at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at /checkout/src/libstd/sys_common/backtrace.rs:71
   2: std::panicking::default_hook::{{closure}}
             at /checkout/src/libstd/sys_common/backtrace.rs:60
             at /checkout/src/libstd/panicking.rs:355
   3: std::panicking::default_hook
             at /checkout/src/libstd/panicking.rs:365
   4: std::panicking::rust_panic_with_hook
             at /checkout/src/libstd/panicking.rs:549
   5: std::panicking::begin_panic
             at /checkout/src/libstd/panicking.rs:511
   6: std::panicking::begin_panic_fmt
             at /checkout/src/libstd/panicking.rs:495
   7: rust_begin_unwind
             at /checkout/src/libstd/panicking.rs:471
   8: core::panicking::panic_fmt
             at /checkout/src/libcore/panicking.rs:69
   9: core::result::unwrap_failed
  10: rustc_trans::back::write::execute_work
  11: rustc_trans::back::write::run_passes
  12: rustc_driver::driver::phase_5_run_llvm_passes
  13: rustc_driver::driver::compile_input
  14: rustc_driver::run_compiler
error: Could not compile `rustc-serialize`.
Caused by:
  process didn't exit successfully: `/home/worker/workspace/build/src/sccache2/sccache /home/worker/workspace/build/src/rustc/bin/rustc --crate-name rustc_serialize /home/worker/workspace/build/src/third_party/rust/rustc-serialize/src/lib.rs --crate-type lib --emit=dep-info,link -C opt-level=2 -C debuginfo=2 -C metadata=eaa267b04fe4074c -C extra-filename=-eaa267b04fe4074c --out-dir /home/worker/workspace/build/src/obj-firefox/toolkit/library/release/deps -C linker=/home/worker/workspace/build/src/build/cargo-linker -L dependency=/home/worker/workspace/build/src/obj-firefox/toolkit/library/release/deps --cap-lints allow` (exit code: 101)
@Mark-Simulacrum Mark-Simulacrum added the I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️ label Jun 23, 2017
@rillian
Copy link
Contributor Author

rillian commented Jun 23, 2017

On the (linux-hosted) macOS build, the same panic happens building the simd crate. It must be something to with the jobserver rather than the specific input code.

@glandium
Copy link
Contributor

related to rust-lang/cargo#1744 ?

@glandium
Copy link
Contributor

The "failed to acquire jobserver token" message only appears in cargo, and comes from that issue I just mentioned.

@rillian
Copy link
Contributor Author

rillian commented Jun 23, 2017

The jobserver moved into the compiler's parallel support recently. See #42682.

@glandium
Copy link
Contributor

Err, I had failed to update my rust tree properly, which is why I still was seeing the old code.

@alexcrichton
Copy link
Member

Hm sounds bad! @rillian do you know if there's a way to reproduce this short of "check out gecko and build everything"? I'm trying some local smaller test cases but can't seem to get the same assertion to show up.

@rillian
Copy link
Contributor Author

rillian commented Jun 25, 2017

It's even harder than that. I've only seen it in the automation builds, so we need something from that environment. I couldn't reproduce with a local gecko build. Probably the best a approach is to use a taskcluster image.

I can investigate further, but it will take a few days.

@alexcrichton
Copy link
Member

Hm ok, I'll try expanding this a bit. Yes jobserver support was recently added to both rustc and Cargo with the intention of being able to create their own and integrate with foreign jobservers. Both rustc and Cargo have the same support code, a jobserver crate.

Jobserver inheritance is specified through the MAKEFLAGS env var which the crate will parse and attempt to interpret. On unix this means it'll attempt to parse the string as file descriptors and interpret those. Note, though, that it attempts to validate that the file descriptor is indeed valid by checking fstat that it's a pipe. Both cargo and rustc ignore misconfigured MAKEFLAGS and fall back to creating their own jobserver if it looks like a funny env var.

So what's most likely to be happening here is that MAKEFLAGS is passed into rustc, it passes validation, and then later on it turns out it was actually invalid (for whatever reason) which causes issues. The error here, "bad file descriptor," I... don't know how that could happen! That seems to me like a file descriptor was closed but rustc afaik doesn't do that.

Some questions that may help narrow this down:

  • Does this only happen with an sccache wrapper?
  • If so, what revision of sccache?
  • If so, is the sccache server started elsewhere?
  • If not, what's the process tree look like when this step in the build process is called? Is make somewhere in the history? Is cargo the one expected to be creating jobservers? (etc)

@glandium
Copy link
Contributor

sccache has a client-server model, and it's the server that does actual compilations, invoking rustc. The problem might be that the server and the client are not running in the same make environment or something related to that.

In any case, cargo/rustc should probably avoid failing in that case...

Cc: @luser

@rillian
Copy link
Contributor Author

rillian commented Jun 27, 2017

I wasn't able to reproduce with sccache disabled.

@alexcrichton
Copy link
Member

@rillian do you know how sccache is started in the build system? Is the server started manually way near the beginning?

Additionally is this perhaps invoked by make or otherwise? Or is this run in a "standalone" fashion?

@glandium
Copy link
Contributor

I /think/ the sccache server starts during the make run, but it is possible other compilations happen during a /separate/ make run. (specifically, things related to gtests would)

@glandium
Copy link
Contributor

@luser would likely know better.

@rillian
Copy link
Contributor Author

rillian commented Jun 27, 2017

Afaict we rely on the first sccache invocation to start it. There is code to explicitly shut down any running server before we start and after we finish. I don't see any spurious invocations of sccache --stop-server in the log though. Just once at the start of the failed run, so don't think that's it.

Is it possible traversing a make invocation boundary is closing the file descriptor rust (or sccache?) received from the environment?

Our builds are say they're using sccache 9155425cfc038d6a60deb50816055f4e93b93ad1, but I don't see that commit in the history. Maybe it's a fork.

@rillian
Copy link
Contributor Author

rillian commented Jun 27, 2017

Yes, our scache build is from luser/sccache@9155425.

@alexcrichton
Copy link
Member

Hm ok I'm still not sure how this can happen... It sounds like sccache is started early enough that it doesn't inherit any file descriptors. This means that rustc runs with --jobserver-fds=... but those are invalid fds. This should be detected by the jobserver crate unless using fstat is just broken?

@rillian
Copy link
Contributor Author

rillian commented Jul 10, 2017

I never got time to investigate, but I can still reproduce with today's 1.20.0-nightly.

@glandium
Copy link
Contributor

glandium commented Jul 13, 2017

So, there are several things going on, but I'll just start with the one that's really relevant to cargo and rust:

  • As mentioned in Add support for make jobservers cargo#1744 (comment) make is kind of a jerk, and sets the environment variables with --jobserver-fds even when it doesn't give out the pipes. Which means, as mentioned in that comment, the very first thing that any jobserver client needs to do is to check whether the file descriptors are there or not, because anything after the process startup may open file descriptors and get those numbers. This also means ICE'ing is not the best thing rustc and cargo can do in that (unfortunately) common case.
  • The Firefox build system is complex. When the first call to sccache happens, which spawns the sccache server, we're in configure, running from make, running from mach, and that make instance is without a jobserver (essentially because the makefile is bonkers and doesn't work with parallelism). So we don't get MAKEFLAGS set with --jobserver-fds, and don't hit the rust ICE as long as we keep using that sccache server, We might be able to get away with enabling the jobserver and adding a .NOTPARALLEL: to that makefile, but make would still not pass the fds to configure (we'd have to prefix the configure call with a +).
  • For some lucky reason (I guess that has to do with make export taking too long ; edit: In fact, it's because the preflight rule from build/sccache.mk runs after configure, killing any sccache server from configure, although the original intent was for it to run before configure...), by the time we reach the first compilation, we end up launching a new sccache server, which then inherits the MAKEFLAGS with --jobserver-fds because we're now in a make instance with a jobserver (launched from the one without from above, don't ask, I told you it was complex). But because it's not prefixed with a +, make doesn't pass the fds, and rust barfs. So in the Firefox build system, we'd need to prefix all invocations of sccache from a Makefile with a +, as well as the one for cargo. The latter would be necessary whether or not sccache is enabled for the build.

For all these reasons above, when I tried this morning with 1.20.0-nightly on a recent tree, it didn't fail, because luck. And as mentioned to @alexcrichton on irc, cargo also didn't actually use the job server, for the same reasons.

@glandium
Copy link
Contributor

FWIW, other than those issues, the jobserver client support in cargo and rustc seems to work.

@alexcrichton
Copy link
Member

Hm so I'm still confused how rustc/cargo are using a broken jobserver. There should be enough guards in place all along the way to prevent usage of a misconfigured or broken jobserver, it's expected that we get MAKEFLAGS without actually getting file descriptors in a sense.

The error above is EBADF returned from a call to read, which is very confusing to me. This means that the file descriptor is closed in the Rust process, which given the validation I mentioned previously that the jobserver crate does seems quite odd. Even if random file descriptors are leaked into a rustc process to fool it about the jobserver, how are the file descriptors closed in the rustc process?

@glandium
Copy link
Contributor

EBADF doesn't necessarily mean the fd is closed. It means it's not a valid fd for reading. Which means it could be a valid write end of a pipe. As a matter of fact, I've seen that happen in sccache itself, but sccache doesn't try to use the fds from MAKEFLAGS, so that wasn't a problem: the process had MAKEFLAGS=--jobserver-fds=5,6, and per /proc/pid/fd, 6 and 7 were end points of the same pipe (which I presume, come from a rust channel).

Anyways, here's the likely scenario:

  • Make invokes cargo with MAKEFLAGS=--jobserver-fds=n,m but doesn't pass the fds.
  • Cargo starts with fd 0, 1, 2 respectively for stdin, stdout and stderr, and nothing else.
  • Cargo does stuff, which opens fds, and ends up opening fds that happen to have number n and/or m
  • Cargo tries to use that after checking MAKEFLAGS and kaboom.

That's a simplified scenario. The real-world problem that this issue was opened for goes one level deeper, with rustc.

The jobserver code does check that the fds are end points of pipes, but the fact is, by the time the fds are checked, they might well be.

Disclaimer: I haven't actually seen what fds were open in a rustc process, but this seems like a reasonable explanation. The safest way IMHO, that the jobserver code can be initialized is from a lazy_static (which doesn't even guarantee it would be first, but at least, the chances of something else opening fds are slim)

@alexcrichton
Copy link
Member

Ah I didn't actually know you could get EBADF for the wrong end of a pipe! I guess that makes sense though, and sounds quite plausible. I'm still having difficulty constructing how this is possible to happen though. In Rust all files are atomically opened with CLOEXEC so nothing from sccache should leak into spawned processes of sccache. Additionally practically the first thing Cargo does is call Config::default which'll hit jobserver initialization. In rustc we build s session quite early which'll hit initialization.

I wonder if fds are being leaked into sccache itself from make and/or configure? Is there a way to check what the fds are of sccache itself when this happens?

It does sound like regardless that the something crate needs to be even more resilient to to a "misconfigured" environment. I'm not really sure what this would look like in terms of a reasonable implementation though... I wonder if sccache could just scrub env vars for a quick fix?

@glandium
Copy link
Contributor

I wonder if fds are being leaked into sccache itself from make and/or configure?

There are, but I think they are red herrings... as a matter of fact, when the fds from make are passed down (and still other fds leaked), as noted in #42867 (comment) , this actually works properly.

@glandium
Copy link
Contributor

I actually double checked last comment by taking the same Firefox tree from @rillian's try push, and applying bug 1367940 on top of it. https://treeherder.mozilla.org/#/jobs?repo=try&revision=b7b58c097d8c0b4340ee88c982c663655b51afcb

@Mark-Simulacrum Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 22, 2017
@staktrace
Copy link
Contributor

I'm running into this same failure (ICE with 'failed to acquire jobserver token') when doing a standalone webrender build (no gecko/mozilla-central involved) with sccache. I don't get the error without sccache.

The machine it's running on is also pretty minimal - it just has the bare minimum of things needed to build webrender. Latest stable rust/cargo (1.20.0) and latest sccache (0.2.1). The build is run on the machine through taskcluster integration, but the machine is in a server farm we rented from macstadium and that we're managing manually. You can see the full taskcluster log at https://tools.taskcluster.net/groups/BqgBtyCvQUqgDaQAJglrUA/tasks/Q8UMqP9TR7-va2ztvs-lSw/runs/0/logs/public%2Flogs%2Flive.log

It would be great to get this resolved as we want to use this setup for webrender CI and being able to use sccache means we'll get much faster turnaround.

@alexcrichton
Copy link
Member

@staktrace hm that's a very curious error... The OS error there is "Resource temporarily unavailable" with code 35 which seems to correspond to EDEADLK. Apparently calling read led to EDEADLK?

Are you running sccache manually ahead of time? Or is it just getting spawned naturally through the first invocation?

@staktrace
Copy link
Contributor

@alexcrichton It is getting spawned naturally via the first invocation.

@alexcrichton
Copy link
Member

I... still have no idea how to explain this.

Using sccache sort of fundamentally can't work with how jobservers are designed, but rustc should have a check to prevent it from using bad fds. This means that somehow bad fds are leaking in, and I have no idea how to explain that.

In any case none of this is really supposed to work anyway (sccache working with jobservers) so I think we need to either:

  1. Remove any jobserver-related env vars from spawned processes in sccache
  2. Configure sccache with its own jobserver which is passed to spawned processes.

I'm personally a little in favor of (2)

@aneeshusa
Copy link

I haven't looked into this at all yet, but wanted to mention I'm getting similar errors when trying to enable this on Servo CI.

alexcrichton added a commit to alexcrichton/sccache that referenced this issue Sep 27, 2017
This commit alters the main sccache server to operate and orchestrate its own
GNU make style jobserver. This is primarily intended for interoperation with
rustc itself.

The Rust compiler currently has a multithreaded mode where it will execute code
generation and optimization on the LLVM side of things in parallel. This
parallelism, however, can overload a machine quickly if not properly accounted
for (e.g. if 10 rustcs all spawn 10 threads...). The usage of a GNU make style
jobserver is intended to arbitrate and rate limit all these rustc instances to
ensure that one build's maximal parallelism never exceeds a particular amount.

Currently for Rust Cargo is the primary driver for setting up a jobserver. Cargo
will create this and manage this per compilation, ensuring that any one `cargo
build` invocation never exceeds a maximal parallelism. When sccache enters the
picture, however, the story gets slightly more odd.

The jobserver implementation on Unix relies on inheritance of file descriptors
in spawned processes. With sccache, however, there's no inheritance as the
actual rustc invocation is spawned by the server, not the client. In this case
the env vars used to configure the jobsever are usually incorrect.

To handle this problem this commit bakes a jobserver directly into sccache
itself. The jobserver then overrides whatever jobserver the client has
configured in its own env vars to ensure correct operation. The settings of each
jobserver may be misconfigured (there's no way to configure sccache's jobserver
right now), but hopefully that's not too much of a problem for the forseeable
future.

The implementation here was to provide a thin wrapper around the `jobserver`
crate with a futures-based interface. This interface was then hooked into the
mock command infrastructure to automatically acquire a jobserver token when
spawning a process and automatically drop the token when the process exits.
Additionally, all spawned processes will now automatically receive a configured
jobserver.

cc rust-lang/rust#42867, the original motivation for this commit
@alexcrichton
Copy link
Member

I've opened mozilla/sccache#185 with my personal preferred strategy of adding a jobserver to sccache.

jrobsonchase pushed a commit to jrobsonchase/sccache that referenced this issue Oct 2, 2017
This commit alters the main sccache server to operate and orchestrate its own
GNU make style jobserver. This is primarily intended for interoperation with
rustc itself.

The Rust compiler currently has a multithreaded mode where it will execute code
generation and optimization on the LLVM side of things in parallel. This
parallelism, however, can overload a machine quickly if not properly accounted
for (e.g. if 10 rustcs all spawn 10 threads...). The usage of a GNU make style
jobserver is intended to arbitrate and rate limit all these rustc instances to
ensure that one build's maximal parallelism never exceeds a particular amount.

Currently for Rust Cargo is the primary driver for setting up a jobserver. Cargo
will create this and manage this per compilation, ensuring that any one `cargo
build` invocation never exceeds a maximal parallelism. When sccache enters the
picture, however, the story gets slightly more odd.

The jobserver implementation on Unix relies on inheritance of file descriptors
in spawned processes. With sccache, however, there's no inheritance as the
actual rustc invocation is spawned by the server, not the client. In this case
the env vars used to configure the jobsever are usually incorrect.

To handle this problem this commit bakes a jobserver directly into sccache
itself. The jobserver then overrides whatever jobserver the client has
configured in its own env vars to ensure correct operation. The settings of each
jobserver may be misconfigured (there's no way to configure sccache's jobserver
right now), but hopefully that's not too much of a problem for the forseeable
future.

The implementation here was to provide a thin wrapper around the `jobserver`
crate with a futures-based interface. This interface was then hooked into the
mock command infrastructure to automatically acquire a jobserver token when
spawning a process and automatically drop the token when the process exits.
Additionally, all spawned processes will now automatically receive a configured
jobserver.

cc rust-lang/rust#42867, the original motivation for this commit
@whitslack
Copy link

So I believe that EAGAIN primarily has to do with nonblocking mode, which I think means that something is setting these file descriptors to nonblocking mode, which would certainly cause problems! As to what is setting them to nonblocking mode... is a good question.

Make sets its jobserver pipe file descriptors to non-blocking mode when it's configured to use pselect.

@luser
Copy link
Contributor

luser commented Oct 5, 2017

Yeah, the EAGAIN in cargo is almost 100% reproducible using gmake master to build Firefox. Changing that #ifdef HAVE_PSELECT to #if 0 makes it go away.

@alexcrichton
Copy link
Member

@whitslack excellent point! I've got a fix for your and @luser's issue at rust-lang/jobserver-rs#2

alexcrichton added a commit to alexcrichton/sccache that referenced this issue Oct 6, 2017
This commit alters the main sccache server to operate and orchestrate its own
GNU make style jobserver. This is primarily intended for interoperation with
rustc itself.

The Rust compiler currently has a multithreaded mode where it will execute code
generation and optimization on the LLVM side of things in parallel. This
parallelism, however, can overload a machine quickly if not properly accounted
for (e.g. if 10 rustcs all spawn 10 threads...). The usage of a GNU make style
jobserver is intended to arbitrate and rate limit all these rustc instances to
ensure that one build's maximal parallelism never exceeds a particular amount.

Currently for Rust Cargo is the primary driver for setting up a jobserver. Cargo
will create this and manage this per compilation, ensuring that any one `cargo
build` invocation never exceeds a maximal parallelism. When sccache enters the
picture, however, the story gets slightly more odd.

The jobserver implementation on Unix relies on inheritance of file descriptors
in spawned processes. With sccache, however, there's no inheritance as the
actual rustc invocation is spawned by the server, not the client. In this case
the env vars used to configure the jobsever are usually incorrect.

To handle this problem this commit bakes a jobserver directly into sccache
itself. The jobserver then overrides whatever jobserver the client has
configured in its own env vars to ensure correct operation. The settings of each
jobserver may be misconfigured (there's no way to configure sccache's jobserver
right now), but hopefully that's not too much of a problem for the forseeable
future.

The implementation here was to provide a thin wrapper around the `jobserver`
crate with a futures-based interface. This interface was then hooked into the
mock command infrastructure to automatically acquire a jobserver token when
spawning a process and automatically drop the token when the process exits.
Additionally, all spawned processes will now automatically receive a configured
jobserver.

cc rust-lang/rust#42867, the original motivation for this commit
jrobsonchase pushed a commit to jrobsonchase/sccache that referenced this issue Oct 10, 2017
This commit alters the main sccache server to operate and orchestrate its own
GNU make style jobserver. This is primarily intended for interoperation with
rustc itself.

The Rust compiler currently has a multithreaded mode where it will execute code
generation and optimization on the LLVM side of things in parallel. This
parallelism, however, can overload a machine quickly if not properly accounted
for (e.g. if 10 rustcs all spawn 10 threads...). The usage of a GNU make style
jobserver is intended to arbitrate and rate limit all these rustc instances to
ensure that one build's maximal parallelism never exceeds a particular amount.

Currently for Rust Cargo is the primary driver for setting up a jobserver. Cargo
will create this and manage this per compilation, ensuring that any one `cargo
build` invocation never exceeds a maximal parallelism. When sccache enters the
picture, however, the story gets slightly more odd.

The jobserver implementation on Unix relies on inheritance of file descriptors
in spawned processes. With sccache, however, there's no inheritance as the
actual rustc invocation is spawned by the server, not the client. In this case
the env vars used to configure the jobsever are usually incorrect.

To handle this problem this commit bakes a jobserver directly into sccache
itself. The jobserver then overrides whatever jobserver the client has
configured in its own env vars to ensure correct operation. The settings of each
jobserver may be misconfigured (there's no way to configure sccache's jobserver
right now), but hopefully that's not too much of a problem for the forseeable
future.

The implementation here was to provide a thin wrapper around the `jobserver`
crate with a futures-based interface. This interface was then hooked into the
mock command infrastructure to automatically acquire a jobserver token when
spawning a process and automatically drop the token when the process exits.
Additionally, all spawned processes will now automatically receive a configured
jobserver.

cc rust-lang/rust#42867, the original motivation for this commit
@jrmuizel
Copy link
Contributor

Any update on this?

@alexcrichton
Copy link
Member

@luser want to take a look at mozilla/sccache#185 and see if you agree with it?

luser pushed a commit to mozilla/sccache that referenced this issue Jan 18, 2018
This commit alters the main sccache server to operate and orchestrate its own
GNU make style jobserver. This is primarily intended for interoperation with
rustc itself.

The Rust compiler currently has a multithreaded mode where it will execute code
generation and optimization on the LLVM side of things in parallel. This
parallelism, however, can overload a machine quickly if not properly accounted
for (e.g. if 10 rustcs all spawn 10 threads...). The usage of a GNU make style
jobserver is intended to arbitrate and rate limit all these rustc instances to
ensure that one build's maximal parallelism never exceeds a particular amount.

Currently for Rust Cargo is the primary driver for setting up a jobserver. Cargo
will create this and manage this per compilation, ensuring that any one `cargo
build` invocation never exceeds a maximal parallelism. When sccache enters the
picture, however, the story gets slightly more odd.

The jobserver implementation on Unix relies on inheritance of file descriptors
in spawned processes. With sccache, however, there's no inheritance as the
actual rustc invocation is spawned by the server, not the client. In this case
the env vars used to configure the jobsever are usually incorrect.

To handle this problem this commit bakes a jobserver directly into sccache
itself. The jobserver then overrides whatever jobserver the client has
configured in its own env vars to ensure correct operation. The settings of each
jobserver may be misconfigured (there's no way to configure sccache's jobserver
right now), but hopefully that's not too much of a problem for the forseeable
future.

The implementation here was to provide a thin wrapper around the `jobserver`
crate with a futures-based interface. This interface was then hooked into the
mock command infrastructure to automatically acquire a jobserver token when
spawning a process and automatically drop the token when the process exits.
Additionally, all spawned processes will now automatically receive a configured
jobserver.

cc rust-lang/rust#42867, the original motivation for this commit
@luser
Copy link
Contributor

luser commented Jan 18, 2018

I finally merged @alexcrichton's change into sccache. If someone wants to test to see if this indeed fixes the issues in Servo CI and elsewhere I'd be happy to tag a new sccache release to get new binaries.

@aneeshusa
Copy link

@luser I'd definitely be interested in trying out the latest in Servo CI! If you cut a new release (or even a pre-release/beta since we're testing) and we can grab some new binaries, we can give it a whirl.

bors-servo pushed a commit to servo/servo that referenced this issue Jan 28, 2018
Re-enable sccache for Linux builds

As far as I know, sccache is working properly
on the non-cross-compiling Linux builders.
For safety, only enable it for the builders that run on PRs,
to avoid breaking our nightly generation and scheduled test runs.

This will also allow testing new versions of sccache more easily.
This implements my suggestion from #19858 (comment), and should also let us
handle testing a new sccache: rust-lang/rust#42867 (comment) (our current version of sccache [seems to be 2018-01-09](https://github.com/servo/saltfs/blob/f50214b8fa012e03616ecae1ef2913e6fe9044da/servo-build-dependencies/ci-map.jinja#L5)).

<!-- Please describe your changes on the following line: -->

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [ ] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #__ (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [x] These changes do not require tests because they change the CI configuration

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/19883)
<!-- Reviewable:end -->
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jan 29, 2018
…usa:reenable-sccache-partially); r=jdm

As far as I know, sccache is working properly
on the non-cross-compiling Linux builders.
For safety, only enable it for the builders that run on PRs,
to avoid breaking our nightly generation and scheduled test runs.

This will also allow testing new versions of sccache more easily.
This implements my suggestion from servo/servo#19858 (comment), and should also let us
handle testing a new sccache: rust-lang/rust#42867 (comment) (our current version of sccache [seems to be 2018-01-09](https://github.com/servo/saltfs/blob/f50214b8fa012e03616ecae1ef2913e6fe9044da/servo-build-dependencies/ci-map.jinja#L5)).

<!-- Please describe your changes on the following line: -->

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [ ] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #__ (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [x] These changes do not require tests because they change the CI configuration

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

Source-Repo: https://github.com/servo/servo
Source-Revision: 78ffce1cbe5fcce4d057b69c3cbf0cd2bc2b449c

--HG--
extra : subtree_source : https%3A//hg.mozilla.org/projects/converted-servo-linear
extra : subtree_revision : 04b98e4452ebe655c59d54f42827b6f3c29b0cd9
@staktrace
Copy link
Contributor

FWIW my STR from #42867 (comment) seem to be fixed with the latest published sccache (0.2.6)

@alexcrichton
Copy link
Member

Hurray!

aidanhs pushed a commit to aidanhs/sccache that referenced this issue May 17, 2018
This commit alters the main sccache server to operate and orchestrate its own
GNU make style jobserver. This is primarily intended for interoperation with
rustc itself.

The Rust compiler currently has a multithreaded mode where it will execute code
generation and optimization on the LLVM side of things in parallel. This
parallelism, however, can overload a machine quickly if not properly accounted
for (e.g. if 10 rustcs all spawn 10 threads...). The usage of a GNU make style
jobserver is intended to arbitrate and rate limit all these rustc instances to
ensure that one build's maximal parallelism never exceeds a particular amount.

Currently for Rust Cargo is the primary driver for setting up a jobserver. Cargo
will create this and manage this per compilation, ensuring that any one `cargo
build` invocation never exceeds a maximal parallelism. When sccache enters the
picture, however, the story gets slightly more odd.

The jobserver implementation on Unix relies on inheritance of file descriptors
in spawned processes. With sccache, however, there's no inheritance as the
actual rustc invocation is spawned by the server, not the client. In this case
the env vars used to configure the jobsever are usually incorrect.

To handle this problem this commit bakes a jobserver directly into sccache
itself. The jobserver then overrides whatever jobserver the client has
configured in its own env vars to ensure correct operation. The settings of each
jobserver may be misconfigured (there's no way to configure sccache's jobserver
right now), but hopefully that's not too much of a problem for the forseeable
future.

The implementation here was to provide a thin wrapper around the `jobserver`
crate with a futures-based interface. This interface was then hooked into the
mock command infrastructure to automatically acquire a jobserver token when
spawning a process and automatically drop the token when the process exits.
Additionally, all spawned processes will now automatically receive a configured
jobserver.

cc rust-lang/rust#42867, the original motivation for this commit
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 2, 2019
…usa:reenable-sccache-partially); r=jdm

As far as I know, sccache is working properly
on the non-cross-compiling Linux builders.
For safety, only enable it for the builders that run on PRs,
to avoid breaking our nightly generation and scheduled test runs.

This will also allow testing new versions of sccache more easily.
This implements my suggestion from servo/servo#19858 (comment), and should also let us
handle testing a new sccache: rust-lang/rust#42867 (comment) (our current version of sccache [seems to be 2018-01-09](https://github.com/servo/saltfs/blob/f50214b8fa012e03616ecae1ef2913e6fe9044da/servo-build-dependencies/ci-map.jinja#L5)).

<!-- Please describe your changes on the following line: -->

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [ ] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #__ (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [x] These changes do not require tests because they change the CI configuration

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

Source-Repo: https://github.com/servo/servo
Source-Revision: 78ffce1cbe5fcce4d057b69c3cbf0cd2bc2b449c

UltraBlame original commit: d897da50fc624ee27af7c16c02e368b92f4a6f5e
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 2, 2019
…usa:reenable-sccache-partially); r=jdm

As far as I know, sccache is working properly
on the non-cross-compiling Linux builders.
For safety, only enable it for the builders that run on PRs,
to avoid breaking our nightly generation and scheduled test runs.

This will also allow testing new versions of sccache more easily.
This implements my suggestion from servo/servo#19858 (comment), and should also let us
handle testing a new sccache: rust-lang/rust#42867 (comment) (our current version of sccache [seems to be 2018-01-09](https://github.com/servo/saltfs/blob/f50214b8fa012e03616ecae1ef2913e6fe9044da/servo-build-dependencies/ci-map.jinja#L5)).

<!-- Please describe your changes on the following line: -->

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [ ] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #__ (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [x] These changes do not require tests because they change the CI configuration

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

Source-Repo: https://github.com/servo/servo
Source-Revision: 78ffce1cbe5fcce4d057b69c3cbf0cd2bc2b449c

UltraBlame original commit: d897da50fc624ee27af7c16c02e368b92f4a6f5e
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 2, 2019
…usa:reenable-sccache-partially); r=jdm

As far as I know, sccache is working properly
on the non-cross-compiling Linux builders.
For safety, only enable it for the builders that run on PRs,
to avoid breaking our nightly generation and scheduled test runs.

This will also allow testing new versions of sccache more easily.
This implements my suggestion from servo/servo#19858 (comment), and should also let us
handle testing a new sccache: rust-lang/rust#42867 (comment) (our current version of sccache [seems to be 2018-01-09](https://github.com/servo/saltfs/blob/f50214b8fa012e03616ecae1ef2913e6fe9044da/servo-build-dependencies/ci-map.jinja#L5)).

<!-- Please describe your changes on the following line: -->

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [ ] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #__ (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [x] These changes do not require tests because they change the CI configuration

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

Source-Repo: https://github.com/servo/servo
Source-Revision: 78ffce1cbe5fcce4d057b69c3cbf0cd2bc2b449c

UltraBlame original commit: d897da50fc624ee27af7c16c02e368b92f4a6f5e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-ICE Issue: The compiler panicked, giving an Internal Compilation Error (ICE) ❄️
Projects
None yet
Development

No branches or pull requests

9 participants