-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENV not propagated to process-wrapper #4137
Comments
Do you mean this command?
Why does it require PATH and LD_LIBRARY_PATH env vars? |
Hi, I am sorry about the confusion. I have edited the original post to clarify your confusion. In fact, I would like to make two points:
|
@aehlig Can we propagate |
A simple question: why are some rules wrapped with process-wrapper but some aren’t? Is this because of sandboxing? I don’t want to scan a bunch of codebase so just ask directly to you guys. |
@meteorcloudy @aehlig I tried to add --action_env to force the bazel executor to take PATH and LD_LIBRARY_PATH yesterday but I didn't succeed. It turns out that sometimes 'action_env' works but sometimes not. I am just wondering why bazel doesn't honor the shell envs in java rules if bazel needs to build some cc utilities like 'process_wrapper'. I tested process_wrapper with cc rules and python rules. They work with it but when I am trying to build java header jars with process_wrapper, bazel doesn't honor the PATH and LD_LIBRARY_PATH envs. I appreciate the idea of sandboxing everything into an isolated environment. But a successful practice of this would be either
|
We're not going to forward env variables by default - doing so is fundamentally incompatible with remote execution and remote caching, which are both important to us (and many of our users). There is a separate issue that --action_env is not forwarded to all actions at #3320. |
@ulfjack Then why do you make cc_rules respect shell envs? I mean this behavior will break a lot of local compilation in customized environments. And at user level, there is no way to control whether envs could be passed to a specific rule so that there is no way to fix it. Actually the problem here is to compile a java header jar, which doesn’t need host envs at all. However, when wrapping the rule by process-wrapper, the process-wrapper tool itself needs to link the cc library. This is a little ridiculous since the helper destroys the work that it is helping with. I don’t really understand why passing Envs would disturb remote execution. Assume if users make remote environment identical with local, why is it a problem? |
It's a two-step process - there's a piece of code that's responsible for configuring the C++ toolchain that's separate from the rules. The rules themselves don't respect shell envs. The code that's configuring the toolchain can be swaped out as a whole for code that does not look at the local envs. If there are specific issues with the existing code for local execution, then we can discuss that separately. It's correct that action_env is not working for all actions right now. There's a separate bug for that. That said, it seems pretty unusual to require LD_LIBRARY_PATH to run basic binaries. The problem isn't the remote environment, but the local one. Any env variable that we forward to the remote machines as part of remote execution poisons the remote cache. For example, if you and your colleague have an env variable USERNAME, and that's forward to the remote cache, then you cannot get any cache hits from your colleague and vice versa. Even worse if you have env variables that are more volatile (changing quickly), you can't even get cache hits from yourself. |
I think the problem is pretty serious.
I ’m sure I have set the "build --action_env" |
@ulfjack is there any progress about the bug "action_env is not working for all actions" ?? |
What makes you think that the error you posted is due to env variables? |
@ulfjack |
What kind of setup do you have such that cp isn't in /bin or /usr/bin? You should check if this works with a more recent bazel release. If I read the code correctly, this is from a genrule, and they should already forward the action env, even if not all actions do. |
@ulfjack |
Did you try with a more recent Bazel release? |
@ulfjack , I used bazel 0.10.0. |
Are there any news about this issue? We have to build bazel with another gcc than /usr/bin/gcc. Then we run into the problem where process-wrapper fails for targets, since the local /usr/lib64/libstdc++.so is too old:
Our machine park runs several different Linux distributions, and the local gcc is generally too old for Bazel. So building Bazel with the local compiler and install it locally is not an option for us. Instead we build all our tools with our own gcc, installed on a network disk, which produces binaries that work on all other platforms. Is it possible to build Bazel in a way where process-wrapper and other internal tools become independent of the environment, for instance by linking them statically or passing --rpath to the linker? |
My problem turned out to have a simple solution: We can link process-wrapper statically when we build Bazel (by updating src/main/tools/BUILD). |
@emusand That’s what I did. Apparently, current Bazel implementations are not in favor of customized Linux environments. And I don’t think this will be fixed. |
I have recently updated a bunch of actions to correctly take --action_env into account - the changes should all be in 0.14.0. If you know about specific actions that still don't do it correctly, please do let me know. I would also be open to merging a patch that makes process-wrapper be statically linked by default. (And possibly change it so that it's a pure C binary and doesn't need to link against libstdc++.) I certainly want Bazel to work better with unusual setups, within reason, but I won't be able to do all the work myself. |
@emusand : can you share how you were able to get around this, what you had to add in the BUILD file? |
Hi Rahul,
I added link option "-static", to statically link all tool binaries.
First I added parameter linkopts = ['static'] to the tool cc_binary rules
in the BUILD files. Then I realized that I just had to add option "bazel
--linkopt=-static" on the command line, without the need of patching the
BUILD files.
Den tis 14 aug. 2018 kl 16:16 skrev Rahul Roy <notifications@github.com>:
… @emusand <https://github.com/emusand> : can you share how you were able
to get around this, what you had to add in the BUILD file?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4137 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHm6KQ3fTOuTNmdDl-Jfuk4vmxrkCzG1ks5uQtuzgaJpZM4QlES->
.
|
Any update? |
any update? This is still an issue as of latest release |
I think @philwo was suggesting that local actions get the local PATH forwarded, even if it's not used for remote actions. |
(That wouldn't help for LD_LIBRARY_PATH.) If someone has a repro for me, I may be able to take a look. |
repro is a bit tough, shove a really old gcc in /usr/ and install a new one in /usr/local/. Even if you set path to be /usr/local/bin and LD_LIBRARY_PATH to /usr/local/lib you'll hit certain steps that try to grab libstdc++ from /usr/lib (I have hit this with ray and tensor flow) If you start from a centos 6.5 docker image, yum install gcc (4.8) into /usr/bin, then manually install gcc 4.9.3 into /usr/local/bin, you'll be unable to compile tensorflow 1.* or ray, even though both are compatible with 4.9.3 (and even if you setup your paths and still them to point to /usr/local/bin/gcc) |
I am experiencing this issue (building ray via Bazel on ppc64le machine (Summit at Oak Ridge)). I tried the "link static" suggestion and my system did not like that (all sorts of missing libraries). I am wondering if there is a different workaround? E.g., if I wanted to hack the Bazel code/configuration (I am building Bazel from source), what would I change to implement the suggestion above to simply pass PATH and LD_LIBRARY_PATH to the step that uses process-wrapper? There is a hint above about "use_default_shell_env = True", but I'm not sure I can follow it. Would I just have to turn it to true in every call to run_shell()/run() in the Bazel's "*.bzl" files? (I tried something like that to no effect, but I may have done it wrong.) Something more? Thank you! |
I have encountered very similar problems with building Ray via Bazel (v1.1.0). I tried different methods, and successfully solved it in some way (no need to hack Bazel code, just a bunch of ENV variable settings. Details are shared as below), which I believe can be applied to build other projects like TensorFlow. I am on a ppc64le machine using a gcc (v7.3.0) at a customized location, because the gcc at /usr/bin or /usr/local/bin is too old and I have no root privilege to upgrade it. Before we continue, make sure that
Rebuild Bazel (in order to statically link process-wrapper with libstdc++)The first (perhaps the most important) problem that I need to deal with is `GLIBCXX_3.4.21' not found error. An example of the error info is:
As indicated by this very github issue, this is caused by ENV var (esp. LD_LIBRARY_PATH) not propagated to process-wrapper, when we use Bazel to build some project like Ray. My workaround solution is to make process-wrapper statically linked with Let's dive into detailed steps. First, before we rebuild Bazel, we can check that process-wrapper is indeed dynamically linked to
The fact that
There are a few things to note in the above command. (1) I used the ENV variable
Notice that both Build Ray (using the new Bazel we just rebuilt)Now we can build Ray (v0.7.7) using the new Bazel that we just rebuilt from previous step. Assuming that other library dependencies of Ray, such as Apache Arrow, have been properly installed (detailed on how to install them can be found here), I used the following command to build ray:
Notice that I am still using ENV variables BAZEL_LINKOPTS and BAZEL_LINKLIBS to instruct Bazel to statically link When the above step is successful, you will see a
Now, you can use Ray in python! |
@forestliurui this looks SUPER helpful, I will try to use the same process to build ray in my environment |
@timkpaine Thanks. I am happy if this could be helpful to others. @pgraf Maybe this is also helpful in your case. |
I just ran into this again: The
So only a very generic failure. After days of digging I modified the Bazel sources so I found that there are some log files with that executions stdout/stderr and that told me it is the process-wrapper again. So take this as another datapoint to do something about this please |
Previously, we hardcode the envs of the xml generation action, which caused problem for process-wrapper because it's dynamically linked to some system library and the required PATH or LD_LIBRARY_PATH are not set. This change propagate the envs we set for the actual test action to the xml file generation action to make sure the env vars are correctly set and can also be controlled by --action_env and --test_env. Fixes bazelbuild#4137
Previously, we hardcode the envs of the xml generation action, which caused problem for process-wrapper because it's dynamically linked to some system library and the required PATH or LD_LIBRARY_PATH are not set. This change propagate the envs we set for the actual test action to the xml file generation action to make sure the env vars are correctly set and can also be controlled by --action_env and --test_env. Fixes bazelbuild#4137
Previously, we hardcode the envs of the xml generation action, which caused problem for process-wrapper because it's dynamically linked to some system libraries and the required PATH or LD_LIBRARY_PATH are not set. This change propagate the envs we set for the actual test action to the xml file generation action to make sure the env vars are correctly set and can also be controlled by --action_env and --test_env. Fixes bazelbuild#4137 Fixes bazelbuild#12579 Closes bazelbuild#12659. PiperOrigin-RevId: 347596753
Previously, we hardcode the envs of the xml generation action, which caused problem for process-wrapper because it's dynamically linked to some system libraries and the required PATH or LD_LIBRARY_PATH are not set. This change propagate the envs we set for the actual test action to the xml file generation action to make sure the env vars are correctly set and can also be controlled by --action_env and --test_env. Fixes bazelbuild#4137 Fixes bazelbuild#12579 Closes bazelbuild#12659. PiperOrigin-RevId: 347596753
Previously, we hardcode the envs of the xml generation action, which caused problem for process-wrapper because it's dynamically linked to some system libraries and the required PATH or LD_LIBRARY_PATH are not set. This change propagate the envs we set for the actual test action to the xml file generation action to make sure the env vars are correctly set and can also be controlled by --action_env and --test_env. Fixes bazelbuild#4137 Fixes bazelbuild#12579 Closes bazelbuild#12659. PiperOrigin-RevId: 347596753
Previously, we hardcode the envs of the xml generation action, which caused problem for process-wrapper because it's dynamically linked to some system libraries and the required PATH or LD_LIBRARY_PATH are not set. This change propagate the envs we set for the actual test action to the xml file generation action to make sure the env vars are correctly set and can also be controlled by --action_env and --test_env. Fixes bazelbuild#4137 Fixes bazelbuild#12579 Closes bazelbuild#12659. PiperOrigin-RevId: 347596753
Previously, we hardcode the envs of the xml generation action, which caused problem for process-wrapper because it's dynamically linked to some system libraries and the required PATH or LD_LIBRARY_PATH are not set. This change propagate the envs we set for the actual test action to the xml file generation action to make sure the env vars are correctly set and can also be controlled by --action_env and --test_env. Fixes bazelbuild#4137 Fixes bazelbuild#12579 Closes bazelbuild#12659. PiperOrigin-RevId: 347596753
For the record, the isolation feature that causes the environment variables to be unset has to do with the following code: (add-after 'unpack 'disable-isolation
(lambda _
;; XXX: By default, Bazel clears all the environment variables
;; but PATH, which causes GCC to not find its include files.
(substitute* "src/main/java/com/google/devtools/build/lib/util/\
CommandFailureUtils.java" ;this is purely cosmetic
(("\\? \"env - ")
"? \"env "))
(substitute* "src/main/java/com/google/devtools/build/lib/shell/\
JavaSubprocessFactory.java"
(("builder.environment\\().clear\\();" all)
(string-append "// (disabled by Guix)" all))))) |
Please provide the following information. The more we know about your system and use case, the more easily and likely we can help.
Description of the problem / feature request / question:
I have a problem using bazel to build the rules_scala on a custom platform. I am sure it is because the shell envs are not propagated correctly to a subprocess that is spawned by some actions of bazel. Please see the log information that I provide in the last section.
If possible, provide a minimal example to reproduce the problem:
Environment info
Operating System:
CentOS 6.7
Bazel version (output of
bazel info release
):0.8.1
If
bazel info release
returns "development version" or "(@non-git)", please tell us what source tree you compiled Bazel from; git commit hash is appreciated (git rev-parse HEAD
):Have you found anything relevant by searching the web?
(e.g. StackOverflow answers,
GitHub issues,
email threads on the
bazel-discuss
Google group)Anything else, information or logs or outputs that would be helpful?
(If they are large, please upload as attachment or provide link).
The command I ran to build rules_scala is
bazel build -s --verbose_failures --sandbox_debug //src/...
From the log message, you can see the last command doesn't have PATH and LD_LIBRARY_PATH propagated but the others have. I highlighted the command that reports the error:
The text was updated successfully, but these errors were encountered: