-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build outcome is inaccurate and differs on different platforms #4108
Comments
It seems that inaccuracy and bad buid outcome is caused by error prone. If I disable all checks on Windows by passing this bazel option:
|
Do you have a minimal repro for the part of this that only happens on windows? I don't have access to a windows machine right now, would you mind adding |
@cushon Sure, it took me quite some time to extract the minimal reproducer for this breakage from Gerrit Code Review and gwtorm. See this repro https://github.com/davido/bazel-annotation-processor-breakage. I also was able to eliminate all external deps, except I also filed this issue upstream: [1].
Looking into the verbose output, as expected, the root cause is the missing dependency on indirect dependency on Windows on Windows
Whereas on Linux indirect dependency on Linux
|
Thanks for taking the time to make the repro! I think there's a second bug here preventing Can you share the contents of |
Here is the content of bazel-bin/java/com/google/gerrit/reviewdb/libserver-hjar.jdeps:
Indeed, passing
Passing
|
Can you attach |
Done libserver-hjar.zip. |
|
@meteorcloudy - I'm surprised Bazel is using |
Yes, for some reasons we still use bazel/src/main/java/com/google/devtools/build/lib/vfs/WindowsPathFragment.java Lines 53 to 58 in 755e176
I wonder where does FYI, @laszlocsomor |
The Out of curiosity, why is |
@meteorcloudy : FYI, I didn't add that TODO, @haxorz did. |
@cushon @meteorcloudy @laszlocsomor Do we understand what is going on here, and why the transitive class path is broken on Windows and why
It took me so long time to track that down, create the reproducer and find a workaround: pass Should we warn the Bazel community (mailing list or Bazel main page where it claims that Windows is supported), and tell them to stop using Bazel on Windows, as it is inherently broken? Not to mention, that inacurate build outcome (passing on Linux but broken on Windows, that claimed to be supported platform) is at least P1 for me, if not P0! |
@davido Thanks for spending so much time debugging this! As you can see, this bug is hidden deeply and it didn't happen very often, that's why we missed it. @cushon Can you briefly summarize the problem, is there any workaround for this issue before we completely switching to |
@cushon, @meteorcloudy : I'll look into changing the separator we use on Windows to |
|
@dslomov :
AFAIK that's not universally the case. API functions like CreateFile{A,W} only support |
The problem is that There's a fall back path in case the heuristic is too aggressive that retries with a full classpath, but that wasn't happening due to the Error Prone issue that was spun off into google/error-prone#827.
Note that the comparison is happening in JavaBuilder, not in Bazel, so we probably don't want to use
@dslomov can you expand on this? They're intermediate build artifacts that should only be used in the context of a single build, and a single build configuration. I wouldn't expect them to be more platform-independent than params files. |
@laszlocsomor @meteorcloudy do you have a sense of how long it will take to change the separator on Windows? (I'm wondering if we should be looking for a work-around.) |
Ah I see so the contents of it is between JavaBuilder and JavaBuilder :) (or other Java tools) |
"But they do support it" (cue Jack Sparrow meme :)). For relative paths (as is the case here) '/' and '\' are supported and equivalent. |
Are there disadvantages to also using the platform-dependent name separator in Bazel? |
+1. I would define this issue as release blocker and wait with releasing 0.8 until it is fixed. |
@davido is it a regression from 0.7.0? |
I don't think it is a regression, but, given that bazel build outcome differs on diferent platform and given that a workaround for "\" vs. "/" comaprisson mismatch should be trivial, it should be fixed prior 0.8 is released. |
@cushon As you suggested, the workaround is here: [1]. I tested it with my reproducer on both platforms Linux and Windows. It works as expected. |
@davido, thanks for your patch! |
Our release policy is clear: not a regression => no cherry-pick and no release blockage (https://bazel.build/support.html#releases) |
The more uniform platform-independent commands are, the better (for remote execution, action cache, etc). |
OTOH, The release policy is also claiming that:
Given that nothing causes more pain, frustration, and disappointment than unfulfilled expectations. As mentioned in the original report to this issue, Linux
Windows
Needless to say, I was shocked to find out, that a simple So my POV that we are talking about critical and high-impact bug: reliability and accuracy of build outcome. What else can compromise a build tool, when its outcome is flaky? Very recently, Bazel team removed the notice from the FAQ, that Bazel on Windows platform is highly experimented, bleeding edge and may produce unexpected results. Last but not least, Bazel has this slogan on its main page:
I suggest, you remove reliably untill I'm really not amused with your assessment of that issue (as you may have noticed). One reason for it, is because Gerrit Code Review project's founder promissed me in person that Bazel team is going to be very supportive for Gerrit Code Review issues, for reasons (that I'm not going to enter into here). |
David, I agree this is an important bug to fix, and we will fix it. I think it should block the next (0.9) release. In this case, we just have no evidence that many users are hit by this specific issue: the issue is not a regression, it has been present in more than one release, we build multiple large projects with Bazel very regularly (including on ci.bazel.build), we have multiple large customers with substantial Java codebases, and nobody yet complained. There are workarounds (disable errorprone) that will get you through until this issue is fixed. There is no justification whatsoever to block 0.8 release on this. |
I was looking into adding a test, for the CL above, as was requested in review. It seems that a similar test already exists. So that I was curious, whether or not this test still works with the diff applied. Unfortunately, the |
Thanks. I abandoned my CL. |
This avoids #4108. PiperOrigin-RevId: 177096864
The work-around was submitted as 0b2352d. |
Works like a charm, thanks! |
Update:
|
I have no active proposal to use backslashes for Windows. In fact, I am
saying we should *not* do that unless absolutely necessary. Forward slashes
are put into command lines in (guesstimate) ~10,000 places in the bazel
code base, it would be very difficult to find them all, fix them all, and
keep them fixed.
We would have to vet every command line constructed and make sure artifacts
are used directly, or when path strings are used they would have to be
funneled through some object that knows what the execution platform is
going to be. We don't have such an object, the analysis phase has no idea
where actions are going to get executed. In skylark paths are stuck
straight on as strings, so there's simply no way to know what a path is.
The whole API would have to be remade to stop returning plain strings,
which isn't backwards compatible.
This can all be done, but keep in mind a decision to use backslashes
throughout will probably cost order of magnitude 1 SWE year.
…On Tue, Nov 28, 2017 at 11:10 AM, László Csomor ***@***.***> wrote:
Update:
- @cushon <https://github.com/cushon> submitted a principled fix to
JavaBuilder in 0b2352d
<0b2352d>,
which fixed this bug.
- I looked at changing the code to use '' on Windows, but gave up
because there are hundreds of call sites and I didn't want to go ahead and
audit all, and also:
- @tomlu <https://github.com/tomlu> is working on a design proposal to
tighten path semantics inside Bazel, and I'd rather he did the work, not me
;)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4108 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABbnSrJiFyWzFk14icxWJWBXg-7OJKT9ks5s7DCGgaJpZM4Qh6tb>
.
|
it will succeed, where the
libdiff.jar-2.params
- file on Linux is here: http://paste.openstack.org/show/626680.where the
libdiff.jar-2.params
- file in Windows is here: http://paste.openstack.org/show/626690.It turns out, that the root cause is missing transitive dependency on
gwtorm.jar
that was fixed here: [1]. I've created a reproducer: [2].It seems that the problem is reported by error prone, in compilation unit, that was generated by annotation processor (auto-value). I wrote this issue to improve the misleading error message: #4105.
This issue is to track the different build outcomes on Linux and Windows: success vs. failure, correspondingly.
The text was updated successfully, but these errors were encountered: