Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing tests reported as failures in summary #141

Closed
mattmoor opened this issue Aug 9, 2020 · 20 comments · Fixed by #191
Closed

Passing tests reported as failures in summary #141

mattmoor opened this issue Aug 9, 2020 · 20 comments · Fixed by #191
Labels
bug Something isn't working test2json-bug A bug in test2json which impacts gotestsum

Comments

@mattmoor
Copy link

mattmoor commented Aug 9, 2020

We are using gotestsum to run our testing for Knative, and digging into some recent "failures", I can't really make sense of why gotestsum is reporting them as failed.

The failure log I have been digging through can be downloaded from here, the file you want is "build-log.txt" (it is large).

In this test run I see the following summary:

=== Skipped
=== SKIP: test/e2e TestProbeWhitelist (0.00s)
    probe_whitelist_test.go:38: RequestAuthentication dropped the regex support. We need to rewrite this test. See https://github.com/istio/istio/issues/16585


=== Failed
=== FAIL: test/conformance/api/v1alpha1 TestContainerExitingMsg/http (unknown)

=== FAIL: test/conformance/api/v1alpha1 TestContainerExitingMsg/tcp (unknown)

However, if I scan back through the test logs, I see that all of the flavors of TestContainerExitingMsg have passed, here's one cluster:

--- PASS: TestContainerExitingMsg (0.00s)
    --- PASS: TestContainerExitingMsg/http (40.29s)
    --- PASS: TestContainerExitingMsg/tcp (172.49s)

I've been trying to make sense of the pieces of information here, but basically all I've managed so far is that (unknown) seems to tell us that the elapsed time is unknown, which seems to point to a TestCase terminated by the end() method here.

Any pointers would be appreciated. 🙏

@dnephin dnephin added the bug Something isn't working label Aug 9, 2020
@dnephin
Copy link
Member

dnephin commented Aug 9, 2020

unknown indicates that gotestsum thinks there was a test that was missing a PASS/FAIL event when the test run completed. It was added in #104 to mitigate golang/go#38382

But I also don't see an incomplete test here. It seems like TestContainerExitingMsg was run 3 times, and passed 3 times.

If you can add a --jsonfile flag, that will save the source test2json output that gotestsum receives. We might get a better idea from that log, or at least find a way to reproduce the issue.

Are you running gotestsum from master? Do you know when this first started happening? Are you running with -count=3 or is something else causing the tests to run multiple times?

Does gotestsum exit with 0? I think that's what I'm seeing in the log, but I'm not sure what else is wrapping the exit status. It seems like go test is pretty confident the tests passed, so this might be a bug I haven't seen before in gotestsum.

@mattmoor
Copy link
Author

mattmoor commented Aug 9, 2020

I believe we only started using this recently, and I'm not sure if it's from master or not without some digging.

Adding --jsonfile and uploading that artifact should be doable, I can try to knock that off tomorrow so when it hits again we have that.

thanks for the quick response!

cc @chaodaiG @chizhg

@mattmoor
Copy link
Author

@chizhg has added this to our CI, so 🤞

@coryrc
Copy link

coryrc commented Aug 10, 2020

We seem to be using master. We install via GO111MODULE=on go get -u gotest.tools/gotestsum

root@ca18d95f09c0:/workspace# ls -ld `which gotestsum`
-rwxr-xr-x 1 root root 8207819 Aug  8 02:31 /go/bin/gotestsum
root@ca18d95f09c0:/workspace# gotestsum --version
gotestsum version master

@mattmoor
Copy link
Author

Alright, we got a run of this here which seems to exhibit the problem. The build artifacts are here and these now contain json_XXXXX.json files for each of the junit files.

I haven't cracked any of these yet, but I'm not sure they'll make sense to me 😅

@dnephin
Copy link
Member

dnephin commented Aug 11, 2020

Perfect! Here's what I'm seeing:

$ grep TestContainerExitingMsg logs_ci-knative-serving-contour-latest_1292985076908298240_artifacts_json_yGQGuuT2.txt | grep '"Action":"pass"'
{"Time":"2020-08-11T01:03:41.641597357Z","Action":"pass","Package":"knative.dev/serving/test/conformance/api/v1","Test":"TestContainerExitingMsg","Elapsed":0}
{"Time":"2020-08-11T01:04:01.649236258Z","Action":"pass","Package":"knative.dev/serving/test/conformance/api/v1beta1","Test":"TestContainerExitingMsg/tcp","Elapsed":27.63}
{"Time":"2020-08-11T01:04:01.649252188Z","Action":"pass","Package":"knative.dev/serving/test/conformance/api/v1beta1","Test":"TestContainerExitingMsg/http","Elapsed":160.37}
{"Time":"2020-08-11T01:04:01.649257987Z","Action":"pass","Package":"knative.dev/serving/test/conformance/api/v1beta1","Test":"TestContainerExitingMsg","Elapsed":0.01}
{"Time":"2020-08-11T01:05:09.935705543Z","Action":"pass","Package":"knative.dev/serving/test/conformance/api/v1alpha1","Test":"TestContainerExitingMsg/http","Elapsed":179.23}
{"Time":"2020-08-11T01:05:09.985943885Z","Action":"pass","Package":"knative.dev/serving/test/conformance/api/v1alpha1","Test":"TestContainerExitingMsg/tcp","Elapsed":194.07}
{"Time":"2020-08-11T01:05:09.985992005Z","Action":"pass","Package":"knative.dev/serving/test/conformance/api/v1alpha1","Test":"TestContainerExitingMsg","Elapsed":0}

For some reason, test2json has Output events for the tests passing, but it never sent an Action: pass event for those 2 sub tests in the knative.dev/serving/test/conformance/api/v1 package.

From what I can tell, this is a new test2json bug that I haven't seen before. Maybe related to parallel test. If it is not happening on every run, it may have to do with the ordering of the tests. I haven't looked at the golang issue tracker to see if this issue has been reported already.

@dnephin dnephin added the test2json-bug A bug in test2json which impacts gotestsum label Aug 11, 2020
@mattmoor
Copy link
Author

@dnephin if I wanted to try and fix this, where might I start looking? This has really been flaring up in recent runs of a suite I've been trying to stomp out flakes in, and so I'd love to see this fixed 😅 . I'd be happy to do some legwork here, I am just not sure what/where I'd need to get started.

thanks for any pointers 🙏

@dnephin
Copy link
Member

dnephin commented Aug 13, 2020

The bug is likely somewhere in here: https://github.com/golang/go/blob/master/src/cmd/internal/test2json/test2json.go

If it is related to parallelism some things to try might be to remove the t.Parallel, and to try running with go1.15 to see if it was already fixed.

I took a quick look on the golang issue tracker, but I did not see any reports of this problem. If we can reproduce it with a smaller test case opening an issue at https://github.com/golang/go/issues/new would be great.

@mattmoor
Copy link
Author

Alright, I got a little test2json build locally and snipped the build output from the artifacts linked above.

cat just-build.log | test2json | grep TestContainerExitingMsg | grep '"Action":"run"'
{"Action":"run","Test":"TestContainerExitingMsg"}
{"Action":"run","Test":"TestContainerExitingMsg"}
{"Action":"run","Test":"TestContainerExitingMsg"}
{"Action":"run","Test":"TestContainerExitingMsg/http"}
{"Action":"run","Test":"TestContainerExitingMsg/tcp"}
{"Action":"run","Test":"TestContainerExitingMsg/http"}
{"Action":"run","Test":"TestContainerExitingMsg/tcp"}
{"Action":"run","Test":"TestContainerExitingMsg/http"}
{"Action":"run","Test":"TestContainerExitingMsg/tcp"}

cat just-build.log | test2json | grep TestContainerExitingMsg | grep '"Action":"pass"'
{"Action":"pass","Test":"TestContainerExitingMsg"}
{"Action":"pass","Test":"TestContainerExitingMsg/tcp"}
{"Action":"pass","Test":"TestContainerExitingMsg/http"}
{"Action":"pass","Test":"TestContainerExitingMsg"}
{"Action":"pass","Test":"TestContainerExitingMsg/http"}
{"Action":"pass","Test":"TestContainerExitingMsg/tcp"}
{"Action":"pass","Test":"TestContainerExitingMsg"}

Time to start aimlessly poking at this until it works!

@mattmoor
Copy link
Author

So the problem section is this sequence here:

--- PASS: TestContainerExitingMsg (0.00s)
=== CONT  TestBlueGreenRoute
    kubelogs.go:201: I 01:01:58.381 webhook-98759d99-pfkgh [conversion/conversion.go:133] [serving-tests/blue-green-route-atsgeusz-q56cd] Converting [kind=Revision group=serving.knative.dev version=v1] to version serving.knative.dev/v1beta1
    --- PASS: TestContainerExitingMsg/http (49.33s)
    --- PASS: TestContainerExitingMsg/tcp (165.25s)

The summary split with a === token screws things up and (I believe) causes the subsequent --- PASS lines to get swallowed as output of TestBlueGreenRoute.

@dnephin
Copy link
Member

dnephin commented Aug 13, 2020

Ah yes, that sounds right. Nice find! Kind of related to golang/go#38063

@mattmoor
Copy link
Author

mattmoor commented Aug 14, 2020

Unfortunately, I think this happens even without -v... Last night was mostly clean, but this morning I do see a single run with:

=== Failed
=== FAIL: test/conformance/runtime TestMustNotContainerConstraints/TestReadinessHTTPProbePort (unknown)

=== FAIL: test/conformance/runtime TestMustNotContainerConstraints/TestLivenessHTTPProbePort (unknown)

=== FAIL: test/conformance/runtime TestMustNotContainerConstraints/TestArbitraryPortName (unknown)

=== FAIL: test/conformance/runtime TestMustNotContainerConstraints/TestLivenessTCPProbePort (unknown)

=== FAIL: test/conformance/runtime TestMustNotContainerConstraints/TestMountPropagation (unknown)

=== FAIL: test/conformance/runtime TestMustNotContainerConstraints/TestReadinessTCPProbePort (unknown)

I haven't cracked the artifacts, but I'd guess we'd see something similar to the test from yesterday.

@innovia
Copy link

innovia commented Aug 17, 2020

I had the same issue

Deleting project "terratest-host-project-gozq07"...--- PASS: TestGCPEnableHostProject (128.43s)

in my case it was the issue where the ---PASS did not appear in a separate line

i fixed the message before that to be adding extra \n new line and it pass

@MatthewDolan
Copy link

MatthewDolan commented Sep 2, 2020

In my case it was the issue where the ---PASS did not appear in a separate line.

I experienced the same issue. In my case, the failure was intermittent and when I was able to track down and compare the success cases from the failure cases, it was because there was a race between the test runner writing the pass output and a logging library writing a log line in a background goroutine that was in the process of being shutdown.

The ---PASS ... string was being inserted in the middle of the log line.

There doesn't seem to be a very elegant way to resolve this issue for projects that have a global logger. The recommendation would probably be to use t.Log(...) when logging during tests when possible, but that's a really significant lift for a project that hasn't been passing around logging objects from the beginning.

This might not be quite a bug with this runner specifically, but it is manifesting in developers not trusting the output of the test runner because the test run will pass and the runner will show a failure. Wanted to share in case others come across a similar issue.

@dnephin
Copy link
Member

dnephin commented Sep 3, 2020

I've worked with some projects that have a similar problem with logs and output. If you aren't running tests in parallel you can set the global logger to write to a buffer, and only print logs from the buffer when a test ends (or when a test fails). Then reset the logger back to the default afterward.

func TestSomething(t *testing.T) {
    patchLogger(t)
    ....
}

func patchLogger(t *testing.T) {
    orig := log.Writer()
    buf := new(bytes.Buffer)
    log.SetOutput(buf)

    t.Cleanup(func() {
        // optionally check t.Failed here if you only want to print logs on failure
        
        t.Log(buf.String())
        log.SetOutput(orig)
    })
}

Another options would be to always set the log output to ioutil.Discard, although that might make test failures harder to debug. I'm not sure if setting the output to os.Stderr would help, or if that has the same problem.

It would be nice if the go toolchain changed how the test binaries write framing events. If they were sent to a file or pipe then stdout/stderr would not mess with the tests results.

@hello-ming
Copy link

I got a similar issue, here is the info

╰─ cat main_test.go
package main

import (
        "fmt"
        "testing"
)

func TestSth(t *testing.T) {
        fmt.Printf("%s", "OK")
}
╰─ go test -v ./
=== RUN   TestSth
OK--- PASS: TestSth (0.00s)
PASS
ok      github.com/fatih/vim-go-tutorial        0.548s
╰─ gotestsum -f testname --debug ./
exec: [go test -json ./]
go test pid: 63519
PASS . (cached)
=== RUN   TestSth
OK--- PASS: TestSth (0.00s)
FAIL TestSth (-1.00s)

=== Failed
=== FAIL: . TestSth (unknown)
OK--- PASS: TestSth (0.00s)


DONE 1 tests, 1 failure in 0.170s

--- PASS is not in a new line

when change the ouput line in test case
from
fmt.Printf("%s", "OK")
to
fmt.Printf("%s\n", "OK")

everything is fine

so maybe gotestsum can handle this situation.

@dnephin
Copy link
Member

dnephin commented Sep 20, 2020

There is a related issue on the go issue tracker for output with missing newlines: golang/go#38063

It sounds like the Go team considers this a problem with the test, and it may not be fixed.

How do you see gotestsum fixing this problem? It may be possible to search every output string for the --PASS or --FAIL pattern, but that is potentially expensive and error prone. Considering this is ultimately a bug in the test suite (with a very easy fix), and the stance taken by the Go team, I'm hesitant to introduce a lot of complexity to fix this problem.

chaodaiG added a commit to chaodaiG/test-infra that referenced this issue Oct 11, 2021
The t.Parallel here could possibly cause test output swallow --- PASS, which is basically what all go test result to json output relies on.

Related: gotestyourself/gotestsum#141
@JoseAlban
Copy link

In my case, I managed to fix the passing-tests-but-reported-as-failed issue after adding a newline to stdin buffer read:

func UserConfirmedViaTerminal(in *bufio.Scanner, ask string) bool {
	fmt.Println(ask)
	fmt.Print("[y/N]: ")
	if err := RequireScan(in); err != nil {
		return false
	}
	answer := strings.TrimSpace(in.Text())
	answer = strings.ToLower(answer)

	// Necessary to let `gotestsum` (and dependencies) parse the test output properly, and not mis-report as failure
	// see https://github.com/gotestyourself/gotestsum/issues/141
	fmt.Println()

	return answer == "y" || answer == "yes"
}

as otherwise output was "corrupted".

Before:

=== RUN   TestPulumiRefresh/Resolves_pending_operations_properly
[refresh] Do you want to apply to all these stacks?
[y/N]: --- PASS: TestPulumiRefresh (0.00s)

[...]

=== Failed
=== FAIL: local/pulumi TestPulumiRefresh (unknown)

[...]
DONE 15 tests, 4 failures in 0.309s

After:

=== RUN   TestPulumiRefresh
=== RUN   TestPulumiRefresh/Resolves_pending_operations_properly
[refresh] Do you want to apply to all these stacks?
[y/N]:
--- PASS: TestPulumiRefresh (0.00s)

[...]
DONE 15 tests in 2.093s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working test2json-bug A bug in test2json which impacts gotestsum
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants