Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for local test caching. #4660

Merged

Conversation

jsirois
Copy link
Contributor

@jsirois jsirois commented Jun 8, 2017

Engage the invalidation machinery to "cache" successful test runs. If a
test run has previously gone green, its execution will be skipped and
only a summary of its successful result will be printed; eg:

tests.python.pants_test.base.build_root ......   SUCCESS
tests.python.pants_test.base.payload    ......   SUCCESS

This is a behavior change in that previously, summaries were only
printed for --no-fast runs. To avoid the surprise of no test output at
all on a fully cached successful re-run, the summary lines are printed
now for --fast runs as well.

NB: This change does not use the artifact cache, just the local
invalidation system; so successful test runs are not shared between
developers yet.

@jsirois
Copy link
Contributor Author

jsirois commented Jun 8, 2017

This is the 1st concrete step towards test caching (#4587).

invalidate_dependents=True) as invalidation_check:

invalid_tgts = [tgt for vts in invalidation_check.invalid_vts for tgt in vts.targets]
return self._run_pytest(invalid_tgts)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line does not raise on failure - it always returns a PytestResult. As a result, both test failures and successes are cached. Needs fixing...

@kwlzn
Copy link
Member

kwlzn commented Jun 9, 2017

IIUC, in a cold-then-warm run flow, run A will produce pytest output and a summary - while runs B and beyond will produce only the summary.

thinking as a user, it feels like this could be a potentially awkward UX. I think ideally, both runs A and B would appear identical but net the runtime speedup of caching. then, if I wanted to look at the output of e.g. the -vvs passthrough mode or printed warnings again I could easily see that on a warm rerun without having to pass different flags to invalidate.

I'm not very familiar with the task caching capabilities in pants, but would it not be possible to e.g. capture the pytest section's output/raw xml files to disk and then respew that run over run?

I could see how it might be useful to have an up-front indication that you're hitting the cache in the form of less output, but I think we could potentially indicate that in an alternate way?

@jsirois
Copy link
Contributor Author

jsirois commented Jun 9, 2017

IIUC, in a cold-then-warm run flow...

We could cache outputs but then you'd also see paths from someone else's machine as well as soon as caching support is enabled. I'm open to ideas I guess, but in general cold/warm run flows are in fact awkward and we've just grown used to it. When you javac code with warnings you get warnings console output - when you hit the cache on a re-javac, you see no warnings, etc. It is true that certain tasks are more "leafy" and end-user interactive, but I'm not seeing a way to preserve output correctly without an inordinate amount of work. Inordinate here is clearly the thing up for debate. If we decide preserving output is important enough to do, then next item for debate is fidelity of that output.

@stuhood
Copy link
Sponsor Member

stuhood commented Jun 9, 2017

I'm definitely fine with producing reduced output on the second run, as doing anything else would make hitting the cache more expensive that is strictly necessary (for example: if we were to render all of the output of cache hits for compilation in a completely warm case, we'd dump out about 20MBs of output to the screen, rather than a series of dots or actual usable progress (TODO).

@kwlzn
Copy link
Member

kwlzn commented Jun 9, 2017

you'd also see paths from someone else's machine as well as soon as caching support is enabled

When you javac code with warnings you get warnings console output - when you hit the cache on a re-javac, you see no warnings, etc.

ah.. k. thanks for the details. if there are key challenges to making that work and at least some precedent for this behavior on the jvm side, then I'm also good w/ it.

@jsirois
Copy link
Contributor Author

jsirois commented Jun 9, 2017

... and at least some precedent for this behavior on the jvm side, then I'm also good w/ it.

Cool. Yeah - there is precedent from every task we cache today. Ivy resolves, Go dep resolves, etc. The single odd case is a leaf verb you would normally interact with. So repl for sure, but also sometimes run if the binary cares about local machine state and test just because folks often add print lines or enter a debugger. That said, they really only normally do so for failing tests, which are not cached.

@jsirois jsirois modified the milestone: 1.4.0 Jun 12, 2017
@jsirois jsirois force-pushed the jsirois/issues/4587/python2/test_caching branch from 8787173 to 28b0694 Compare June 12, 2017 20:40
@jsirois
Copy link
Contributor Author

jsirois commented Jun 12, 2017

OK - 1 legit test failure at least and then failures depending on how many times you run the test (junit xml missing). I'll need to fold in use of vt.results_dir - aka caching - in this change instead of keeping it seperate I think. Reviewers can hold off a bit longer.

Engage the invalidation machinery to "cache" successful test runs. If a
test run has previously gone green, its execution will be skipped and
only a summary of its successful result will be printed; eg:
```
tests.python.pants_test.base.build_root ......   SUCCESS
tests.python.pants_test.base.payload    ......   SUCCESS
```

This is a behavior change in that previously, summaries were only
printed for `--no-fast` runs. To avoid the surprise of no test output at
all on a fully cached successful re-run, the summary lines are printed
now for `--fast` runs as well.

NB: This change does not use the artifact cache, just the local
invalidation system; so successful test runs are not shared between
developers yet.
@jsirois jsirois force-pushed the jsirois/issues/4587/python2/test_caching branch from 28b0694 to 300fea1 Compare June 12, 2017 22:11
@jsirois
Copy link
Contributor Author

jsirois commented Jun 12, 2017

... Reviewers can hold off a bit longer.

Actually - all is good now, PTAL. There is a pre-existing failure mode when either the old or new pytest_run test is run with --coverage turned on, I'd like to address that separately since its longstanding.

@jsirois jsirois requested review from kwlzn and benjyw June 12, 2017 23:28
Copy link
Member

@kwlzn kwlzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@benjyw
Copy link
Sponsor Contributor

benjyw commented Jun 13, 2017

Agreed re not worrying about different output on cold vs. hot runs.

However it is generally desirable to capture tool logging/console output in a more principled way than we currently do, so that it's always available in a well-known .pants.d location. But that's future work, obvi.

Copy link
Sponsor Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love it!

Perhaps there should be a --force flag that always reruns the tests? This is a major change in behavior, so it would be good to have an escape hatch.

In fact, I wonder if there simply should be a global, recursive --force flag. Then you can set --force globally to treat everything as if invalidated, or set it per-task to just do so for that task. Then the flag can be examined in the invalidation logic itself, instead of each task having to know about it.

Obviously this would have to be in a followup change.

Thoughts?

@jsirois
Copy link
Contributor Author

jsirois commented Jun 13, 2017

In fact, I wonder if there simply should be a global, recursive --force flag. Then you can set --force ...

Yeah - this seems to make more sense. Instead of ./pants invalidate - which I assume no-one knows about or runs and better, allowing for targeted re-runs. I've filed #4673.

@jsirois jsirois merged commit 8023691 into pantsbuild:master Jun 13, 2017
@jsirois jsirois deleted the jsirois/issues/4587/python2/test_caching branch June 13, 2017 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants