Add support for local test caching. #4660

jsirois · 2017-06-08T20:34:00Z

Engage the invalidation machinery to "cache" successful test runs. If a
test run has previously gone green, its execution will be skipped and
only a summary of its successful result will be printed; eg:

tests.python.pants_test.base.build_root ......   SUCCESS
tests.python.pants_test.base.payload    ......   SUCCESS

This is a behavior change in that previously, summaries were only
printed for --no-fast runs. To avoid the surprise of no test output at
all on a fully cached successful re-run, the summary lines are printed
now for --fast runs as well.

NB: This change does not use the artifact cache, just the local
invalidation system; so successful test runs are not shared between
developers yet.

jsirois · 2017-06-08T20:34:27Z

This is the 1st concrete step towards test caching (#4587).

jsirois · 2017-06-08T22:10:18Z

src/python/pants/backend/python/tasks2/pytest_run.py

+                          invalidate_dependents=True) as invalidation_check:
+
+      invalid_tgts = [tgt for vts in invalidation_check.invalid_vts for tgt in vts.targets]
+      return self._run_pytest(invalid_tgts)


This line does not raise on failure - it always returns a PytestResult. As a result, both test failures and successes are cached. Needs fixing...

kwlzn · 2017-06-09T02:43:02Z

IIUC, in a cold-then-warm run flow, run A will produce pytest output and a summary - while runs B and beyond will produce only the summary.

thinking as a user, it feels like this could be a potentially awkward UX. I think ideally, both runs A and B would appear identical but net the runtime speedup of caching. then, if I wanted to look at the output of e.g. the -vvs passthrough mode or printed warnings again I could easily see that on a warm rerun without having to pass different flags to invalidate.

I'm not very familiar with the task caching capabilities in pants, but would it not be possible to e.g. capture the pytest section's output/raw xml files to disk and then respew that run over run?

I could see how it might be useful to have an up-front indication that you're hitting the cache in the form of less output, but I think we could potentially indicate that in an alternate way?

jsirois · 2017-06-09T15:51:52Z

IIUC, in a cold-then-warm run flow...

We could cache outputs but then you'd also see paths from someone else's machine as well as soon as caching support is enabled. I'm open to ideas I guess, but in general cold/warm run flows are in fact awkward and we've just grown used to it. When you javac code with warnings you get warnings console output - when you hit the cache on a re-javac, you see no warnings, etc. It is true that certain tasks are more "leafy" and end-user interactive, but I'm not seeing a way to preserve output correctly without an inordinate amount of work. Inordinate here is clearly the thing up for debate. If we decide preserving output is important enough to do, then next item for debate is fidelity of that output.

stuhood · 2017-06-09T16:51:18Z

I'm definitely fine with producing reduced output on the second run, as doing anything else would make hitting the cache more expensive that is strictly necessary (for example: if we were to render all of the output of cache hits for compilation in a completely warm case, we'd dump out about 20MBs of output to the screen, rather than a series of dots or actual usable progress (TODO).

kwlzn · 2017-06-09T18:12:32Z

you'd also see paths from someone else's machine as well as soon as caching support is enabled

When you javac code with warnings you get warnings console output - when you hit the cache on a re-javac, you see no warnings, etc.

ah.. k. thanks for the details. if there are key challenges to making that work and at least some precedent for this behavior on the jvm side, then I'm also good w/ it.

jsirois · 2017-06-09T18:23:57Z

... and at least some precedent for this behavior on the jvm side, then I'm also good w/ it.

Cool. Yeah - there is precedent from every task we cache today. Ivy resolves, Go dep resolves, etc. The single odd case is a leaf verb you would normally interact with. So repl for sure, but also sometimes run if the binary cares about local machine state and test just because folks often add print lines or enter a debugger. That said, they really only normally do so for failing tests, which are not cached.

jsirois · 2017-06-12T21:58:00Z

OK - 1 legit test failure at least and then failures depending on how many times you run the test (junit xml missing). I'll need to fold in use of vt.results_dir - aka caching - in this change instead of keeping it seperate I think. Reviewers can hold off a bit longer.

Engage the invalidation machinery to "cache" successful test runs. If a test run has previously gone green, its execution will be skipped and only a summary of its successful result will be printed; eg: ``` tests.python.pants_test.base.build_root ...... SUCCESS tests.python.pants_test.base.payload ...... SUCCESS ``` This is a behavior change in that previously, summaries were only printed for `--no-fast` runs. To avoid the surprise of no test output at all on a fully cached successful re-run, the summary lines are printed now for `--fast` runs as well. NB: This change does not use the artifact cache, just the local invalidation system; so successful test runs are not shared between developers yet.

jsirois · 2017-06-12T23:28:40Z

... Reviewers can hold off a bit longer.

Actually - all is good now, PTAL. There is a pre-existing failure mode when either the old or new pytest_run test is run with --coverage turned on, I'd like to address that separately since its longstanding.

kwlzn

lgtm!

benjyw · 2017-06-13T17:37:29Z

Agreed re not worrying about different output on cold vs. hot runs.

However it is generally desirable to capture tool logging/console output in a more principled way than we currently do, so that it's always available in a well-known .pants.d location. But that's future work, obvi.

benjyw

I love it!

Perhaps there should be a --force flag that always reruns the tests? This is a major change in behavior, so it would be good to have an escape hatch.

In fact, I wonder if there simply should be a global, recursive --force flag. Then you can set --force globally to treat everything as if invalidated, or set it per-task to just do so for that task. Then the flag can be examined in the invalidation logic itself, instead of each task having to know about it.

Obviously this would have to be in a followup change.

Thoughts?

jsirois · 2017-06-13T17:55:02Z

In fact, I wonder if there simply should be a global, recursive --force flag. Then you can set --force ...

Yeah - this seems to make more sense. Instead of ./pants invalidate - which I assume no-one knows about or runs and better, allowing for targeted re-runs. I've filed #4673.

jsirois mentioned this pull request Jun 8, 2017

Allow python tasks to request target roots partitioning types #4638

Closed

jsirois commented Jun 8, 2017

View reviewed changes

This was referenced Jun 9, 2017

Support fingerprinting of UnsetBool options. #4665

Merged

Support options fingerprinting in Task tests. #4666

Merged

jsirois modified the milestone: 1.4.0 Jun 12, 2017

jsirois force-pushed the jsirois/issues/4587/python2/test_caching branch from 8787173 to 28b0694 Compare June 12, 2017 20:40

jsirois added 2 commits June 12, 2017 15:11

Update test to account for new caching.

300fea1

jsirois force-pushed the jsirois/issues/4587/python2/test_caching branch from 28b0694 to 300fea1 Compare June 12, 2017 22:11

jsirois requested review from kwlzn and benjyw June 12, 2017 23:28

kwlzn approved these changes Jun 13, 2017

View reviewed changes

benjyw approved these changes Jun 13, 2017

View reviewed changes

jsirois mentioned this pull request Jun 13, 2017

Support invalidation generically via an option #4673

Closed

jsirois merged commit 8023691 into pantsbuild:master Jun 13, 2017

jsirois deleted the jsirois/issues/4587/python2/test_caching branch June 13, 2017 17:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for local test caching. #4660

Add support for local test caching. #4660

jsirois commented Jun 8, 2017

jsirois commented Jun 8, 2017

jsirois Jun 8, 2017

kwlzn commented Jun 9, 2017 •

edited

Loading

jsirois commented Jun 9, 2017

stuhood commented Jun 9, 2017 •

edited

Loading

kwlzn commented Jun 9, 2017

jsirois commented Jun 9, 2017

jsirois commented Jun 12, 2017

jsirois commented Jun 12, 2017

kwlzn left a comment

benjyw commented Jun 13, 2017

benjyw left a comment •

edited

Loading

jsirois commented Jun 13, 2017

Add support for local test caching. #4660

Add support for local test caching. #4660

Conversation

jsirois commented Jun 8, 2017

jsirois commented Jun 8, 2017

jsirois Jun 8, 2017

Choose a reason for hiding this comment

kwlzn commented Jun 9, 2017 • edited Loading

jsirois commented Jun 9, 2017

stuhood commented Jun 9, 2017 • edited Loading

kwlzn commented Jun 9, 2017

jsirois commented Jun 9, 2017

jsirois commented Jun 12, 2017

jsirois commented Jun 12, 2017

kwlzn left a comment

Choose a reason for hiding this comment

benjyw commented Jun 13, 2017

benjyw left a comment • edited Loading

Choose a reason for hiding this comment

jsirois commented Jun 13, 2017

kwlzn commented Jun 9, 2017 •

edited

Loading

stuhood commented Jun 9, 2017 •

edited

Loading

benjyw left a comment •

edited

Loading