Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Targets always have a Snapshot #5994

Merged

Conversation

illicitonion
Copy link
Contributor

@illicitonion illicitonion commented Jun 21, 2018

Unless they're somehow specially generated, their Snapshot will be
populated during parsing in parallel in the engine.

If they weren't generated during parsing, the scheduler arg will be used
to synchronously capture one on demand.

This enables the sources of all Targets to be used in remote execution.

@illicitonion illicitonion force-pushed the dwagnerhall/lazyfileset/snapshotseverywhere branch 2 times, most recently from 41536af to 1a81185 Compare June 21, 2018 15:46
@illicitonion illicitonion changed the title WIP Targets always have a Snapshot Jun 21, 2018
@illicitonion
Copy link
Contributor Author

@stuhood This is now ready for review.

The first two commits are #5993 and #5997 and the last two are independently reviewable.

@illicitonion illicitonion force-pushed the dwagnerhall/lazyfileset/snapshotseverywhere branch from 1a81185 to 82341e7 Compare June 21, 2018 23:13
Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good!

I do think we'll need to take advantage of the batch Snapshot capture though.

"""
:param rel_root: The root for the given filespec, relative to the buildroot.
:param filespec: A filespec as generated by `FilesetRelPathWrapper`, which represents
what globs or file list it came from. Must be relative to buildroot.
:param files: A list of matched files, with declared order and duplicates preserved.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "duplicates" bit was a lie anyway.

# Build the target and handle duplicate sources.
if not vt.valid:
if self._do_validate_sources_present(vt.target):
self.execute_codegen(vt.target, vt.results_dir)
self._handle_duplicate_sources(vt.target, vt.results_dir)
sources = self._capture_sources(vt.target, vt.results_dir)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a very good usecase for using the batchness of the capture_snapshots method, and I think we're going to need to here to get adequate performance on a large graph.

If you split out a "generate and collect sources" loop before a "inject synthetic targets" loop, you could make one call to capture_snapshots for all targets in between the two loops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

'deprecated, and now takes a slow path. Instead, class {} should have its sources '
'argument populated by the engine, either by using the standard parsing pipeline, or by '
'requesting a SourcesField product.'
'the v2 engine.').format(self.__class__.__name__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/requesting a SourcesField product.the v2 engine./requesting a SourcesField product from the v2 engine./ ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@illicitonion
Copy link
Contributor Author

For my reference: I can reproduce the CI failures by running:
./pants run contrib/go/examples/src/go/distance

Unless they're somehow specially generated, their Snapshot will be
popualted during parsing in parallel in the engine.

If they weren't generated during parsing, the scheduler arg will be used
to synchronously capture one on demand.
@illicitonion illicitonion force-pushed the dwagnerhall/lazyfileset/snapshotseverywhere branch from 82341e7 to 5d9ac89 Compare June 26, 2018 17:37
Copy link
Member

@kwlzn kwlzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@stuhood stuhood merged commit 6229c41 into pantsbuild:master Jun 27, 2018
@illicitonion illicitonion deleted the dwagnerhall/lazyfileset/snapshotseverywhere branch August 15, 2018 03:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants