-
-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[internal] Split shell targets into atom vs generator #12957
[internal] Split shell targets into atom vs generator #12957
Conversation
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust]
all_expanded_targets = await Get(Targets, AddressSpecs([DescendantAddresses("")])) | ||
shell_tgts = tuple(tgt for tgt in all_expanded_targets if tgt.has_field(ShellSources)) | ||
all_targets = await Get(Targets, AddressSpecs([DescendantAddresses("")])) | ||
shell_tgts = tuple(tgt for tgt in all_targets if tgt.has_field(ShellSourceField)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file mean that dep inference only happens for the file-based atomic targets, i.e. ShellSourceTarget
and ShellTestTarget
.
# Because a FieldSet corresponds to a file address, there should be exactly 1 file in the | ||
# sources. This assumption allows us to simplify determining which shell to use via inspecting | ||
# the shebang. | ||
if len(field_set_digest_content) != 1: | ||
raise AssertionError( | ||
f"The file address {request.field_set.address} had sources != 1, which is unexpected: " | ||
f"{field_set_sources.snapshot.files}. Please file a bug at " | ||
"https://github.com/pantsbuild/pants/issues/new with this error message copied." | ||
) | ||
original_test_file_content = field_set_digest_content[0] | ||
updated_test_file_content = add_source_shunit2(original_test_file_content) | ||
# `ShellTestSourceField` validates that there's exactly one file. | ||
test_file_content = field_set_digest_content[0] | ||
updated_test_file_content = add_source_shunit2(test_file_content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm excited about this cleanup in particular. Rule authors don't need to know about "file targets" anymore, or even generated targets! The rule behaves the same whether the file-based target was generated or explicitly declared.
class ShellSources(Sources): | ||
class ShellSourceField(Sources): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not in love with this name, particularly wondering if it should be Source
vs. Sources
.
I think the superclass will probably always be Sources
, which we cannot rename to Source
because there are non-file based atomic targets like go_package
which need >1 source file. So, weird for subclass to be Source
and parent to be Sources
.
But ShellSourcesField
also feels wrong - we want to express in the type system that Shell operates with files as its atoms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I'm thinking about how to model subclasses like this expressing with the type system that they are file-based, while still having Sources
as a superclass. Something like either a class-decorator or subclassing SourceField
(which itself subclasses Sources
). It would rename the field alias to be source
and expect a single str
, rather than list[str]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShellSingleSourceField
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's better, even it the extra characters are annoying.
Also, I tried this weekend experimenting with subclasses of Sources
abusing the class to be source: str
instead of sources: list[str]
, and it was chaotic. Pretty sure it violates Liskov Substitution and is a bad idea. If we want a source
vs. sources
field, then I think we need two distinct classes and for all our supporting code (graph.py
etc) to work with both. That's not worth doing for now, so I think we stick with sources
and set expected_num_files = 1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ShellSourcesField
is fine - the fact that we operate on single sources at a time is somewhat an implementation choice. So I don't think we want to encode that in the name. Imagine a language where the theoretical atom is a single file, but in practice we choose to act on a few files together for whatever reason. There is no reason that code consuming a ShellSourcesField
should care that it happens to represent a single source? That decision has been made upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sgtm. Thanks Benjy!
class ShellGeneratingSources(Sources): | ||
uses_source_roots = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB that the Target API needs to view these sources as different than the file-based targets for all our rules to do the right thing. This split is what unlocks all the PR's benefits.
I anticipate that this will soon subclass something like GeneratingSources
, rather than Sources
. See #12953 for that idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit premature to do this until we know what https://docs.google.com/document/d/17XvhHlx8jBn4K9qFRZPe7qaKJQjTfiKMKYRubnzWxjA/edit will look like.
In particular: an unnamed target never actually needs to exist as a Target object if there is a shim for it during BUILD file parsing. And for a named Target generator, after BUILD file parsing, it shouldn't need a Sources field because it will be sources-less, with dependencies on its generated Targets instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, the idea of GeneratingSources is that it's not normal Sources. It's not representing ownership of the files, only a relationship that those files trigger generation.
Either way, this PR improves the modeling by having the generator sources be a different subclass than the atom sources.
class Shunit2TestsTimeout(IntField): | ||
class Shunit2TestTimeoutField(IntField): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I could go back, I would have added Field
to the end of most field names. A couple extra characters, but I think it brings clarity - it's often confusing imo what is a target's field vs. a rule's helper type, e.g. EntryPoint
from pants.backend.target_types
.
default = ("*_test.sh", "test_*.sh", "tests.sh") | ||
|
||
|
||
class Shunit2TestsGeneratorTarget(Target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is verbose, but I like it for making abundantly clear its purpose.
I thought about using Macro
instead of GeneratorTarget
, but that's a misnomer. An important part of https://docs.google.com/document/d/1HpJn2jTWf5sKob6zhe4SqHZ7KWBlX4a6OJOwnuw-IHo/edit#heading=h.7qrnzjjahn5v is realizing that target generators are targets themselves.
class ShellSources(Sources): | ||
class ShellSourceField(Sources): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShellSingleSourceField
?
…rget # Conflicts: # src/python/pants/backend/shell/lint/shell_fmt.py # src/python/pants/backend/shell/register.py
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
1b85305
to
299ab17
Compare
* [internal] Run pyupgrade on src/python/pants/backend/python ([#13073](#13073)) * [internal] Re-enable some skipped JVM tests. ([#13074](#13074)) * [internal] Use `DownloadedExternalModules` when analyzing external Go packages ([#13076](#13076)) * [internal] Use `DownloadedExternalModules` during Go target generation ([#13070](#13070)) * [internal] Replace deprecated use of `[pytest] junit_xml_dir` with `[test] xml_dir. ([#13069](#13069)) * [internal] Add `DownloadedExternalModules` for Go ([#13068](#13068)) * [internal] Always use jars on the user classpath, and generalize transitive classpath building ([#13061](#13061)) * Add failing tests for Go external modules ([#13065](#13065)) * [internal] java: fix version in test ([#13064](#13064)) * [internal] Skip additional inference tests ([#13062](#13062)) * [internal] java: enable cycles for file-level targets generated by `java_sources` ([#13058](#13058)) * [internal] Add a `@logging` decorator for tests. ([#13060](#13060)) * [internal] Improve compatibility of nailgun with append only caches, and use them for Coursier ([#13046](#13046)) * [internal] Stop using `go.sum` when generating `_go_external_package` targets ([#13052](#13052)) * [internal] Rename `go_module` target to `go_mod` ([#13053](#13053)) * [internal] Refactor `go/util_rules/external_module.py` ([#13051](#13051)) * [internal] go: add analyzer and rules for test sources ([#13041](#13041)) * [Internal] Refactoring how we integrate with dockerfile ([#13027](#13027)) * [internal] Simplify `go/package_binary.py` ([#13045](#13045)) * [internal] Refactor `OwningGoMod` ([#13042](#13042)) * [internal] Refactor `go_mod.py` ([#13039](#13039)) * [internal] Record metadata on engine-aware params ([#13040](#13040)) * [internal] Test discovery of `go` binary ([#13038](#13038)) * [internal] Extract directory setup for terraform linters / formatters into a separate rule. ([#13037](#13037)) * [internal] java: register dependency inference rules ([#13035](#13035)) * [internal] Add `strutil.bullet_list()` to DRY formatting ([#13031](#13031)) * Minor cleanups for the autoflake linter / formatter. ([#13032](#13032)) * Ensure XML results recorded for both pytest and junit ([#13025](#13025)) * [internal] go: refactor compilation into separate rule ([#13019](#13019)) * [internal] go: refactor link step into separate rule ([#13022](#13022)) * [internal] go: enable plugin in repo and cleanup test project ([#13018](#13018)) * [internal] go: use colon to separate binary name and version ([#13020](#13020)) * [internal] tweak formatting of help text for sourcefile-validation subsystem. ([#13016](#13016)) * [internal] Use system-installed Go rather than installing via Pants ([#13007](#13007)) * Move the `process-execution-local-cleanup` hint to a more specific location. ([#13013](#13013)) * [internal] Split shell targets into atom vs generator ([#12957](#12957)) * Install Go in CI ([#13011](#13011)) * Refresh maintainers list. ([#13004](#13004)) * [internal] Refactor setup of GOROOT and `import_analysis.py` ([#13000](#13000)) * Infer dependencies on COPY of pex binaries for `docker_image`s. ([#12920](#12920)) * Prepare 2.7.0. ([#12995](#12995)) * [internal] jvm: skip JDK tests unless env var set ([#12994](#12994)) * [internal] jvm: limit caching of JDK setup processes ([#12992](#12992)) * [internal] Async-ify `NailgunPool::connect` and `nailgun::CommandRunner`. ([#12990](#12990)) * [internal] Replace `java_library` with `java_source` and `java_sources`, and add `java_test` ([#12976](#12976)) * Prepare 2.7.0rc5. ([#12987](#12987)) * [internal] terraform: refactor parser script into its own file ([#12985](#12985)) * [internal] jvm/java: ensure JDK downloaded in one process ([#12972](#12972)) * add JDK to GitHub Actions CI ([#12977](#12977)) * [internal] Re-enable the `clippy::used_underscore_binding` check. ([#12983](#12983)) * [internal] Use target generation for `_go_external_package` ([#12929](#12929)) * [internal] Bump CI token expiration threshold. ([#12974](#12974)) * [internal] Re-enable the Java backend. ([#12971](#12971)) * [internal] Implement `@union`s via `Query`s ([#12966](#12966)) * Remove `Enriched*Result` classes in favor of `EngineAwareReturnType.cacheable` ([#12970](#12970)) * [internal] Remove spurious `python_tests` directive ([#12968](#12968)) * [internal] Python coverage report generation uses precomputed addresses. ([#12965](#12965)) * Add PackageRootedDependencyMap for mapping inferred Java dependencies ([#12964](#12964))
Demos the technique described in #12954. The end user sees no change, including
./pants filter --target-type=python_library
and./pants peek
behaving the same way.But our internal representation now uses the changes from the proposal. The major benefit is that now our rules don't need to worry about
Address.is_file_target
- instead, the Target API ensures that no matter what, the rules will only ever operate on files as atoms. Even with fixes like #12952, we'll do the right thing.This change also frees us up to add an
overrides
field to the target generator safely.[ci skip-rust]