[internal] Split shell targets into atom vs generator #12957

Eric-Arellano · 2021-09-18T18:02:51Z

Demos the technique described in #12954. The end user sees no change, including ./pants filter --target-type=python_library and ./pants peek behaving the same way.

But our internal representation now uses the changes from the proposal. The major benefit is that now our rules don't need to worry about Address.is_file_target - instead, the Target API ensures that no matter what, the rules will only ever operate on files as atoms. Even with fixes like #12952, we'll do the right thing.

This change also frees us up to add an overrides field to the target generator safely.

[ci skip-rust]

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust]

Eric-Arellano · 2021-09-18T18:04:03Z

src/python/pants/backend/shell/dependency_inference.py

-    all_expanded_targets = await Get(Targets, AddressSpecs([DescendantAddresses("")]))
-    shell_tgts = tuple(tgt for tgt in all_expanded_targets if tgt.has_field(ShellSources))
+    all_targets = await Get(Targets, AddressSpecs([DescendantAddresses("")]))
+    shell_tgts = tuple(tgt for tgt in all_targets if tgt.has_field(ShellSourceField))


The changes in this file mean that dep inference only happens for the file-based atomic targets, i.e. ShellSourceTarget and ShellTestTarget.

Eric-Arellano · 2021-09-18T18:05:52Z

src/python/pants/backend/shell/shunit2_test_runner.py

-    # Because a FieldSet corresponds to a file address, there should be exactly 1 file in the
-    # sources. This assumption allows us to simplify determining which shell to use via inspecting
-    # the shebang.
-    if len(field_set_digest_content) != 1:
-        raise AssertionError(
-            f"The file address {request.field_set.address} had sources != 1, which is unexpected: "
-            f"{field_set_sources.snapshot.files}. Please file a bug at "
-            "https://github.com/pantsbuild/pants/issues/new with this error message copied."
-        )
-    original_test_file_content = field_set_digest_content[0]
-    updated_test_file_content = add_source_shunit2(original_test_file_content)
+    # `ShellTestSourceField` validates that there's exactly one file.
+    test_file_content = field_set_digest_content[0]
+    updated_test_file_content = add_source_shunit2(test_file_content)


I'm excited about this cleanup in particular. Rule authors don't need to know about "file targets" anymore, or even generated targets! The rule behaves the same whether the file-based target was generated or explicitly declared.

Eric-Arellano · 2021-09-18T18:08:51Z

src/python/pants/backend/shell/target_types.py

-class ShellSources(Sources):
+class ShellSourceField(Sources):


I'm not in love with this name, particularly wondering if it should be Source vs. Sources.

I think the superclass will probably always be Sources, which we cannot rename to Source because there are non-file based atomic targets like go_package which need >1 source file. So, weird for subclass to be Source and parent to be Sources.

But ShellSourcesField also feels wrong - we want to express in the type system that Shell operates with files as its atoms.

Also, I'm thinking about how to model subclasses like this expressing with the type system that they are file-based, while still having Sources as a superclass. Something like either a class-decorator or subclassing SourceField (which itself subclasses Sources). It would rename the field alias to be source and expect a single str, rather than list[str].

ShellSingleSourceField ?

I think that's better, even it the extra characters are annoying.

Also, I tried this weekend experimenting with subclasses of Sources abusing the class to be source: str instead of sources: list[str], and it was chaotic. Pretty sure it violates Liskov Substitution and is a bad idea. If we want a source vs. sources field, then I think we need two distinct classes and for all our supporting code (graph.py etc) to work with both. That's not worth doing for now, so I think we stick with sources and set expected_num_files = 1.

I think ShellSourcesField is fine - the fact that we operate on single sources at a time is somewhat an implementation choice. So I don't think we want to encode that in the name. Imagine a language where the theoretical atom is a single file, but in practice we choose to act on a few files together for whatever reason. There is no reason that code consuming a ShellSourcesField should care that it happens to represent a single source? That decision has been made upstream.

Sgtm. Thanks Benjy!

Eric-Arellano · 2021-09-18T18:10:23Z

src/python/pants/backend/shell/target_types.py

+class ShellGeneratingSources(Sources):
+    uses_source_roots = False


NB that the Target API needs to view these sources as different than the file-based targets for all our rules to do the right thing. This split is what unlocks all the PR's benefits.

I anticipate that this will soon subclass something like GeneratingSources, rather than Sources. See #12953 for that idea.

It seems a bit premature to do this until we know what https://docs.google.com/document/d/17XvhHlx8jBn4K9qFRZPe7qaKJQjTfiKMKYRubnzWxjA/edit will look like.

In particular: an unnamed target never actually needs to exist as a Target object if there is a shim for it during BUILD file parsing. And for a named Target generator, after BUILD file parsing, it shouldn't need a Sources field because it will be sources-less, with dependencies on its generated Targets instead.

To be clear, the idea of GeneratingSources is that it's not normal Sources. It's not representing ownership of the files, only a relationship that those files trigger generation.

Either way, this PR improves the modeling by having the generator sources be a different subclass than the atom sources.

Eric-Arellano · 2021-09-18T18:12:10Z

src/python/pants/backend/shell/target_types.py

-class Shunit2TestsTimeout(IntField):
+class Shunit2TestTimeoutField(IntField):


If I could go back, I would have added Field to the end of most field names. A couple extra characters, but I think it brings clarity - it's often confusing imo what is a target's field vs. a rule's helper type, e.g. EntryPoint from pants.backend.target_types.

Eric-Arellano · 2021-09-18T18:16:34Z

src/python/pants/backend/shell/target_types.py

+    default = ("*_test.sh", "test_*.sh", "tests.sh")
+
+
+class Shunit2TestsGeneratorTarget(Target):


This name is verbose, but I like it for making abundantly clear its purpose.

I thought about using Macro instead of GeneratorTarget, but that's a misnomer. An important part of https://docs.google.com/document/d/1HpJn2jTWf5sKob6zhe4SqHZ7KWBlX4a6OJOwnuw-IHo/edit#heading=h.7qrnzjjahn5v is realizing that target generators are targets themselves.

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

tdyas · 2021-09-19T15:42:24Z

src/python/pants/backend/shell/target_types.py

-class ShellSources(Sources):
+class ShellSourceField(Sources):


ShellSingleSourceField ?

Eric-Arellano · 2021-09-25T16:03:47Z

Bump @benjyw @stuhood re the naming of things. I'd like to do this same type of PR for the rest of the codebase once this lands.

…rget # Conflicts: # src/python/pants/backend/shell/lint/shell_fmt.py # src/python/pants/backend/shell/register.py

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

…rget

* [internal] Run pyupgrade on src/python/pants/backend/python ([#13073](#13073)) * [internal] Re-enable some skipped JVM tests. ([#13074](#13074)) * [internal] Use `DownloadedExternalModules` when analyzing external Go packages ([#13076](#13076)) * [internal] Use `DownloadedExternalModules` during Go target generation ([#13070](#13070)) * [internal] Replace deprecated use of `[pytest] junit_xml_dir` with `[test] xml_dir. ([#13069](#13069)) * [internal] Add `DownloadedExternalModules` for Go ([#13068](#13068)) * [internal] Always use jars on the user classpath, and generalize transitive classpath building ([#13061](#13061)) * Add failing tests for Go external modules ([#13065](#13065)) * [internal] java: fix version in test ([#13064](#13064)) * [internal] Skip additional inference tests ([#13062](#13062)) * [internal] java: enable cycles for file-level targets generated by `java_sources` ([#13058](#13058)) * [internal] Add a `@logging` decorator for tests. ([#13060](#13060)) * [internal] Improve compatibility of nailgun with append only caches, and use them for Coursier ([#13046](#13046)) * [internal] Stop using `go.sum` when generating `_go_external_package` targets ([#13052](#13052)) * [internal] Rename `go_module` target to `go_mod` ([#13053](#13053)) * [internal] Refactor `go/util_rules/external_module.py` ([#13051](#13051)) * [internal] go: add analyzer and rules for test sources ([#13041](#13041)) * [Internal] Refactoring how we integrate with dockerfile ([#13027](#13027)) * [internal] Simplify `go/package_binary.py` ([#13045](#13045)) * [internal] Refactor `OwningGoMod` ([#13042](#13042)) * [internal] Refactor `go_mod.py` ([#13039](#13039)) * [internal] Record metadata on engine-aware params ([#13040](#13040)) * [internal] Test discovery of `go` binary ([#13038](#13038)) * [internal] Extract directory setup for terraform linters / formatters into a separate rule. ([#13037](#13037)) * [internal] java: register dependency inference rules ([#13035](#13035)) * [internal] Add `strutil.bullet_list()` to DRY formatting ([#13031](#13031)) * Minor cleanups for the autoflake linter / formatter. ([#13032](#13032)) * Ensure XML results recorded for both pytest and junit ([#13025](#13025)) * [internal] go: refactor compilation into separate rule ([#13019](#13019)) * [internal] go: refactor link step into separate rule ([#13022](#13022)) * [internal] go: enable plugin in repo and cleanup test project ([#13018](#13018)) * [internal] go: use colon to separate binary name and version ([#13020](#13020)) * [internal] tweak formatting of help text for sourcefile-validation subsystem. ([#13016](#13016)) * [internal] Use system-installed Go rather than installing via Pants ([#13007](#13007)) * Move the `process-execution-local-cleanup` hint to a more specific location. ([#13013](#13013)) * [internal] Split shell targets into atom vs generator ([#12957](#12957)) * Install Go in CI ([#13011](#13011)) * Refresh maintainers list. ([#13004](#13004)) * [internal] Refactor setup of GOROOT and `import_analysis.py` ([#13000](#13000)) * Infer dependencies on COPY of pex binaries for `docker_image`s. ([#12920](#12920)) * Prepare 2.7.0. ([#12995](#12995)) * [internal] jvm: skip JDK tests unless env var set ([#12994](#12994)) * [internal] jvm: limit caching of JDK setup processes ([#12992](#12992)) * [internal] Async-ify `NailgunPool::connect` and `nailgun::CommandRunner`. ([#12990](#12990)) * [internal] Replace `java_library` with `java_source` and `java_sources`, and add `java_test` ([#12976](#12976)) * Prepare 2.7.0rc5. ([#12987](#12987)) * [internal] terraform: refactor parser script into its own file ([#12985](#12985)) * [internal] jvm/java: ensure JDK downloaded in one process ([#12972](#12972)) * add JDK to GitHub Actions CI ([#12977](#12977)) * [internal] Re-enable the `clippy::used_underscore_binding` check. ([#12983](#12983)) * [internal] Use target generation for `_go_external_package` ([#12929](#12929)) * [internal] Bump CI token expiration threshold. ([#12974](#12974)) * [internal] Re-enable the Java backend. ([#12971](#12971)) * [internal] Implement `@union`s via `Query`s ([#12966](#12966)) * Remove `Enriched*Result` classes in favor of `EngineAwareReturnType.cacheable` ([#12970](#12970)) * [internal] Remove spurious `python_tests` directive ([#12968](#12968)) * [internal] Python coverage report generation uses precomputed addresses. ([#12965](#12965)) * Add PackageRootedDependencyMap for mapping inferred Java dependencies ([#12964](#12964))

Similar to #12957, this implements the technique from #12954. `./pants filter --target-type` and `./pants peek` behave the same as before, but our internal implementation differentiates between the atom target vs. its generator. [ci skip-rust] [ci skip-build-wheels]

[internal] Split shell targets into atom vs generator

42fdc29

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust]

Eric-Arellano requested review from stuhood, benjyw and tdyas September 18, 2021 18:02

Eric-Arellano commented Sep 18, 2021

View reviewed changes

Fix fmt

dbd78da

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

Eric-Arellano mentioned this pull request Sep 20, 2021

Add Java dependency analysis types and launcher using javaparser library. #12890

Merged

tdyas approved these changes Sep 20, 2021

View reviewed changes

Eric-Arellano mentioned this pull request Sep 20, 2021

Java: use java_source vs java_sources #12962

Closed

Eric-Arellano added 2 commits September 26, 2021 02:29

Merge branch 'main' of github.com:pantsbuild/pants into shell-test-ta…

a7abc52

…rget # Conflicts: # src/python/pants/backend/shell/lint/shell_fmt.py # src/python/pants/backend/shell/register.py

Review feedback: call it Sources field still

299ab17

# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

Eric-Arellano force-pushed the shell-test-target branch from 1b85305 to 299ab17 Compare September 26, 2021 06:33

Eric-Arellano enabled auto-merge (squash) September 26, 2021 06:33

Eric-Arellano added 2 commits September 26, 2021 12:37

Merge branch 'main' of github.com:pantsbuild/pants into shell-test-ta…

da75b7d

…rget

Fix shell_command_test.py

4c86bd0

Eric-Arellano merged commit 906099f into pantsbuild:main Sep 26, 2021

Eric-Arellano deleted the shell-test-target branch September 27, 2021 04:46

wisechengyi mentioned this pull request Oct 2, 2021

2.8.0.dev3 prep #13081

Merged

Eric-Arellano mentioned this pull request Oct 6, 2021

[internal] Split Protobuf targets into atom vs. generator #13134

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[internal] Split shell targets into atom vs generator #12957

[internal] Split shell targets into atom vs generator #12957

Eric-Arellano commented Sep 18, 2021

Eric-Arellano Sep 18, 2021

Eric-Arellano Sep 18, 2021

Eric-Arellano Sep 18, 2021

Eric-Arellano Sep 18, 2021

tdyas Sep 19, 2021

Eric-Arellano Sep 20, 2021

benjyw Sep 25, 2021

Eric-Arellano Sep 26, 2021

Eric-Arellano Sep 18, 2021

stuhood Sep 25, 2021

Eric-Arellano Sep 26, 2021

Eric-Arellano Sep 18, 2021

Eric-Arellano Sep 18, 2021

tdyas Sep 19, 2021

Eric-Arellano commented Sep 25, 2021

		class ShellSources(Sources):
		class ShellSourceField(Sources):

		class ShellGeneratingSources(Sources):
		uses_source_roots = False

		class Shunit2TestsTimeout(IntField):
		class Shunit2TestTimeoutField(IntField):

		default = ("_test.sh", "test_.sh", "tests.sh")


		class Shunit2TestsGeneratorTarget(Target):

[internal] Split shell targets into atom vs generator #12957

[internal] Split shell targets into atom vs generator #12957

Conversation

Eric-Arellano commented Sep 18, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eric-Arellano commented Sep 25, 2021