Associate cli arguments with executables and refactor llvm/gcc c/c++ toolchain selection #6217

cosmicexplorer · 2018-07-23T07:43:56Z

Problem

#5951 explains the problem addressed by moving CLI arguments to individual Executable objects -- this reduces greatly the difficulty in generating appropriate command lines for the executables invoked. In this PR, it can be seen to remove a significant amount of repeated boilerplate.

Additionally, we weren't distinguishing between a Linker to link the compiled object files of gcc or g++ vs clang or clang++. We were attempting to generate a linker object which would work with any of gcc, g++, clang, or clang++, and this wasn't really feasible. Along with the above, this made it extremely difficult and error-prone to generate correct command lines / environments for executing the linker, which led to e.g. not being able to find crti.o (as one symptom addressed by this problem).

Solution

Introduce CToolchain and CppToolchain in environment.py, which can be generated from LLVMCToolchain, LLVMCppToolchain, GCCCToolchain, or GCCCppToolchain. These toolchain datatypes are created in native_toolchain.py, where a single @rule for each ensures that no C or C++ compiler that someone can request was made without an accompanying linker, which will be configured to work with the compiler.
Introduce the extra_args property to the Executable mixin in environment.py, which Executable subclasses can just declare a datatype field named extra_args in order to override. This is used in native_toolchain.py to ensure platform-specific arguments and environment variables are set in the same @rule which produces a paired compiler and linker -- there is a single place to look at to see where all the process invocation environment variables and command-line arguments are set for a given toolchain.
Introduce the ArchiveFileMapper subsystem and use it to declare sets of directories to resolve within our BinaryTool archives GCC and LLVM. This subsystem allows globbing (and checks that there is a unique expansion), which makes it robust to e.g. platform-specific paths to things like include or lib directories.

Result

Removes several FIXMEs, including heavily-commented parts of test_native_toolchain.py. Partially addresses #5951 -- setup_py.py still generates its own execution environment from scratch, and this could be made more hygienic in the future. As noted in #6179 and #6205, this PR seems to immediately fix the CI failures in those PRs.

jsirois

This is alot to grok at once, so a bit of a superficial skim.

jsirois · 2018-07-23T14:08:47Z

src/python/pants/backend/native/subsystems/binaries/gcc.py

-  def c_compiler(self):
-    exe_filename = 'gcc'
-    path_entries = self.path_entries()
+  _PLATFORM_INTERMEDIATE_DIRNAME = {


Are these solid for all 64 linux and osx? They look a bit specific. If they are just enough to get CI tests working that's a step forward, but deserve a TODO with issue pointer. On the other hand, if they are solid - they don't look so and desrve a comment pointing off to GCC docs that show they are.

So this is actually only a path into the specific gcc archive we provide, which is why I was thinking it was ok to lean on. However, the ParseSearchDirs approach actually may be able to get this -- doing it manually results in:

> PATH="${HOME}/.cache/pants/bin/gcc/mac/10.13/7.3.0/gcc/bin" g++ -print-search-dirs | grep -P '^libraries' | sed -re 's#^libraries: =##g' | tr ':' '\n' | sort | uniq | parallel -n1 readlink -f /Users/dmcclanahan/.cache/pants/bin/gcc/mac/10.13/7.3.0/gcc/lib/gcc/x86_64-apple-darwin17.5.0/7.3.0 /Users/dmcclanahan/.cache/pants/bin/gcc/mac/10.13/7.3.0/gcc/lib /Users/dmcclanahan/.cache/pants/bin/gcc/mac/10.13/7.3.0/gcc/lib/gcc

This looks like it can be manipulated to find the include dirs we want instead of generating those paths ourselves, which would allow us to revert many of the changes to this file.

OK, that would be excellent.

But this clarification also points to this issue pantsbuild/binaries#78; ie: x86_64-apple-darwin17.5.0 is a bit specific, which would be OK iff it corresponded to 10.11.

The fact that this entire class refers to the specifics of our GCC build is worth pointing out in a class level doc most likely.

Yes, 100% agreed.

To clarify, 17.5.0 is from the LLVM binary release for OSX -- this is the package we provide for all OSX (because it's the package LLVM provides for all OSX), and is not an artifact of any build process we perform ourselves. I am making the ParseSearchDirs approach work regardless, just wanted to clear up why I thought that path was ok.

I was having difficulty getting the ParseSearchDirs solution to work, so e67044c scraps that entirely and uses a helper method _get_check_single_path_by_glob() which is used to locate directories within our provided gcc distribution.

I was annoyed that we couldn't just invoke the binary -- although there may be a better way in the future, adding /lib and /include to directory paths (as I mentioned could be doable above) ended up getting way too long and needlessly complex.

The process of using globs was split off into another pants.backend.native.utils subsystem named ArchiveFileMapper, and docstrings describing that GCC and LLVM rely on the current specific layout of the archive were added as well, in 3051235.

jsirois · 2018-07-23T15:00:19Z

src/python/pants/backend/native/config/environment.py

+
+  @abstractproperty
+  def as_c_toolchain(self):
+    """???"""


Please doc or pass.

I think this should just be a pass, will make that change.

Removed in 3441e74!

jsirois · 2018-07-23T15:02:33Z

src/python/pants/backend/native/config/environment.py

+
+  @abstractproperty
+  def as_cpp_toolchain(self):
+    """???"""


Ditto doc or pass.

Removed in 3441e74!

jsirois · 2018-07-23T15:06:13Z

src/python/pants/backend/python/tasks/setup_py.py

@@ -92,12 +92,11 @@ class SetupPyNativeTools(datatype([
  """


-@rule(SetupPyNativeTools, [Select(CCompiler), Select(CppCompiler), Select(Linker), Select(Platform)])
-def get_setup_py_native_tools(c_compiler, cpp_compiler, linker, platform):
+@rule(SetupPyNativeTools, [Select(LLVMCToolchain), Select(LLVMCppToolchain), Select(Platform)])


This seems like a step back - SetupPyNativeTools which only cares about a CppToolchain and a CToolchain is not LLVM specific and cannot be produced for GCC. Am I missing something or is a TODO/comment missing?

Totally right! In 3441e74 I moved all the rules and datatypes into python_native_code.py. The rules select_c_toolchain_for_local_dist_compilation and select_cpp_toolchain_for_local_dist_compilation mean that LLVM is used as the default toolchain, but SetupPyNativeTools only requires an injected CToolchain and CppToolchain.

jsirois · 2018-07-23T15:25:33Z

src/python/pants/backend/native/config/environment.py

+class LLVMCToolchain(datatype([
+    ('llvm_c_compiler', LLVMCCompiler),
+    ('llvm_c_linker', LLVMCLinker),
+]), CToolchainProvider):


I'm not understanding why class LLVMCCompiler(datatype([('c_compiler', CCompiler)])): pass vs just class LLVMCCompiler(CCompiler): pass, etc. It seems to me this would eliminate the need for both CToolchainProvider and CppToolchainProvider.

You're completely right. 3441e74 does exactly this. That removed ~70 lines from this diff.

jsirois · 2018-07-23T15:50:36Z

src/python/pants/backend/native/subsystems/binaries/gcc.py

-  yield GCCCCompiler(gcc.c_compiler())
+@rule(GCCCCompiler, [Select(GCC), Select(Platform)])
+def get_gcc(gcc, platform):
+  yield GCCCCompiler(gcc.c_compiler(platform))


Why a yield here and below instead of a return? Afaict that just causes a little extra work looping through generator send gymnastics for what will be a blocking call either way:

pants/src/rust/engine/src/nodes.rs

Lines 861 to 871 in b42d305

deps

.then(move |deps_result| match deps_result {

Ok(deps) => externs::call(&externs::val_for(&func.0), &deps),

Err(failure) => Err(failure),

})

.then(move |task_result| match task_result {

Ok(val) => {

if externs::satisfied_by(&context.core.types.generator, &val) {

Self::generate(context, entry, val)

} else {

ok(val)

It had slipped my mind that return was allowed for an @rule -- this is using return now.

It absolutely makes sense, but I wasn't aware before that yield was interpreted to mean this (hadn't thought about it in a while) -- this makes the execution process of @rules more clear to me.

jsirois · 2018-07-23T15:51:56Z

src/python/pants/backend/native/subsystems/xcode_cli_tools.py

-      library_dirs=[])
+      library_dirs=[],
+      linking_library_dirs=[],
+      extra_args=['-mmacosx-version-min=10.11'])


The 10.11 seems arbitrarty / tied to Pants current min version support. Does this deserve to be lifted out to a constant used to generate all such flags?

It does! And it was, in MIN_OSX_VERSION_ARG from b2a750b.

jsirois · 2018-07-23T15:57:41Z

src/python/pants/backend/native/tasks/cpp_compile.py

    return NativeToolchain.scoped_instance(self)

  def get_compile_settings(self):
    return CppCompileSettings.scoped_instance(self)

+  @memoized_property
+  def _cpp_toolchain(self):
+    llvm_cpp_toolchain = self._request_single(LLVMCppToolchain, self._native_toolchain)


Another instance of regressing? to LLVM lock in here in a presumably generic CppCompile class.

Yep! This was fixed in 6843579! Thanks for describing the regressions!

jsirois · 2018-07-23T15:58:17Z

src/python/pants/backend/native/tasks/link_shared_libraries.py

    return NativeToolchain.scoped_instance(self)

+  @memoized_property
+  def _cpp_toolchain(self):
+    llvm_cpp_toolchain = self._request_single(LLVMCppToolchain, self._native_toolchain)


More lock in.

Also should be fixed by 6843579!

stuhood

John has provided a very useful review for this, so I'll stay out of it. But will say that what you've done to inline things into src/python/pants/backend/native/subsystems/native_toolchain.py makes a lot of sense to me, and removes a bit of unnecessary abstraction.

stuhood · 2018-07-23T18:40:44Z

src/python/pants/backend/native/subsystems/binaries/gcc.py

-  def c_compiler(self):
-    exe_filename = 'gcc'
-    path_entries = self.path_entries()
+  _PLATFORM_INTERMEDIATE_DIRNAME = {


The fact that this entire class refers to the specifics of our GCC build is worth pointing out in a class level doc most likely.

CMLivingston · 2018-07-23T19:01:01Z

src/python/pants/backend/native/subsystems/binaries/binutils.py

@@ -30,7 +30,9 @@ def linker(self):
    return Linker(
      path_entries=self.path_entries(),
      exe_filename='ld',
-      library_dirs=[])
+      library_dirs=[],


Slightly confused in the distinction between this param and the one below it. This is slightly ambiguous to me as well:
@abstractproperty def library_dirs(self): """Directories containing shared libraries required for a subprocess to run."""

Maybe add an abstract property with a docstring to the Linker class?

This makes things a lot neater! Added LinkerMixin in b2a750b!

CMLivingston · 2018-07-23T19:10:18Z

src/python/pants/backend/native/subsystems/native_toolchain.py

+  libc_dev = yield Get(LibcDev, NativeToolchain, native_toolchain)
+  working_linker = Linker(
+    path_entries=(base_linker.path_entries + working_c_compiler.path_entries),
+    exe_filename=working_c_compiler.exe_filename,


So we are still linking through the compiler frontend? Based on path_entries, it looks like the we will hit our linker first, however why not explicitly invoke it? I know you explained this to me once but I'm having a tough time recalling...I think it had to do with critical search dirs that we are unable to find by directly executing the linker?

It has been very difficult to invoke the linker directly when I've tried, for C or C++. If you try adding -v to the extra_args for a Linker somewhere you can see the (extremely long) generated command line -- I have not been able to make it work yet, after trying for a while, and as far as I can tell there is no reason we would to invoke the linker directly, unless the compiler frontend is pulling in something we don't want (which is what we are using the extra_args to avoid, with arguments such as -nostdinc++).

(to clarify, I would love to invoke the linker directly, but it's not something that has ever borne fruit every time I've tried. if there's a way to do this that I'm missing I'm all ears)

(what I said to you before was a condensed version of that)

Sure, I don't think this is a problem if we are hitting our linker regardless. If we find that the frontend is making things too difficult to build/debug, we can lean into direct invocation.

CMLivingston · 2018-07-23T19:12:28Z

src/python/pants/backend/native/subsystems/native_toolchain.py

+    '-x', 'c++', '-std=c++11',
+    # These mean we don't use any of the headers from our LLVM distribution.
+    '-nobuiltininc',
+    '-nostdinc++',


This implies that we will always be relying on system headers/includes to be there for any calls to std::*, correct?

No, the opposite -- these flags stop clang from searching for or using the system headers (because we don't control what it finds in that way) except for the ones we provide via *_INCLUDE_PATH. We add the system headers that we control / know about specifically later in the @rule, either from XCodeCLITools or GCC.

Expanded the definition of these more in 973f26e. Also, -nobuiltininc may not exist, or may not exist on Linux? I'm not sure, but it's not being recognized now -- will see if it's required on OSX.

clang++ --help is the only way I've been getting information about what any of these options do.

CMLivingston · 2018-07-23T19:23:18Z

src/python/pants/backend/native/subsystems/native_toolchain.py

+      extra_args=llvm_cpp_compiler_args)
+    linking_library_dirs = provided_gpp.library_dirs + provided_clang.library_dirs
+    # Ensure we use libstdc++, provided by g++, during the linking stage.
+    linker_extra_args=['-stdlib=libstdc++']


Am I correct in that we are using the provided lib dirs for linking, but not using them for compilation? This is for LLVM-based toolchains only, correct?

linking_library_dirs aren't used in compilation, only the compiler object is used. This is only for LLVM toolchains -- this @rule is producing the single LLVMCppToolchain. The -stdlib argument is only recognized on clang, see the libcxx site (the new LLVM C++ standard library that we are turning off with this option, because it is not complete on Linux yet). See the LinkerMixin introduced in b2a750b which was added in response to your other review comment, which makes the difference between these directories more clear.

CMLivingston

Thanks for circling back to fix these TODOs. A few questions from me, but I would like to hear your take on the llvm-locking happening in the native tasks (w.r.t. John's comments) before I approve these changes.

~~It would also be a great idea to get one more member of the Build team to give this a look to promote knowledge sharing of the native compilation support.~~

…ific directories

also introduce `LinkerMixin` as per review comments

…Mapper

cosmicexplorer · 2018-07-24T04:13:33Z

I removed some arguments to clang and clang++ while iterating on Linux that turned out to be necessary for OSX, but it should all be sorted out now (we'll see what travis has to say about that). All comments above should have been addressed.

…produce errors more easily

cosmicexplorer · 2018-07-24T07:48:01Z

Green, and OP updated.

cosmicexplorer · 2018-07-24T09:02:52Z

Also broke out #6224 for further documentation of what we are doing and why we do what we are doing in native_toolchain.py.

CMLivingston · 2018-07-24T17:25:47Z

src/python/pants/backend/native/config/environment.py

-
-    if self.include_dirs:
-      ret['C_INCLUDE_PATH'] = create_path_env_var(self.include_dirs)
+    ret = super(CCompiler, self).as_invocation_environment_dict.copy()

    ret['CC'] = self.exe_filename


Have you been able to verify that setup.py compilation is taking place with the correct exe? I recall we hit some cases where it wasn't actually respecting this variable.

With the correct exe_filename? There is no testing that the setup.py compilation does anything other than succeed, currently, but if you make the build fail (e.g. by inserting a syntax error into a C/C++ source), pex will print the stdout and stderr to the terminal stderr, and you can see that modifying exe_filenames (or e.g. selecting GCCCToolchain instead of the llvm one in python_native_code.py) will change the compiler used for the generated command line to build setup.py native sources.

At this point, I'm not sure what we should be testing wrt setup.py compilation other than success. If you have some itches you want to scratch, an issue just listing what tests we should add, or a PR doing that, would be great so I or someone else can address it in full. This wouldn't need to more than a few sentences.

And that may have been the case (wrt not respecting CC), but I don't remember the situation, and the issue may have been e.g. that we were adding the wrong entries to the PATH.

jsirois

A general unease I have with new and existing code here is the proliferation of tiny rules that don't obviously do anything expensive. Some definitely just construct a datatype instance from other datatype instances - which is not good. Others are less clear. The problem though is that each of these tiny rules will run on a different machine in the future! That's completely un-necessary and although it would be nice to not have to think about that when writing rules, I think - currently at least - you do have to consider that.

jsirois · 2018-07-24T17:47:00Z

src/python/pants/backend/python/subsystems/python_native_code.py

@@ -129,3 +139,157 @@ def check_build_for_current_platform_only(self, targets):
      'native code. Please ensure that the platform arguments in all relevant targets and build '
      'options are compatible with the current platform. Found targets for platforms: {}'
      .format(str(platforms_with_sources)))
+
+
+@rule(CToolchain, [Select(PythonNativeCode)])


CToolchain but you limit to LLVMCToolchain here (GCCCToolchain is ruled out). Why is this OK? Is this required to avoid the central problem that you have 2 producers for CToolchain components in GCC and LLVM registered rules with the engine, and to request a CCompiler straight up would thus blow up with an ambiguous producer? A comment / TODO to address would be good.

100% correct on why this is required. This concern was vaguely addressed with the TODO(#4020): These classes are performing the work of variants. I can add your comment to the usage here to make it clear how variants would improve the ux.

A comment would be great, but this is not just UX, it's a bug. There is currently no way to select GCC.

Hm, my comment was under the assumption that it was possible to select GCC for C/C++ elsewhere (this is done in test_native_toolchain.py) -- we just currently elect to use the LLVM toolchain when compiling python_dist()s without making that configurable yet. Am I missing something?

I just added 4f5194c which removes the rules which provide a bare CToolchain or CppToolchain, and expands the explanation of why the wrappers are necessary/how to request them. EDIT: the right commit is ac543fe now.

Sure, but in the test you select GCC explicitly. What's the point of the generic CToolchain if you know you're getting an LLVMCToolchain or GCCCToolchain toolchain because you ask for one of those directly by type. There is perhaps more abstraction than needed.

jsirois · 2018-07-24T17:49:47Z

src/python/pants/backend/native/config/environment.py

+class CToolchain(datatype([('c_compiler', CCompiler), ('c_linker', Linker)])): pass
+
+
+class LLVMCToolchain(datatype([('c_toolchain', CToolchain)])): pass


I'm still missing the need for indirection in CToolchain and CppToolchain variants. Why not just subtype directly?:

class CToolchain(datatype([('c_compiler', CCompiler), ('c_linker', Linker)])): pass class LLVMCToolchain(CToolchain): pass class GCCCToolchain(CToolchain): pass

Because then I can't figure out how to do e.g. this in python_native_code.py:

@rule(CToolchain, [Select(PythonNativeCode)]) def select_c_toolchain_for_local_dist_compilation(python_native_code): return Get(LLVMCToolchain, NativeToolchain, python_native_code.native_toolchain) @rule(CppToolchain, [Select(PythonNativeCode)]) def select_cpp_toolchain_for_local_dist_compilation(python_native_code): return Get(LLVMCppToolchain, NativeToolchain, python_native_code.native_toolchain)

because afaik the engine matches on an Exactly constraint when type-checking @rule return types. Just now, when trying the edits in your comment and changing python_native_code.py to the above, I get this error message excerpt when running ./pants test tests/python/pants_test/backend/python/tasks:python_native_code_testing -- -vsk test_python_create_platform_specific_distribution:

E ExecutionError: Received unexpected Throw state(s): E Computing Select(<pants.backend.python.subsystems.python_native_code.PythonNativeCode object at 0x11141ad90>, =SetupPyNativeTools) E Noop(No task was available to compute the value.)

cc @stuhood: is it correct to say this is happening because of an Exactly type constraint on rule results? Or is there something else I'm missing?

I assume by return Get(... you mean yield Get(.... Gets are only meant to work with yields.

Yep. This will give a more useful error post #5788... but currently returning the wrong type from a rule triggers that Noop behaviour.

Correct -- I had tried both, and yield Get(...) also fails, with a StopIteration error, but I had neglected to mention that above.

jsirois · 2018-07-24T17:57:10Z

src/python/pants/backend/python/subsystems/python_native_code.py

+# TODO: could this kind of @rule be automatically generated?
+@rule(SetupPyNativeTools, [Select(CToolchain), Select(CppToolchain), Select(Platform)])
+def get_setup_py_native_tools(c_toolchain, cpp_toolchain, platform):
+  yield SetupPyNativeTools(


This rule was removed entirely in 9846475!

jsirois · 2018-07-24T18:03:29Z

src/python/pants/backend/python/subsystems/python_native_code.py

+    # only creating an environment at the very end.
+    native_tools = self.setup_py_native_tools
+    if native_tools:
+      # TODO: an as_tuple() method for datatypes could make this destructuring cleaner!


I think this is actually a sign you should eliminate the middleman. See comment below RE SetupPyNativeTools rule.

I have so far ended up keeping the SetupPyExecutionEnvironment for now, but removing all the rules associated with it. This is because the right environment is difficult to create simply by composing together other executable environments (for now) -- but I did expand the comment here and remove the TODO in 7ebfe51.

jsirois · 2018-07-24T18:06:28Z

src/python/pants/backend/python/subsystems/python_native_code.py

+  """
+
+
+# TODO: could this kind of @rule be automatically generated?


Any time you have a rule like this I think its a sign you don't need a rule like this - for 2 reasons:

You're likely creating a middleman wrapper struct - if so, let the folks needing the struct components ask for them directly.

You're just running non-io python code - this is generally an abuse of parallelism and in-efficient, just run that code where you need to run it, not off in a rule.

This was a very helpful comment! Along these lines, I was actually able to remove all the rules from the python backend in 9846475 and request LLVMCToolchain/etc directly.

…nvironment.py

cosmicexplorer · 2018-07-24T20:40:39Z

In response to the above and other comments on individual lines in the diff, I was able to remove all of the rules from the python backend. I would strongly appreciate not having to mix concerns of the execution model with the @rule interface -- I'm really digging the use as a generic typed dependency injection facility. I will see if it would be reasonable to intersperse a non-futurized/parallelized execution model at some point in the future into the one we have right now to avoid having to run certain logic in the coroutine execution model, if only because, incredibly, that is something I was thinking a lot about before I ever started working on pants.

Eric-Arellano

Looks Py3 compliant to me.

…toolchain selection (#6217) ### Problem #5951 explains the problem addressed by moving CLI arguments to individual `Executable` objects -- this reduces greatly the difficulty in generating appropriate command lines for the executables invoked. In this PR, it can be seen to remove a significant amount of repeated boilerplate. Additionally, we weren't distinguishing between a `Linker` to link the compiled object files of `gcc` or `g++` vs `clang` or `clang++`. We were attempting to generate a linker object which would work with *any of* `gcc`, `g++`, `clang`, or `clang++`, and this wasn't really feasible. Along with the above, this made it extremely difficult and error-prone to generate correct command lines / environments for executing the linker, which led to e.g. not being able to find `crti.o` (as one symptom addressed by this problem). ### Solution - Introduce `CToolchain` and `CppToolchain` in `environment.py`, which can be generated from `LLVMCToolchain`, `LLVMCppToolchain`, `GCCCToolchain`, or `GCCCppToolchain`. These toolchain datatypes are created in `native_toolchain.py`, where a single `@rule` for each ensures that no C or C++ compiler that someone can request was made without an accompanying linker, which will be configured to work with the compiler. - Introduce the `extra_args` property to the `Executable` mixin in `environment.py`, which `Executable` subclasses can just declare a datatype field named `extra_args` in order to override. This is used in `native_toolchain.py` to ensure platform-specific arguments and environment variables are set in the same `@rule` which produces a paired compiler and linker -- there is a single place to look at to see where all the process invocation environment variables and command-line arguments are set for a given toolchain. - Introduce the `ArchiveFileMapper` subsystem and use it to declare sets of directories to resolve within our BinaryTool archives `GCC` and `LLVM`. This subsystem allows globbing (and checks that there is a unique expansion), which makes it robust to e.g. platform-specific paths to things like include or lib directories. ### Result Removes several FIXMEs, including heavily-commented parts of `test_native_toolchain.py`. Partially addresses #5951 -- `setup_py.py` still generates its own execution environment from scratch, and this could be made more hygienic in the future. As noted in #6179 and #6205, this PR seems to immediately fix the CI failures in those PRs.

…toolchain selection (pantsbuild#6217) ### Problem pantsbuild#5951 explains the problem addressed by moving CLI arguments to individual `Executable` objects -- this reduces greatly the difficulty in generating appropriate command lines for the executables invoked. In this PR, it can be seen to remove a significant amount of repeated boilerplate. Additionally, we weren't distinguishing between a `Linker` to link the compiled object files of `gcc` or `g++` vs `clang` or `clang++`. We were attempting to generate a linker object which would work with *any of* `gcc`, `g++`, `clang`, or `clang++`, and this wasn't really feasible. Along with the above, this made it extremely difficult and error-prone to generate correct command lines / environments for executing the linker, which led to e.g. not being able to find `crti.o` (as one symptom addressed by this problem). ### Solution - Introduce `CToolchain` and `CppToolchain` in `environment.py`, which can be generated from `LLVMCToolchain`, `LLVMCppToolchain`, `GCCCToolchain`, or `GCCCppToolchain`. These toolchain datatypes are created in `native_toolchain.py`, where a single `@rule` for each ensures that no C or C++ compiler that someone can request was made without an accompanying linker, which will be configured to work with the compiler. - Introduce the `extra_args` property to the `Executable` mixin in `environment.py`, which `Executable` subclasses can just declare a datatype field named `extra_args` in order to override. This is used in `native_toolchain.py` to ensure platform-specific arguments and environment variables are set in the same `@rule` which produces a paired compiler and linker -- there is a single place to look at to see where all the process invocation environment variables and command-line arguments are set for a given toolchain. - Introduce the `ArchiveFileMapper` subsystem and use it to declare sets of directories to resolve within our BinaryTool archives `GCC` and `LLVM`. This subsystem allows globbing (and checks that there is a unique expansion), which makes it robust to e.g. platform-specific paths to things like include or lib directories. ### Result Removes several FIXMEs, including heavily-commented parts of `test_native_toolchain.py`. Partially addresses pantsbuild#5951 -- `setup_py.py` still generates its own execution environment from scratch, and this could be made more hygienic in the future. As noted in pantsbuild#6179 and pantsbuild#6205, this PR seems to immediately fix the CI failures in those PRs.

cosmicexplorer added 7 commits July 22, 2018 18:07

introduce CToolchain and CppToolchain to pair related tools

f330514

make the rest of the native backend subsystem testing work

9e36cd0

make python_dist() compilation work with the new native backend

b0e7b52

fix python dist integration tests

8193ca9

move argument generation into the executable objects themselves

1c0c880

make everything work on osx

6e3bcbc

remove/update TODOs/FIXMEs

9a16efb

cosmicexplorer requested review from stuhood, jsirois, benjyw and CMLivingston July 23, 2018 07:43

cosmicexplorer added needs-cherrypick native labels Jul 23, 2018

cosmicexplorer added this to the 1.9.x milestone Jul 23, 2018

remove unnecessary RootRules

2bc7698

cosmicexplorer mentioned this pull request Jul 23, 2018

move extension/modification of process argv in the native backend into Executable subclassing instead of in each task #5951

Closed

jsirois reviewed Jul 23, 2018

View reviewed changes

stuhood reviewed Jul 23, 2018

View reviewed changes

CMLivingston reviewed Jul 23, 2018

View reviewed changes

cosmicexplorer added 7 commits July 23, 2018 14:39

refactor ParseSearchDirs

900a476

add FIXME

78fe179

refactor gcc subsystem to use globs instead of guessing platform-spec…

e67044c

…ific directories

remove GCCCCompiler, GCCCLinker, etc

3441e74

inject CToolchain and CppToolchain and select which with other @rules

6843579

centralize -mmacosx-version-min=10.11 argument creation

b2a750b

also introduce `LinkerMixin` as per review comments

clarify some comments on the clang args we add

973f26e

cosmicexplorer added 3 commits July 23, 2018 20:56

split off indexing into BinaryTool archives using the new ArchiveFile…

3051235

…Mapper

remove -nostdinc from clang args

09526d2

make searching gcc BinaryTool paths work on osx

8f34e37

cosmicexplorer added 3 commits July 23, 2018 21:21

fix up docstring formatting

8fd7baa

add the environment to the compiler or linker invocation so we can re…

e5a5d7e

…produce errors more easily

only add --gcc-toolchain argument to clang and clang++ on linux

f4a8a7d

This was referenced Jul 24, 2018

Add basic native task unit tests. #6179

Merged

Fix pydist native sources selection #6205

Merged

CMLivingston reviewed Jul 24, 2018

View reviewed changes

jsirois approved these changes Jul 24, 2018

View reviewed changes

cosmicexplorer added 3 commits July 24, 2018 12:55

remove all python backend rules

9846475

remove TODO in SetupPyExecutionEnvironment

7ebfe51

remove CToolchain/CppToolchain rules and add explanatory comment in e…

ac543fe

…nvironment.py

cosmicexplorer force-pushed the refactor-llvm-gcc-toolchain-selection branch from 4f5194c to ac543fe Compare July 24, 2018 20:32

CMLivingston approved these changes Jul 24, 2018

View reviewed changes

Eric-Arellano approved these changes Jul 24, 2018

View reviewed changes

cosmicexplorer merged commit 2fc57cb into pantsbuild:master Jul 24, 2018

cosmicexplorer removed the needs-cherrypick label Aug 9, 2018

	deps
	.then(move \|deps_result\| match deps_result {
	Ok(deps) => externs::call(&externs::val_for(&func.0), &deps),
	Err(failure) => Err(failure),
	})
	.then(move \|task_result\| match task_result {
	Ok(val) => {
	if externs::satisfied_by(&context.core.types.generator, &val) {
	Self::generate(context, entry, val)
	} else {
	ok(val)

		class CToolchain(datatype([('c_compiler', CCompiler), ('c_linker', Linker)])): pass


		class LLVMCToolchain(datatype([('c_toolchain', CToolchain)])): pass

		"""


		# TODO: could this kind of @rule be automatically generated?

Associate cli arguments with executables and refactor llvm/gcc c/c++ toolchain selection #6217

Associate cli arguments with executables and refactor llvm/gcc c/c++ toolchain selection #6217

Conversation

cosmicexplorer commented Jul 23, 2018 • edited Loading

Problem

Solution

Result

jsirois left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsirois Jul 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CMLivingston Jul 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cosmicexplorer Jul 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CMLivingston left a comment • edited Loading

Choose a reason for hiding this comment

cosmicexplorer commented Jul 24, 2018

cosmicexplorer commented Jul 24, 2018

cosmicexplorer commented Jul 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsirois left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cosmicexplorer Jul 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cosmicexplorer Jul 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cosmicexplorer commented Jul 23, 2018 •

edited

Loading

jsirois Jul 23, 2018 •

edited

Loading

stuhood left a comment •

edited

Loading

CMLivingston Jul 23, 2018 •

edited

Loading

cosmicexplorer Jul 24, 2018 •

edited

Loading

CMLivingston left a comment •

edited

Loading

cosmicexplorer Jul 24, 2018 •

edited

Loading

cosmicexplorer Jul 24, 2018 •

edited

Loading

cosmicexplorer commented Jul 24, 2018 •

edited

Loading