Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic native task unit tests. #6179

Conversation

jsirois
Copy link
Contributor

@jsirois jsirois commented Jul 18, 2018

Running a ./pants lint across the whole repo revealed an issue
accessing a non-existant self.linker.platform attribute in
LinkSharedLibraries. Add basic unit tests to exercise task execution
with a full cache miss and then a full hit.

@stuhood stuhood added this to the 1.9.x milestone Jul 18, 2018
@jsirois jsirois force-pushed the link_shared_libraries/fix_invalid_platform_ref branch from 55a4f4e to 1b013e2 Compare July 18, 2018 19:27
@jsirois
Copy link
Contributor Author

jsirois commented Jul 18, 2018

NB: When I wrote these tests there were 2 issues, self.linker.platform in LinkSharedLibraries and self.workunit_name in NativeCompile. Creating this PR forced a rebase which revealed these 2 issues have been fixed in the meantime. I think this stands on its own though since there is a missing product dep declaration and the tests caught both prior errors.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 19, 2018

This is hitting:

                             rc = process.wait()
                             if rc != 0:
                               workunit.set_outcome(WorkUnit.FAILURE)
                               raise self.LinkSharedLibrariesError(
                                 "Error linking native objects with command {} for request {}. Exit code was: {}."
                     >           .format(cmd, link_request, rc))
                     E         LinkSharedLibrariesError: Error linking native objects with command [u'g++', u'-shared', u'-o', u'/tmp/tmpNSxjs5_BUILD_ROOT/.pants.d/pants_backend_native_tasks_link_shared_libraries_LinkSharedLibraries/5f0164f78d1c/src.cpp.test.test/current/libtest.so', u'/tmp/tmpNSxjs5_BUILD_ROOT/.pants.d/cpp_compile/4826ac0c340a/src.cpp.test.test/current/test.o'] for request LinkSharedLibraryRequest(linker=Linker(path_entries=[u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/bin', u'/home/travis/.cache/pants/bin/binutils/linux/x86_64/2.30/binutils/bin', u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/bin', u'/home/travis/.cache/pants/bin/binutils/linux/x86_64/2.30/binutils/bin', u'/home/travis/.cache/pants/bin/binutils/linux/x86_64/2.30/binutils/bin'], exe_filename=g++, library_dirs=[u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/lib/gcc/x86_64-pc-linux-gnu/7.3.0', u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/lib/gcc', u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/lib64', u'/lib64', u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/lib', u'/lib', u'/usr/lib', u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/lib/gcc/x86_64-pc-linux-gnu/7.3.0', u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/lib/gcc', u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/lib64', u'/lib64', u'/home/travis/.cache/pants/bin/gcc/linux/x86_64/7.3.0/gcc/lib', u'/lib', u'/usr/lib']), object_files=[u'/tmp/tmpNSxjs5_BUILD_ROOT/.pants.d/cpp_compile/4826ac0c340a/src.cpp.test.test/current/test.o'], native_artifact=NativeArtifact(lib_name=test), output_dir=/tmp/tmpNSxjs5_BUILD_ROOT/.pants.d/pants_backend_native_tasks_link_shared_libraries_LinkSharedLibraries/5f0164f78d1c/src.cpp.test.test/current, external_libs_info=<pants.backend.native.tasks.native_external_library_fetch.NativeExternalLibraryFiles object at 0x7f762dc3b690>). Exit code was: 1.
                     
                     pants/backend/native/tasks/link_shared_libraries.py:201: LinkSharedLibrariesError
                     -------------- Captured stderr call --------------
                      98% .................................................  242190 KB 6.910s
                     100% .................................................. 39181 KB 1.384s
                     
                     Starting workunit cpp-compile
                     
                     Starting workunit link-shared-libraries
                     /home/travis/.cache/pants/bin/binutils/linux/x86_64/2.30/binutils/bin/ld: cannot find crti.o: No such file or directory
                     /home/travis/.cache/pants/bin/binutils/linux/x86_64/2.30/binutils/bin/ld: cannot find -lm
                     collect2: error: ld returned 1 exit status

I'll dig a bit on the crti.o.

@jsirois
Copy link
Contributor Author

jsirois commented Jul 19, 2018

I find:

$ docker run --rm -it ubuntu:trusty bash -c 'apt-get update && apt-get install -y lib32stdc++6 lib32z1 lib32z1-dev gcc-multilib python-dev openssl libssl-dev && find $(sudo ldconfig -v 2>/dev/null | grep -v ^$'\t' | cut -d: -f1) -name crti.o'
...
./usr/lib/x86_64-linux-gnu/crti.o
./usr/libx32/crti.o
./usr/lib32/crti.o

And afaict, the existing search algorithm only consults this (outside amending the search path manually via flags):

$ docker run --rm -it ubuntu:trusty bash -c 'apt-get update && apt-get install build-essential && gcc -print-search-dirs'
...
install: /usr/lib/gcc/x86_64-linux-gnu/4.8/
programs: =/usr/lib/gcc/x86_64-linux-gnu/4.8/:/usr/lib/gcc/x86_64-linux-gnu/4.8/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.8/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../x86_64-linux-gnu/bin/x86_64-linux-gnu/4.8/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../x86_64-linux-gnu/bin/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../x86_64-linux-gnu/bin/
libraries: =/usr/lib/gcc/x86_64-linux-gnu/4.8/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../x86_64-linux-gnu/lib/x86_64-linux-gnu/4.8/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../x86_64-linux-gnu/lib/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../x86_64-linux-gnu/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/4.8/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../lib/:/lib/x86_64-linux-gnu/4.8/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/4.8/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../x86_64-linux-gnu/lib/:/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../:/lib/:/usr/lib/

That has none of the right answers. Just /lib, /usr/lib and a bunch of **/gcc/* dirs. Since systems can be configured so differently, it seems to me there is no way around consulting the host system ld.so cache.

@cosmicexplorer I think you've banged your head against this a good deal. Anything obvious you think I'm missing re searching for crti.o?

@cosmicexplorer
Copy link
Contributor

Gah, I was trying really hard to not make this a problem to inflict on anyone else. Indeed #5998 fixed the linker.platform issue (and this PR seems really necessary to avoid that happening again)! I will look into this after making an issue to track followup work for the 1.9.0rc0 release.

@cosmicexplorer
Copy link
Contributor

cosmicexplorer commented Jul 19, 2018

@jsirois I'm not certain how the ld.so cache would help in locating crti.o, except if you mean that it is probably co-located with libc.so (which seems reasonable)? EDIT: sorry, didn't read your command closely enough, I see that now.

It looks like the output of that command contains /usr/lib/x86_64-linux-gnu/ in the libraries paths, which (1) is where I was seeing crti.o get resolved to before (2) is where you're seeing it resolved to now. It seems there was an additional error introduced when linking shared libraries -- the error you're seeing should not be occurring if we tell the linker to produce a shared object (I'm pretty sure) -- crti.o is libc information needed to create an executable entry point. I will repro this with your command and fix it.

@cosmicexplorer
Copy link
Contributor

Ok, I'm 90% sure that we shouldn't be seeing this error with -shared turned on, but I'm not sure about that. If that's the case, then that is another thing to fix. Out for the next few hours but fixing this is my goal for the release today.

cosmicexplorer added a commit that referenced this pull request Jul 20, 2018
### Problem

Trying to release `1.9.0rc0` failed, and pypi won't accept another release to the same version. #6192 will have more information about what happened here after we can get this release out.

### Solution

Make a noop release for `1.9.0rc1`.

### Result

We finally have a release candidate for 1.9.0. I plan to fast-follow with a hotfix release `rc2` containing the changes from both #6201 and #6179, as these both describe bugged functionality that would make it very difficult to use `python_dist()` and the new C/C++ targets.
cosmicexplorer added a commit that referenced this pull request Jul 20, 2018
### Problem

Trying to release `1.9.0rc0` failed, and pypi won't accept another release to the same version. #6192 will have more information about what happened here after we can get this release out.

### Solution

Make a noop release for `1.9.0rc1`.

### Result

We finally have a release candidate for 1.9.0. I plan to fast-follow with a hotfix release `rc2` containing the changes from both #6201 and #6179, as these both describe bugged functionality that would make it very difficult to use `python_dist()` and the new C/C++ targets.
@cosmicexplorer
Copy link
Contributor

So after doing a little more research, I'm convinced I was mistaken on the purpose of crti.o. While the documentation on any of this is of course severely lacking, I think I had misconstrued mentions of "global constructors" or "init" as strictly referring to a main function, and therefore that crti.o would be necessary specifically for an executable, but not a shared library. My vague half-understanding as of now is that crti.o provides hooks which are necessary for any ELF object -- since this appears to be true for any shared object file anyone has ever distributed, my concern that this would somehow link the produced shared object to the specific version of libc providing crti.o on the host which produces the shared library seems unfounded, or workaroundable.

I believe part of the reason for my confusion is that the "init" facility provided by crti.o is used in executables for linking up the main method to executable invocation (this is a complete guess). Another alternative interpretation is that crti.o is necessary regardless, and hooking up the main method is totally separate. The resources that led to me to this tentative conclusion are this page on the OSDev Wiki (which does not provide much information about what crti.o is for) and this stackoverflow response (which does).

(this information should be cataloged in an issue somewhere, probably, or maybe a README in the native backend. i'll take responsibility for doing that)

Regardless of the correct interpretation, it now seems (more) clear that crti.o is indeed necessary for creating shared object files, so I will now attempt to figure out why it's not being found. As mentioned above, parsing the output of g++ -print-search-dirs actually seems to be finding where crti.o should be on travis -- but obviously this isn't working, so I will figure out why that is happening. This error appears to be the only thing blocking #6205 as well, so this planned hotfix 1.9.0rc2 should be good to go after this is figured out.

@cosmicexplorer
Copy link
Contributor

cosmicexplorer commented Jul 24, 2018

This travis run on my fork of this PR, rebased onto #6217, is now magically passing everything, because we fix the linker provided to LinkSharedLibraries.

Also, this could resolve #5934 when merged.

cosmicexplorer added a commit that referenced this pull request Jul 24, 2018
…toolchain selection (#6217)

### Problem

#5951 explains the problem addressed by moving CLI arguments to individual `Executable` objects -- this reduces greatly the difficulty in generating appropriate command lines for the executables invoked. In this PR, it can be seen to remove a significant amount of repeated boilerplate.

Additionally, we weren't distinguishing between a `Linker` to link the compiled object files of `gcc` or `g++` vs `clang` or `clang++`. We were attempting to generate a linker object which would work with *any of* `gcc`, `g++`, `clang`, or `clang++`, and this wasn't really feasible. Along with the above, this made it extremely difficult and error-prone to generate correct command lines / environments for executing the linker, which led to e.g. not being able to find `crti.o` (as one symptom addressed by this problem).

### Solution

- Introduce `CToolchain` and `CppToolchain` in `environment.py`, which can be generated from `LLVMCToolchain`, `LLVMCppToolchain`, `GCCCToolchain`, or `GCCCppToolchain`. These toolchain datatypes are created in `native_toolchain.py`, where a single `@rule` for each ensures that no C or C++ compiler that someone can request was made without an accompanying linker, which will be configured to work with the compiler.
- Introduce the `extra_args` property to the `Executable` mixin in `environment.py`, which `Executable` subclasses can just declare a datatype field named `extra_args` in order to override. This is used in `native_toolchain.py` to ensure platform-specific arguments and environment variables are set in the same `@rule` which produces a paired compiler and linker -- there is a single place to look at to see where all the process invocation environment variables and command-line arguments are set for a given toolchain.
- Introduce the `ArchiveFileMapper` subsystem and use it to declare sets of directories to resolve within our BinaryTool archives `GCC` and `LLVM`. This subsystem allows globbing (and checks that there is a unique expansion), which makes it robust to e.g. platform-specific paths to things like include or lib directories.

### Result

Removes several FIXMEs, including heavily-commented parts of `test_native_toolchain.py`. Partially addresses #5951 -- `setup_py.py` still generates its own execution environment from scratch, and this could be made more hygienic in the future. As noted in #6179 and #6205, this PR seems to immediately fix the CI failures in those PRs.
@jsirois jsirois force-pushed the link_shared_libraries/fix_invalid_platform_ref branch 3 times, most recently from 660c600 to 28c24d9 Compare July 25, 2018 02:29
@jsirois jsirois changed the title Fix native tasks. Add basic native task unit tests. Jul 25, 2018
@jsirois
Copy link
Contributor Author

jsirois commented Jul 25, 2018

@cosmicexplorer please take another look. Now green rebased on top of your fix with 1 added test for the C compilation task.

Running a `./pants lint` across the whole repo revealed an issue
accessing a non-existant `self.linker.platform` attribute in
`LinkSharedLibraries`. Add basic unit tests to exercise task execution
with a full cache miss and then a full hit.
@jsirois jsirois force-pushed the link_shared_libraries/fix_invalid_platform_ref branch from 28c24d9 to 5258fc9 Compare July 26, 2018 17:57
@jsirois
Copy link
Contributor Author

jsirois commented Jul 26, 2018

@cosmicexplorer please take a final look. Re-based on top a conflicting change from @illicitonion

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cosmicexplorer is under the weather today, but this looks sane to me. Thanks!

@jsirois jsirois merged commit e667ae7 into pantsbuild:master Jul 27, 2018
@jsirois
Copy link
Contributor Author

jsirois commented Jul 27, 2018

Thanks @stuhood this is in master. The cherry-picked to 1.9.x was not clean, so I'll leave that to you @cosmicexplorer to decide whether worth it or not. I suspect not since this turned into just adding tests with the original 2 production issues fixed by subsequent PRs.

@jsirois jsirois deleted the link_shared_libraries/fix_invalid_platform_ref branch July 27, 2018 01:14
cosmicexplorer added a commit that referenced this pull request Jul 28, 2018
…toolchain selection (#6217)

### Problem

#5951 explains the problem addressed by moving CLI arguments to individual `Executable` objects -- this reduces greatly the difficulty in generating appropriate command lines for the executables invoked. In this PR, it can be seen to remove a significant amount of repeated boilerplate.

Additionally, we weren't distinguishing between a `Linker` to link the compiled object files of `gcc` or `g++` vs `clang` or `clang++`. We were attempting to generate a linker object which would work with *any of* `gcc`, `g++`, `clang`, or `clang++`, and this wasn't really feasible. Along with the above, this made it extremely difficult and error-prone to generate correct command lines / environments for executing the linker, which led to e.g. not being able to find `crti.o` (as one symptom addressed by this problem).

### Solution

- Introduce `CToolchain` and `CppToolchain` in `environment.py`, which can be generated from `LLVMCToolchain`, `LLVMCppToolchain`, `GCCCToolchain`, or `GCCCppToolchain`. These toolchain datatypes are created in `native_toolchain.py`, where a single `@rule` for each ensures that no C or C++ compiler that someone can request was made without an accompanying linker, which will be configured to work with the compiler.
- Introduce the `extra_args` property to the `Executable` mixin in `environment.py`, which `Executable` subclasses can just declare a datatype field named `extra_args` in order to override. This is used in `native_toolchain.py` to ensure platform-specific arguments and environment variables are set in the same `@rule` which produces a paired compiler and linker -- there is a single place to look at to see where all the process invocation environment variables and command-line arguments are set for a given toolchain.
- Introduce the `ArchiveFileMapper` subsystem and use it to declare sets of directories to resolve within our BinaryTool archives `GCC` and `LLVM`. This subsystem allows globbing (and checks that there is a unique expansion), which makes it robust to e.g. platform-specific paths to things like include or lib directories.

### Result

Removes several FIXMEs, including heavily-commented parts of `test_native_toolchain.py`. Partially addresses #5951 -- `setup_py.py` still generates its own execution environment from scratch, and this could be made more hygienic in the future. As noted in #6179 and #6205, this PR seems to immediately fix the CI failures in those PRs.
cosmicexplorer pushed a commit that referenced this pull request Jul 28, 2018
Running a `./pants lint` across the whole repo revealed an issue
accessing a non-existant `self.linker.platform` attribute in
`LinkSharedLibraries`. Add basic unit tests to exercise task execution
with a full cache miss and then a full hit.
CMLivingston pushed a commit to CMLivingston/pants that referenced this pull request Aug 27, 2018
### Problem

Trying to release `1.9.0rc0` failed, and pypi won't accept another release to the same version. pantsbuild#6192 will have more information about what happened here after we can get this release out.

### Solution

Make a noop release for `1.9.0rc1`.

### Result

We finally have a release candidate for 1.9.0. I plan to fast-follow with a hotfix release `rc2` containing the changes from both pantsbuild#6201 and pantsbuild#6179, as these both describe bugged functionality that would make it very difficult to use `python_dist()` and the new C/C++ targets.
CMLivingston pushed a commit to CMLivingston/pants that referenced this pull request Aug 27, 2018
…toolchain selection (pantsbuild#6217)

### Problem

pantsbuild#5951 explains the problem addressed by moving CLI arguments to individual `Executable` objects -- this reduces greatly the difficulty in generating appropriate command lines for the executables invoked. In this PR, it can be seen to remove a significant amount of repeated boilerplate.

Additionally, we weren't distinguishing between a `Linker` to link the compiled object files of `gcc` or `g++` vs `clang` or `clang++`. We were attempting to generate a linker object which would work with *any of* `gcc`, `g++`, `clang`, or `clang++`, and this wasn't really feasible. Along with the above, this made it extremely difficult and error-prone to generate correct command lines / environments for executing the linker, which led to e.g. not being able to find `crti.o` (as one symptom addressed by this problem).

### Solution

- Introduce `CToolchain` and `CppToolchain` in `environment.py`, which can be generated from `LLVMCToolchain`, `LLVMCppToolchain`, `GCCCToolchain`, or `GCCCppToolchain`. These toolchain datatypes are created in `native_toolchain.py`, where a single `@rule` for each ensures that no C or C++ compiler that someone can request was made without an accompanying linker, which will be configured to work with the compiler.
- Introduce the `extra_args` property to the `Executable` mixin in `environment.py`, which `Executable` subclasses can just declare a datatype field named `extra_args` in order to override. This is used in `native_toolchain.py` to ensure platform-specific arguments and environment variables are set in the same `@rule` which produces a paired compiler and linker -- there is a single place to look at to see where all the process invocation environment variables and command-line arguments are set for a given toolchain.
- Introduce the `ArchiveFileMapper` subsystem and use it to declare sets of directories to resolve within our BinaryTool archives `GCC` and `LLVM`. This subsystem allows globbing (and checks that there is a unique expansion), which makes it robust to e.g. platform-specific paths to things like include or lib directories.

### Result

Removes several FIXMEs, including heavily-commented parts of `test_native_toolchain.py`. Partially addresses pantsbuild#5951 -- `setup_py.py` still generates its own execution environment from scratch, and this could be made more hygienic in the future. As noted in pantsbuild#6179 and pantsbuild#6205, this PR seems to immediately fix the CI failures in those PRs.
CMLivingston pushed a commit to CMLivingston/pants that referenced this pull request Aug 27, 2018
Running a `./pants lint` across the whole repo revealed an issue
accessing a non-existant `self.linker.platform` attribute in
`LinkSharedLibraries`. Add basic unit tests to exercise task execution
with a full cache miss and then a full hit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants