Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update BOLT #92

Merged
merged 38 commits into from
Dec 7, 2020
Merged

Update BOLT #92

merged 38 commits into from
Dec 7, 2020

Conversation

shintaro-iwasaki
Copy link
Collaborator

To fff1abc406d56401f37a1ef4431583f2e75b5039

jprotze and others added 30 commits December 4, 2020 09:32
OpenMP 5.1 adds an extra enum entry for ompt_scope_t, which makes the related
switch statement incomplete.
Also adding cases for newly added barrier variants.

Differential Revision: https://reviews.llvm.org/D90758

cherry-pick: 96eaacc917a2998e733c8141ca6713f88fd2333c
llvm/llvm-project@96eaacc
This patch adds omp_calloc implementation according to OpenMP 5.1
specification.

Differential Revision: https://reviews.llvm.org/D90967

cherry-pick: 938f1b858104956e1e2d298bbf774a54554e4a9f
llvm/llvm-project@938f1b8
Differential Revision: https://reviews.llvm.org/D91478

cherry-pick: 9bcef58b63776c490fd902290f0efc580e3970bc
llvm/llvm-project@9bcef58
…quential part

This introduces the new `ARCHER_OPTIONS` flag `ignore_serial=0|1` to disable
analysis and logging of memory accesses in the sequential part of the OpenMP
application.

In the sequential part of an OpenMP program no data race is possible, unless
there is non-OpenMP concurrency (such as pthreads, MPI, ...). For the latter
reason, this is not active by default.

Besides reducing the runtime overhead for the sequential part of the program,
this reduces the memory overhead for sequential initialization. In combination
with `flush_shadow=1` this can allow analysis of applications, which run close
to the limit of available memory, but only access smaller parts of shared
memory during each OpenMP parallel region.

A problem for this approach is that Archer only gets active, when the OpenMP
runtime gets initialized, which might be after serial initialization of the
application. In such case, it helps to call for example `omp_get_max_threads()`
at the beginning of main.

Differential Revision: https://reviews.llvm.org/D90473

cherry-pick: fdc9dfc8e47750fa27b7e7bde2f62c9af68b99e5
llvm/llvm-project@fdc9dfc
cherry-pick: 8647c669a4a3193558ce0f2f398ffe04b80ad886
llvm/llvm-project@8647c66
This patch adds omp_realloc function implementation according to
OpenMP 5.1 specification.

Differential Revision: https://reviews.llvm.org/D90971

cherry-pick: 5439db05e74044a239c0fd37f8594b6b67dd3c02
llvm/llvm-project@5439db0
Differential Revision: https://reviews.llvm.org/D91105

cherry-pick: 44a11c342caa70efe9f9d07db3e66dd48f701aca
llvm/llvm-project@44a11c3
Summary:
This patch adds support for passing in the original delcaration name in the source file to the libomptarget runtime. This will allow the runtime to provide more intelligent debugging messages. This patch takes the original expression parsed from the OpenMP map / update clause and provides a textual representation if it was explicitly mapped, otherwise it takes the name of the variable declaration as a fallback. The information in passed to the runtime in a global array of strings that matches the existing ident_t source location strings using ";name;filename;column;row;;"

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D89802

cherry-pick: 97e55cfef5b86b1b190b6f3f57ca2a89ec61c14f
llvm/llvm-project@97e55cf
Summary:
This patch adds basic support for priting the source location and names for the mapped variables. This patch does not support names for custom mappers. This is based on D89802.

Reviewers: jdoerfert

Differential Revision: https://reviews.llvm.org/D90172

cherry-pick: 5378c6a4bf9422db0e2d6232dc2dc43222d0ab6b
llvm/llvm-project@5378c6a
Summary:
Add support for passing source locations to libomptarget runtime functions using the ident_t struct present in the rest of the libomp API. This will allow the runtime system to give much more insightful error messages and debugging values.

Reviewers: jdoerfert grokos

Differential Revision: https://reviews.llvm.org/D87946

cherry-pick: da8bec47ab8c755c8272ecf79d80c887bca8b781
llvm/llvm-project@da8bec4
This patch is the runtime support for https://reviews.llvm.org/D84192.

In order not to modify the tgt_target_data_update information but still be
able to pass the extra information for non-contiguous map item (offset,
count, and stride for each dimension), this patch overload arg when
the maptype is set as OMP_TGT_MAPTYPE_DESCRIPTOR. The origin arg is for
passing the pointer information, however, the overloaded arg is an
array of descriptor_dim:

```
struct descriptor_dim {
  int64_t offset;
  int64_t count;
  int64_t stride
};
```

and the array size is the dimension size. In addition, since we
have count and stride information in descriptor_dim, we can replace/overload the
arg_size parameter by using dimension size.

Reviewed By: grokos, tianshilei1992

Differential Revision: https://reviews.llvm.org/D82245

cherry-pick: 7036fe8a0cffcefaa542f6dde756b7aa2f9c91b5
llvm/llvm-project@7036fe8
Patch by tlwilmar (Terry Wilmarth)

Differential Revision: https://reviews.llvm.org/D91189

cherry-pick: 9cfad5f9c5bfd985f1bc8b0954f58013c5236e58
llvm/llvm-project@9cfad5f
This reverts commit 9cfad5f9c5bfd985f1bc8b0954f58013c5236e58.

cherry-pick: 5644f734d6068f6e75ecd9856e5f837190543667
llvm/llvm-project@5644f73
Adjusted external reference for Darwin/AARCH64 link compatibility.
Made size directive conditional only if __ELF__ defined.

Patch by Michael_Pique <mpique@icloud.com>

Differential Revision: https://reviews.llvm.org/D88252

cherry-pick: 7b5254223acbf2ef9cd278070c5a84ab278d7e5f
llvm/llvm-project@7b52542
OpenMP 5.1 introduces the new env variable
OMP_TOOL_VERBOSE_INIT=(disabled|stdout|stderr|<filename>) to enable verbose
loading and initialization of OMPT tools.
This env variable helps to understand the cause when loading of a tool fails
(e.g., undefined symbols or dependency not in LD_LIBRARY_PATH)
Output of OMP_TOOL_VERBOSE_INIT is added for OMP_DISPLAY_ENV

Tests for this patch are integrated into the different existing tool loading
tests, making these tests more verbose. An Archer specific verbose test is
integrated into an existing Archer test.

Patch prepared by: Isabel Thärigen

Differential Revision: https://reviews.llvm.org/D91464

cherry-pick: b281a05dacb485d3c3c9cc7f7f5e8fb858ac67bc
llvm/llvm-project@b281a05
This is an alternative approach to address inconsistencies pointed out in: D90078
This patch makes sure that the return address is reset, when leaving the scope.
In some cases, I had to move the macro out of an if-statement to have it in the
right scope, in some cases I added an additional block to restrict the scope.

This patch does not handle inconsistencies, which might occur if the return
address is still set when we call into the application.

Test case (repeated_calls.c) provided by @hbae

Differential Revision: https://reviews.llvm.org/D91692

cherry-pick: 6d3b81664a4b79b32ed2c2f46b21ab0dca9029cc
llvm/llvm-project@6d3b816
Commit https://reviews.llvm.org/rG7b5254223acbf2ef9cd278070c5a84ab278d7e5f
broke the build for some architectures, because macro KMP_PREFIX_UNDERSCORE
was defined only for x86, x86_64 and aarch64. This patch defines it for other
architectures (as a no-op).

Differential Revision: https://reviews.llvm.org/D92027

cherry-pick: 9e3e332d273b80b5167ac35f8dcfa7178e45c5e9
llvm/llvm-project@9e3e332
[libomptarget][cuda] Detect missing symbols in plugin at build time

Passes -z,defs to the linker. Error on unresolved symbol references.

Otherwise, those unresolved symbols present as target code running on the host
as the plugin fails to load. This is significantly harder to debug than a link
time error. Flag matches that passed by amdgcn and ve plugins.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D92143

cherry-pick: 89a0f48c58f82262c7ce2b9ca51ffad0ffc559ea
llvm/llvm-project@89a0f48
The test had a chance to finish the first task before the second task is
created. In this case, the dependences-pair event would not trigger.

cherry-pick: cdf9401df84ef382467d1ca1c1c458c11fd6043a
llvm/llvm-project@cdf9401
The test would fail for gcc, when built with debug flag.

cherry-pick: 723be4042a3aa38523c60b1dd96b20448053c41e
llvm/llvm-project@723be40
Once __kmp_task_finish is not executed for proxy tasks,
move mutexinoutset dependency code to __kmp_release_deps
which is executed for all task kinds.

Differential Revision: https://reviews.llvm.org/D92326

cherry-pick: f6f28b44ad48e35d1300693d9c34f47782b519a4
llvm/llvm-project@f6f28b4
…on SIGTERM

With the change to using shared memory, there were a few problems that need to be fixed.
- The previous filename that was used for SHM only used process id. Given that process is
  usually based on 16bit number, this was causing some conflicts on machines. Thus we add
  UID to the name to prevent this.
- It appears under some conditions (SIGTERM, etc) the shared memory files were not getting
  cleaned up. Added a call to clean up the shm files under those conditions. For this user
  needs to set envirable KMP_HANDLE_SIGNALS to true.

Patch by Erdner, Todd <todd.erdner@intel.com>

Differential Revision: https://reviews.llvm.org/D91869

cherry-pick: 9615890db576721fbd73ae77d81d39435e83b4b4
llvm/llvm-project@9615890
cherry-pick: fd3d1b09c12f1419292172627dbca9929f0daf39
llvm/llvm-project@fd3d1b0
Added UNLIKELY hint to one-time or rarely executed branches.
This improves performance of the library on some tasking benchmarks.

Differential Revision: https://reviews.llvm.org/D92322

cherry-pick: 6bf84871e9382fe7bde1117194bc15abb2b09f68
llvm/llvm-project@6bf8487
These changes add support for Intel's umonitor/umwait usage in wait
code, for architectures that support those intrinsic functions. Usage of
umonitor/umwait is off by default, but can be turned on by setting the
KMP_USER_LEVEL_MWAIT environment variable.

Differential Revision: https://reviews.llvm.org/D91189

cherry-pick: e0665a9050840650809fa4eb6ef23bd8f5adfbf0
llvm/llvm-project@e0665a9
Removes MaxParallelLevel references from rtl.cpp and drops
resulting dead code.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D92463

cherry-pick: afc09c6fe44ecf99e5946b7fe08013f592504448
llvm/llvm-project@afc09c6
[libomptarget][amdgpu] Address compiler warnings, drive by fixes

Initialize some variables, remove unused ones.
Changes the debug printing condition to align with the aomp test suite.

Differential Revision: https://reviews.llvm.org/D92559

cherry-pick: ae9d96a656a17fa782ccaa9ba10d4570f497e855
llvm/llvm-project@ae9d96a
This patch enables use of the entry on Windows.

Differential Revision: https://reviews.llvm.org/D92618

cherry-pick: c4a22224d97120df55f5d1d753e7f839fdb1da2f
llvm/llvm-project@c4a2222
JonChesterfield and others added 8 commits December 7, 2020 14:30
cherry-pick: f628eef98acd24f8eb6a52d67ee887bb18f04bca
llvm/llvm-project@f628eef
D91692 missed various locations in kmp_gsupport, where the scope for
OMPT_STORE_RETURN_ADDRESS is too narrow, i.e. the scope ends before the OMPT
callback is called in some nested function.

This patch fixes the scoping issue, so that all OMPT tests pass, when the
tests are built with gcc.

Differential Revision: https://reviews.llvm.org/D92121

cherry-pick: a148216b31292e52c0229dae98f52d3b2c350400
llvm/llvm-project@a148216
Check pointer returned by strchr, as it can be NULL in case of broken
format of input string. Introduced new function __kmp_str_loc_numbers
for fast parsing of numbers only in the location string.
Also made some cleanup of __kmp_str_loc_init declaration and usage:
- changed type of init_fname parameter to bool;
- changed input from true to false in places where fname is not used.

Differential Revision: https://reviews.llvm.org/D90962

cherry-pick: 22558c8501eaf5e7547ee13fa5a009efdec6dc90
llvm/llvm-project@22558c8
cherry-pick: fff1abc406d56401f37a1ef4431583f2e75b5039
llvm/llvm-project@fff1abc
@shintaro-iwasaki
Copy link
Collaborator Author

The test coverage of BOLT is the same as that of the official LLVM OpenMP: https://jenkins-pmrs.cels.anl.gov/job/bolt-llvmproj-review-centos/14/.

I will merge this PR.

@shintaro-iwasaki shintaro-iwasaki merged commit 8f81ee5 into pmodels:main Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet