Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cmake for AMDGPU platform #13801

Merged
merged 10 commits into from
Nov 23, 2018
Merged

Conversation

sabreshao
Copy link
Contributor

No description provided.

sabreshao and others added 3 commits October 10, 2018 07:27
Enable whole archieve build for pybind library.

Disable two warning.

Rollback to C++11.

[Don't merge!]
Link RCCL to WA gpu kernel loading issue.

Update eigen to fix build failure.

Add more include directories.

Fix O3 build failure.

Update eigen.

fix tensor_util_test segment fault issue

add more macro check in hip.cmake.
we may consider refine hip.cmake to inherit all add_definitions() in parrent scope, in the future.

Signed-off-by: carlushuang <carlus.huang@amd.com>

Fix rocRAND load.

Update eigen to fix gru_unit_op and reduce_op.

Add HIP support to testing.

Update eigen to support int16 and int8 in arg min and arg max.

Conflicts:
	paddle/fluid/pybind/CMakeLists.txt

Conflicts:
	paddle/fluid/pybind/CMakeLists.txt
Signed-off-by: carlushuang <carlus.huang@amd.com>

Conflicts:
	paddle/fluid/operators/CMakeLists.txt
@sabreshao
Copy link
Contributor Author

@dzhwinter @luotao1

Conflicts:
	paddle/fluid/operators/CMakeLists.txt
@Xreki Xreki added the AMD label Oct 31, 2018
Conflicts:
	CMakeLists.txt
Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for so many questions, because we don't have an AMD GPU, we cannot test the PR.

@@ -17,7 +17,7 @@ if(WITH_AMD_GPU)
extern_eigen3
${EXTERNAL_PROJECT_LOG_ARGS}
GIT_REPOSITORY "https://github.com/sabreshao/hipeigen.git"
GIT_TAG 0cba03ff9f8f9f70bbd92ac5857b031aa8fed6f9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the GIT_TAG?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two reasones:

  1. Rebase to the eigen used by CUDA build.
  2. Several commits to the repo to fix unittest failure such as test_gru_unit_op and test_reduce_op.

GIT_TAG 5bd41b96ab8d8343330fb2c3e1b96775bde3b3fc
PREFIX ${ROCPRIM_SOURCE_DIR}
UPDATE_COMMAND ""
CMAKE_ARGS -DCMAKE_CXX_COMPILER=/opt/rocm/bin/hcc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since rocprim is a header-only library, does the set of cxx compiler here have an actual effect? Do the codes which include rocprim also need to be compiled by hcc?

I think it's better not use an absolute path here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some header files are generated in building, though compilation is not needed.

@@ -0,0 +1,39 @@
if (NOT WITH_AMD_GPU)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please add some comment to introduce the package rocprim here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.


add_dependencies(rocprim extern_rocprim)

LIST(APPEND external_project_dependencies rocprim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not needed, because rocprim is not used in Paddle v2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

target_link_libraries(${TARGET_NAME} ${hip_library_DEPS})
if("${hip_library_DEPS}" MATCHES "ARCHIVE_START")
# Support linking flags: --whole-archive (Linux) / -force_load (MacOS).
# WARNING: Please don't use ARCHIVE_START&ARCHIVE_END if TARGET_NAME will be linked by other libraries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not suggested to use whole-archive here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a WA towards HIP/HCC toolchain, some symbol will be missing without "whole-archive".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paddle Fluid uses USE_OP to ensure all operators linked, maybe whole-archive here can be removed in the same way. This can be done in the future. By the way, please format the codes in hip_library.

cmake/hip.cmake Outdated
set(HIP_HCC_FLAGS "${HIP_HCC_FLAGS} -fPIC -DPADDLE_WITH_HIP -std=c++11" )

if(NOT WITH_PYTHON)
set(HIP_HCC_FLAGS "${HIP_HCC_FLAGS} -DPADDLE_NO_PYTHON")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where to use HIP_HCC_FLAGS and where to use hcc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HIP/HCC toolchain will refer the flag HIP_HCC_FLAGS. When a libarry/executable is marked as hip_library/hip_exe, source files with postfix .cu will be compiled by HIP/HCC toolchain.

DEPS ${PYBIND_DEPS}
${GLOB_OP_LIB})
DEPS ARCHIVE_START ${PYBIND_DEPS}
${GLOB_OP_LIB} ARCHIVE_END)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add ARCHIVE_START here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a WA towards HIP/HCC toolchain, some symbol will be missing without "whole-archive".

@@ -254,6 +254,9 @@ op_library(cross_entropy_op DEPS cross_entropy)
if(WITH_GPU)
op_library(softmax_with_cross_entropy_op DEPS cross_entropy softmax cub)
op_library(sequence_softmax_op DEPS cub)
elseif(WITH_AMD_GPU)
op_library(softmax_with_cross_entropy_op DEPS cross_entropy softmax rocprim)
op_library(sequence_softmax_op DEPS rocprim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header file is not included in the operators ".cc", so why need to depend on rocprim here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of including cub, HIP/HCC build will refer to the header files of rocprim.

@sabreshao
Copy link
Contributor Author

@Xreki Can you have a look?

Conflicts:
	paddle/fluid/operators/CMakeLists.txt
	paddle/fluid/pybind/CMakeLists.txt
	paddle/testing/paddle_gtest_main.cc
@carlushuang
Copy link

@Xreki Hi please take a look at this PR. The Cmake system is at the root of enabling AMDGPU support, and later AMDGPU related code have a strong dependency on this one.

Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please teel me how to update the version of hip? My build failed due to the following error:

CMake Error at /usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:148 (message):                                                                                               
  Could NOT find HIP: Found unsuitable version "1.5.18073", but required is
  at least "1.5.18263" (found /opt/rocm)

target_link_libraries(${TARGET_NAME} ${hip_library_DEPS})
if("${hip_library_DEPS}" MATCHES "ARCHIVE_START")
# Support linking flags: --whole-archive (Linux) / -force_load (MacOS).
# WARNING: Please don't use ARCHIVE_START&ARCHIVE_END if TARGET_NAME will be linked by other libraries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paddle Fluid uses USE_OP to ensure all operators linked, maybe whole-archive here can be removed in the same way. This can be done in the future. By the way, please format the codes in hip_library.

cmake/hip.cmake Outdated
set(HIP_HCC_FLAGS "${HIP_HCC_FLAGS} -fPIC -DPADDLE_WITH_HIP -std=c++11" )

if(NOT WITH_PYTHON)
set(HIP_HCC_FLAGS "${HIP_HCC_FLAGS} -DPADDLE_NO_PYTHON")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, PADDLE_NO_PYTHON is only used in legacy Paddle, so there is no need to add this line. So as WITH_TIMER and WITH_DOUBLE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless flag will be removed. Can you point out Paddle coding style, since pre-commit hook doesn't touch cmakefile?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cmake file, just need to keep the right indentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@carlushuang
Copy link

carlushuang commented Nov 22, 2018

@Xreki Hi

CMake Error at /usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:148 (message):
Could NOT find HIP: Found unsuitable version "1.5.18073", but required is
at least "1.5.18263" (found /opt/rocm)

For this question, let my sync some environment settings for yours.

  1. Please type in following command to check the required hip version:
/opt/rocm/bin/hipconfig --version
/opt/rocm/hip/bin/hipconfig --version

Please check the print is "1.5.18263".

  1. /opt/rocm/hip/lib/cmake/hip/hip-config-version.cmake
    check the "set(PACKAGE_VERSION "1.5.18263")" if it is "1.5.18263"

  2. For possible compatibility, please check is there multiple "hipconfig" or multiple hip-config-version.cmake install on your build system, in case with different version. I do think it 's because the build system has installed multiple version of rocm in the system

Conflicts:
	CMakeLists.txt
Copy link
Contributor

@Xreki Xreki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Xreki Xreki merged commit 61c5f13 into PaddlePaddle:develop Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants