Skip to content

Commit

Permalink
AMD ROCm support (#5)
Browse files Browse the repository at this point in the history
* added rocm support

* added rocm build workflow

* rocm version

* updated readme

* rocm 5.6

* remove unnecessary packages

* fix for rocm6.0

* fix

* fix extra_compile_args
  • Loading branch information
IlyasMoutawwakil authored Jan 26, 2024
1 parent 4bb0c02 commit 3aed4bf
Show file tree
Hide file tree
Showing 5 changed files with 246 additions and 92 deletions.
131 changes: 122 additions & 9 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,11 @@ jobs:
const script = require('.github/workflows/scripts/github_create_release.js')
await script(github, context, core)
build_wheels:
name: Build AWQ
build_cuda_wheels:
name: Build AWQ with CUDA
runs-on: ${{ matrix.os }}
needs: release

strategy:
matrix:
os: [ubuntu-20.04, windows-latest]
Expand All @@ -48,7 +48,7 @@ jobs:
env:
PYPI_CUDA_VERSION: "12.1.1"
CUDA_VERSION: ${{ matrix.cuda }}

steps:
- name: Free Disk Space
uses: jlumbroso/free-disk-space@v1.3.0
Expand All @@ -61,7 +61,7 @@ jobs:
large-packages: false
docker-images: true
swap-storage: false

- uses: actions/checkout@v3

- uses: actions/setup-python@v3
Expand All @@ -78,7 +78,7 @@ jobs:
use-mamba: true
add-pip-as-python-dependency: true
auto-activate-base: false

- name: Install Dependencies
run: |
# Install CUDA toolkit
Expand All @@ -87,7 +87,7 @@ jobs:
# Env variables
$env:CUDA_PATH = $env:CONDA_PREFIX
$env:CUDA_HOME = $env:CONDA_PREFIX
# Install torch
$cudaVersion = $env:CUDA_VERSION.Replace('.', '')
$cudaVersionPytorch = $cudaVersion.Substring(0, $cudaVersion.Length - 1)
Expand All @@ -113,9 +113,122 @@ jobs:
}
python setup.py sdist bdist_wheel
- name: Upload Assets
uses: shogo82148/actions-upload-release-asset@v1
with:
upload_url: ${{ needs.release.outputs.upload_url }}
asset_path: ./dist/*.whl

build_rocm_wheels:
name: Build AWQ with ROCm
runs-on: ${{ matrix.os }}
needs: release

strategy:
matrix:
os: [ubuntu-20.04]
python: ["3.8", "3.9", "3.10", "3.11"]
rocm: ["5.6.1", "5.7.1"] # we build only for rocm5.6 & 5.7 to match PyTorch 2.1.0 and PyTorch 2.2 nightly
defaults:
run:
shell: bash
env:
ROCM_VERSION: ${{ matrix.rocm }}

steps:
- uses: actions/checkout@v3

- name: Free Disk Space
run: |
df -h
echo "Removing large packages"
sudo apt-get remove -y '^dotnet-.*'
sudo apt-get remove -y 'php.*'
sudo apt-get remove -y azure-cli google-chrome-stable firefox powershell mono-devel
df -h
sudo apt-get autoremove -y >/dev/null 2>&1
sudo apt-get clean
sudo apt-get autoremove -y >/dev/null 2>&1
sudo apt-get autoclean -y >/dev/null 2>&1
df -h
echo "https://github.com/actions/virtual-environments/issues/709"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
df -h
echo "remove big /usr/local"
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf /usr/local/lib/android >/dev/null 2>&1
df -h
sudo rm -rf /usr/share/dotnet/sdk > /dev/null 2>&1
sudo rm -rf /usr/share/dotnet/shared > /dev/null 2>&1
sudo rm -rf /usr/share/swift > /dev/null 2>&1
df -h
- uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python }}

- name: Setup Mamba
uses: conda-incubator/setup-miniconda@v2.2.0
with:
activate-environment: "build"
python-version: ${{ matrix.python }}
mamba-version: "*"
use-mamba: false
channels: conda-forge,defaults
channel-priority: true
add-pip-as-python-dependency: true
auto-activate-base: false

- name: Set up ROCm
run: |
echo "Using python:"
python --version
which python
if [[ "${{ matrix.rocm }}" == "5.4.2" ]]; then
export ROCM_DL_FILE=amdgpu-install_5.4.50402-1_all.deb
elif [[ "${{ matrix.rocm }}" == "5.6.1" ]]; then
export ROCM_DL_FILE=amdgpu-install_5.6.50601-1_all.deb
elif [[ "${{ matrix.rocm }}" == "5.7.1" ]]; then
export ROCM_DL_FILE=amdgpu-install_5.7.50701-1_all.deb
else
echo Unknown rocm version
exit 1
fi
curl -O https://repo.radeon.com/amdgpu-install/${{ matrix.rocm }}/ubuntu/focal/$ROCM_DL_FILE
sudo dpkg -i $ROCM_DL_FILE
sudo DEBIAN_FRONTEND=noninteractive amdgpu-install --usecase=rocm --no-dkms --no-32 -y
- name: Install Dependencies
run: |
sudo apt-get update
sudo apt-get install -y --no-install-recommends rocsparse-dev rocthrust-dev rocblas-dev hipblas-dev hipsparse-dev
python -m pip install --upgrade build setuptools wheel
if [[ "${{ matrix.rocm }}" == "5.7.1" ]]; then
echo "Using PyTorch nightly"
python -m pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/rocm5.7
elif [[ "${{ matrix.rocm }}" == "5.6.1" ]]; then
echo "Using PyTorch stable"
python -m pip install torch --index-url https://download.pytorch.org/whl/rocm5.6
else
echo Unknown rocm version for python install
exit 1
fi
- name: Build Wheel
run: |
echo "Using python for build:"
python --version
which python
ROCM_VERSION=${{ matrix.rocm }} python setup.py sdist bdist_wheel
- name: Upload Assets
uses: shogo82148/actions-upload-release-asset@v1
with:
upload_url: ${{ needs.release.outputs.upload_url }}
asset_path: ./dist/*.whl
asset_path: ./dist/*.whl
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -159,3 +159,6 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

*hip*
!hip_compact.hip
21 changes: 19 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,38 @@ AutoAWQ Kernels is a new package that is split up from the [main repository](htt
## Requirements

- Windows: Must use WSL2.
- GPU: Must be compute capability 7.5 or higher.
- CUDA Toolkit: Must be 11.8 or higher.

- NVIDIA:
- GPU: Must be compute capability 7.5 or higher.
- CUDA Toolkit: Must be 11.8 or higher.
- AMD:
- ROCm: Must be 5.6 or higher.

## Install

### Install from PyPi

The package is available on PyPi with CUDA 12.1.1 wheels:

```
pip install autoawq-kernels
```

### Install release wheels

For ROCm and other CUDA versions, you can use the wheels published at each [release](https://github.com/casper-hansen/AutoAWQ_kernels/releases/):

```
pip install https://github.com/casper-hansen/AutoAWQ_kernels/releases/download/v0.0.2/autoawq_kernels-0.0.2+rocm561-cp310-cp310-linux_x86_64.whl
```

### Build from source
You can also build from source:

```
git clone https://github.com/casper-hansen/AutoAWQ_kernels
cd AutoAWQ_kernels
pip install -e .
```

To build for ROCm, you need to first install the following packages `rocsparse-dev hipsparse-dev rocthrust-dev rocblas-dev hipblas-dev`.
4 changes: 2 additions & 2 deletions awq_ext/exllama/hip_compat.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ __device__ __forceinline__ __half __compat_hrcp(__half x) {
static_cast<_Float16>(__builtin_amdgcn_rcph(static_cast<__half_raw>(x).data))};
}

// ROCm 6.0 compatible from: /opt/rocm-6.0.0/include/hip/amd_detail/amd_hip_fp16.h:1708
__device__ __forceinline__ __half2 __compat_h2rcp(__half2 x) {
return _Float16_2{static_cast<_Float16>(__builtin_amdgcn_rcph(x.x)),
static_cast<_Float16>(__builtin_amdgcn_rcph(x.y))};
return _Float16_2{_Float16_2{static_cast<_Float16>(1.0f), static_cast<_Float16>(1.0f)} / x.data};
}

#define hrcp __compat_hrcp
Expand Down
Loading

0 comments on commit 3aed4bf

Please sign in to comment.