Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel] (2/N) Machete - Integrate into CompressedTensorsWNA16 and GPTQMarlin #7701

Merged
merged 27 commits into from
Sep 23, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
458d69e
squash-patch changes
LucasWilkinson Jul 31, 2024
1ee3608
remove gptq support
LucasWilkinson Aug 30, 2024
ab7507e
formatting + fixes
LucasWilkinson Aug 30, 2024
68ff26d
add gptq_marlin support back
LucasWilkinson Aug 31, 2024
7b9e8b2
remove extra prints
LucasWilkinson Aug 31, 2024
30f1056
add machete act ordering
LucasWilkinson Sep 6, 2024
3bbb902
udpate heuristic
LucasWilkinson Sep 6, 2024
196a9f2
add to tests
LucasWilkinson Sep 6, 2024
38f5b84
update benchmark
LucasWilkinson Sep 6, 2024
c59449b
tweak for llama 405b
LucasWilkinson Sep 6, 2024
3048911
env var for disabling kernels
LucasWilkinson Sep 10, 2024
df7c4c0
format + mypy
LucasWilkinson Sep 11, 2024
6f3f707
yapf format
LucasWilkinson Sep 11, 2024
90b8e03
refactor
LucasWilkinson Sep 11, 2024
c264c7a
add g_idx back
LucasWilkinson Sep 11, 2024
2d25a9a
clean-up
LucasWilkinson Sep 11, 2024
62508c5
review comments
LucasWilkinson Sep 12, 2024
84cfdb2
fix codespell
LucasWilkinson Sep 12, 2024
c452a86
TorchDynamo Compatability
LucasWilkinson Sep 13, 2024
096dd4a
add permute cols opcheck
LucasWilkinson Sep 13, 2024
a98f691
fix correctness test
LucasWilkinson Sep 16, 2024
7c02bcf
bug in filtering kernels by compute capability
LucasWilkinson Sep 16, 2024
95a85c9
Merge remote-tracking branch 'origin/main' into lwilkinson/machete-en…
LucasWilkinson Sep 20, 2024
a019473
add requirements.txt
LucasWilkinson Sep 20, 2024
306b283
Merge branch 'main' into lwilkinson/machete-end2end
mgoin Sep 21, 2024
e32bfc5
[dbrx] refactor dbrx experts to extend FusedMoe class (#8518)
divakar-amd Sep 21, 2024
05752e9
[Kernel][Bugfix] Delete some more useless code in marlin_moe_ops.cu (…
tlrmchlsmth Sep 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add permute cols opcheck
  • Loading branch information
LucasWilkinson committed Sep 13, 2024
commit 096dd4af132f3171e4ea22ae59a71f503e7a0ebc
15 changes: 15 additions & 0 deletions tests/kernels/test_permute_cols.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import pytest
import torch

from tests.kernels.utils import opcheck
from vllm._custom_ops import permute_cols


@pytest.mark.parametrize('shape', [(1, 512), (544, 4096), (67, 8192)])
@pytest.mark.parametrize('dtype', [torch.bfloat16, torch.float16])
def test_permute_cols(shape, dtype):
x = torch.randn(shape, dtype=dtype).cuda()
perm = torch.randperm(x.shape[1]).to(torch.int).cuda()
opcheck(torch.ops._C.permute_cols, (x, perm))
y = permute_cols(x, perm)
torch.testing.assert_close(y, x[:, perm])
12 changes: 12 additions & 0 deletions vllm/_custom_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -576,6 +576,18 @@ def machete_prepack_B(b_q_weight: torch.Tensor,
return torch.ops._C.machete_prepack_B(b_q_weight, b_type)


# TODO: has to be a better way to do this
try:
torch.ops._C.permute_cols # noqa B018

@torch.library.register_fake("_C::permute_cols")
def _permute_cols_fake(a: torch.Tensor,
perm: torch.Tensor) -> torch.Tensor:
return torch.empty_like(a)
except Exception:
pass
mgoin marked this conversation as resolved.
Show resolved Hide resolved


def permute_cols(a: torch.Tensor, perm: torch.Tensor) -> torch.Tensor:
LucasWilkinson marked this conversation as resolved.
Show resolved Hide resolved
return torch.ops._C.permute_cols(a, perm)

Expand Down
Loading