New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add Split QuickGelu Fusion #20344

Draft

ajindal1 wants to merge 193 commits into main from abjindal/add_split_quickgelu_fusion

Contributor

ajindal1 commented Apr 17, 2024

Description

Add Split QuickGelu Fusion

Motivation and Context


          add initial files for fusion

f09b065

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/test/testdata/transform/fusion/split_quickgelu_fusion.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/core/optimizer/split_quickgelu_fusion.cc Fixed Show fixed Hide fixed

zhijxu-MS reviewed

View reviewed changes

onnxruntime/core/optimizer/split_quickgelu_fusion.cc Outdated

+                // Trying to find Split->QuickGelu->Mul Path
+                std::vector<const Node::EdgeEnd*> edges;
+                std::vector<graph_utils::EdgeEndToMatch> quickgelu_mul_path{
+                    {0, 0, "QuickGelu", {1, 11, 13}, kOnnxDomain},

Contributor

zhijxu-MS Apr 22, 2024

the activation can be other elementwise operator instead of QuickGelu, maybe we can add a todo here to support other operators.

zhijxu-MS reviewed

View reviewed changes

onnxruntime/core/optimizer/split_quickgelu_fusion.cc

+                    "fused " + split_node.Name() + " and " + quickgelu_node.Name() + " and " + mul.Name() + " into SplitQuickGelu";
+                std::string op_type = "SplitQuickGelu";
+                Node& fused_node = graph.AddNode(graph.GenerateNodeName(op_type),

Contributor

zhijxu-MS Apr 22, 2024

nit: add transformer name into node's name to make future debugging easier to figure out which graph transformer generate the node.

ajindal1 added 21 commits

April 25, 2024 15:42


          change fused node name and add more params

95a1778


          Merge branch 'main' of github.com:microsoft/onnxruntime into abjindal…

8e37993

…/add_split_quickgelu_fusion


          add onnx graph

ce6497a


          add test

51aeded


          include graph transformer

a34ab50


          add logger and remove vars

57026fc


          remove mask var

0ae44de


          fix var names

22a98a4


          try an array of input

fb05c77


          try array

b16adc0


          add get parameters fn

677d76f


          change return type

e266990


          add include files

c8f7d82


          add else condition

e8284c2


          dummy assignment

7a99a44


          add header file

4927df0


          format fixes

715cf5f


          change data type for axis


          update

3a10dea


          add split var

8abdfc9


          update params

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/test/testdata/transform/fusion/split_quickgelu_fusion.py Fixed Show fixed Hide fixed


          update quickgelu node details

bc22cd5

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/test/testdata/transform/fusion/split_quickgelu_fusion.py

		@@ -0,0 +1,38 @@
		import onnx

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from' Note test

Module 'onnx' is imported with both 'import' and 'import from'.
Module 'onnxruntime.test.onnx' is imported with both 'import' and 'import from'.


          update split value

febf726

ajindal1 added 13 commits

June 5, 2024 16:43


          add more logging

bc87e29


          add more logging

a11305b


          fix graph

e9ade5b


          fix graph

7e47a82


          test

62ff1cb


          test

195cec9


          update logs


          update logs

85b1f07


          handle multidimension

8c4060c


          handle multidimension

e0497c2


          handle multidimension

e4239cd


          handle multidimension

0bbc948


          add test for comparing perf

05dd0a1

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py Fixed Show fixed Hide fixed

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py Fixed Show fixed Hide fixed

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py Fixed Show fixed Hide fixed

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py Fixed Show fixed Hide fixed

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py

		@@ -0,0 +1,50 @@
		import torch

Check warning

Code scanning / lintrunner

BLACK-ISORT/format Warning test

Run lintrunner -a to apply this patch.


          improve test

e8f0367

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py

+              # inp = torch.randn((76, 54, 1368), device='cuda:0')
+              # out = temp(inp)
+              from onnxruntime.training.ortmodule import ORTModule

Check warning

Code scanning / lintrunner

RUFF/E402 Warning test

Module level import not at top of file.
See https://docs.astral.sh/ruff/rules/module-import-not-at-top-of-file

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py

+              # out = temp(inp)
+              from onnxruntime.training.ortmodule import ORTModule
+              import time

Check warning

Code scanning / lintrunner

RUFF/E402 Warning test

Module level import not at top of file.
See https://docs.astral.sh/ruff/rules/module-import-not-at-top-of-file

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py

+              def compare_torch_ort_perf(inp, num_steps=1000):
+                model2 = ORTModule(temp)
+                ort_temp_out = model2(inp)

Check warning

Code scanning / lintrunner

RUFF/F841 Warning test

Local variable ort\_temp\_out is assigned to but never used.
See https://docs.astral.sh/ruff/rules/unused-variable

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py

+                print("Current input shape:", inp.shape)
+                torch.cuda.synchronize()
+                start = time.time()
+                for i in range(num_steps):

Check warning

Code scanning / lintrunner

RUFF/B007 Warning test

Loop control variable i not used within loop body.
See https://docs.astral.sh/ruff/rules/unused-loop-control-variable

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py

+                print("Total time torch:", time.time() - start)
+                torch.cuda.synchronize()
+                start = time.time()
+                for i in range(num_steps):

Check warning

Code scanning / lintrunner

RUFF/B007 Warning test

Loop control variable i not used within loop body.
See https://docs.astral.sh/ruff/rules/unused-loop-control-variable

github-advanced-security bot found potential problems

View reviewed changes

onnxruntime/contrib_ops/cuda/math/test_split_quickgelu_fusion_perf.py

+              def compare_torch_ort_perf(inp, num_steps=1000):
+                model2 = ORTModule(temp)
+                ort_temp_out = model2(inp)

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable ort_temp_out is not used.

ajindal1 added 13 commits

June 10, 2024 00:37


          try new implementation

4f7c47e


          try new implementation

36faad0


          try new implementation

7729df1


          try new implementation

919aed8


          try new implementation

fe12249


          try second implementation

cb8d40e


          clean up the code

93e4f99


          add definitionin in graph transformer utils

22d4ca8


          add cpu kernel

267d624


          try changing op registration

bd5f630


          comment out logs

8b5029c


          add more logging and tests

fdbe0d0


          add second input split count

e77fb61

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet