[DML EP] Add DML implementation for BiasGelu #13795

PatriceVignola · 2022-12-01T00:22:22Z

Description

Add DML implementation for BiasGelu

sumitsays · 2022-12-01T01:50:35Z

ADD1 only optimizes Relu and PRelu activations at the shader level. For rest other operators it will dispatch a separate shader which will be equivalent to execute decomposed form of BiasGelu. So, I think it won't add any benefit in performance.

PatriceVignola · 2022-12-01T01:54:35Z

ADD1 only optimizes Relu and PRelu activations at the shader level. For rest other operators it will dispatch a separate shader which will be equivalent to execute decomposed form of BiasGelu. So, I think it won't add any benefit in performance.

Sure, but it still won't be worse than calling both operators separately. And the current behavior is that it falls back to the CPU, which is very bad. We could also optimize the Gelu activation into ADD1 if necessary in the future.

sumitsays

Thank you.

fdwr

Pat: Do you know why it now falls back to CPU, given (I thought anyway, the last time we looked Sumit) there was a functional decomposition to Add and Gelu, which called DML? Maybe I misremember, but this is a general concern that if DML is the priority number #1 in the execution provider list, that any decomposable operators go to it first, rather than a fused version of the CPU one. I had some emails with Scott McKay long ago about this that I'll dredge up. Maybe there's a bug elsewhere to fix, and we can delete this temporary kernel later.

PatriceVignola · 2022-12-01T17:22:56Z

Pat: Do you know why it now falls back to CPU, given (I thought anyway, the last time we looked Sumit) there was a functional decomposition to Add and Gelu, which called DML? Maybe I misremember, but this is a general concern that if DML is the priority number #1 in the execution provider list, that any decomposable operators go to it first, rather than a fused version of the CPU one. I had some emails with Scott McKay long ago about this that I'll dredge up. Maybe there's a bug elsewhere to fix, and we can delete this temporary kernel later.

I'm not familiar with how op decomposition works. In this case, it's not even a "fusion": BiasGelu is hardcoded in the ONNX model. Does decomposition work in this case? And where would I find the logic in the code?

sumitsays · 2022-12-01T18:26:29Z

Pat: Do you know why it now falls back to CPU, given (I thought anyway, the last time we looked Sumit) there was a functional decomposition to Add and Gelu, which called DML? Maybe I misremember, but this is a general concern that if DML is the priority number #1 in the execution provider list, that any decomposable operators go to it first, rather than a fused version of the CPU one. I had some emails with Scott McKay long ago about this that I'll dredge up. Maybe there's a bug elsewhere to fix, and we can delete this temporary kernel later.

I'm not familiar with how op decomposition works. In this case, it's not even a "fusion": BiasGelu is hardcoded in the ONNX model. Does decomposition work in this case? And where would I find the logic in the code?

There is no such decomposition happens. If BiasGelu` is hardcoded, then in that case we need dedicated BiasGelu registration to make sure that it won't fallback to CPU.

fdwr · 2022-12-02T05:51:49Z

There is no such decomposition happens. If BiasGelu` is hardcoded, then in that case we need dedicated BiasGelu registration to make sure that it won't fallback to CPU.

Alrighty. I must have been thinking of another one of the many other *elu's then, like Selu.

Function	Since version	Function version
Celu	12	12
Elu	6, 1	18
PRelu	16, 9, 7, 6, 1	16
Relu	14, 13, 6, 1	18
Selu	6, 1	18

### Description Add DML implementation for BiasGelu

Add DML implementation for BiasGelu

02e3dfe

PatriceVignola requested review from fdwr and sumitsays December 1, 2022 00:22

sumitsays previously approved these changes Dec 1, 2022

View reviewed changes

Generate missing docs

5fb2a58

PatriceVignola dismissed sumitsays’s stale review via 5fb2a58 December 1, 2022 05:35

PatriceVignola requested a review from sumitsays December 1, 2022 07:37

fdwr approved these changes Dec 1, 2022

View reviewed changes

PatriceVignola merged commit e9b92fd into main Dec 1, 2022

PatriceVignola deleted the user/pavignol/add-dml-bias-gelu branch December 1, 2022 17:23

sumitsays mentioned this pull request Dec 6, 2022

[DML EP] Add SkipLayerNormalization #13849

Merged

henrywu2019 pushed a commit to henrywu2019/onnxruntime that referenced this pull request Dec 26, 2022

[DML EP] Add DML implementation for BiasGelu (microsoft#13795)

8c30023

### Description Add DML implementation for BiasGelu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DML EP] Add DML implementation for BiasGelu #13795

[DML EP] Add DML implementation for BiasGelu #13795

PatriceVignola commented Dec 1, 2022

sumitsays commented Dec 1, 2022

PatriceVignola commented Dec 1, 2022

sumitsays left a comment

fdwr left a comment

PatriceVignola commented Dec 1, 2022

sumitsays commented Dec 1, 2022

fdwr commented Dec 2, 2022 •

edited

Loading

[DML EP] Add DML implementation for BiasGelu #13795

[DML EP] Add DML implementation for BiasGelu #13795

Conversation

PatriceVignola commented Dec 1, 2022

Description

sumitsays commented Dec 1, 2022

PatriceVignola commented Dec 1, 2022

sumitsays left a comment

Choose a reason for hiding this comment

fdwr left a comment

Choose a reason for hiding this comment

PatriceVignola commented Dec 1, 2022

sumitsays commented Dec 1, 2022

fdwr commented Dec 2, 2022 • edited Loading

fdwr commented Dec 2, 2022 •

edited

Loading