Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement CUDA IsInf-10,20 #19772

Merged
merged 7 commits into from
Mar 5, 2024
Merged

Implement CUDA IsInf-10,20 #19772

merged 7 commits into from
Mar 5, 2024

Conversation

yuslepukhin
Copy link
Member

Description

Implment IsInf-10,20 for CUDA.
Add FP16 types also on CPU.

Motivation and Context

Certain models lag in performance due to IsInf not available on CUDA.

xadupre
xadupre previously approved these changes Mar 5, 2024
@yuslepukhin yuslepukhin marked this pull request as ready for review March 5, 2024 17:53
liqunfu
liqunfu previously approved these changes Mar 5, 2024
tianleiwu
tianleiwu previously approved these changes Mar 5, 2024
@yuslepukhin yuslepukhin dismissed stale reviews from tianleiwu and liqunfu via 94c0da7 March 5, 2024 19:17
@yuslepukhin yuslepukhin merged commit 1e78bce into main Mar 5, 2024
95 checks passed
@yuslepukhin yuslepukhin deleted the yuslepukhin/isinf_20 branch March 5, 2024 21:33
zz002 pushed a commit to zz002/onnxruntime that referenced this pull request Mar 7, 2024
### Description
Implment IsInf-10,20 for CUDA.
Add FP16 types also on CPU.

### Motivation and Context
Certain models lag in performance due to IsInf not available on CUDA.
@snnn
Copy link
Member

snnn commented Mar 11, 2024

#17724

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants