Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update DirectML 1.5.1 to 1.8.0 for ORT1.10 #9765

Merged
merged 138 commits into from
Nov 20, 2021
Merged

Conversation

fdwr
Copy link
Contributor

@fdwr fdwr commented Nov 16, 2021

Description:

Motivation and Context

  • Why is this change required? Bug fixes, perf improvements, additional int64 data type support.

fdwr and others added 30 commits September 22, 2020 19:08
 MaskRCNN failed when `Cast` tried to execute `Xor` with emptiness (zero in dimensions). This is perfectly legal and should be treated as a nop.

Ultimately DML itself should treat this case as a nop, just like how C's `memcpy` treats 0 count as a nop, but I'm just addressing it in ORT now, as enabling it in DML would impact more operators to be consistent (probably should incrementally add a flag to tensor validation so operators can be opted in gradually).

Corresponding WindowsAI PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/5195850

Related work items: #27469839, #28761382
When used in ORT, a common method shouldn't copy and return initializer data

Related work items: #29514403
Tensors that contain 0-sized dimensions were being broadcasted to higher dimensions, which would remove the possibility to remove them from the graph. 0-sized dimensions represent empty tensors, so whatever operator needs to broadcast it shouldn't try to call into DML.
This extends a workaround needed to match node inputs with Tensors to the EP code handling constant input upload.

This was causing issues in a couple of models, including EfficientDet, although that model still fails due to this bug:
https://microsoft.visualstudio.com/OS/_workitems/edit/29970551

Related work items: #29706035
GPU timeouts have already been disabled in command queues created by Winml, but not the ones created by the DML EP within the ORT API
…ut size

New validation [here](https://microsoft.visualstudio.com/DefaultCollection/WindowsAI/_git/WindowsAI/pullrequest/5354070?_a=files&path=%2Fdml%2FSharedValidation%2FDmlBatchNormalizationOperatorValidator.h) causes some BatchNorm cases to fail (e.g. OnnxConformanceTestsTaef::BatchNormalization (BatchNormalization_2x2x2)). I'm unsure how long this bug existed, but based on Nick's investigation, it apparently still worked anyway.

Related work items: #27678610
Update 8D BatchNorm

Related work items: #27678610
0 is valid in Tile in "repeats" parameter. The CPU kernel handles it fine. So should the DML EP.

Related work items: #29970551
…r into DmlDev

I'm just doing this manually this once because some commits were squashed, and I want to be sure they merge cleanly for the nightly automatic merge. "git merge upstream/master"
fdwr and others added 5 commits November 15, 2021 21:17
Add opset 13 for ops which are unchanged.

The culled list, removing any operators that changed signature (beyond just data type) or any that we don't support:

Abs-13
Add-13
ArgMax-13
ArgMin-13
Cast-13
Ceil-13
Clip-13
Concat-13
Constant-13
DepthToSpace-13
Div-13
Equal-13
Erf-13
Exp-13
Expand-13
Flatten-13
Floor-13
Gather-13
GatherElements-13
GatherND-13
Gemm-13
Greater-13
Identity-13
IsNaN-13
LRN-13
Less-13
Log-13
MatMul-13
Max-13
Mean-13
MeanVarianceNormalization-13
Min-13
Mod-13
Mul-13
Neg-13
Pad-13
Pow-13
Reciprocal-13
ReduceL1-13
ReduceL2-13
ReduceLogSum-13
ReduceLogSumExp-13
ReduceMax-13
ReduceMean-13
ReduceMin-13
ReduceProd-13
ReduceSumSquare-13
Relu-13
Reshape-13
Scatter-13
ScatterElements-13
ScatterND-13
Sigmoid-13
Sign-13
Slice-13
SpaceToDepth-13
Sqrt-13
Sub-13
Sum-13
Tanh-13
Tile-13
Transpose-13
Upsample-13

Related work items: #36946821
@fdwr fdwr force-pushed the user/dwayner/DML1.8forORT1.10 branch from 90720cb to e0ffc30 Compare November 19, 2021 13:14
@fdwr fdwr force-pushed the user/dwayner/DML1.8forORT1.10 branch from 22f650b to 289b1bd Compare November 19, 2021 13:35
@fdwr fdwr requested review from adtsai and jeffbloo November 19, 2021 13:40
@fdwr fdwr marked this pull request as ready for review November 19, 2021 13:41
@fdwr fdwr changed the title [DRAFT] Update DirectML 1.5.1 to 1.8.0 for ORT1.10 Update DirectML 1.5.1 to 1.8.0 for ORT1.10 Nov 19, 2021
jeffbloo
jeffbloo previously approved these changes Nov 19, 2021
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@fdwr
Copy link
Contributor Author

fdwr commented Nov 20, 2021

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline

@azure-pipelines
Copy link

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

@azure-pipelines
Copy link

Azure Pipelines successfully started running 9 pipeline(s).

@fdwr
Copy link
Contributor Author

fdwr commented Nov 20, 2021

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-amd-gpu-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants