Skip to content

Commit

Permalink
Set parallel count to avoid OOM in training GPU packaging pipeline (#…
Browse files Browse the repository at this point in the history
…20255)

### Description
make the compilation work on Azure CPU Agent by reduce the parallel
count



### Motivation and Context
The OOM issue mentioned in #20244 was caused the by low
memory/parallel_count.
  • Loading branch information
mszhanyi authored Apr 10, 2024
1 parent 280b263 commit 0acde11
Showing 1 changed file with 3 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ resources:
stages:
- template: templates/py-packaging-training-cuda-stage.yml
parameters:
build_py_parameters: --enable_training --update --build
# set the paralle count to reduce memory/build_threads to avoid OOM
build_py_parameters: --enable_training --update --build --parallel 8
torch_version: '2.1.0'
opset_version: '17'
cuda_version: '12.2'
Expand All @@ -20,4 +21,4 @@ stages:
agent_pool: Onnxruntime-Linux-GPU
upload_wheel: 'yes'
debug_build: false
build_pool_name: 'onnxruntime-Linux-GPU'
build_pool_name: 'onnxruntime-Ubuntu2204-AMD-CPU'

0 comments on commit 0acde11

Please sign in to comment.