Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591

yzygitzh · 2023-12-11T13:39:43Z

Description
Add in-place metrics for NCCL/RCCL benchmark for latency measurement.

docs/user-tutorial/benchmarks/micro-benchmarks.md

superbench/benchmarks/micro_benchmarks/cuda_nccl_bw_performance.py

codecov · 2023-12-12T08:34:43Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (606ff19) 86.11% compared to head (96adc71) 86.12%.

Additional details and impacted files

@@               Coverage Diff                @@
##           release/0.10     #591      +/-   ##
================================================
+ Coverage         86.11%   86.12%   +0.01%     
================================================
  Files                97       97              
  Lines              6873     6878       +5     
================================================
+ Hits               5919     5924       +5     
  Misses              954      954

Flag	Coverage Δ
cpu-python3.6-unit-test	`71.83% <100.00%> (+0.02%)`	⬆️
cpu-python3.7-unit-test	`71.83% <100.00%> (+0.02%)`	⬆️
cpu-python3.8-unit-test	`72.24% <100.00%> (+0.02%)`	⬆️
cuda-unit-test	`84.15% <100.00%> (+0.01%)`	⬆️
directx-unit-test	`35.27% <8.33%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tests/benchmarks/micro_benchmarks/test_cuda_nccl_bw_performance.py

…#591) **Description** Add in-place metrics for NCCL/RCCL benchmark for latency measurement.

**Description** Cherry-pick bug fixes from v0.10.0 to main. **Major Revisions** * Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590 * Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591 * Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592 * Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595 * Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596 * CI/CD - Add ndv5 topo file #597 * Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593 * Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599 * Dockerfile - Bug fix for rocm docker build and deploy #598 * Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603 * Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604 * Monitor - Upgrade pyrsmi to amdsmi python library. #601 * Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605 * Dockerfile - Add rocm6.0 dockerfile #602 * Bug Fix - Bug fix for latest megatron-lm benchmark #600 * Docs - Upgrade version and release note #606 Co-authored-by: Ziyue Yang <ziyyang@microsoft.com> Co-authored-by: Yang Wang <yangwang1@microsoft.com> Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com> Co-authored-by: guoshzhao <guzhao@microsoft.com>

yzygitzh requested review from cp5555 and a team as code owners December 11, 2023 13:39

yzygitzh changed the base branch from main to release/0.10 December 11, 2023 14:05

yukirora reviewed Dec 11, 2023

View reviewed changes

docs/user-tutorial/benchmarks/micro-benchmarks.md Outdated Show resolved Hide resolved

cp5555 reviewed Dec 11, 2023

View reviewed changes

superbench/benchmarks/micro_benchmarks/cuda_nccl_bw_performance.py Outdated Show resolved Hide resolved

yzygitzh force-pushed the ziyue/add-in-place-for-nccl branch from 55aaee1 to 9106e32 Compare December 12, 2023 07:54

cp5555 added benchmarks SuperBench Benchmarks micro-benchmarks Micro Benchmark Test for SuperBench Benchmarks labels Dec 12, 2023

yukirora reviewed Dec 12, 2023

View reviewed changes

tests/benchmarks/micro_benchmarks/test_cuda_nccl_bw_performance.py Outdated Show resolved Hide resolved

cp5555 approved these changes Dec 12, 2023

View reviewed changes

add in-place for nccl benchmark

203776b

yzygitzh force-pushed the ziyue/add-in-place-for-nccl branch from 2a32186 to 203776b Compare December 13, 2023 00:49

yzygitzh added 2 commits December 13, 2023 01:38

fix test

c06ef7a

Merge branch 'release/0.10' into ziyue/add-in-place-for-nccl

96adc71

yukirora approved these changes Dec 13, 2023

View reviewed changes

cp5555 changed the title ~~Benchmarks: Microbenchmark - Add in-place metrics for NCCL/RCCL benchmark for latency measurement~~ Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark Dec 13, 2023

cp5555 mentioned this pull request Dec 13, 2023

V0.10.0 Release Plan #559

Closed

30 tasks

yzygitzh merged commit 27374ad into release/0.10 Dec 13, 2023
20 checks passed

yzygitzh deleted the ziyue/add-in-place-for-nccl branch December 13, 2023 17:39

yukirora mentioned this pull request Dec 14, 2023

V0.10.0 Test Plan #585

Closed

29 tasks

abuccts pushed a commit that referenced this pull request Jan 3, 2024

Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark (…

5b904a5

…#591) **Description** Add in-place metrics for NCCL/RCCL benchmark for latency measurement.

abuccts mentioned this pull request Jan 3, 2024

Release - SuperBench v0.10.0 #607

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591

Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591

yzygitzh commented Dec 11, 2023

codecov bot commented Dec 12, 2023 •

edited

Loading

Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591

Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591

Conversation

yzygitzh commented Dec 11, 2023

codecov bot commented Dec 12, 2023 • edited Loading

Codecov Report

codecov bot commented Dec 12, 2023 •

edited

Loading