Pulse · microsoft/onnxruntime · GitHub

June 24, 2024 – July 1, 2024

Overview

46 Active pull requests

38 Active issues

1 Release published by 1 person

v1.18.1 ONNX Runtime v1.18.1
published Jun 28, 2024

26 Pull requests merged by 16 people

Templatize publishing nuget package
#21199 merged Jul 2, 2024
Update install docs and add troubleshooting page
#21210 merged Jul 1, 2024
CoreML: Disable 1D ML Program matmul due to bug in coreml
#21186 merged Jun 29, 2024
Initial PR for VSINPU execution provider
#20903 merged Jun 29, 2024
Update upstream packaging pipeline name to make it more meaningful.
#21154 merged Jun 29, 2024
Update the functions in tensorprotoutils.h to use std::filesystem::path instead
#20920 merged Jun 29, 2024
Uppdate nuget to Use Nuget 6.10.x
#21209 merged Jun 29, 2024
[VitisAI] Align TensorProto_DataType with onnx1.16
#21067 merged Jun 29, 2024
OVEP options to disable CPU fallback at compile time
#21166 merged Jun 28, 2024
Add QNN UTs for QNN Pad Op with FP16 data on HTP backend
#21142 merged Jun 28, 2024
Add FP32 and INT4 test in Llama2
#21187 merged Jun 27, 2024
Fix typo in Python code block on home page
#21196 merged Jun 27, 2024
Add compatibility for NumPy 2.0
#21085 merged Jun 27, 2024
[WebNN EP] Remove useless variable unpacked_tensors_
#21189 merged Jun 27, 2024
support for layernorm in webgpu pre opset-17
#21121 merged Jun 27, 2024
[Fix] Throwes one excepiton while Llama2 parity_check fails
#21160 merged Jun 27, 2024
[WebNN EP] Fixed bug in Expand implementation
#21163 merged Jun 27, 2024
add split3inner
#19886 merged Jun 27, 2024
[ROCm] Extend the Pipeline restriction time
#21158 merged Jun 27, 2024
[ROCm] Disable ck_tile in Debug build
#21178 merged Jun 27, 2024
Check for unit test log severity override earlier
#21177 merged Jun 27, 2024
Rollback 19832, Remove shape_input_merge Fusion
#21179 merged Jun 26, 2024
Convert scalars to 1D to satisfy ML Program requirements.
#21159 merged Jun 26, 2024
Skip softmax BF16 test for ROCm
#21162 merged Jun 26, 2024
[WebNN EP] Support rest Reduction ops for TFLite backend
#21135 merged Jun 26, 2024
[WebNN EP] Support more Normalization ops for TFLite backend
#21151 merged Jun 25, 2024

20 Pull requests opened by 20 people

Add warning for scale being too small to quantize bias
#21155 opened Jun 25, 2024
[js/webgpu] Refactor trace
#21161 opened Jun 25, 2024
[QNN EP] Initial INT4 support
#21171 opened Jun 25, 2024
Replace Android CI vmImage: 'MacOS-12' with vmImage: 'ubuntu-latest'
#21172 opened Jun 25, 2024
[build] allow MPI on Unix when NCCL is disabled
#21175 opened Jun 25, 2024
[Optimizer] DQ + MatMul to MatMulNBits support
#21180 opened Jun 26, 2024
Keep QDQ nodes w/ nonpositive scale around MaxPool
#21182 opened Jun 26, 2024
Enable AVX NE CONVERT for FP16 to FP32 cast
#21183 opened Jun 26, 2024
[ROCm] fix: obtain AMD GPU memory info through rocm_smi library
#21190 opened Jun 27, 2024
[MLAS] AArch64 SQNBitGemm CompInt8 initial multi-row implementation
#21193 opened Jun 27, 2024
[WebNN EP] Release WebNN MLGraphBuilder after Compile to free memory
#21200 opened Jun 28, 2024
[Training] Fix Overflow Handling in Cast Infer for ORTModule.
#21202 opened Jun 28, 2024
Delete path.h
#21211 opened Jun 29, 2024
Use cuda memset async
#21216 opened Jul 1, 2024
[VSINPU]Code improvement && Slice/Dropout OP support
#21217 opened Jul 1, 2024
Implementation of set membership
#21222 opened Jul 1, 2024
onnxruntime shared lib inside python package
#21223 opened Jul 1, 2024
Add debugging helper to dump string, vector and thread id
#21224 opened Jul 1, 2024
Fix type-punned pointer error during compilation on Loongson LA64.
#21225 opened Jul 1, 2024
Fix ETW Sink Initialize unproperly locking
#21226 opened Jul 1, 2024

13 Issues closed by 13 people

onnxruntime-gpu not working with my gpu / setup
#21215 closed Jul 2, 2024
onnxruntime::python::CreateExecutionProviderInstance CUDA_PATH is set but CUDA wasnt able to be loaded.
#21218 closed Jul 1, 2024
Microsoft.ML.OnnxRuntime.Gpu 1.18.0 not working with NVIDIA CUDA 11.6
#20916 closed Jul 1, 2024
Onnx Model run failed in a loop
#21213 closed Jun 30, 2024
"Protobuf parsing failed" error when loading a quantized Mistral model
#20113 closed Jun 28, 2024
ORT 1.18 crashes on exit after using Cuda EP to run inference on a specific model
#21207 closed Jun 28, 2024
Support Numpy v2.0
#21063 closed Jun 27, 2024
[Mobile] Cocoapods release archive zips are missing
#21181 closed Jun 27, 2024
[Build] Support CUDA 12 onnxruntime-gpu pypi package
#20745 closed Jun 26, 2024
[Build] "utf8_range::utf8_validity" does not exist
#21174 closed Jun 26, 2024
[Feature Request] prebuilt package with cudnn 9 support
#20829 closed Jun 25, 2024
[Inference] Could not find an implmentation for groupnorm
#20661 closed Jun 25, 2024
[Bug] The accuracy of the A16W16 quantized model is very poor if per_channel is True
#21000 closed Jun 25, 2024

25 Issues opened by 23 people

SIGSEGV on CoreMLExecutionProvider when using dynamic batch
#21227 opened Jul 2, 2024
[Mobile] QNN failed to finalize QNN graph for attention layer
#21221 opened Jul 1, 2024
Inference result different between cuda and cpu
#21220 opened Jul 1, 2024
[TensorRT EP] OOM (RAM) when loading ONNX model
#21219 opened Jul 1, 2024
[Mobile] QNN HTP Backend Setup on Android Device
#21214 opened Jun 30, 2024
[Documentation] How Configure CUDA 12.* and cuDNN for GPU with ONNX Runtime and C# on Windows 11
#21212 opened Jun 29, 2024
[Transformers Optimizer] CLIP-ViT encoder attention not getting fused
#21208 opened Jun 28, 2024
[Web] `Error: using ceil() in shape computation is not yet supported for AveragePool`
#21206 opened Jun 28, 2024
Initialization crash using OnnxRuntime 17.0 (previously working on 16.3)
#21205 opened Jun 28, 2024
[Build] ‘struct onnxruntime::ProviderHostCPU’ has no member named ‘UpsampleBase__AdjustOutputSizeAsPolicy’
#21204 opened Jun 28, 2024
[Build] Build python interface for Onnxruntime-qnn on aarch64 Linux
#21203 opened Jun 28, 2024
[Documentation] phi-3 vision tutorial lacks samples for languages that are actually used for desktop development.
#21198 opened Jun 27, 2024
[Documentation] Setup the CUDA Environment is not detailed enough
#21197 opened Jun 27, 2024
[Performance] Mapfile support for certain external data files is not working
#21195 opened Jun 27, 2024
Issue with performing shape inference using symbolic_shape_infer.py with Phi-3 ONNX Models
#21194 opened Jun 27, 2024
[Build] How to build for Android armeabi platform?
#21192 opened Jun 27, 2024
Cannot create arena allocator with Environment::CreateAndRegisterAllocator on MAC M2 with clang
#21191 opened Jun 27, 2024
QDQ removal optimization from around MaxPool changes results with negative scale
#21176 opened Jun 26, 2024
ORT 1.18.1 Release Candidates available for testing
#21173 opened Jun 25, 2024
CoreML EP inference result is improperly scaled
#21170 opened Jun 25, 2024
Can onnxruntime.quantization.quantize_dynamic() work with onnx-trt?
#21169 opened Jun 25, 2024
[Performance] Increased memory usage when loading from bytes
#21165 opened Jun 25, 2024
[Feature Request] Add DFT support for CUDAExecutionProvider
#21164 opened Jun 25, 2024
[E:onnxruntime:, qnn_execution_provider.cc:591 GetCapability] QNN SetupBackend failed qnn_backend_manager.cc:334 InitializeBackend Failed to initialize backend
#21157 opened Jun 25, 2024
[Performance] Failed to run Whisper inference after optimization with Dml EP
#21156 opened Jun 25, 2024

65 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Implement FlashAttention for CPU
#20805 commented on Jun 28, 2024 • 28 new comments
Enablement of onnxruntime for AIX and fixing issues related to big-endian platform.
#21133 commented on Jul 1, 2024 • 11 new comments
[WebNN EP] Support Einsum op
#19558 commented on Jun 26, 2024 • 11 new comments
Implemenation of IObinding in Mixtral MoE Parity Script
#21153 commented on Jun 28, 2024 • 10 new comments
VitisAI EP Context Model
#20926 commented on Jun 26, 2024 • 10 new comments
[Performance] CUDA kernel not found in registries for Op type: ScatterND
#21148 commented on Jul 1, 2024 • 7 new comments
[Feature Request] Move graph compilation behind higher transformers (graph optimization)
#20915 commented on Jun 28, 2024 • 6 new comments
onnxruntime shape mismatch during quantization of yolov8 models
#21048 commented on Jun 26, 2024 • 6 new comments
Update OpenVino CI Ubuntu to 22.04
#21127 commented on Jun 29, 2024 • 5 new comments
onnxruntime 在C++上如何实现fp16的推理 yolov5模型
#20395 commented on Jul 1, 2024 • 5 new comments
Enabling S8S8 and S8U8 handling in QGemm for AVX2 and AVX-VNNI
#21123 commented on Jul 1, 2024 • 4 new comments
[CPU] SparseAttention op
#21110 commented on Jul 1, 2024 • 4 new comments
TensorrtExecutionProvider slower than CUDAExecutionProvider: Faster-rcnn [Performance]
#17434 commented on Jun 27, 2024 • 3 new comments
Add GQA support for ROCm
#21032 commented on Jul 2, 2024 • 3 new comments
[Mobile] android prod crash: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)
#20828 commented on Jun 27, 2024 • 2 new comments
Update custom Triton kernel documentation and examples
#20883 commented on Jun 26, 2024 • 2 new comments
Segmentation fault while loading CUDA Provider
#16146 commented on Jun 27, 2024 • 2 new comments
[Build] quantization unittest failed when run all tests
#20821 commented on Jul 1, 2024 • 2 new comments
[TensorRT EP] Update TRT10.0 deprecated api
#20989 commented on Jul 1, 2024 • 2 new comments
Execution Provider bridge for TFLite Delegates for Coral Edge TPUs
#10248 commented on Jun 28, 2024 • 2 new comments
Index put loop model regression with ort==1.18
#20855 commented on Jun 30, 2024 • 1 new comment
[Documentation Request] Required cuDNN version for OnnxRuntime 1.18
#20784 commented on Jun 30, 2024 • 1 new comment
Added requested install instructions to ORT ROCm Python.
#21124 commented on Jun 27, 2024 • 1 new comment
[Feature Request] Assess performance capability before a model is loaded
#20998 commented on Jul 1, 2024 • 1 new comment
[Training] Compiling ONNX Runtime for MIPS32 Linux for On-Device Training Capabilities
#20884 commented on Jul 1, 2024 • 1 new comment
[Training] The gradient builder has not been registered for node with op type MatMulNBits
#20781 commented on Jul 1, 2024 • 1 new comment
Jeffmend move vitis ai ep to main ep list
#16289 commented on Jun 26, 2024 • 1 new comment
[Nuget] Add netstandard* to buildTransitive folders (microsoft#17010)
#19242 commented on Jun 26, 2024 • 1 new comment
[js/webnn] Enable user-supplied MLContext
#20600 commented on Jun 25, 2024 • 1 new comment
Fix several tiny typo
#20699 commented on Jun 25, 2024 • 1 new comment
Register PRelu into quantization ops
#20686 commented on Jun 25, 2024 • 1 new comment
[Build] Propagate build option for CUDA minimal to TRT
#20695 commented on Jun 25, 2024 • 1 new comment
Whether CUDA12.4 and cudnn9 matches onnxruntime-win-x64-cuda12-1.17.1
#20223 commented on Jun 25, 2024 • 1 new comment
[Performance] Whisper model inference results incorrect after Transformer Optimizer
#21150 commented on Jun 25, 2024 • 1 new comment
RunAsync C# API crashes without any error
#19140 commented on Jun 25, 2024 • 1 new comment
Got segmentation fault error when using 'InferenceSession' API
#11964 commented on Jun 25, 2024 • 1 new comment
Quantized ONNX Model Still Has Float32 Input/Output Tensors
#21138 commented on Jun 25, 2024 • 1 new comment
[Mobile] Segmentation fault after repeated inference
#21082 commented on Jun 25, 2024 • 1 new comment
[Build] build python wheel fails
#21145 commented on Jun 25, 2024 • 1 new comment
Error running quantize_dynamic: Failed to find proper ai.onnx domain
#15563 commented on Jun 25, 2024 • 1 new comment
[Build] passing --arm64 to ci_build/build.py has error in arm64 host
#20814 commented on Jun 26, 2024 • 1 new comment
[Web] `executionProviders` chain for `webnn` fallback does not work on init error
#20729 commented on Jun 26, 2024 • 1 new comment
[Web] The YOLOv8 segmentation model with batching option is not runing on the GPU ?
#20710 commented on Jun 26, 2024 • 1 new comment
OpenCL and Mali GPU support left out of all execution providers
#20896 commented on Jun 26, 2024 • 1 new comment
RUNTIME_EXCEPTION, 80070057 The parameter is incorrect in v1.17.3
#20464 commented on Jun 27, 2024 • 1 new comment
DML cannot use device_id = 1 , run_with_iobinding failed.
#21092 commented on Jun 27, 2024 • 1 new comment
[Web] I can’t use onnruntime-web to load a onnx model in a react web
#20846 commented on Jun 29, 2024 • 1 new comment
[Documentation] The documentation for early versions is missing
#20850 commented on Jun 29, 2024 • 1 new comment
Non-zero status code returned while running Add node. Name:'Add_221'
#20861 commented on Jun 29, 2024 • 1 new comment
[Discussion] ORT GPU binaries do not contain DML
#20638 commented on Jun 29, 2024 • 1 new comment
ONNXRuntime 1.18 crashing with TensorRT EP when dealing with big inputs
#21001 commented on Jun 28, 2024 • 1 new comment
terminate called after throwing an instance of 'Ort::Exception' what(): Invalid input name: ��veSU
#20568 commented on Jun 28, 2024 • 1 new comment
Onnxruntime-directml 1.18.0 broken multithreading inference session
#20713 commented on Jun 28, 2024 • 1 new comment
Please Add webpack and typescript configuration
#20822 commented on Jun 28, 2024 • 1 new comment
[Build] --external_graph_transformer_path doesn't. --test_external_transformer_example removed from build.py?
#20751 commented on Jun 28, 2024 • 1 new comment
Java API Docs for GenerateAPI
#21125 commented on Jun 27, 2024 • 0 new comments
Adds ATen fallback for scaled_dot_product_attention
#21107 commented on Jun 27, 2024 • 0 new comments
Replace inline pip install with pip install from requirements*.txt
#21106 commented on Jun 25, 2024 • 0 new comments
Connecting fp16xq4 gemm kernels (optimized for A100) to MatMulNBits<fp16> operator
#21083 commented on Jun 27, 2024 • 0 new comments
Update pool to MacOS-13
#17361 commented on Jun 29, 2024 • 0 new comments
[Performance] Running YOLOv8-seg.onnx with Dynamic Batch Size on GPU
#21103 commented on Jun 27, 2024 • 0 new comments
[Documentation] Typo in tutorials at the top of the official webpage
#21146 commented on Jun 27, 2024 • 0 new comments
Add Split QuickGelu Fusion
#20344 commented on Jun 27, 2024 • 0 new comments
Mlas int4 int8 with avx2/512
#20687 commented on Jun 28, 2024 • 0 new comments
[Jvm] Native crash during createSession: std::bad_cast
#21147 commented on Jun 27, 2024 • 0 new comments