-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Insights: microsoft/onnxruntime
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v1.18.1 ONNX Runtime v1.18.1
published
Jun 28, 2024
26 Pull requests merged by 16 people
-
Templatize publishing nuget package
#21199 merged
Jul 2, 2024 -
Update install docs and add troubleshooting page
#21210 merged
Jul 1, 2024 -
CoreML: Disable 1D ML Program matmul due to bug in coreml
#21186 merged
Jun 29, 2024 -
Initial PR for VSINPU execution provider
#20903 merged
Jun 29, 2024 -
Update upstream packaging pipeline name to make it more meaningful.
#21154 merged
Jun 29, 2024 -
Update the functions in tensorprotoutils.h to use std::filesystem::path instead
#20920 merged
Jun 29, 2024 -
Uppdate nuget to Use Nuget 6.10.x
#21209 merged
Jun 29, 2024 -
[VitisAI] Align TensorProto_DataType with onnx1.16
#21067 merged
Jun 29, 2024 -
OVEP options to disable CPU fallback at compile time
#21166 merged
Jun 28, 2024 -
Add QNN UTs for QNN Pad Op with FP16 data on HTP backend
#21142 merged
Jun 28, 2024 -
Add FP32 and INT4 test in Llama2
#21187 merged
Jun 27, 2024 -
Fix typo in Python code block on home page
#21196 merged
Jun 27, 2024 -
Add compatibility for NumPy 2.0
#21085 merged
Jun 27, 2024 -
[WebNN EP] Remove useless variable unpacked_tensors_
#21189 merged
Jun 27, 2024 -
support for layernorm in webgpu pre opset-17
#21121 merged
Jun 27, 2024 -
[Fix] Throwes one excepiton while Llama2 parity_check fails
#21160 merged
Jun 27, 2024 -
[WebNN EP] Fixed bug in Expand implementation
#21163 merged
Jun 27, 2024 -
add split3inner
#19886 merged
Jun 27, 2024 -
[ROCm] Extend the Pipeline restriction time
#21158 merged
Jun 27, 2024 -
[ROCm] Disable ck_tile in Debug build
#21178 merged
Jun 27, 2024 -
Check for unit test log severity override earlier
#21177 merged
Jun 27, 2024 -
Rollback 19832, Remove shape_input_merge Fusion
#21179 merged
Jun 26, 2024 -
Convert scalars to 1D to satisfy ML Program requirements.
#21159 merged
Jun 26, 2024 -
Skip softmax BF16 test for ROCm
#21162 merged
Jun 26, 2024 -
[WebNN EP] Support rest Reduction ops for TFLite backend
#21135 merged
Jun 26, 2024 -
[WebNN EP] Support more Normalization ops for TFLite backend
#21151 merged
Jun 25, 2024
20 Pull requests opened by 20 people
-
Add warning for scale being too small to quantize bias
#21155 opened
Jun 25, 2024 -
[js/webgpu] Refactor trace
#21161 opened
Jun 25, 2024 -
[QNN EP] Initial INT4 support
#21171 opened
Jun 25, 2024 -
Replace Android CI vmImage: 'MacOS-12' with vmImage: 'ubuntu-latest'
#21172 opened
Jun 25, 2024 -
[build] allow MPI on Unix when NCCL is disabled
#21175 opened
Jun 25, 2024 -
[Optimizer] DQ + MatMul to MatMulNBits support
#21180 opened
Jun 26, 2024 -
Keep QDQ nodes w/ nonpositive scale around MaxPool
#21182 opened
Jun 26, 2024 -
Enable AVX NE CONVERT for FP16 to FP32 cast
#21183 opened
Jun 26, 2024 -
[ROCm] fix: obtain AMD GPU memory info through rocm_smi library
#21190 opened
Jun 27, 2024 -
[MLAS] AArch64 SQNBitGemm CompInt8 initial multi-row implementation
#21193 opened
Jun 27, 2024 -
[WebNN EP] Release WebNN MLGraphBuilder after Compile to free memory
#21200 opened
Jun 28, 2024 -
[Training] Fix Overflow Handling in Cast Infer for ORTModule.
#21202 opened
Jun 28, 2024 -
Delete path.h
#21211 opened
Jun 29, 2024 -
Use cuda memset async
#21216 opened
Jul 1, 2024 -
[VSINPU]Code improvement && Slice/Dropout OP support
#21217 opened
Jul 1, 2024 -
Implementation of set membership
#21222 opened
Jul 1, 2024 -
onnxruntime shared lib inside python package
#21223 opened
Jul 1, 2024 -
Add debugging helper to dump string, vector and thread id
#21224 opened
Jul 1, 2024 -
Fix type-punned pointer error during compilation on Loongson LA64.
#21225 opened
Jul 1, 2024 -
Fix ETW Sink Initialize unproperly locking
#21226 opened
Jul 1, 2024
13 Issues closed by 13 people
-
onnxruntime-gpu not working with my gpu / setup
#21215 closed
Jul 2, 2024 -
onnxruntime::python::CreateExecutionProviderInstance CUDA_PATH is set but CUDA wasnt able to be loaded.
#21218 closed
Jul 1, 2024 -
Microsoft.ML.OnnxRuntime.Gpu 1.18.0 not working with NVIDIA CUDA 11.6
#20916 closed
Jul 1, 2024 -
Onnx Model run failed in a loop
#21213 closed
Jun 30, 2024 -
"Protobuf parsing failed" error when loading a quantized Mistral model
#20113 closed
Jun 28, 2024 -
ORT 1.18 crashes on exit after using Cuda EP to run inference on a specific model
#21207 closed
Jun 28, 2024 -
Support Numpy v2.0
#21063 closed
Jun 27, 2024 -
[Mobile] Cocoapods release archive zips are missing
#21181 closed
Jun 27, 2024 -
[Build] Support CUDA 12 onnxruntime-gpu pypi package
#20745 closed
Jun 26, 2024 -
[Build] "utf8_range::utf8_validity" does not exist
#21174 closed
Jun 26, 2024 -
[Feature Request] prebuilt package with cudnn 9 support
#20829 closed
Jun 25, 2024 -
[Inference] Could not find an implmentation for groupnorm
#20661 closed
Jun 25, 2024 -
[Bug] The accuracy of the A16W16 quantized model is very poor if per_channel is True
#21000 closed
Jun 25, 2024
25 Issues opened by 23 people
-
SIGSEGV on CoreMLExecutionProvider when using dynamic batch
#21227 opened
Jul 2, 2024 -
[Mobile] QNN failed to finalize QNN graph for attention layer
#21221 opened
Jul 1, 2024 -
Inference result different between cuda and cpu
#21220 opened
Jul 1, 2024 -
[TensorRT EP] OOM (RAM) when loading ONNX model
#21219 opened
Jul 1, 2024 -
[Mobile] QNN HTP Backend Setup on Android Device
#21214 opened
Jun 30, 2024 -
[Documentation] How Configure CUDA 12.* and cuDNN for GPU with ONNX Runtime and C# on Windows 11
#21212 opened
Jun 29, 2024 -
[Transformers Optimizer] CLIP-ViT encoder attention not getting fused
#21208 opened
Jun 28, 2024 -
[Web] `Error: using ceil() in shape computation is not yet supported for AveragePool`
#21206 opened
Jun 28, 2024 -
Initialization crash using OnnxRuntime 17.0 (previously working on 16.3)
#21205 opened
Jun 28, 2024 -
[Build] ‘struct onnxruntime::ProviderHostCPU’ has no member named ‘UpsampleBase__AdjustOutputSizeAsPolicy’
#21204 opened
Jun 28, 2024 -
[Build] Build python interface for Onnxruntime-qnn on aarch64 Linux
#21203 opened
Jun 28, 2024 -
[Documentation] Setup the CUDA Environment is not detailed enough
#21197 opened
Jun 27, 2024 -
[Performance] Mapfile support for certain external data files is not working
#21195 opened
Jun 27, 2024 -
Issue with performing shape inference using symbolic_shape_infer.py with Phi-3 ONNX Models
#21194 opened
Jun 27, 2024 -
[Build] How to build for Android armeabi platform?
#21192 opened
Jun 27, 2024 -
Cannot create arena allocator with Environment::CreateAndRegisterAllocator on MAC M2 with clang
#21191 opened
Jun 27, 2024 -
QDQ removal optimization from around MaxPool changes results with negative scale
#21176 opened
Jun 26, 2024 -
ORT 1.18.1 Release Candidates available for testing
#21173 opened
Jun 25, 2024 -
CoreML EP inference result is improperly scaled
#21170 opened
Jun 25, 2024 -
Can onnxruntime.quantization.quantize_dynamic() work with onnx-trt?
#21169 opened
Jun 25, 2024 -
[Performance] Increased memory usage when loading from bytes
#21165 opened
Jun 25, 2024 -
[Feature Request] Add DFT support for CUDAExecutionProvider
#21164 opened
Jun 25, 2024 -
[Performance] Failed to run Whisper inference after optimization with Dml EP
#21156 opened
Jun 25, 2024
65 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Implement FlashAttention for CPU
#20805 commented on
Jun 28, 2024 • 28 new comments -
Enablement of onnxruntime for AIX and fixing issues related to big-endian platform.
#21133 commented on
Jul 1, 2024 • 11 new comments -
[WebNN EP] Support Einsum op
#19558 commented on
Jun 26, 2024 • 11 new comments -
Implemenation of IObinding in Mixtral MoE Parity Script
#21153 commented on
Jun 28, 2024 • 10 new comments -
VitisAI EP Context Model
#20926 commented on
Jun 26, 2024 • 10 new comments -
[Performance] CUDA kernel not found in registries for Op type: ScatterND
#21148 commented on
Jul 1, 2024 • 7 new comments -
[Feature Request] Move graph compilation behind higher transformers (graph optimization)
#20915 commented on
Jun 28, 2024 • 6 new comments -
onnxruntime shape mismatch during quantization of yolov8 models
#21048 commented on
Jun 26, 2024 • 6 new comments -
Update OpenVino CI Ubuntu to 22.04
#21127 commented on
Jun 29, 2024 • 5 new comments -
onnxruntime 在C++上如何实现fp16的推理 yolov5模型
#20395 commented on
Jul 1, 2024 • 5 new comments -
Enabling S8S8 and S8U8 handling in QGemm for AVX2 and AVX-VNNI
#21123 commented on
Jul 1, 2024 • 4 new comments -
[CPU] SparseAttention op
#21110 commented on
Jul 1, 2024 • 4 new comments -
TensorrtExecutionProvider slower than CUDAExecutionProvider: Faster-rcnn [Performance]
#17434 commented on
Jun 27, 2024 • 3 new comments -
Add GQA support for ROCm
#21032 commented on
Jul 2, 2024 • 3 new comments -
[Mobile] android prod crash: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)
#20828 commented on
Jun 27, 2024 • 2 new comments -
Update custom Triton kernel documentation and examples
#20883 commented on
Jun 26, 2024 • 2 new comments -
Segmentation fault while loading CUDA Provider
#16146 commented on
Jun 27, 2024 • 2 new comments -
[Build] quantization unittest failed when run all tests
#20821 commented on
Jul 1, 2024 • 2 new comments -
[TensorRT EP] Update TRT10.0 deprecated api
#20989 commented on
Jul 1, 2024 • 2 new comments -
Execution Provider bridge for TFLite Delegates for Coral Edge TPUs
#10248 commented on
Jun 28, 2024 • 2 new comments -
Index put loop model regression with ort==1.18
#20855 commented on
Jun 30, 2024 • 1 new comment -
[Documentation Request] Required cuDNN version for OnnxRuntime 1.18
#20784 commented on
Jun 30, 2024 • 1 new comment -
Added requested install instructions to ORT ROCm Python.
#21124 commented on
Jun 27, 2024 • 1 new comment -
[Feature Request] Assess performance capability before a model is loaded
#20998 commented on
Jul 1, 2024 • 1 new comment -
[Training] Compiling ONNX Runtime for MIPS32 Linux for On-Device Training Capabilities
#20884 commented on
Jul 1, 2024 • 1 new comment -
[Training] The gradient builder has not been registered for node with op type MatMulNBits
#20781 commented on
Jul 1, 2024 • 1 new comment -
Jeffmend move vitis ai ep to main ep list
#16289 commented on
Jun 26, 2024 • 1 new comment -
[Nuget] Add netstandard* to buildTransitive folders (microsoft#17010)
#19242 commented on
Jun 26, 2024 • 1 new comment -
[js/webnn] Enable user-supplied MLContext
#20600 commented on
Jun 25, 2024 • 1 new comment -
Fix several tiny typo
#20699 commented on
Jun 25, 2024 • 1 new comment -
Register PRelu into quantization ops
#20686 commented on
Jun 25, 2024 • 1 new comment -
[Build] Propagate build option for CUDA minimal to TRT
#20695 commented on
Jun 25, 2024 • 1 new comment -
Whether CUDA12.4 and cudnn9 matches onnxruntime-win-x64-cuda12-1.17.1
#20223 commented on
Jun 25, 2024 • 1 new comment -
[Performance] Whisper model inference results incorrect after Transformer Optimizer
#21150 commented on
Jun 25, 2024 • 1 new comment -
RunAsync C# API crashes without any error
#19140 commented on
Jun 25, 2024 • 1 new comment -
Got segmentation fault error when using 'InferenceSession' API
#11964 commented on
Jun 25, 2024 • 1 new comment -
Quantized ONNX Model Still Has Float32 Input/Output Tensors
#21138 commented on
Jun 25, 2024 • 1 new comment -
[Mobile] Segmentation fault after repeated inference
#21082 commented on
Jun 25, 2024 • 1 new comment -
[Build] build python wheel fails
#21145 commented on
Jun 25, 2024 • 1 new comment -
Error running quantize_dynamic: Failed to find proper ai.onnx domain
#15563 commented on
Jun 25, 2024 • 1 new comment -
[Build] passing --arm64 to ci_build/build.py has error in arm64 host
#20814 commented on
Jun 26, 2024 • 1 new comment -
[Web] `executionProviders` chain for `webnn` fallback does not work on init error
#20729 commented on
Jun 26, 2024 • 1 new comment -
[Web] The YOLOv8 segmentation model with batching option is not runing on the GPU ?
#20710 commented on
Jun 26, 2024 • 1 new comment -
OpenCL and Mali GPU support left out of all execution providers
#20896 commented on
Jun 26, 2024 • 1 new comment -
RUNTIME_EXCEPTION, 80070057 The parameter is incorrect in v1.17.3
#20464 commented on
Jun 27, 2024 • 1 new comment -
DML cannot use device_id = 1 , run_with_iobinding failed.
#21092 commented on
Jun 27, 2024 • 1 new comment -
[Web] I can’t use onnruntime-web to load a onnx model in a react web
#20846 commented on
Jun 29, 2024 • 1 new comment -
[Documentation] The documentation for early versions is missing
#20850 commented on
Jun 29, 2024 • 1 new comment -
Non-zero status code returned while running Add node. Name:'Add_221'
#20861 commented on
Jun 29, 2024 • 1 new comment -
[Discussion] ORT GPU binaries do not contain DML
#20638 commented on
Jun 29, 2024 • 1 new comment -
ONNXRuntime 1.18 crashing with TensorRT EP when dealing with big inputs
#21001 commented on
Jun 28, 2024 • 1 new comment -
terminate called after throwing an instance of 'Ort::Exception' what(): Invalid input name: ��veSU
#20568 commented on
Jun 28, 2024 • 1 new comment -
Onnxruntime-directml 1.18.0 broken multithreading inference session
#20713 commented on
Jun 28, 2024 • 1 new comment -
Please Add webpack and typescript configuration
#20822 commented on
Jun 28, 2024 • 1 new comment -
[Build] --external_graph_transformer_path doesn't. --test_external_transformer_example removed from build.py?
#20751 commented on
Jun 28, 2024 • 1 new comment -
Java API Docs for GenerateAPI
#21125 commented on
Jun 27, 2024 • 0 new comments -
Adds ATen fallback for scaled_dot_product_attention
#21107 commented on
Jun 27, 2024 • 0 new comments -
Replace inline pip install with pip install from requirements*.txt
#21106 commented on
Jun 25, 2024 • 0 new comments -
Connecting fp16xq4 gemm kernels (optimized for A100) to MatMulNBits<fp16> operator
#21083 commented on
Jun 27, 2024 • 0 new comments -
Update pool to MacOS-13
#17361 commented on
Jun 29, 2024 • 0 new comments -
[Performance] Running YOLOv8-seg.onnx with Dynamic Batch Size on GPU
#21103 commented on
Jun 27, 2024 • 0 new comments -
[Documentation] Typo in tutorials at the top of the official webpage
#21146 commented on
Jun 27, 2024 • 0 new comments -
Add Split QuickGelu Fusion
#20344 commented on
Jun 27, 2024 • 0 new comments -
Mlas int4 int8 with avx2/512
#20687 commented on
Jun 28, 2024 • 0 new comments -
[Jvm] Native crash during createSession: std::bad_cast
#21147 commented on
Jun 27, 2024 • 0 new comments