-
Notifications
You must be signed in to change notification settings - Fork 336
Insights: google/XNNPACK
Overview
-
- 30 Merged pull requests
- 15 Open pull requests
- 0 Closed issues
- 2 New issues
Could not load contribution data
Please try again later
30 Pull requests merged by 6 people
-
Add
f32-vgelu
microkernels.#6677 merged
Jul 4, 2024 -
QB4W Neondot F16/F32 GEMM Kernels
#6664 merged
Jul 4, 2024 -
Mark
generate_build_identifier_py
incompatible with wasm platforms.#6678 merged
Jul 3, 2024 -
QB4W AVX GEMM Kernels
#6621 merged
Jul 3, 2024 -
Update test and benchmark generation for blockwise kr>2
#6557 merged
Jul 3, 2024 -
QB4W SSE2/SSE41 GEMM Kernels
#6576 merged
Jul 3, 2024 -
Remove unused variable in binary-elementwise-config
#6667 merged
Jul 2, 2024 -
F32 copysign subgraph API
#6586 merged
Jul 2, 2024 -
F32 copysign op
#6585 merged
Jul 2, 2024 -
Add SIMD copysign microkernels
#6581 merged
Jul 2, 2024 -
Clean up
#include
s in thesrc/
subdirectory.#6661 merged
Jul 2, 2024 -
Sort order of
#include
statements intest/*.cc
andtest/*.h
files.#6659 merged
Jul 2, 2024 -
NEON qs8-rsum use 16 byte one's and zero's instead of 15
#6662 merged
Jul 2, 2024 -
Change avx256 build for Visual Studio to /arch:AVX512
#6660 merged
Jul 1, 2024 -
Fix
simd/f32-wasmsimd.h
static initializer types.#6658 merged
Jul 1, 2024 -
qs8-rsum ssse3 use load not loadu for params
#6647 merged
Jul 1, 2024 -
Detect avx10 and enable avx256skx
#6648 merged
Jul 1, 2024 -
qs8-rsum avx512 vpmaddwd to sum 16 bit accumulators to 32 bit
#6649 merged
Jul 1, 2024 -
AVX10 QS8/QD8 GEMM/IGEMM
#6650 merged
Jul 1, 2024 -
Added standalone rsum HVX
#6615 merged
Jul 1, 2024 -
Only copy cvt params if they are set
#6655 merged
Jul 1, 2024 -
Fix a few
#include <xnnpack/...>
to#include "src/xnnpack/..."
stragglers.#6652 merged
Jul 1, 2024 -
Fix
FullyConnectedTestQD8F32QC4W
which was accidentally setting thekernel
to zeros.#6651 merged
Jul 1, 2024 -
Add f32 vrelu RVV implementation microkernels, tests and config changes
#6597 merged
Jul 1, 2024 -
Fix
generate-enum.py
to use#include "..."
instead of#include <...>
.#6643 merged
Jul 1, 2024 -
AVX10 build files support and qs8-rsum
#6645 merged
Jun 29, 2024 -
Add scripts for running Hexagon executable on the simulator and on an adb-connected device.
#6644 merged
Jun 28, 2024 -
Comment out KleidiAI repository until they have official Bazel support.
#6642 merged
Jun 28, 2024 -
Fix a few
#include <xnnpack/...>
to#include "src/xnnpack/..."
stragglers.#6640 merged
Jun 28, 2024 -
Add vectorized
f32-vlog
microkernels.#6614 merged
Jun 28, 2024
15 Pull requests opened by 5 people
-
HVX `F32-raddstoreexpminusmax` microkernels for Softmax
#6646 opened
Jun 30, 2024 -
Add `xnn_cmpeq_f32` to the portable SIMD wrappers.
#6654 opened
Jul 1, 2024 -
Rename `xnn_shift(l|r)_f32` to `xnn_shift_(left|right)_f32`, add `xnn_shift_right_signed_f32`.
#6656 opened
Jul 1, 2024 -
Fix `f32-vlog-rational-3-3` microkernels to return `NaN` on negative inputs and `-Inf` on `0.0f`.
#6657 opened
Jul 1, 2024 -
Add f32 vlrelu RVV implementation microkernels, tests and config changes.
#6665 opened
Jul 2, 2024 -
Remove duplicate test generator
#6669 opened
Jul 2, 2024 -
Fix duplicate test and benchmark generation
#6670 opened
Jul 2, 2024 -
X16-PACKW for avx512 GEMM's that need NR=32 and NR=64
#6671 opened
Jul 2, 2024 -
Add AVX10 qd8/qs8 GEMM microkernels and enable 5x8 for avx10 vs 3x8 for avx2
#6672 opened
Jul 2, 2024 -
QB4W Fully Connected Operator
#6673 opened
Jul 3, 2024 -
QB4W Fully Connected Subgraph
#6674 opened
Jul 3, 2024 -
HVX `f32-qs8-vcvt` microkernels
#6675 opened
Jul 3, 2024 -
Add `f32-vgelu` kernels using Newton-Raphson iteration instead of division.
#6680 opened
Jul 3, 2024 -
QB4W Neoni8mm F16/F32 GEMM Kernels
#6681 opened
Jul 4, 2024 -
Use a single `max_abx_x` instead of `min_x` and `max_x`.
#6682 opened
Jul 4, 2024
2 Issues opened by 2 people
-
`f32-vgelu` for HVX build fails
#6683 opened
Jul 4, 2024 -
Low XNNPACK speedup for ARM Cortex A73
#6653 opened
Jul 1, 2024
4 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
QB4W AVX2 GEMM Kernels
#6618 commented on
Jul 4, 2024 • 1 new comment -
QB4W Development
#6502 commented on
Jul 4, 2024 • 0 new comments -
Hook up the new KleidiAI GEMM microkernels to the `fully-connected` operator.
#6575 commented on
Jul 3, 2024 • 0 new comments -
AVX512SKX QB4 Kernels [F16]
#6636 commented on
Jul 1, 2024 • 0 new comments