v2.12.0

KodiaqQ released this 31 Jul 12:28

· 2158 commits to develop since this release

Post-training Quantization:

Features:

(OpenVINO, PyTorch, ONNX) Excluded comparison operators from the quantization scope for nncf.ModelType.TRANSFORMER.
(OpenVINO, PyTorch) Changed the representation of symmetrically quantized weights from an unsigned integer with a fixed zero-point to a signed data type without a zero-point in the nncf.compress_weights() method.
(OpenVINO) Extended patterns support of the AWQ algorithm as part of nncf.compress_weights(). This allows apply AWQ for the wider scope of the models.
(OpenVINO) Introduced nncf.CompressWeightsMode.E2M1 mode option of nncf.compress_weights() as the new MXFP4 precision (Experimental).
(OpenVINO) Added support for models with BF16 precision in the nncf.quantize() method.
(PyTorch) Added quantization support for the torch.addmm.
(PyTorch) Added quantization support for the torch.nn.functional.scaled_dot_product_attention.

Fixes:

(OpenVINO, PyTorch, ONNX) Fixed Fast-/BiasCorrection algorithms with correct support of transposed MatMul layers.
(OpenVINO) Fixed nncf.IgnoredScope() functionality for models with If operation.
(OpenVINO) Fixed patterns with PReLU operations.
Fixed runtime error while importing NNCF without Matplotlib package.

Improvements:

Reduced the amount of memory required for applying nncf.compress_weights() to OpenVINO models.
Improved logging in case of the not empty nncf.IgnoredScope().

Tutorials:

Compression-aware training:

Fixes:

(PyTorch) Fixed issue with wrapping for operator without patched state.

Requirements:

Updated Tensorflow (2.15) version. This version requires Python 3.9-3.11.

Acknowledgements

Thanks for contributions from the OpenVINO developer community:
@Lars-Codes

Contributors

Lars-Codes

Assets 2