Tags · huggingface/optimum-quanto

v0.2.2

chore: release 0.2.2

Jun 28, 2024
8c8aa97
zip
tar.gz
Notes

v0.2.1

release: 0.2.1

May 31, 2024
1f35ceb
zip
tar.gz
Notes

v0.2.0

release: 0.2.0

New:
- requantize helper by @calmitchell617,
- StableDiffusion example by @thliang01,
- improved linear backward path,
- AWQ int4 kernels.

May 24, 2024
96ab5d3
zip
tar.gz
Notes

v0.1.0

release: 0.1.0

- group-wise quantization,
- safe serialization.

Mar 13, 2024
fe2b313
zip
tar.gz
Notes

v0.0.13

release: 0.0.13

- new `QConv2d` quantized module,
- official support for `float8` weights.

- fix `QbitsTensor.to()` that was not moving the inner tensors,
- prevent shallow `QTensor` copies when loading weights that do not move
  inner tensors.

Feb 23, 2024
addd712
zip
tar.gz
Notes

v0.0.12

release: 0.0.12

Feb 16, 2024
d79f795
zip
tar.gz
Notes

0.0.11

chore: bump version

Jan 19, 2024
283080a
zip
tar.gz
Notes

0.0.10

release: 0.0.10

New features:

- calibration streamline option to remove spurious quantize/dequantize,
- calibration debug mode.

Dec 20, 2023
5ab7e6a
zip
tar.gz
Notes

0.0.9

release: 0.0.9

New features:

- quantize weights and activations parameters
- float8 activations

Dec 15, 2023
8acbefc
zip
tar.gz
Notes

0.0.8

release: 0.0.8

New features:

- weight-only quantization,
- integer matmul acceleration on CUDA.

Bug fixes:

- actually use float16 weights,
- avoid float16 overflows,
- correct device placement,
- robust serialization.

Dec 8, 2023
63041a4
zip
tar.gz
Notes

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.2

v0.2.1

v0.2.0

v0.1.0

v0.0.13

v0.0.12

0.0.11

0.0.10

0.0.9

0.0.8

Tags: huggingface/optimum-quanto