Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added int4 config for Mixtral-8x7B weight compression (#255)
Evaluated word perplexity of Mixtral-8x7B on wikitext and found int4 config for weight compression that has a neglible increase (0.4) in the perplexity from original model. Mixtral 8x7B | word_ppl on wikitext -- | -- Torch CPU | 5.17 OV CPU | 5.17 sym_g128_r100 (default) | 5.98 sym_g128_r90 | 5.60 sym_g128_r80 (added) | 5.55
- Loading branch information