0.22.0 Release #440

guillaume-be · 2024-01-20T09:24:47Z

Added

Addition of new_with_tokenizer constructor for SentenceEmbeddingsModel allowing passing custom tokenizers for sentence embeddings pipelines.
Support for Tokenizers in pipelines, allowing loading tokenizer.json and special_token_map.json tokenizer files.
(BREAKING) Most model configuration can now take an optional kind parameter to specify the model weight precision. If not provided, will default to full precision on CPU, or the serialized weights precision otherwise.

(BREAKING) Fixed the keyword extraction pipeline for n-gram sizes > 2. Add new configuration option tokenizer_forbidden_ngram_chars to specify characters that should be excluded from n-grams (allows filtering m-grams spanning multiple sentences).
Improved MPS device compatibility setting the sparse_grad flag to false for gather operations
Updated ONNX runtime backend version to 1.15.x
Issue with incorrect results for QA models with a tokenizer not using segment ids
Issue with GPT-J that was incorrectly tracking the gradients for the attention bias

(BREAKING) Upgraded to torch 2.1 (via tch 0.14.0).
(BREAKING) Text generation traits and pipelines (including conversation, summarization and translation) now return a Result for improved error handling

guillaume-be · 2024-01-20T09:32:11Z

Fixes #438

guillaume-be added 2 commits January 20, 2024 09:19

Fix Clippy warnings

7be90c8

bump version, updated dependencies and changelog

5bdbb28

guillaume-be merged commit c3a3f39 into main Jan 20, 2024
11 checks passed

guillaume-be deleted the 0_22_0_release branch January 20, 2024 09:42