Merge pull request mistralai#164 from mistralai/Add-codestral

Add codestral
vortex4242 · May 29, 2024 · bf21b82 · bf21b82
2 parents fac6b9a + 9243fa8
commit bf21b82
Showing 1 changed file with 69 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -9,6 +9,7 @@ This repository contains minimal code to run our 7B, 8x7B and 8x22B models.
 Blog 7B: [https://mistral.ai/news/announcing-mistral-7b/](https://mistral.ai/news/announcing-mistral-7b/)\
 Blog 8x7B: [https://mistral.ai/news/mixtral-of-experts/](https://mistral.ai/news/mixtral-of-experts/)\
 Blog 8x22B: [https://mistral.ai/news/mixtral-8x22b/](https://mistral.ai/news/mixtral-8x22b/)
+Blog Codestral 22B: [https://mistral.ai/news/codestral](https://mistral.ai/news/codestral/)
 
 Discord: [https://discord.com/invite/mistralai](https://discord.com/invite/mistralai)\
 Documentation: [https://docs.mistral.ai/](https://docs.mistral.ai/)\
@@ -35,17 +36,19 @@ cd $HOME/mistral-inference && poetry install .
 
 | Name        | Download | md5sum |
 |-------------|-------|-------|
-| 7B Instruct v3 | https://models.mistralcdn.com/mistral-7b-v0-3/mistral-7B-Instruct-v0.3.tar | `80b71fcb6416085bcb4efad86dfb4d52` |
+| 7B Instruct | https://models.mistralcdn.com/mistral-7b-v0-3/mistral-7B-Instruct-v0.3.tar | `80b71fcb6416085bcb4efad86dfb4d52` |
 | 8x7B Instruct | https://models.mistralcdn.com/mixtral-8x7b-v0-1/Mixtral-8x7B-v0.1-Instruct.tar (**Updated model coming soon!**) | `8e2d3930145dc43d3084396f49d38a3f` |
 | 8x22 Instruct | https://models.mistralcdn.com/mixtral-8x22b-v0-3/mixtral-8x22B-Instruct-v0.3.tar | `471a02a6902706a2f1e44a693813855b` |
 | 7B Base | https://models.mistralcdn.com/mistral-7b-v0-3/mistral-7B-v0.3.tar | `0663b293810d7571dad25dae2f2a5806` |
 | 8x7B |     **Updated model coming soon!**       | - |
 | 8x22B | https://models.mistralcdn.com/mixtral-8x22b-v0-3/mixtral-8x22B-v0.3.tar | `a2fa75117174f87d1197e3a4eb50371a` |
+| Codestral 22B | https://models.mistralcdn.com/codestral-22b-v0-1/codestral-22B-v0.1.tar | `a5661f2f6c6ee4d6820a2f68db934c5d` |
 
 Note: 
 - **Important**:
   - `mixtral-8x22B-Instruct-v0.3.tar` is exactly the same as [Mixtral-8x22B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1), only stored in `.safetensors` format
   - `mixtral-8x22B-v0.3.tar` is the same as [Mixtral-8x22B-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-v0.1), but has an extended vocabulary of 32768 tokens.
+  - `codestral-22B-v0.1.tar` has a custom non-commercial license, called [Mistral AI Non-Production (MNPL) License](https://mistral.ai/licences/MNPL-0.1.md)
 - All of the listed models above supports function calling. For example, Mistral 7B Base/Instruct v3 is a minor update to Mistral 7B Base/Instruct v2,  with the addition of function calling capabilities. 
 - The "coming soon" models will include function calling as well. 
 - You can download the previous versions of our models from our [docs](https://docs.mistral.ai/getting-started/open_weight_models/#downloading).
@@ -77,7 +80,7 @@ tar -xf Mixtral-8x7B-v0.1-Instruct.tar -C $M8x7B_DIR
 
 ## Usage
 
-The following sections give an overview of how to run the model from the Command-line interface or from Python.
+The following sections give an overview of how to run the model from the Command-line interface (CLI) or directly within Python.
 
 ### CLI
 
@@ -115,6 +118,38 @@ torchrun --nproc-per-node 2 --no-python mistral-chat $M8x7B_DIR --instruct
 
 *Note*: Change `--nproc-per-node` to more GPUs if necessary (*e.g.* for 8x22B).
 
+- **Chat as Code Assistant**
+
+To use [Codestral] as a coding assistant you can run the following command using `mistral-chat`.
+Make sure `$M22B_CODESTRAL` is set to a valid path to the downloaded codestral folder, e.g. `$HOME/mistral_models/Codestral-22B-v0.1`
+
+```sh
+mistral-chat $M22B_CODESTRAL --instruct --max_tokens 256
+```
+
+If you prompt it with *"Write me a function that computes fibonacci in Rust"*, the model should generate something along the following lines:
+
+```sh
+Sure, here's a simple implementation of a function that computes the Fibonacci sequence in Rust. This function takes an integer `n` as an argument and returns the `n`th Fibonacci number.
+
+fn fibonacci(n: u32) -> u32 {
+    match n {
+        0 => 0,
+        1 => 1,
+        _ => fibonacci(n - 1) + fibonacci(n - 2),
+    }
+}
+
+fn main() {
+    let n = 10;
+    println!("The {}th Fibonacci number is: {}", n, fibonacci(n));
+}
+
+This function uses recursion to calculate the Fibonacci number. However, it's not the most efficient solution because it performs a lot of redundant calculations. A more efficient solution would use a loop to iteratively calculate the Fibonacci numbers.
+```
+
+You can continue chatting afterwards, *e.g.* with *"Translate it to Python"*.
+
 ### Python
 
 - *Instruction Following*:
@@ -183,6 +218,38 @@ result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])
 print(result)
 ```
 
+- *Fill-in-the-middle (FIM)*:
+
+Make sure to have `mistral-common >= 1.2.0` installed:
+```
+pip install --upgrade mistral-common
+```
+
+You can simulate a code completion in-filling as follows.
+
+```py
+from mistral_inference.model import Transformer
+from mistral_inference.generate import generate
+from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
+from mistral_common.tokens.instruct.request import FIMRequest
+
+tokenizer = MistralTokenizer.from_model("codestral-22b")
+model = Transformer.from_folder("./mistral_22b_codestral")
+
+prefix = """def add("""
+suffix = """    return sum"""
+
+request = FIMRequest(prompt=prefix, suffix=suffix)
+
+tokens = tokenizer.encode_fim(request).tokens
+
+out_tokens, _ = generate([tokens], model, max_tokens=256, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
+result = tokenizer.decode(out_tokens[0])
+
+middle = result.split(suffix)[0].strip()
+print(middle)
+```
+
 ### One-file-ref
 
 If you want a self-contained implementation, look at `one_file_ref.py`, or run it with