Skip to content

Commit

Permalink
Merge pull request mistralai#164 from mistralai/Add-codestral
Browse files Browse the repository at this point in the history
Add codestral
  • Loading branch information
patrickvonplaten committed May 29, 2024
2 parents fac6b9a + 9243fa8 commit bf21b82
Showing 1 changed file with 69 additions and 2 deletions.
71 changes: 69 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ This repository contains minimal code to run our 7B, 8x7B and 8x22B models.
Blog 7B: [https://mistral.ai/news/announcing-mistral-7b/](https://mistral.ai/news/announcing-mistral-7b/)\
Blog 8x7B: [https://mistral.ai/news/mixtral-of-experts/](https://mistral.ai/news/mixtral-of-experts/)\
Blog 8x22B: [https://mistral.ai/news/mixtral-8x22b/](https://mistral.ai/news/mixtral-8x22b/)
Blog Codestral 22B: [https://mistral.ai/news/codestral](https://mistral.ai/news/codestral/)

Discord: [https://discord.com/invite/mistralai](https://discord.com/invite/mistralai)\
Documentation: [https://docs.mistral.ai/](https://docs.mistral.ai/)\
Expand All @@ -35,17 +36,19 @@ cd $HOME/mistral-inference && poetry install .

| Name | Download | md5sum |
|-------------|-------|-------|
| 7B Instruct v3 | https://models.mistralcdn.com/mistral-7b-v0-3/mistral-7B-Instruct-v0.3.tar | `80b71fcb6416085bcb4efad86dfb4d52` |
| 7B Instruct | https://models.mistralcdn.com/mistral-7b-v0-3/mistral-7B-Instruct-v0.3.tar | `80b71fcb6416085bcb4efad86dfb4d52` |
| 8x7B Instruct | https://models.mistralcdn.com/mixtral-8x7b-v0-1/Mixtral-8x7B-v0.1-Instruct.tar (**Updated model coming soon!**) | `8e2d3930145dc43d3084396f49d38a3f` |
| 8x22 Instruct | https://models.mistralcdn.com/mixtral-8x22b-v0-3/mixtral-8x22B-Instruct-v0.3.tar | `471a02a6902706a2f1e44a693813855b` |
| 7B Base | https://models.mistralcdn.com/mistral-7b-v0-3/mistral-7B-v0.3.tar | `0663b293810d7571dad25dae2f2a5806` |
| 8x7B | **Updated model coming soon!** | - |
| 8x22B | https://models.mistralcdn.com/mixtral-8x22b-v0-3/mixtral-8x22B-v0.3.tar | `a2fa75117174f87d1197e3a4eb50371a` |
| Codestral 22B | https://models.mistralcdn.com/codestral-22b-v0-1/codestral-22B-v0.1.tar | `a5661f2f6c6ee4d6820a2f68db934c5d` |

Note:
- **Important**:
- `mixtral-8x22B-Instruct-v0.3.tar` is exactly the same as [Mixtral-8x22B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1), only stored in `.safetensors` format
- `mixtral-8x22B-v0.3.tar` is the same as [Mixtral-8x22B-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-v0.1), but has an extended vocabulary of 32768 tokens.
- `codestral-22B-v0.1.tar` has a custom non-commercial license, called [Mistral AI Non-Production (MNPL) License](https://mistral.ai/licences/MNPL-0.1.md)
- All of the listed models above supports function calling. For example, Mistral 7B Base/Instruct v3 is a minor update to Mistral 7B Base/Instruct v2, with the addition of function calling capabilities.
- The "coming soon" models will include function calling as well.
- You can download the previous versions of our models from our [docs](https://docs.mistral.ai/getting-started/open_weight_models/#downloading).
Expand Down Expand Up @@ -77,7 +80,7 @@ tar -xf Mixtral-8x7B-v0.1-Instruct.tar -C $M8x7B_DIR

## Usage

The following sections give an overview of how to run the model from the Command-line interface or from Python.
The following sections give an overview of how to run the model from the Command-line interface (CLI) or directly within Python.

### CLI

Expand Down Expand Up @@ -115,6 +118,38 @@ torchrun --nproc-per-node 2 --no-python mistral-chat $M8x7B_DIR --instruct

*Note*: Change `--nproc-per-node` to more GPUs if necessary (*e.g.* for 8x22B).

- **Chat as Code Assistant**

To use [Codestral] as a coding assistant you can run the following command using `mistral-chat`.
Make sure `$M22B_CODESTRAL` is set to a valid path to the downloaded codestral folder, e.g. `$HOME/mistral_models/Codestral-22B-v0.1`

```sh
mistral-chat $M22B_CODESTRAL --instruct --max_tokens 256
```

If you prompt it with *"Write me a function that computes fibonacci in Rust"*, the model should generate something along the following lines:

```sh
Sure, here's a simple implementation of a function that computes the Fibonacci sequence in Rust. This function takes an integer `n` as an argument and returns the `n`th Fibonacci number.
fn fibonacci(n: u32) -> u32 {
match n {
0 => 0,
1 => 1,
_ => fibonacci(n - 1) + fibonacci(n - 2),
}
}
fn main() {
let n = 10;
println!("The {}th Fibonacci number is: {}", n, fibonacci(n));
}
This function uses recursion to calculate the Fibonacci number. However, it's not the most efficient solution because it performs a lot of redundant calculations. A more efficient solution would use a loop to iteratively calculate the Fibonacci numbers.
```

You can continue chatting afterwards, *e.g.* with *"Translate it to Python"*.

### Python

- *Instruction Following*:
Expand Down Expand Up @@ -183,6 +218,38 @@ result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])
print(result)
```

- *Fill-in-the-middle (FIM)*:

Make sure to have `mistral-common >= 1.2.0` installed:
```
pip install --upgrade mistral-common
```

You can simulate a code completion in-filling as follows.

```py
from mistral_inference.model import Transformer
from mistral_inference.generate import generate
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.request import FIMRequest

tokenizer = MistralTokenizer.from_model("codestral-22b")
model = Transformer.from_folder("./mistral_22b_codestral")

prefix = """def add("""
suffix = """ return sum"""

request = FIMRequest(prompt=prefix, suffix=suffix)

tokens = tokenizer.encode_fim(request).tokens

out_tokens, _ = generate([tokens], model, max_tokens=256, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.decode(out_tokens[0])

middle = result.split(suffix)[0].strip()
print(middle)
```

### One-file-ref

If you want a self-contained implementation, look at `one_file_ref.py`, or run it with
Expand Down

0 comments on commit bf21b82

Please sign in to comment.