Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : add RPC backend #6829

Merged
merged 19 commits into from
May 14, 2024
Merged
Prev Previous commit
Next Next commit
readme : trim trailing whitespace
  • Loading branch information
rgerganov committed May 14, 2024
commit df54adabeade8d4b909cea09dddda4cf3d100c83
6 changes: 3 additions & 3 deletions examples/rpc/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Overview

The `rpc-server` allows running a `ggml` backend on a remote host.
The `rpc-server` allows running a `ggml` backend on a remote host.
The RPC backend communicates with one or several instances of `rpc-server` and offloads computations to them.
This can be used for distributed LLM inference with `llama.cpp` in the following way:

Expand Down Expand Up @@ -37,9 +37,9 @@ cd build-rpc-cuda
cmake .. -DLLAMA_CUDA=ON -DLLAMA_RPC=ON
make -j
rgerganov marked this conversation as resolved.
Show resolved Hide resolved
```

Then, start the `rpc-server` with the backend:

```bash
$ bin/rpc-server 0.0.0.0 50052
rgerganov marked this conversation as resolved.
Show resolved Hide resolved
create_backend: using CUDA backend
Expand Down