Name		Name	Last commit message	Last commit date
parent directory ..
safety		safety
server		server
tests/locust		tests/locust
text-client		text-client
worker		worker
.gitignore		.gitignore
README.md		README.md
full-dev-setup.sh		full-dev-setup.sh

README.md

OpenAssistant Inference

Preliminary implementation of the inference engine for OpenAssistant. This is strictly for local development, although you might find limited success for your self-hosting OA plan. There is no warranty that this will not change in the future — in fact, expect it to change.

Development Variant 1 (docker compose)

The services of the inference stack are prefixed with "inference-" in the unified compose descriptor.
Prior to building those, please ensure that you have Docker's new BuildKit backend enabled. See the FAQ for more info.

To build the services, run:

docker compose --profile inference build

Spin up the stack:

docker compose --profile inference up -d

Tail the logs:

docker compose logs -f    \
    inference-server      \
    inference-worker

Note: The compose file contains the bind mounts enabling you to develop on the modules of the inference stack, and the oasst-shared package, without rebuilding.

Note: You can change the model by editing variable MODEL_CONFIG_NAME in the docker-compose.yaml file. Valid model names can be found in model_configs.py.

Note: You can spin up any number of workers by adjusting the number of replicas of the inference-worker service to your liking.

Note: Please wait for the inference-text-generation-server service to output {"message":"Connected"} before starting to chat.

Run the text client and start chatting:

cd text-client
pip install -r requirements.txt
python __main__.py
# You'll soon see a `User:` prompt, where you can type your prompts.

Distributed Testing

We run distributed load tests using the locust Python package.

pip install locust
cd tests/locust
locust

Navigate to http://0.0.0.0:8089/ to view the locust UI.

API Docs

To update the api docs, once the inference server is running run below command to download the inference openapi json into the relevant folder under /docs:

wget localhost:8000/openapi.json -O docs/docs/api/inference-openapi.json

Then make a PR to have the updated docs merged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference

inference

README.md

OpenAssistant Inference

Development Variant 1 (docker compose)

Distributed Testing

API Docs

Files

inference

Directory actions

More options

Directory actions

More options

Latest commit

History

inference

Folders and files

parent directory

README.md

OpenAssistant Inference

Development Variant 1 (docker compose)

Distributed Testing

API Docs