Skip to content

Latest commit

 

History

History
143 lines (112 loc) · 5.37 KB

README.md

File metadata and controls

143 lines (112 loc) · 5.37 KB

Server

A cargo workspace which contains bleep, the Rust package which powers bloop's search and code navigation.

Setup

Install

Dependencies:

  • rust
  • openssl
  • onnxruntime
  • universal-ctags

Follow these instructions and verify the installation with ctags --version. You should see something like this:

Universal Ctags 5.9.0(6f10588d), Copyright (C) 2015-2022 Universal Ctags Team
Universal Ctags is derived from Exuberant Ctags.
Exuberant Ctags 5.8, Copyright (C) 1996-2009 Darren Hiebert
  Compiled: Aug 19 2022, 11:23:10
  URL: https://ctags.io/
  Optional compiled features: +wildcards, +regex, +gnulib_fnmatch, +gnulib_regex, +iconv, +option-directory, +xpath, +json, +interactive, +yaml, +case-insensitive-filenames, +packcc, +optscript

Make sure that +json is in the list of compiled features.

Natural Language Answers

To execute natural language queries bleep needs to connect to a instance of Qdrant, a vector search database. You can start Qdrant on port 6334 (where bleep expects it by default) with:

docker run -p 6333:6333 -p 6334:6334 \
    -e QDRANT__SERVICE__GRPC_PORT="6334" \
    qdrant/qdrant

You can run bleep without linking it to a Qdrant instance but calls to the /answer endpoint will return an error.

You'll also need to run a local instance of answer_api (bleep expects it on port 7879 by default) which handles requests to the explanation API.

Build

cargo build -p bleep --release

Usage

To index and search all the repos in a directory (say, /path/to/source) run (from this repo's root dir):

$ cargo run -p bleep --release -- \
  --source-dir /path/to/dir

bloop will recursively scan /path/to/source for repositories and start indexing them. It will also start a webserver. The location of the search index can be specified with the --index-dir parameter. By default it is stored in the system cache.

bloop periodically checks for changes to local and remote repos and automatically reindexes if a change is detected. Indexing and polling can be disabled by passing the --disable-background and --disable-fsevents flags.

The log level can be customized by setting the BLOOP_LOG env var.

Sync GitHub

To sync GitHub repos, first create a GitHub Client ID. Then call bleep with the --github-client-id <token> parameter.

Arguments

Options:
  -c, --config-file <CONFIG_FILE>
          If a config file is given, it will override _all_ command line parameters!
  -d, --source-dir <DIRECTORY>
          Directory where repositories are located
  -s, --state-file <STATE_FILE>
          State file for all repositories
      --credentials <CREDENTIALS>
          Credentials store for external providers
  -v, --version-file <VERSION_FILE>
          Version of the current schema
  -i, --index-dir <INDEX_DIR>
          Directory to store indexes [default: /Users/gabriel/Library/Caches/ai.bloop.bleep]
      --index-only
          Quit after indexing the specified repos
      --disable-background
          Disable periodic reindexing, and `git pull` on remote repositories
      --disable-fsevents
          Disable system-native notification backends to detect new git commits immediately
  -b, --buffer-size <BUFFER_SIZE>
          Size of memory to use for file indexes [default: 100000000]
  -r, --repo-buffer-size <REPO_BUFFER_SIZE>
          Size of memory to use for repo indexes [default: 30000000]
  -m, --max-threads <MAX_THREADS>
          Maximum number of parallel background threads [default: 8]
      --host <HOST>
          Bind the webserver to `<port>` [default: 127.0.0.1]
      --port <PORT>
          Bind the webserver to `<host>` [default: 7878]
      --qdrant-url <QDRANT_URL>
          URL for the qdrant server [default: http://127.0.0.1:6334]
      --answer-api-url <ANSWER_API_URL>
          URL for the answer-api [default: http://127.0.0.1:7879]
      --model-dir <MODEL_DIR>
          Path to the embedding model directory [default: model]
      --github-client-id <GITHUB_CLIENT_ID>
          Github Client ID for OAuth connection to private repos
      --segment-key <SEGMENT_KEY>
          Segment write key
      --max-chunk-tokens <MAX_CHUNK_TOKENS>
          Maximum number of tokens in a chunk (should be the model's input size) [default: 256]
      --overlap <OVERLAP>
          Chunking strategy [possible values: 1, 50%]
  -h, --help
          Print help information
  -V, --version
          Print version information

Query

With the server running you can start searching your code:

$ curl -v "localhost:7878/q?q=anyhow%20path:webserver%20repo:bloop" | jq

You can get answers in natural language by querying the answer endpoint:

curl "localhost:7878/answer?q=what%20does%20the%20query%20parser%20do?"

You can check which repos are indexed and their status:

$ curl -v "localhost:7878/api/repos/indexed" | jq

OpenAPI

You can view OpenAPI documentation (and railroad diagrams for the query language) here.

bleep uses utoipa to automatically generate an OpenAPI spec. To can manually access the OpenAPI spec with this endpoint:

$ curl "localhost:7878/api-doc/openapi.yaml"