Skip to content

Commit

Permalink
[AI] API for defining pools of AI related endpoints (#10100)
Browse files Browse the repository at this point in the history
Co-authored-by: soloio-bulldozer[bot] <48420018+soloio-bulldozer[bot]@users.noreply.github.com>
Co-authored-by: Shashank Ram <21697719+shashankram@users.noreply.github.com>
  • Loading branch information
3 people authored Sep 26, 2024
1 parent c46ed68 commit 2a290f0
Show file tree
Hide file tree
Showing 22 changed files with 2,138 additions and 452 deletions.
5 changes: 5 additions & 0 deletions changelog/v1.18.0-beta23/ai-prioritized-endpoints.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
changelog:
- type: NEW_FEATURE
issueLink: https://github.com/solo-io/solo-projects/issues/6957
description: >-
Add an API to allow configuring prioritized pools of LLM backends.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 9 additions & 4 deletions install/helm/gloo/crds/gateway.solo.io_v1_RouteOption.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,6 @@ spec:
properties:
ai:
properties:
backupModels:
items:
type: string
type: array
defaults:
items:
properties:
Expand Down Expand Up @@ -149,6 +145,9 @@ spec:
promptTemplate:
type: string
type: object
routeType:
type: string
x-kubernetes-int-or-string: true
semanticCache:
properties:
datastore:
Expand Down Expand Up @@ -1573,6 +1572,12 @@ spec:
nullable: true
type: integer
type: object
retriableStatusCodes:
items:
maximum: 4294967295
minimum: 0
type: integer
type: array
retryBackOff:
properties:
baseInterval:
Expand Down
13 changes: 9 additions & 4 deletions install/helm/gloo/crds/gateway.solo.io_v1_RouteTable.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -145,10 +145,6 @@ spec:
properties:
ai:
properties:
backupModels:
items:
type: string
type: array
defaults:
items:
properties:
Expand Down Expand Up @@ -259,6 +255,9 @@ spec:
promptTemplate:
type: string
type: object
routeType:
type: string
x-kubernetes-int-or-string: true
semanticCache:
properties:
datastore:
Expand Down Expand Up @@ -1683,6 +1682,12 @@ spec:
nullable: true
type: integer
type: object
retriableStatusCodes:
items:
maximum: 4294967295
minimum: 0
type: integer
type: array
retryBackOff:
properties:
baseInterval:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1342,6 +1342,12 @@ spec:
nullable: true
type: integer
type: object
retriableStatusCodes:
items:
maximum: 4294967295
minimum: 0
type: integer
type: array
retryBackOff:
properties:
baseInterval:
Expand Down
Loading

0 comments on commit 2a290f0

Please sign in to comment.