[AI] Semantic Cache API Updates, (mode, score_threshold) #9931

EItanya · 2024-08-22T13:53:04Z

Description

AI Semantic caching API updates

API changes

Added a mode field to semantic caching to allow the user to selectively add new entries to the cache
Added a score_threshold field to semantic caching to allow the user to decide how similar a prompt must be in order for caching to trigger.

…i-cache-v2

…into eitanya/ai-cache-v2

solo-changelog-bot · 2024-08-22T15:26:10Z

Issues linked to changelog:
https://github.com/solo-io/solo-projects/issues/6783

shashankram · 2024-08-22T15:33:46Z

projects/gloo/api/v1/enterprise/options/ai/ai.proto

+    // two queries need to be in order to return a cached result.
+    // The lower the number, the more similar the queries need to be for a cache hit.
+    //
+    // +kubebuilder:validation:Minimum=0


Our code gen via solo-kit likely isn't enabling kubebuilder markers, we may want to do that

That's a much bigger issue, I agree with you but probably not for right now.

shashankram · 2024-08-22T15:34:30Z

projects/gloo/api/v1/enterprise/options/ai/ai.proto

+  enum Mode {
+    // Read and write to the cache as a part of the request/response lifecycle
+    READ_WRITE = 0;
+    // Only read from the cache, do not write to it. Data will be written to the cache outside the request/response cycle.


nit: drop the period or add it to all comments for consistency

shashankram · 2024-08-22T15:35:25Z

projects/gloo/api/v1/enterprise/options/ai/ai.proto

  // Which data store to use
  DataStore datastore = 1;
  // Model to use to get embeddings for prompt
  Embedding embedding = 2;
  // Time before data in the cache is considered expired
  uint32 ttl = 3;
+  // Cache mode to use: READ_WRITE or READ_ONLY


Maybe another comment line on what it does similar to the enum desc

Why? if a user wants more detail they can click into the enum, this is just meant as high level.

Makes it easier in the doc to now click another link, nothing more

github-actions · 2024-08-22T15:40:20Z

Visit the preview URL for this PR (updated for commit 0dc82d3):

https://gloo-edge--pr9931-eitanya-ai-cache-v2-tvom25a5.web.app

_{(expires Mon, 02 Sep 2024 20:06:05 GMT)}

_{🔥 via Firebase Hosting GitHub Action 🌎}

_{Sign: 77c2b86e287749579b7ff9cadb81e099042ef677}

shashankram · 2024-08-22T16:58:44Z

projects/gateway2/helm/gloo-gateway/templates/gateway/proxy-deployment.yaml

@@ -249,6 +249,8 @@ spec:
 {{- if (($gateway.aiExtension).enabled) }}
      - name: gloo-ai-extension
        image: "{{ template "gloo-gateway.gateway.image" $gateway.aiExtension.image }}"
+        args:
+        - "extproc"


the entire spec is opaque, so perhaps this should be exposed via GatewayParameters as well. However, I would rather not make this change and instead use an env var and let the process use that instead.

This is still opaque, essentially I wanted to re-use the program rather than creating a new image for the apiserver. If we make this configurable the user can accidentally not include this, but it's necessary 100% of the time if this is enabled, so it belongs here. Really this entire spec should be located alongside the code rather than in a different repo.

It isn't because "extproc" doesn't make sense without knowing about the container being used and its impl. Why not use an env var as it allows to do exactly what you are intending to?

What do you mean "use an env var", where would we use one that's opaque to the user? Putting it in the Params requires the user to but it in the params, but I specifically want to avoid the user having to do anything. I can name it sidecar if extproc is specifically bad in this case, but I don't think that's what you're saying

You can default the impl to use a sidecar and we can set an env var when running as an apiservice?

…into eitanya/ai-cache-v2

Co-authored-by: Shashank Ram <shashank.ram@solo.io> Co-authored-by: soloio-bulldozer[bot] <48420018+soloio-bulldozer[bot]@users.noreply.github.com>

shashankram and others added 2 commits August 8, 2024 13:44

WIP: cache proto updates

ce6a0bb

Merge branch 'main' of https://github.com/solo-io/gloo into eitanya/a…

7fe0ae4

…i-cache-v2

github-actions bot added keep pr updated signals bulldozer to keep pr up to date with base branch work in progress signals bulldozer to keep pr open (don't auto-merge) labels Aug 22, 2024

Merge main into eitanya/ai-cache-v2

09a5e8b

EItanya changed the title ~~[AI] Semantic Cache API Updates~~ [AI] Semantic Cache API Updates, (mode, score_threshold) Aug 22, 2024

EItanya marked this pull request as ready for review August 22, 2024 15:19

EItanya requested a review from a team as a code owner August 22, 2024 15:19

EItanya added 2 commits August 22, 2024 15:25

changelog

dede543

Merge branch 'eitanya/ai-cache-v2' of https://github.com/solo-io/gloo …

2c794d5

…into eitanya/ai-cache-v2

shashankram approved these changes Aug 22, 2024

View reviewed changes

add argument to the ai-extension container to run extproc

bd9e6fb

shashankram reviewed Aug 22, 2024

View reviewed changes

soloio-bulldozer bot and others added 8 commits August 22, 2024 17:06

Merge refs/heads/main into eitanya/ai-cache-v2

907d74b

Merge refs/heads/main into eitanya/ai-cache-v2

d7a78bf

Merge refs/heads/main into eitanya/ai-cache-v2

7bcab48

Merge refs/heads/main into eitanya/ai-cache-v2

983949b

revert changing proxy deployment

683515c

Merge branch 'eitanya/ai-cache-v2' of https://github.com/solo-io/gloo …

f701108

…into eitanya/ai-cache-v2

Merge refs/heads/main into eitanya/ai-cache-v2

ec1394f

Merge refs/heads/main into eitanya/ai-cache-v2

0539f7d

sam-heilbron approved these changes Aug 23, 2024

View reviewed changes

soloio-bulldozer bot added 4 commits August 23, 2024 21:10

Merge refs/heads/main into eitanya/ai-cache-v2

eb35536

Merge refs/heads/main into eitanya/ai-cache-v2

66f896a

Merge refs/heads/main into eitanya/ai-cache-v2

13a3e09

Merge refs/heads/main into eitanya/ai-cache-v2

0dc82d3

EItanya merged commit 34b8664 into main Aug 26, 2024
18 checks passed

EItanya deleted the eitanya/ai-cache-v2 branch August 26, 2024 20:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AI] Semantic Cache API Updates, (mode, score_threshold) #9931

[AI] Semantic Cache API Updates, (mode, score_threshold) #9931

EItanya commented Aug 22, 2024 •

edited

Loading

solo-changelog-bot bot commented Aug 22, 2024

shashankram Aug 22, 2024

EItanya Aug 22, 2024

shashankram Aug 22, 2024

shashankram Aug 22, 2024

EItanya Aug 22, 2024

shashankram Aug 23, 2024

github-actions bot commented Aug 22, 2024 •

edited

Loading

shashankram Aug 22, 2024

EItanya Aug 22, 2024

shashankram Aug 22, 2024

EItanya Aug 22, 2024 •

edited

Loading

shashankram Aug 22, 2024

[AI] Semantic Cache API Updates, (mode, score_threshold) #9931

[AI] Semantic Cache API Updates, (mode, score_threshold) #9931

Conversation

EItanya commented Aug 22, 2024 • edited Loading

Description

API changes

solo-changelog-bot bot commented Aug 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Aug 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EItanya Aug 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EItanya commented Aug 22, 2024 •

edited

Loading

github-actions bot commented Aug 22, 2024 •

edited

Loading

EItanya Aug 22, 2024 •

edited

Loading