Envoy to pass hits_addend to RateLimitService #12969

wwillsey · 2020-09-03T15:45:42Z

Description

The RLS v3 api describes the RateLimitService as able to injest a hits_addend field to determine number of tokens to use for the rate limiting request.
Envoy should provide a method for extracting a value from a request header (or some other method) to populate this method on a per request basis. If hits_addend is only static, then it is effectively the same as modifying the ratelimit.

Use case

In the HTTP Rate Limit Filter allow for a configuration of a request header containing an integer hits_addend value to send with the rate limit request, allowing for greater configurability of rate limiting capabilities.

wwillsey · 2020-09-03T15:47:08Z

Hey @mattklein123, I've created this issue to follow up on envoyproxy/ratelimit#167. Please let me know if you think any more details would be helpful.

Thanks!

mattklein123 · 2020-09-03T16:04:48Z

Yeah this makes sense to me. Marking help wanted.

medalliaerlich · 2021-01-28T12:51:04Z

any news regarding this?

sc0ttbeardsley · 2022-11-01T17:39:04Z

Pinterest is interested in this also. cc @fishcakez @JuniorHsu

lizzzcai · 2023-07-05T04:13:41Z

Hi, any news regarding this? We would like to use it to limit the token/minutes for the LLM use case, as they are usually limited by tokens-per-minutes rather than requests/secs.

PeterL328 · 2023-08-18T22:03:34Z

Related work.
I updated the ratelimit client to support the hits_addend field (#28939). Some extra work would be required so users can configure ratelimit sidecar to send hits_addend

PeterL328 · 2023-08-18T22:04:48Z

@lizzzcai In case you are using the OpenAI API, I think they limit on request token + response token. So further work would be required either in the ratelimit filter or another new filter so the response token can be sent to the ratelimit sidecar on the response flow.

lizzzcai · 2023-08-24T09:00:47Z

Hi @PeterL328 , thanks for your update, I will follow your other PR for the progress.

In case you are using the OpenAI API, I think they limit on request token + response token.

For our case, we are using Azure OpenAI. However, I think the limit is not on the response token at least for Azure OpenAI. For our case we are using prompt text token + max_tokens(max number of token will be responded) in the request.

Reference: Azure OpenAI

As each request is received, Azure OpenAI computes an estimated max processed-token count that includes the following:

Prompt text and count
The max_tokens parameter setting
The best_of parameter setting

As requests come into the deployment endpoint, the estimated max-processed-token count is added to a running token count of all requests that is reset each minute. If at any time during that minute, the TPM rate limit value is reached, then further requests will receive a 429 response code until the counter resets.

PeterL328 · 2023-09-07T01:48:58Z

Hi @lizzzcai,
We use OpenAI API and also Azure OpenAI. I believe both will report back the total token consumed (request + response token) in the response body.

Yea you can use the max token on the response but it will not be accurate if that is what you need if you plan to track it.

EItanya · 2024-05-16T02:05:38Z

I have opened #34184 as a potential solution to setting hits_addend in an unobtrusive way.

zirain · 2024-08-08T13:42:20Z

after #34184 merged, able to close this?

OS-ramamurtisubramanian · 2024-08-13T04:56:10Z

Hi @EItanya, I'm trying to use the hits addend with istio. Can you please provide me an example of how to configure this as an EnvoyFilter?

I was trying to use the set filter state filter to set the envoy.ratelimit.hits_addend filter state from a request header, but It was not working.

I get the following error.

Error adding/updating listener(s) virtualInbound: 'envoy.ratelimit.hits_addend' does not have an object factory.

zirain · 2024-08-13T06:37:47Z

please use master branch

OS-ramamurtisubramanian · 2024-08-14T04:54:26Z

Hi @zirain , I managed to build and use the piot and proxyv2 images of istio from master branch.

I am tryting to create the EnvoyFilter objects.
envoyfilter_hits_addend.txt

Is this the correct way to set the envoy.ratelimit.hits_addend filter state from a request header called hits, before the rate limit filter?

zirain · 2024-08-14T06:55:30Z

be careful of inserting a filter based on something that is created by another envoyfilter.

gcalmettes · 2024-10-08T16:56:00Z

Seeing the same problem than the one described by @OS-ramamurtisubramanian on the latest v1.31.2, when trying to use the envoy.filters.http.set_filter_state filter to set the envoy.ratelimit.hits_addend state key.

          - name: envoy.filters.http.set_filter_state
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.set_filter_state.v3.Config
              on_request_headers:
              - object_key: envoy.ratelimit.hits_addend
                format_string:
                  text_format_source:
                    inline_string: "0"

Error log is:

[main] [source/server/server.cc:412] error initializing config '  /etc/envoy/envoy.yaml': 'envoy.ratelimit.hits_addend' does not have an object factory

Is there another configuration to add ?

zirain · 2024-10-08T22:59:49Z

Seeing the same problem than the one described by @OS-ramamurtisubramanian on the latest v1.31.2, when trying to use the envoy.filters.http.set_filter_state filter to set the envoy.ratelimit.hits_addend state key.
          - name: envoy.filters.http.set_filter_state
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.set_filter_state.v3.Config
              on_request_headers:
              - object_key: envoy.ratelimit.hits_addend
                format_string:
                  text_format_source:
                    inline_string: "0"
Error log is:
[main] [source/server/server.cc:412] error initializing config '  /etc/envoy/envoy.yaml': 'envoy.ratelimit.hits_addend' does not have an object factory 
Is there another configuration to add ?

I cannot recall, but can you give a try with main branch?

gcalmettes · 2024-10-09T16:16:17Z

@zirain I just tried using a freshly built envoy binary from the main branch.

> ./envoy --version          

./envoy  version: 51e253405a2be7f94df8c0ba78bd884dc79bb8a5/1.32.0-dev/Modified/DEBUG/BoringSSL

Configuration tested:

admin:
  address:
    socket_address: { address: 127.0.0.1, port_value: 9901 }

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address: { address: 127.0.0.1, port_value: 10000 }
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: AUTO
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match: { prefix: "/" }
                route: { cluster: some_service }
          http_filters:
          - name: envoy.filters.http.set_filter_state
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.set_filter_state.v3.Config
              on_request_headers:
              - object_key: envoy.ratelimit.hits_addend
                format_string:
                  text_format_source:
                    inline_string: "0"
          - name: envoy.filters.http.ratelimit
            typed_config:
              '@type': type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
              domain: rpm
              enable_x_ratelimit_headers: DRAFT_VERSION_03
              failure_mode_deny: false
              rate_limit_service:
                grpc_service:
                  envoy_grpc:
                    cluster_name: ratelimit
                transport_api_version: V3
              rate_limited_as_resource_exhausted: true
              request_type: external
              stage: 0
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

  clusters:
  - name: some_service
    connect_timeout: 0.25s
    type: STATIC
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: some_service
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 127.0.0.1
                port_value: 1234
  - name: ratelimit
    connect_timeout: 1s
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: ratelimit
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 127.0.0.1
                port_value: 5001

same error:

[2024-10-09 18:14:52.204][619209][info][main] [source/server/server.cc:871] runtime: {}
[2024-10-09 18:14:52.206][619209][info][admin] [source/server/admin/admin.cc:65] admin address: 127.0.0.1:9901
[2024-10-09 18:14:52.209][619209][info][config] [source/server/configuration_impl.cc:168] loading tracing configuration
[2024-10-09 18:14:52.209][619209][info][config] [source/server/configuration_impl.cc:124] loading 0 static secret(s)
[2024-10-09 18:14:52.209][619209][info][config] [source/server/configuration_impl.cc:130] loading 2 cluster(s)
[2024-10-09 18:14:52.231][619209][info][config] [source/server/configuration_impl.cc:138] loading 1 listener(s)
[2024-10-09 18:14:52.241][619209][info][config] [source/server/configuration_impl.cc:168] loading tracing configuration
[2024-10-09 18:14:52.241][619209][info][config] [source/server/configuration_impl.cc:124] loading 0 static secret(s)
[2024-10-09 18:14:52.241][619209][info][config] [source/server/configuration_impl.cc:130] loading 2 cluster(s)
[2024-10-09 18:14:52.258][619209][info][config] [source/server/configuration_impl.cc:138] loading 1 listener(s)
[2024-10-09 18:14:52.266][619209][critical][main] [source/server/server.cc:412] error initializing config '  envoy-basic.yaml': 'envoy.ratelimit.hits_addend' does not have an object factory
[2024-10-09 18:14:52.268][619209][info][main] [source/server/server.cc:1042] exiting
'envoy.ratelimit.hits_addend' does not have an object factory

zirain · 2024-10-09T23:17:00Z

I'm not sure how you build it, I cannot reproduce it on my machine.

bazel build envoy
cp bazel-bin/source/exe/envoy-static /usr/local/bin/envoy-dev
envoy-dev -c envoy.yaml

gcalmettes · 2024-10-10T13:29:56Z

@zirain , sorry, I must have missed something in my first build (I was using the docker script provided). Trying with your command indeed works. Thank you !
It's very useful to set a different hitsAddend value per filter for different domains when multiple ratelimit filters are chained.

mattklein123 added area/ratelimit help wanted Needs help! labels Sep 3, 2020

guicassolato mentioned this issue Feb 28, 2023

RateLimitPolicy v2 Kuadrant/architecture#8

Merged

lizzzcai mentioned this issue Aug 24, 2023

Send ratelimit request to ratelimit service on response flow with hits_addend #29161

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Envoy to pass hits_addend to RateLimitService #12969

Envoy to pass hits_addend to RateLimitService #12969

wwillsey commented Sep 3, 2020

wwillsey commented Sep 3, 2020

mattklein123 commented Sep 3, 2020

medalliaerlich commented Jan 28, 2021

sc0ttbeardsley commented Nov 1, 2022

lizzzcai commented Jul 5, 2023

PeterL328 commented Aug 18, 2023

PeterL328 commented Aug 18, 2023 •

edited

Loading

lizzzcai commented Aug 24, 2023

PeterL328 commented Sep 7, 2023

EItanya commented May 16, 2024

zirain commented Aug 8, 2024

OS-ramamurtisubramanian commented Aug 13, 2024 •

edited

Loading

zirain commented Aug 13, 2024

OS-ramamurtisubramanian commented Aug 14, 2024 •

edited

Loading

zirain commented Aug 14, 2024

gcalmettes commented Oct 8, 2024 •

edited

Loading

zirain commented Oct 8, 2024

gcalmettes commented Oct 9, 2024

zirain commented Oct 9, 2024

gcalmettes commented Oct 10, 2024 •

edited

Loading

Envoy to pass hits_addend to RateLimitService #12969

Envoy to pass hits_addend to RateLimitService #12969

Comments

wwillsey commented Sep 3, 2020

Description

Use case

wwillsey commented Sep 3, 2020

mattklein123 commented Sep 3, 2020

medalliaerlich commented Jan 28, 2021

sc0ttbeardsley commented Nov 1, 2022

lizzzcai commented Jul 5, 2023

PeterL328 commented Aug 18, 2023

PeterL328 commented Aug 18, 2023 • edited Loading

lizzzcai commented Aug 24, 2023

PeterL328 commented Sep 7, 2023

EItanya commented May 16, 2024

zirain commented Aug 8, 2024

OS-ramamurtisubramanian commented Aug 13, 2024 • edited Loading

zirain commented Aug 13, 2024

OS-ramamurtisubramanian commented Aug 14, 2024 • edited Loading

zirain commented Aug 14, 2024

gcalmettes commented Oct 8, 2024 • edited Loading

zirain commented Oct 8, 2024

gcalmettes commented Oct 9, 2024

zirain commented Oct 9, 2024

gcalmettes commented Oct 10, 2024 • edited Loading

PeterL328 commented Aug 18, 2023 •

edited

Loading

OS-ramamurtisubramanian commented Aug 13, 2024 •

edited

Loading

OS-ramamurtisubramanian commented Aug 14, 2024 •

edited

Loading

gcalmettes commented Oct 8, 2024 •

edited

Loading

gcalmettes commented Oct 10, 2024 •

edited

Loading