diff --git a/rfcs/0000-rlp-v2.md b/rfcs/0000-rlp-v2.md new file mode 100644 index 00000000..58c5f591 --- /dev/null +++ b/rfcs/0000-rlp-v2.md @@ -0,0 +1,1321 @@ +# RFC RateLimitPolicy v2 + +- Feature Name: `rlp-v2` +- Start Date: 2023-02-02 +- RFC PR: [Kuadrant/architecture#8](https://github.com/Kuadrant/architecture/pull/8) +- Issue tracking: N/A + +## Summary +[summary]: #summary + +Proposal of new API for the Kuadrant's `RateLimitPolicy` (RLP) CRD, for improved UX. + +### Motivation +[motivation]: #motivation + +The [`RateLimitPolicy`](https://pkg.go.dev/github.com/kuadrant/kuadrant-operator@v0.2.1/api/v1beta1#RateLimitPolicy) API (v1beta1), particularly its [`RateLimit`](https://pkg.go.dev/github.com/kuadrant/kuadrant-operator@v0.2.1/api/v1beta1#RateLimit) type used in `ratelimitpolicy.spec.rateLimits`, designed in part to fit the underlying implementation based on the Envoy [Rate limit](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/rate_limit_filter) filter, has been proven to be complex, as well as somewhat limiting for the extension of the API for other platforms and/or for supporting use cases of not contemplated in the original design. + +Users of the `RateLimitPolicy` will immediately recognize elements of Envoy's Rate limit API in the definitions of the `RateLimit` type, with almost 1:1 correspondence between the [`Configuration`](https://pkg.go.dev/github.com/kuadrant/kuadrant-operator@v0.2.1/api/v1beta1#Configuration) type and its [counterpart in the Envoy configuration](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-msg-config-route-v3-ratelimit). Although compatibility between those continue to be desired, leaking such implementation details to the level of the API can be avoided to provide a better abstraction for activators ("matchers") and payload ("descriptors"), stated by users in a seamless way. + +Furthermore, the [`Limit`](https://pkg.go.dev/github.com/kuadrant/kuadrant-operator@v0.2.1/api/v1beta1#Limit) type – used as well in the RLP's `RateLimit` type – implies presently a logical relationship between its inner concepts – i.e. conditions and variables on one side, and limits themselves on the other – that otherwise could be shaped in a different manner, to provide clearer understanding of the meaning of these concepts by the user and avoid repetition. I.e., one limit definition contains multiple rate limits, and not the other way around. + +### Goals + +1. Decouple the API from the underlying implementation - i.e. provide a more generic and more user-friendly abstraction +2. Prepare the API for upcoming changes in the Gateway API Policy Attachment specification +3. Improve consistency of the API with respect to Kuadrant's [AuthPolicy](https://pkg.go.dev/github.com/kuadrant/kuadrant-operator@v0.2.1/api/v1beta1#AuthPolicy) CRD - i.e. same language, similar UX + +### Current WIP to consider + +1. Policy attachment update ([kubernetes-sigs/gateway-api#1565](https://github.com/kubernetes-sigs/gateway-api/pull/1565)) +2. *No* merging of policies ([kuadrant/architecture#10](https://github.com/Kuadrant/architecture/pull/10)) +3. A single Policy scoped to HTTPRoutes and HTTPRouteRule ([kuadrant/architecture#4](https://github.com/Kuadrant/architecture/pull/4)) - future +4. Implement `skip_if_absent` for the RequestHeaders action ([kuadrant/wasm-shim#29](https://github.com/Kuadrant/wasm-shim/issues/29)) + +### Highlights + +- `spec.rateLimits[]` replaced with `spec.limits{: }` +- `spec.rateLimits.limits` replaced with `spec.limits..rates` +- `spec.rateLimits.limits.maxValue` replaced with `spec.limits..rates.limit` +- `spec.rateLimits.limits.seconds` replaced with `spec.limits..rates.duration` + `spec.limits..rates.unit` +- `spec.rateLimits.limits.conditions` replaced with `spec.limits..when`, structured field based on _well-known selectors_, mainly for expressing conditions not related to the HTTP route (although not exclusively) +- `spec.rateLimits.limits.variables` replaced with `spec.limits..counters`, based on _well-known selectors_ +- `spec.rateLimits.rules` replaced with `spec.limits..triggers`, for selecting (or "sub-targeting") HTTPRouteRules that trigger the limit +- new matcher `spec.limits..triggers.hostnames[]` +- `spec.rateLimits.configurations` removed – descriptor actions configuration (previously `spec.rateLimits.configurations.actions`) generated from `spec.limits..when.selector` ∪ `spec.limits..counters` and unique identifier of the limit (associated with `spec.limits..triggers`) +- Limitador conditions composed of "soft" `spec.limits..when` conditions + a "hard" condition that binds the limit to its trigger HTTPRouteRules + +For detailed differences between current and vew RLP API, see [Comparison to current RateLimitPolicy](#comparison-to-current-ratelimitpolicy). + +## Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +### Examples of RLPs based on the new API + +Given the following network resources: + +```yaml +apiVersion: gateway.networking.k8s.io/v1alpha2 +kind: Gateway +metadata: + name: istio-ingressgateway + namespace: istio-system +spec: + gatewayClassName: istio + listeners: + - hostname: + - "*.acme.com" +--- +apiVersion: gateway.networking.k8s.io/v1alpha2 +kind: HTTPRoute +metadata: + name: toystore + namespace: toystore +spec: + parentRefs: + - name: istio-ingressgateway + namespace: istio-system + hostnames: + - "*.toystore.acme.com" + rules: + - matches: + - path: + type: PathPrefix + value: "/toys" + method: GET + - path: + type: PathPrefix + value: "/toys" + method: POST + backendRefs: + - name: toystore + port: 80 + - matches: + - path: + type: PathPrefix + value: "/assets/" + backendRefs: + - name: toystore + port: 80 + filters: + - type: ResponseHeaderModifier + responseHeaderModifier: + set: + - name: Cache-Control + value: "max-age=31536000, immutable" +``` + +The following are examples of RLPs targeting the route and the gateway. Each example is independent from the other. + +#### Example 1. Minimal example - network resource targeted entirely without filtering, unconditional and unqualified rate limiting + +In this example, all traffic to `*.toystore.acme.com` will be limited to 5rps, regardless of any other attribute of the HTTP request (method, path, headers, etc), without any extra "soft" conditions (conditions non-related to the HTTP route), across all consumers of the API (unqualified rate limiting). + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toystore-infra-rl + namespace: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + base: # user-defined name of the limit definition - future use for handling hierarchical policy attachment + - rates: # at least one rate limit required + - limit: 5 + unit: second +``` + +
+ How is this RLP implemented under the hood? + + ```yaml + gateway_actions: + - rules: + - paths: ["/toys*"] + methods: ["GET"] + hosts: ["*.toystore.acme.com"] + - paths: ["/toys*"] + methods: ["POST"] + hosts: ["*.toystore.acme.com"] + - paths: ["/assets/*"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-infra-rl/base" + ``` + + ```yaml + limits: + - conditions: + - ratelimit.binding == "toystore/toystore-infra-rl/base" + max_value: 5 + seconds: 1 + namespace: TDB + ``` +
+ +#### Example 2. Targeting specific route rules, with counter qualifiers, multiple rates per limit definition and "soft" conditions + +In this example, a distinct limit will be associated ("bound") to each individual HTTPRouteRule of the targeted HTTPRoute, by using the `triggers` field for selecting (or "sub-targeting") the HTTPRouteRule. + +The following limit definitions will be bound to each HTTPRouteRule: +- `/toys*` → 50rpm, enforced per username (counter qualifier) and only in case the user is not an admin ("soft" condition). +- `/assets/*` → 5rpm / 100rp12h + +Each set of trigger matches in the RLP will be matched to **all** HTTPRouteRules whose HTTPRouteMatches is a superset of the set of trigger matches in the RLP. For every HTTPRouteRule matched, the HTTPRouteRule will be bound to the corresponding limit definition that specifies that trigger. In case no HTTPRouteRule is found containing at least one HTTPRouteMatch that is identical to some set of matching rules of a particular limit definition, the limit definition is considered invalid and reported as such in the status of RLP. + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toystore-per-endpoint + namespace: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + toys: + rates: + - limit: 50 + duration: 1 + unit: minute + counters: + - auth.identity.username + triggers: + - matches: # matches the 1st HTTPRouteRule (i.e. GET or POST to /toys*) + - path: + type: PathPrefix + value: "/toys" + when: + - selector: auth.identity.group + operator: neq + value: admin + + assets: + rates: + - limit: 5 + duration: 1 + unit: minute + - limit: 100 + duration: 12 + unit: hour + triggers: + - matches: # matches the 2nd HTTPRouteRule (i.e. /assets/*) + - path: + type: PathPrefix + value: "/assets/" +``` + +
+ How is this RLP implemented under the hood? + + ```yaml + gateway_actions: + - rules: + - paths: ["/toys*"] + methods: ["GET"] + hosts: ["*.toystore.acme.com"] + - paths: ["/toys*"] + methods: ["POST"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-per-endpoint/toys" + - metadata: + descriptor_key: "auth.identity.group" + metadata_key: + key: "envoy.filters.http.ext_authz" + path: + - segment: + key: "identity" + - segment: + key: "group" + - metadata: + descriptor_key: "auth.identity.username" + metadata_key: + key: "envoy.filters.http.ext_authz" + path: + - segment: + key: "identity" + - segment: + key: "username" + - rules: + - paths: ["/assets/*"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-per-endpoint/assets" + ``` + + ```yaml + limits: + - conditions: + - ratelimit.binding == "toystore/toystore-per-endpoint/toys" + - auth.identity.group != "admin" + variables: + - auth.identity.username + max_value: 50 + seconds: 60 + namespace: TBD + - conditions: + - ratelimit.binding == "toystore/toystore-per-endpoint/assets" + max_value: 5 + seconds: 60 + namespace: TBD + - conditions: + - ratelimit.binding == "toystore/toystore-per-endpoint/assets" + max_value: 100 + seconds: 43200 # 12 hours + namespace: TBD + ``` +
+ +#### Example 3. Targeting a subset of an HTTPRouteRule - HTTPRouteMatch missing + +Consider a 150rps rate limit set on requests to `GET /toys/special`. Such specific application endpoint is covered by the first HTTPRouteRule in the HTTPRoute (as a subset of `GET` or `POST` to any path that starts with `/toys`). However, to avoid binding limits to HTTPRouteRules that are more permissive than the actual intended scope of the limit, the RateLimitPolicy controller requires trigger matches to find identical matching rules explicitly defined amongst the sets of HTTPRouteMatches of the HTTPRouteRules potentially targeted. + +As a consequence, by simply defining a trigger match for `GET /toys/special` in the RLP, the `GET|POST /toys*` HTTPRouteRule will NOT be bound to the limit definition. In order to ensure the limit definition is properly bound to a routing rule that strictly covers the `GET /toys/special` application endpoint, first the user has to modify the spec of the HTTPRoute by adding an explicit HTTPRouteRule for this case: + +```yaml +apiVersion: gateway.networking.k8s.io/v1alpha2 +kind: HTTPRoute +metadata: + name: toystore + namespace: toystore +spec: + parentRefs: + - name: istio-ingressgateway + namespace: istio-system + hostnames: + - "*.toystore.acme.com" + rules: + - matches: + - path: + type: PathPrefix + value: "/toys" + method: GET + - path: + type: PathPrefix + value: "/toys" + method: POST + backendRefs: + - name: toystore + port: 80 + - matches: + - path: + type: PathPrefix + value: "/assets/" + backendRefs: + - name: toystore + port: 80 + filters: + - type: ResponseHeaderModifier + responseHeaderModifier: + set: + - name: Cache-Control + value: "max-age=31536000, immutable" + - matches: # new (more specific) HTTPRouteRule added + - path: + type: Exact + value: "/toys/special" + method: GET + backendRefs: + - name: toystore + port: 80 +``` + +After that, the RLP can target the new HTTPRouteRule strictly: + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toystore-special-toys + namespace: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + specialToys: + rates: + - limit: 150 + unit: second + triggers: + - matches: # matches the new HTTPRouteRule (i.e. GET /toys/special) + - path: + type: Exact + value: "/toys/special" + method: GET +``` + +
+ How is this RLP implemented under the hood? + + ```yaml + gateway_actions: + - rules: + - paths: ["/toys/special"] + methods: ["GET"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-special-toys/specialToys" + ``` + + ```yaml + limits: + - conditions: + - ratelimit.binding == "toystore/toystore-special-toys/specialToys" + max_value: 150 + seconds: 1 + namespace: TBD + ``` +
+ +#### Example 4. Targeting a subset of an HTTPRouteRule - HTTPRouteMatch found + +This example is similar to [Example 3](#example-3-targeting-a-subset-of-an-httprouterule---httproutematch-missing). Consider the use case of setting a 150rpm rate limit on requests to `GET /toys*`. + +The targeted application endpoint is covered by the first HTTPRouteRule in the HTTPRoute (as a subset of `GET` or `POST` to any path that starts with `/toys`). However, unlike in the previous example where, at first, no HTTPRouteRule included an explicit HTTPRouteMatch for `GET /toys/special`, in this example the HTTPRouteMatch for the targeted application endpoint `GET /toys*` does exist explicitly in one of the HTTPRouteRules, thus the RateLimitPolicy controller would find no problem to bind the limit definition to the HTTPRouteRule. That would nonetheless cause a unexpected behavior of the limit triggered not strictly for `GET /toys*`, but also for `POST /toys*`. + +To avoid extending the scope of the limit beyiond desired, with no extra "soft" conditions, again the user must modify the spec of the HTTPRoute, so an exclusive HTTPRouteRule exists for the `GET /toys*` application endpoint: + +```yaml +apiVersion: gateway.networking.k8s.io/v1alpha2 +kind: HTTPRoute +metadata: + name: toystore + namespace: toystore +spec: + parentRefs: + - name: istio-ingressgateway + namespace: istio-system + hostnames: + - "*.toystore.acme.com" + rules: + - matches: # first HTTPRouteRule split into two – one for GET /toys*, other for POST /toys* + - path: + type: PathPrefix + value: "/toys" + method: GET + backendRefs: + - name: toystore + port: 80 + - matches: + - path: + type: PathPrefix + value: "/toys" + method: POST + backendRefs: + - name: toystore + port: 80 + - matches: + - path: + type: PathPrefix + value: "/assets/" + backendRefs: + - name: toystore + port: 80 + filters: + - type: ResponseHeaderModifier + responseHeaderModifier: + set: + - name: Cache-Control + value: "max-age=31536000, immutable" +``` + +The RLP can then target the new HTTPRouteRule strictly: + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toy-readers + namespace: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + toyReaders: + rates: + - limit: 150 + unit: second + triggers: + - matches: # matches the new more speficic HTTPRouteRule (i.e. GET /toys*) + - path: + type: PathPrefix + value: "/toys" + method: GET +``` + +
+ How is this RLP implemented under the hood? + + ```yaml + gateway_actions: + - rules: + - paths: ["/toys*"] + methods: ["GET"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toy-readers/toyReaders" + ``` + + ```yaml + limits: + - conditions: + - ratelimit.binding == "toystore/toy-readers/toyReaders" + max_value: 150 + seconds: 1 + namespace: TBD + ``` +
+ +#### Example 5. One limit triggered by multiple HTTPRouteRules + +In this example, both HTTPRouteRules, i.e. `GET|POST /toys*` and `/assets/*`, are targeted by the same limit of 50rpm per username. + +Because the HTTPRoute has no other rule, this is technically equivalent to targeting the entire HTTPRoute and therefore similar to [Example 1](#example-1-minimal-example---network-resource-targeted-entirely-without-filtering-unconditional-and-unqualified-rate-limiting). However, if the HTTPRoute had other rules or got other rules added afterwards, this would ensure the limit applies only to the two original route rules. + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toystore-per-user + namespace: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + toysOrAssetsPerUsername: + rates: + - limit: 50 + duration: 1 + unit: minute + counters: + - auth.identity.username + triggers: + - matches: + - path: + type: PathPrefix + value: "/toys" + method: GET + - path: + type: PathPrefix + value: "/toys" + method: POST + - matches: + - path: + type: PathPrefix + value: "/assets/" +``` + +
+ How is this RLP implemented under the hood? + + ```yaml + gateway_actions: + - rules: + - paths: ["/toys*"] + methods: ["GET"] + hosts: ["*.toystore.acme.com"] + - paths: ["/toys*"] + methods: ["POST"] + hosts: ["*.toystore.acme.com"] + - paths: ["/assets/*"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-per-user/toysOrAssetsPerUsername" + - metadata: + descriptor_key: "auth.identity.username" + metadata_key: + key: "envoy.filters.http.ext_authz" + path: + - segment: + key: "identity" + - segment: + key: "username" + ``` + + ```yaml + limits: + - conditions: + - ratelimit.binding == "toystore/toystore-per-user/toysOrAssetsPerUsername" + variables: + - auth.identity.username + max_value: 50 + seconds: 60 + namespace: TBD + ``` +
+ +#### Example 6. Multiple limit definitions targeting the same HTTPRouteRule + +In case multiple limit definitions target a same HTTPRouteRule, only the first limit definition will be bound to the HTTPRouteRule, thus "shadowing" the other limit definitions for that one HTTPRouteRule. + +E.g., the following RLP will set 20rps on `GET|POST /toys*` and 100rps on `/assets/*`: + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toystore-per-endpoint + namespace: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + readToys: + rates: + - limit: 50 + unit: second + counters: + - auth.identity.username + triggers: + - matches: # matches the 1st HTTPRouteRule (i.e. GET or POST to /toys*) + - path: + type: PathPrefix + value: "/toys" + method: GET + + assets: + rates: + - limit: 100 + unit: second + triggers: + - matches: # matches the 1st HTTPRouteRule (i.e. GET or POST to /toys*) but it's shadowed by 'readToys' + - path: + type: PathPrefix + value: "/toys" + method: POST + - matches: # matches the 2nd HTTPRouteRule (i.e. /assets/*) + - path: + type: PathPrefix + value: "/assets/" +``` + +
+ How is this RLP implemented under the hood? + + ```yaml + gateway_actions: + - rules: + - paths: ["/toys*"] + methods: ["GET"] + hosts: ["*.toystore.acme.com"] + - paths: ["/toys*"] + methods: ["POST"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-per-endpoint/readToys" + - metadata: + descriptor_key: "auth.identity.username" + metadata_key: + key: "envoy.filters.http.ext_authz" + path: + - segment: + key: "identity" + - segment: + key: "username" + - rules: + - paths: ["/assets/*"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-per-endpoint/assets" + ``` + + ```yaml + limits: + - conditions: # actually applies to GET|POST /toys* + - ratelimit.binding == "toystore/toystore-per-endpoint/readToys" + variables: + - auth.identity.username + max_value: 50 + seconds: 1 + namespace: TBD + - conditions: # does not apply to POST /toys* + - ratelimit.binding == "toystore/toystore-per-endpoint/assets" + max_value: 100 + seconds: 1 + namespace: TBD + ``` +
+ +#### Example 7. Limits triggered for specific hostnames + +In the previous examples, the limit definitions and therefore the counters were set indistinctly for all hostnames – i.e. no matter if the request is sent to `games.toystore.acme.com` or `dolls.toystore.acme.com`, the same counters are expected to be affected. In this example on the other hand, a 1000rpd rate limit is set for requests to `/assets/*` only when the hostname matches `games.toystore.acme.com`. + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toystore-per-hostname + namespace: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + games: + rates: + - limit: 1000 + unit: day + triggers: + - matches: + - path: + type: PathPrefix + value: "/assets/" + hostnames: + - games.toystore.acme.com +``` + +
+ How is this RLP implemented under the hood? + + ```yaml + gateway_actions: + - rules: + - paths: ["/assets/*"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-per-hostname/games" + - request_headers: + descriptor_key: "context.request.http.host" + header_name: ":authority" + ``` + + ```yaml + limits: + - conditions: + - ratelimit.binding == "toystore/toystore-per-hostname/games" + - context.request.http.host == "games.toystore.acme.com" + max_value: 1000 + seconds: 86400 # 1 day + namespace: TBD + ``` +
+ +#### Example 8. Targeting the Gateway + +> **Note:** Additional meaning and context may be given to this use case in the future, when discussing [defaults and overrides](https://gateway-api.sigs.k8s.io/references/policy-attachment/#hierarchy). + +Targeting a Gateway is a shortcut to targeting all individual HTTPRoutes referencing the gateway as parent. This differs from [Example 1](#example-1-minimal-example---network-resource-targeted-entirely-without-filtering-unconditional-and-unqualified-rate-limiting) nonetheless because, by targeting the gateway rather than an individual HTTPRoute, the RLP applies automatically to all HTTPRoutes pointing to the gateway, including routes created before and after the creation of the RLP. Moreover, all those routes will share the same limit counters specified in the RLP. + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: gw-rl + namespace: istio-ingressgateway +spec: + targetRef: + group: gateway.networking.k8s.io + kind: Gateway + name: istio-ingressgateway + limits: + base: + - rates: + - limit: 5 + unit: second +``` + +
+ How is this RLP implemented under the hood? + + ```yaml + gateway_actions: + - rules: + - paths: ["/toys*"] + methods: ["GET"] + hosts: ["*.toystore.acme.com"] + - paths: ["/toys*"] + methods: ["POST"] + hosts: ["*.toystore.acme.com"] + - paths: ["/assets/*"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "istio-system/gw-rl/base" + ``` + + ```yaml + limits: + - conditions: + - ratelimit.binding == "istio-system/gw-rl/base" + max_value: 5 + seconds: 1 + namespace: TDB + ``` +
+ +### Comparison to current RateLimitPolicy + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CurrentNewReason
1:1 relation between Limit (the object) and the actual Rate limit (the value) (spec.rateLimits.limits)Rate limit becomes a detail of Limit where each limit may define one or more rates (1:N) (spec.limits.<limit-name>.rates) +
    +
  • It allows to reuse when conditions and counters for groups of rate limits
  • +
+
Parsed spec.rateLimits.limits.conditions field, directly exposing the Limitador's APIStructured spec.limits.<limit-name>.when condition field composed of 3 well-defined properties: selector, operator and value + +
spec.rateLimits.configurations as a list of "variables assignments" and direct exposure of Envoy's RL descriptor actions APIDescriptor actions composed from selectors used in the limit definitions (spec.limits.<limit-name>.when.selector and spec.limits.<limit-name>.counters) plus a fixed identifier of the route rules (spec.limits.<limit-name>.triggers) +
    +
  • Abstract the Envoy-specific concepts of "actions" and "descriptors"
  • +
  • No risk of mismatching descriptors keys between "actions" and actual usage in the limits
  • +
  • No user-defined generic descriptors (e.g. "limited = 1")
  • +
  • Source value of the selectors defined from an implicit "context" data structure
  • +
+
Key-value descriptorsStructured descriptors referring to a contextual well-known data structure + +
Limitador conditions independent from the route rulesArtificial Limitador condition injected to bind routes and corresponding limits +
    +
  • Ensure the limit is enforced only for corresponding selected HTTPRouteRules
  • +
+
translate(spec.rateLimits.rules) ⊂ httproute.spec.rulesspec.limits.<limit-name>.triggers.matches ⊆ httproute.spec.rules.matches +
    +
  • HTTPRouteRule selector (via HTTPRouteMatch subset)
  • +
  • Gateway API language
  • +
  • Preparation for inherited policies and defaults & overrides
  • +
+
spec.rateLimits.limits.secondsspec.limits.<limit-name>.rates.duration and spec.limits.<limit-name>.rates.unit +
    +
  • Support for more units beyond seconds
  • +
  • duration: 1 by default
  • +
+
spec.rateLimits.limits.variablesspec.limits.<limit-name>.counters +
    +
  • Improved (more specific) naming
  • +
+
spec.rateLimits.limits.maxValuespec.limits.<limit-name>.rates.limit +
    +
  • Improved (more generic) naming
  • +
+
+ +## Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +By completely dropping out the `configurations` field from the RLP, [composing the RL descriptor actions](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/rate_limit_filter#composing-actions) is now done based essentially on the selectors listed in the `when` conditions and the `counters`, plus an artificial condition used to bind the HTTPRouteRules to the corresponding limits to trigger in Limitador. + +The descriptor actions composed from the selectors in the "soft" `when` conditions and counter qualifiers originate from the direct references these selectors make to paths within a well-known data structure that stores information about the context (HTTP request and ext-authz filter). These selectors in "soft" `when` conditions and counter qualifiers are thereby called _well-known selectors_. + +Other descriptor actions might be composed by the RLP controller to define additional RL conditions to bind HTTPRouteRules and corresponding limits. + +### Well-known selectors + +Each selector used in a `when` condition or counter qualifier is a direct reference to a path within a well-known data structure that stores information about the `context` (L4 and L7 data of the original request handled by the proxy), as well as `auth` data (dynamic metadata occasionally exported by the external authorization filter and injected by the proxy into the rate-limit filter). + +The well-known data structure for building RL descriptor actions resembles Authorino's ["Authorization JSON"](https://github.com/Kuadrant/authorino/blob/main/docs/architecture.md#the-authorization-json), whose `context` component consists of Envoy's [`AttributeContext`](https://pkg.go.dev/github.com/envoyproxy/go-control-plane/envoy/service/auth/v3?utm_source=gopls#AttributeContext) type of the external authorization API (marshalled as JSON). Compared to the more generic [`RateLimitRequest`](https://pkg.go.dev/github.com/envoyproxy/go-control-plane@v0.11.0/envoy/service/ratelimit/v3#RateLimitRequest) struct, the `AttributeContext` provides a more structured and arguibly more intuitive relation between the data sources for the RL descriptors actions and their corresponding key names through which the values are referred within the RLP, in a context of predominantly serving for HTTP applications. + +To keep compatibility with the [Envoy Rate Limit API](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/rate_limit_filter), the well-known data structure can optionally be extended with the `RateLimitRequest`, thus resulting in the following final structure. + +```yaml +context: # Envoy's Ext-Authz `CheckRequest.AttributeContext` type + source: + address: … + service: … + … + destination: + address: … + service: … + … + request: + http: + host: … + path: … + method: … + headers: {…} + +auth: # Dynamic metadata exported by the external authorization service + +ratelimit: # Envoy's Rate Limit `RateLimitRequest` type + domain: … # generated by the Kuadrant controller + descriptors: {…} # descriptors configured by the user directly in the proxy (not generated by the Kuadrant controller, if allowed) + hitsAddend: … # only in case we want to allow users to refer to this value in a policy +``` + +### Mechanics of generating RL descriptor actions + +From the perspective of a user who writes a RLP, the selectors used in then `when` and `counters` fields are paths to the well-known data structure (see [_Well-known selectors_](#well-known-selectors)). While desiging a policy, the user intuitively pictures the well-known data structure and states each limit definition having in mind the possible values assumed by each of those paths in the data plane. For example, + +The user story: +> _Each distinct user (`auth.identity.username`) can send no more than 1rps to the same HTTP path (`context.request.http.path`)._ + +...materializes as the following RLP: + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + dolls: + rates: + - limit: 1 + unit: second + counters: + - auth.identity.username + - context.request.http.path +``` + +The following selectors are to be interpreted by the RLP controller: +- `auth.identity.username` +- `context.request.http.path` + +The RLP controller uses a map to translate each selector into its corresponding descriptor action. (Roughly described:) + +``` +context.source.address → source_cluster(...) # TBC +context.source.service → source_cluster(...) # TBC +context.destination... → destination_cluster(...) +context.destination... → destination_cluster(...) +context.request.http. → request_headers(header_name: ":") +context.request... → ... +auth. → metadata(key: "envoy.filters.http.ext_authz", path: ) +ratelimit.domain → +``` + +...to yield effectively: + +```yaml +rate_limits: +- actions: + - metadata: + descriptor_key: "auth.identity.username" + metadata_key: + key: "envoy.filters.http.ext_authz" + path: + - segment: + key: "identity" + - segment: + key: "username" + - request_headers: + descriptor_key: "context.request.http.path" + header_name: ":path" +``` + +### Artificial Limitador condition for `triggers` + +For each limit definition that explicitly or implicitly defines a `triggers` field, the RLP controller will generate an artificial Limitador condition that ensures that the limit applies only when the filterred rules are honoured when serving the request. This can be implemented with a 2-step procedure: +1. generate an unique identifier of the limit - i.e. `//` +2. associate a `generic_key` type descriptor action with each `HTTPRouteRule` targeted by the limit – i.e. `{ descriptor_key: "ratelimit.binding", descriptor_value: }`. + +For example, given the following RLP: + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toystore-non-admin-users + namespace: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + toys: + matches: + - path: + type: PathPrefix + value: "/toys" + method: GET + - path: + type: PathPrefix + value: "/toys" + method: POST + rates: + - limit: 50 + duration: 1 + unit: minute + when: + - selector: auth.identity.group + operator: neq + value: admin + + assets: + matches: + - path: + type: PathPrefix + value: "/assets/" + rates: + - limit: 5 + duration: 1 + unit: minute + when: + - selector: auth.identity.group + operator: neq + value: admin +``` + +Apart from the following descriptor action associated with both routes: + +```yaml +- metadata: + descriptor_key: "auth.identity.group" + metadata_key: + key: "envoy.filters.http.ext_authz" + path: + - segment: + key: "identity" + - segment: + key: "group" +``` + +...and its corresponding Limitador condition: + +``` +auth.identity.group != "admin" +``` + +The following additional artificial descriptor actions will be generated: + +```yaml +# associated with route rule GET|POST /toys* +- generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-non-admin-users/toys" + +# associated with route rule /assets/* +- generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-non-admin-users/assets" +``` + +...and their corresponding Limitador conditions. + +In the end, the following Limitador configuration is yielded: + +```yaml +- conditions: + - ratelimit.binding == "toystore/toystore-non-admin-users/toys" + - auth.identity.group != "admin" + max_value: 50 + seconds: 60 + namespace: TBD + +- conditions: + - ratelimit.binding == "toystore/toystore-non-admin-users/assets" + - auth.identity.group != "admin" + max_value: 5 + seconds: 60 + namespace: TBD +``` + +### Support in wasm shim and Envoy RL API + +This proposal tries to keep compatibility with the [Envoy API for rate limit](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/rate_limit_filter) and does not introduce any new requirement that otherwise would require the use of [wasm shim](https://github.com/Kuadrant/wasm-shim) to be implemented. + +In the case of implementation of this proposal in the wasm shim, all types of matchers supported by the [HTTPRouteMatch](https://gateway-api.sigs.k8s.io/references/spec/#gateway.networking.k8s.io/v1beta1.HTTPRouteMatch) type of Gateway API must be also supported in the `rate_limit_policies.gateway_actions.rules` field of the [wasm plugin configuration](https://github.com/Kuadrant/kuadrant-operator/blob/faf41ff6b08df27946a663c34a5736476578dea5/pkg/rlptools/wasm_utils.go#L109). These include matchers based on path (prefix, exact), headers, query string parameters and method. + +## Drawbacks +[drawbacks]: #drawbacks + +**HTTPRoute editing occasionally required**
+Need to duplicate rules that don't explicitly include a matcher wanted for the policy, so that matcher can be added as a special case for each of those rules. + +**Risk of over-targeting**
+Some HTTPRouteRules might need to be split into more specific ones so a limit definition is not bound to beyond intended (e.g. target `method: GET` when the route matches `method: POST|GET`). + +**Prone to consistency issues**
+Typos and updates to the HTTPRoute can easily cause a mismatch and invalidate a RLP. + +**Two types of conditions – "hard" `triggers` and "soft" `when` conditions**
+Although with different meanings (evaluates in the gateway vs. evaluated in Limitador) and meant for expressing different types of rules (HTTPRouteRule selectors for "hard" triggers vs. "soft" conditions based on attributes not related to the HTTP request), users might still perceive these as two ways of expressing conditions and find difficult to understand at first that "soft" conditions do not accept expressions related to attributes of the HTTP request. + +## Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +### Targeting full HTTPRouteRules + +Requiring users to specify full HTTPRouteRule matches in the RLP (as opposed to any subset of HTTPRoureMatches of targeted HTTPRouteRules – current proposal) contains some of the same [drawbacks](#drawbacks) of this proposal, such as HTTPRoute editing occasionally required and prone to consistency issues. If, on one hand, it eliminates the risk of over-targeting, on the other hand, it does it at the cost of requiring excessively verbose policies written by the users, to the point of sometimes expecting user to have to specify trigger matching rules that are significantly more than what's originally and strictly intended. + +E.g.: + +On a HTTPRoute that contains the following HTTPRouteRules (simplified representation): + +``` +{ header: x-canary=true } → backend-canary +{ * } → backend-rest +``` + +Where the user wants to define a RLP that targets `{ method: POST }`. First, the user needs to edit the HTTPRoute and duplicate the HTTPRouteRules: + +``` +{ header: x-canary=true, method: POST } → backend-canary +{ header: x-canary=true } → backend-canary +{ method: POST } → backend-rest +{ * } → backend-rest +``` + +Then, user needs to include the following trigger in the RLP so only full HTTPRouteRules are specified: + +``` +{ header: x-canary=true, method: POST } +{ method: POST } +``` + +The first matching rule of the trigger (i.e. `{ header: x-canary=true, method: POST }`) is beoynd the original user intent of targeting simply `{ method: POST }`. + +This issue can be even more concerning in the case of targeting gateways with multiple child HTTPRoutes. All the HTTPRoutes would have to be fixed and the HTTPRouteRules that cover for all the cases in all HTTPRoutes listed in the policy targeting the gateway. + +### Limit shadowing vs. All limit definitions apply + +The proposed [binding between limit definition and HTTPRouteRules that trigger the limit](#artificial-limitador-condition-for-triggers) was thought so only one limit definition can be bound to an HTTPRouteRule that triggers the limit. Once a limit is bound to the HTTPRouteRule, that limit "shadows" any other limit definition that tries to be bound to the same HTTPRouteRule. + +An alternative to such limit shadowing approach led by the first bound limit definition is allowing all limit definitions that targets (via trigger selectors) an HTTPRouteRule to be bound to that HTTPRouteRule. While the first approach ("limit shadowing") causes an artificial Limitador condition of the form `ratelimit.binding == "//"`, the alternative approach can be implemented by generating a descriptor of the following form instead: `// == "1"`. + +The downside of allowing multiple bindings to the same HTTPRouteRule is that all limits would apply in Limitador, thus making status report potently very hard. Moreover, the most restritive rate limit strategy implemented by Limitador might not be as obvious to users setting multiple limit definitions as it is when setting multiple rate limits within a single limit definition. + +### Writing "soft" `when` conditions based on attributes of the HTTP request + +As a first steps, users will not be able to write "soft" `when` conditions to selective apply rate limit definitions based on attributes of the HTTP request that otherwise could be specified using the "hard" `triggers` field of the RLP instead. + +On one hand, using `when` conditions for route filtering would make it easy to define limits when the HTTPRoute cannot be modified to include the special rule. On the other hand, users would miss information in the status. An HTTPRouteRule for `GET|POST /toys*`, for example, that is targeted with an additional "soft" `when` condition that specifies that the method must be equal to `GET` and the path exactly equal to `/toys/special` (see [Example 3](#example-3-targeting-a-subset-of-an-httprouterule---httproutematch-missing)) would be reported as rate limited with extra details that this is in fact only for `GET /toys/special`. For small deployments, this might be considered acceptable; however it would easily explode to unmanageable number of cases for deployments with only a few limit definitions and HTTPRouteRules. + +Moreover, by not specifying a more strict HTTPRouteRule for `GET /toys/special`, the RLP controller would bind the limit definition to other rules that would cause the rate limit filter to invoke the rate limit service (Limitador) for cases other than strictly `GET /toys/special`. Even if the rate limits would still be ensured to apply in Limitador only for `GET /toys/special` (due to the presence of a hypothetical "soft" `when` condition), an extra no-op hop to the rate limit service would happen. This is avoided with the current imposed limitation. + +Example of "soft" `when` conditions for rate limit based on attributes of the HTTP request (NOT SUPPORTED): + +```yaml +apiVersion: kuadrant.io/v2beta1 +kind: RateLimitPolicy +metadata: + name: toystore-special-toys + namespace: toystore +spec: + targetRef: + group: gateway.networking.k8s.io + kind: HTTPRoute + name: toystore + limits: + specialToys: + rates: + - limit: 150 + unit: second + triggers: + - matches: # matches the original HTTPRouteRule GET|POST /toys* + - path: + type: PathPrefix + value: "/toys" + method: GET + when: + - selector: context.request.http.method # cannot omit this selector or POST /toys/special would also be rate limited + operator: eq + value: GET + - selector: context.request.http.path + operator: eq + value: /toys/special +``` + +
+ How is this RLP would be implemented under the hood if supported? + + ```yaml + gateway_actions: + - rules: + - paths: ["/toys*"] + methods: ["GET"] + hosts: ["*.toystore.acme.com"] + - paths: ["/toys*"] + methods: ["POST"] + hosts: ["*.toystore.acme.com"] + configurations: + - generic_key: + descriptor_key: "ratelimit.binding" + descriptor_value: "toystore/toystore-special-toys/specialToys" + - request_headers: + descriptor_key: "context.request.http.method" + header_name: ":method" + - request_headers: + descriptor_key: "context.request.http.path" + header_name: ":path" + ``` + + ```yaml + limits: + - conditions: + - ratelimit.binding == "toystore/toystore-special-toys/specialToys" + - context.request.http.method == "GET" + - context.request.http.path == "/toys/special" + max_value: 150 + seconds: 1 + namespace: TBD + ``` +
+ +### Possible variations for the selectors (conditions and counter qualifiers) + +The main drivers behind the proposed design for the selectors (conditions and counter qualifiers), based on (i) structured condition expressions composed of fields `selector`, `operator`, and `value`, and (ii) `when` conditions and `counters` separated in two distinct fields (variation "C" below), are: +1. consistency with the Authorino `AuthConfig` API, which also specifies [`when`](https://github.com/Kuadrant/authorino/blob/main/docs/features.md#common-feature-conditions-when) conditions expressed in `selector`, `operator`, and `value` fields; +2. explicit user intent, without subtle distinction of meaning based on presence of optional fields. + +Nonetheless here are a few alternative variations to consider: + + + + + + + + + + + + + + + + + + + + + +
Structured condition expressionsParsed condition expressions
Single field + A +
+selectors:
+  - selector: context.request.http.method
+    operator: eq
+    value: GET
+  - selector: auth.identity.username
+
+ B +
+selectors:
+  - context.request.http.method == "GET"
+  - auth.identity.username
+
Distinct fields + C ⭐️ +
+when:
+  - selector: context.request.http.method
+    operator: eq
+    value: GET
+counters:
+  - auth.identity.username
+
+ D +
+when:
+  - context.request.http.method == "GET"
+counters:
+  - auth.identity.username
+
+ +⭐️ Variation adopted for the examples and (so far) final design proposal. + +## Prior art +[prior-art]: #prior-art + +Most implementations currently orbiting around Gateway API (e.g. Istio, Envoy Gateway, etc) for added RL functionality seem to have been leaning more to the direct route extension pattern instead of Policy Attachment. That might be an option particularly suitable for gateway implementations (gateway providers) and for those aiming to avoid dealing with defaults and overrides. + +## Unresolved questions +[unresolved-questions]: #unresolved-questions + +1. In case a limit definition lists triggers such that some of the triggers can be bound to HTTPRouteRules and some cannot (see [Example 6](#example-6-multiple-limit-definitions-targeting-the-same-httprouterule)), do we bind the valid triggers and ignore the invalid ones or the limit definition is invalid altogether and bound to no HTTPRouteRule at all? +2. What should we fill domain/namespace with, if no longer with the hostname? This can be useful for multi-tenancy. +3. How do we support lists of hostnames in Limitador conditions (single counter)? Should we open an issue for a new `in` operator? +4. What "soft" condition `operator`s do we need to support (e.g. `eq`, `neq`, `exists`, `nexists`, `matches`)? +5. Do we need special field to define shared counters across clusters/Limitador instances or that's to be solved at another layer (`Limitador`, `Kuadrant` CRDs, MCTC)? + +## Future possibilities +[future-possibilities]: #future-possibilities + +- Port `triggers` and the semantics around it to the `AuthPolicy` API (aka "KAP v2"). +- Defaults and overrides, either along the lines of [architecture#4](https://github.com/Kuadrant/architecture/pull/4) or [architecture#10](https://github.com/Kuadrant/architecture/pull/10).