Rate Limit Quota (proto)

This extension has the qualified name envoy.filters.http.rate_limit_quota

Note

This extension is work-in-progress. Functionality is incomplete and it is not intended for production use.

This extension has an unknown security posture and should only be used in deployments where both the downstream and upstream are trusted.

Warning

This API feature is currently work-in-progress. API features marked as work-in-progress are not considered stable, are not covered by the threat model, are not supported by the security team, and are subject to breaking changes. Do not use this feature without understanding each of the previous points.

Rate Limit Quota configuration overview.

extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaFilterConfig

[extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaFilterConfig proto]

Configures the Rate Limit Quota filter.

Can be overridden in the per-route and per-host configurations. The more specific definition completely overrides the less specific definition.

{
  "rlqs_server": {...},
  "domain": ...,
  "bucket_matchers": {...},
  "filter_enabled": {...},
  "filter_enforced": {...},
  "request_headers_to_add_when_not_enforced": []
}
rlqs_server

(config.core.v3.GrpcService, REQUIRED) Configures the gRPC Rate Limit Quota Service (RLQS) RateLimitQuotaService.

domain

(string, REQUIRED) The application domain to use when calling the service. This enables sharing the quota server between different applications without fear of overlap. E.g., “envoy”.

bucket_matchers

(.xds.type.matcher.v3.Matcher, REQUIRED) The match tree to use for grouping incoming requests into buckets.

Example:

matcher_list:
  matchers:
  # Assign requests with header['env'] set to 'staging' to the bucket { name: 'staging' }
  - predicate:
      single_predicate:
        input:
          typed_config:
            '@type': type.googleapis.com/envoy.type.matcher.v3.HttpRequestHeaderMatchInput
            header_name: env
        value_match:
          exact: staging
    on_match:
      action:
        typed_config:
          '@type': type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
          bucket_id_builder:
            bucket_id_builder:
              name:
                string_value: staging

  # Assign requests with header['user_group'] set to 'admin' to the bucket { acl: 'admin_users' }
  - predicate:
      single_predicate:
        input:
          typed_config:
            '@type': type.googleapis.com/xds.type.matcher.v3.HttpAttributesCelMatchInput
        custom_match:
          typed_config:
            '@type': type.googleapis.com/xds.type.matcher.v3.CelMatcher
            expr_match:
              # Shortened for illustration purposes. Here should be parsed CEL expression:
              # request.headers['user_group'] == 'admin'
              parsed_expr: {}
    on_match:
      action:
        typed_config:
          '@type': type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
          bucket_id_builder:
            bucket_id_builder:
              acl:
                string_value: admin_users

# Catch-all clause for the requests not matched by any of the matchers.
# In this example, deny all requests.
on_no_match:
  action:
    typed_config:
      '@type': type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
      no_assignment_behavior:
        fallback_rate_limit:
          blanket_rule: DENY_ALL

Attention

The first matched group wins. Once the request is matched into a bucket, matcher evaluation ends.

Use on_no_match field to assign the catch-all bucket. If a request is not matched into any bucket, and there’s no on_no_match field configured, the request will be ALLOWED by default. It will NOT be reported to the RLQS server.

Refer to Unified Matcher API documentation for more information on the matcher trees.

filter_enabled

(config.core.v3.RuntimeFractionalPercent) If set, this will enable – but not necessarily enforce – the rate limit for the given fraction of requests.

Defaults to 100% of requests.

filter_enforced

(config.core.v3.RuntimeFractionalPercent) If set, this will enforce the rate limit decisions for the given fraction of requests. For requests that are not enforced the filter will still obtain the quota and include it in the load computation, however the request will always be allowed regardless of the outcome of quota application. This allows validation or testing of the rate limiting service infrastructure without disrupting existing traffic.

Note: this only applies to the fraction of enabled requests.

Defaults to 100% of requests.

request_headers_to_add_when_not_enforced

(repeated config.core.v3.HeaderValueOption) Specifies a list of HTTP headers that should be added to each request that has been rate limited and is also forwarded upstream. This can only occur when the filter is enabled but not enforced.

extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaOverride

[extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaOverride proto]

Per-route and per-host configuration overrides. The more specific definition completely overrides the less specific definition.

{
  "domain": ...,
  "bucket_matchers": {...}
}
domain

(string) The application domain to use when calling the service. This enables sharing the quota server between different applications without fear of overlap. E.g., “envoy”.

If empty, inherits the value from the less specific definition.

bucket_matchers

(.xds.type.matcher.v3.Matcher) The match tree to use for grouping incoming requests into buckets.

If set, fully overrides the bucket matchers provided on the less specific definition. If not set, inherits the value from the less specific definition.

See usage example: RateLimitQuotaFilterConfig.bucket_matchers.

extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings

[extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings proto]

Rate Limit Quota Bucket Settings to apply on the successful bucket_matchers match.

Specify this message in the Matcher.OnMatch.action field of the bucket_matchers matcher tree to assign the matched requests to the Quota Bucket. Usage example: RateLimitQuotaFilterConfig.bucket_matchers.

{
  "bucket_id_builder": {...},
  "reporting_interval": {...},
  "deny_response_settings": {...},
  "no_assignment_behavior": {...},
  "expired_assignment_behavior": {...}
}
bucket_id_builder

(extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.BucketIdBuilder) BucketId builder.

BucketId is a map from the string key to the string value which serves as bucket identifier common for on the control plane and the data plane.

While BucketId is always static, BucketIdBuilder allows to populate map values with the dynamic properties associated with the each individual request.

Example 1: static fields only

BucketIdBuilder:

bucket_id_builder:
  name:
    string_value: my_bucket
  hello:
    string_value: world

Produces the following BucketId for all requests:

bucket:
  name: my_bucket
  hello: world

Example 2: static and dynamic fields

bucket_id_builder:
  name:
    string_value: my_bucket
  env:
    custom_value:
      typed_config:
        '@type': type.googleapis.com/envoy.type.matcher.v3.HttpRequestHeaderMatchInput
        header_name: environment

In this example, the value of BucketId key env is substituted from the environment request header.

This is equivalent to the following pseudo-code:

name: 'my_bucket'
env: $header['environment']

For example, the request with the HTTP header env set to staging will produce the following BucketId:

bucket:
  name: my_bucket
  env: staging

For the request with the HTTP header environment set to prod, will produce:

bucket:
  name: my_bucket
  env: prod

Note

The order of BucketId keys do not matter. Buckets { a: 'A', b: 'B' } and { b: 'B', a: 'A' } are identical.

If not set, requests will NOT be reported to the server, and will always limited according to no_assignment_behavior configuration.

reporting_interval

(Duration, REQUIRED) The interval at which the data plane (RLQS client) is to report quota usage for this bucket.

When the first request is matched to a bucket with no assignment, the data plane is to report the request immediately in the RateLimitQuotaUsageReports message. For the RLQS server, this signals that the data plane is now subscribed to the quota assignments in this bucket, and will start sending the assignment as described in the RLQS documentation.

After sending the initial report, the data plane is to continue reporting the bucket usage with the internal specified in this field.

If for any reason RLQS client doesn’t receive the initial assignment for the reported bucket, the data plane will eventually consider the bucket abandoned and stop sending the usage reports. This is explained in more details at Rate Limit Quota Service (RLQS).

deny_response_settings

(extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.DenyResponseSettings) Customize the deny response to the requests over the rate limit. If not set, the filter will be configured as if an empty message is set, and will behave according to the defaults specified in DenyResponseSettings.

no_assignment_behavior

(extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.NoAssignmentBehavior) Configures the behavior in the “no assignment” state: after the first request has been matched to the bucket, and before the the RLQS server returns the first quota assignment.

If not set, the default behavior is to allow all requests.

expired_assignment_behavior

(extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.ExpiredAssignmentBehavior) Configures the behavior in the “expired assignment” state: the bucket’s assignment has expired, and cannot be refreshed.

If not set, the bucket is abandoned when its active assignment expires. The process of abandoning the bucket, and restarting the subscription is described in the AbandonAction message.

extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.NoAssignmentBehavior

[extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.NoAssignmentBehavior proto]

Configures the behavior after the first request has been matched to the bucket, and before the the RLQS server returns the first quota assignment.

{
  "fallback_rate_limit": {...}
}
fallback_rate_limit

(type.v3.RateLimitStrategy, REQUIRED) Apply pre-configured rate limiting strategy until the server sends the first assignment.

extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.ExpiredAssignmentBehavior

[extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.ExpiredAssignmentBehavior proto]

Specifies the behavior when the bucket’s assignment has expired, and cannot be refreshed for any reason.

{
  "expired_assignment_behavior_timeout": {...},
  "fallback_rate_limit": {...},
  "reuse_last_assignment": {...}
}
expired_assignment_behavior_timeout

(Duration) Limit the time ExpiredAssignmentBehavior is applied. If the server doesn’t respond within this duration:

  1. Selected ExpiredAssignmentBehavior is no longer applied.

  2. The bucket is abandoned. The process of abandoning the bucket is described in the AbandonAction message.

  3. If a new request is matched into the bucket that has become abandoned, the data plane restarts the subscription to the bucket. The process of restarting the subscription is described in the AbandonAction message.

If not set, defaults to zero, and the bucket is abandoned immediately.

fallback_rate_limit

(type.v3.RateLimitStrategy) Apply the rate limiting strategy to all requests matched into the bucket until the RLQS server sends a new assignment, or the expired_assignment_behavior_timeout runs out.

Precisely one of fallback_rate_limit, reuse_last_assignment must be set.

reuse_last_assignment

(extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.ExpiredAssignmentBehavior.ReuseLastAssignment) Reuse the last active assignment until the RLQS server sends a new assignment, or the expired_assignment_behavior_timeout runs out.

Precisely one of fallback_rate_limit, reuse_last_assignment must be set.

extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.ExpiredAssignmentBehavior.ReuseLastAssignment

[extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.ExpiredAssignmentBehavior.ReuseLastAssignment proto]

Reuse the last known quota assignment, effectively extending it for the duration specified in the expired_assignment_behavior_timeout field.

extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.DenyResponseSettings

[extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.DenyResponseSettings proto]

Customize the deny response to the requests over the rate limit.

{
  "http_status": {...},
  "http_body": {...},
  "grpc_status": {...},
  "response_headers_to_add": []
}
http_status

(type.v3.HttpStatus) HTTP response code to deny for HTTP requests (gRPC excluded). Defaults to 429 (StatusCode.TooManyRequests).

http_body

(BytesValue) HTTP response body used to deny for HTTP requests (gRPC excluded). If not set, an empty body is returned.

grpc_status

(Status) Configure the deny response for gRPC requests over the rate limit. Allows to specify the RPC status code, and the error message. Defaults to the Status with the RPC Code UNAVAILABLE and empty message.

To identify gRPC requests, Envoy checks that the Content-Type header is application/grpc, or one of the various application/grpc+ values.

Note

The HTTP code for a gRPC response is always 200.

response_headers_to_add

(repeated config.core.v3.HeaderValueOption) Specifies a list of HTTP headers that should be added to each response for requests that have been rate limited. Applies both to plain HTTP, and gRPC requests. The headers are added even when the rate limit quota was not enforced.

extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.BucketIdBuilder

[extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.BucketIdBuilder proto]

BucketIdBuilder makes it possible to build BucketId with values substituted from the dynamic properties associated with each individual request. See usage examples in the docs to bucket_id_builder field.

{
  "bucket_id_builder": {...}
}
bucket_id_builder

(repeated map<string, extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.BucketIdBuilder.ValueBuilder>) The map translated into the BucketId map.

The string key of this map and becomes the key of BucketId map as is.

The ValueBuilder value for the key can be:

  • static StringValue string_value — becomes the value in the BucketId map as is.

  • dynamic TypedExtensionConfig custom_value — evaluated for each request. Must produce a string output, which becomes the value in the the BucketId map.

See usage examples in the docs to bucket_id_builder field.

extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.BucketIdBuilder.ValueBuilder

[extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings.BucketIdBuilder.ValueBuilder proto]

Produces the value of the BucketId map.

{
  "string_value": ...,
  "custom_value": {...}
}
string_value

(string) Static string value — becomes the value in the BucketId map as is.

Precisely one of string_value, custom_value must be set.

custom_value

(config.core.v3.TypedExtensionConfig) Dynamic value — evaluated for each request. Must produce a string output, which becomes the value in the BucketId map. For example, extensions with the envoy.matching.http.input category can be used.

Precisely one of string_value, custom_value must be set.