Rate Limit Quota (Work-In-Progress)

Attention

The Rate Limit Quota filter is currently under active development and is not ready for use yet. Capabilities and the configuration structures could be changed.

Global rate limiting architecture overview
v3 API reference

This filter provides implementation of the global rate limit quota protocol. The rate limit quota service (RLQS) provides quota assignments to each Envoy instance connected to the RLQS service. In addition to enforcing rate limit quota assignments, this filter periodically reports request rates to the RLQS, allowing the RLQS server to rebalance quota assignments between Envoy instances based on the real-time individual load of each Envoy instance. When quota assignments change, the RLQS proactively pushes the new assignment to Envoy.

The HTTP rate limit quota filter will call the rate limit quota service when it is configured in the HTTP connection manager filter chain. Filter configuration defines the RLQS service and definitions of quota buckets that will receive quota assignments from the server. Quota buckets are defined by a set of matchers that determine if a request is subject to the rate limit quota assigned to that bucket. Each matcher can contain multiple buckets by the means of the bucket_id_builder. The bucket ID builder allows quota buckets to be generated either dynamically based on request attributes, such as request header value or statically based on the configuration.

If a request does not match any set of matchers then quota assignment for the “catch all” bucket configured by the on_no_match field of the bucket_matchers is applied. If the on_no_match configuration is not provided, all unmatched requests are not rate limited (i.e. fail-open).

Bucket definitions can be overridden by the virtual host or route configurations. The more specific definition completely overrides the less specific definition.

Initially all Envoy’s quota assignments are empty. The rate limit quota filter requests quota assignment from RLQS when the request matches to a bucket for the first time. The behavior of the filter while it waits for the initial assignment is determined by the no_assignment_behavior value. In this state, requests can either all be immediately allowed, denied until quota assignment is received.

A quota assignment may have an associated time to live. The RLQS is expected to update the assignment before the TTL runs out. If RLQS failed to update the assignment and its TTL has expired, the filter can be configured to continue using the last quota assignment or fall back to a value predefined in the expired assignment configuration.

The rate limit quota filter reports the request load for each bucket to the RLQS with the configured reporting_interval. The RLQS may rebalance quota assignments based on the request load that each Envoy receives and push new quota assignments to Envoy instances.

Failure modes

In case the connection to RLQS server fails, the filter will fall back to either the no assignment behavior if it has not yet received a rate limit quota or to the expired assignment behavior if connection could not be re-established by the time the existing quota expired.

In case the RLQS client doesn’t receive the initial bucket assignment (for any reason, including RLQS server connection failures) within predetermined [1] time, such buckets will eventually be purged from memory. Subsequent requests matched into the bucket will re-initialize the bucket in the “no assignment” state, restarting the reports. This is explained in more details at Rate Limit Quota Service (RLQS).

Example 1

In this example the HTTP connection manager has the following bucket definitions in the rate limit quota filter configuration. This configuration enables a rate limit quota filter with 3 buckets. Note that bucket ID is a map of key-value pairs.

Bucket id name: prod-rate-limit-quota for all requests with the deployment: prod header present. Until RLQS assigns a quota, all requests are allowed.
Bucket id name: staging-rate-limit-quota for all requests with the deployment: staging header present. Until RLQS assigns a quota, all requests are denied.
Bucket id name: default-rate-limit-quota for all other requests. Until RLQS assigns a quota, 1K RPS quota is applied.

rate-limit-quota-filter-configuration.yaml

            rlqs_server:
              envoy_grpc:
                cluster_name: rate_limit_quota_service
              domain: "acme-services"
              matcher:
                matcher_list:
                  matchers:
                  - predicate:
                    - single_predicate:
                        input:
                          name: request-headers
                          typed_config:
                            "@type": type.googleapis.com/envoy.type.matcher.v3.HttpRequestHeaderMatchInput
                            header_name: deployment
                        value_match:
                          exact: prod
                    on_match:
                      action:
                        name: prod-bucket
                        typed_config:
                          "@type": type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
                          bucket_id_builder:
                            bucket_id_builder:
                              "name":
                                string_value: "prod-rate-limit-quota"
                          reporting_interval: 60s
                          no_assignment_behavior:
                            blanket_rule: ALLOW_ALL
                  - predicate:
                    - single_predicate:
                        input:
                          name: request-headers
                          typed_config:
                            "@type": type.googleapis.com/envoy.type.matcher.v3.HttpRequestHeaderMatchInput
                            header_name: deployment
                        value_match:
                          exact: staging
                    on_match:
                      action:
                        name: staging-bucket
                        typed_config:
                          "@type": type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
                          bucket_id_builder:
                            bucket_id_builder:
                              "name":
                                string_value: "staging-rate-limit-quota"
                          reporting_interval: 60s
                          no_assignment_behavior:
                            blanket_rule: DENY_ALL
                # The "catch all" bucket settings
                on_no_match:
                  action:
                    name: default-bucket
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
                      bucket_id_builder:
                        bucket_id_builder:
                          "name":
                            string_value: "default-rate-limit-quota"
                      reporting_interval: 60s
                      deny_response_settings:
                        http_status_code: 429
                      no_assignment_behavior:
                        blanket_rule: ALLOW_ALL
                      expired_assignment_behavior:
                        fallback_rate_limit:
                          requests_per_time_unit:
                            requests_per_time_unit: 1000
                            time_unit: 1s

Rate Limit Quota Override

Rate limit filter configuration can be overridden at the virtual host or route levels using the RateLimitQuotaOverride configuration. The more specific configuration fully overrides less specific configuration.

Matcher extensions

TODO

Statistics

The rate limit filter outputs statistics in the cluster.<route target cluster>.rate_limit_quota. namespace. 429 responses or the configured rate limited status are emitted to the normal cluster dynamic HTTP statistics.

Name	Type	Description
buckets	Counter	Total number of quota buckets created
assignments	Counter	Total rate limit assignments received from the rate limit quota service
error	Counter	Total errors contacting the rate limit quota service
over_limit	Counter	Total requests that exceeded assigned rate limit
no_assigment	Counter	Total requests that were applied the no_assigment_behavior
expired_assigment	Counter	Total requests that were applied the expired_assignment_behavior