Rate Limit Quota (Work-In-Progress)

Attention

The Rate Limit Quota filter is currently under active development and is not ready for use yet. Capabilities and the configuration structures could be changed.

This filter provides implementation of the global rate limit quota protocol. The rate limit quota service (RLQS) provides quota assignments to each Envoy instance connected to the RLQS service. In addition to enforcing rate limit quota assignments, this filter periodically reports request rates to the RLQS, allowing the RLQS server to rebalance quota assignments between Envoy instances based on the real-time individual load of each Envoy instance. When quota assignments change, the RLQS proactively pushes the new assignment to Envoy.

The HTTP rate limit quota filter will call the rate limit quota service when it is configured in the HTTP connection manager filter chain. Filter configuration defines the RLQS service and definitions of quota buckets that will receive quota assignments from the server. Quota buckets are defined by a set of matchers that determine if a request is subject to the rate limit quota assigned to that bucket. Each matcher can contain multiple buckets by the means of the bucket_id_builder. The bucket ID builder allows quota buckets to be generated either dynamically based on request attributes, such as request header value or statically based on the configuration.

If a request does not match any set of matchers then quota assignment for the “catch all” bucket configured by the on_no_match field of the bucket_matchers is applied. If the on_no_match configuration is not provided, all unmatched requests are not rate limited (i.e. fail-open).

Bucket definitions can be overridden by the virtual host or route configurations. The more specific definition completely overrides the less specific definition.

Initially all Envoy’s quota assignments are empty. The rate limit quota filter requests quota assignment from RLQS when the request matches to a bucket for the first time. The behavior of the filter while it waits for the initial assignment is determined by the no_assignment_behavior value. In this state, requests can either all be immediately allowed, denied or enqueued until quota assignment is received.

A quota assignment may have an associated time to live. The RLQS is expected to update the assignment before the TTL runs out. If RLQS failed to update the assignment and its TTL has expired, the filter can be configured to continue using the last quota assignment or fall back to a value predefined in the expired assignment configuration.

The rate limit quota filter reports the request load for each bucket to the RLQS with the configured reporting_interval. The RLQS may rebalance quota assignments based on the request load that each Envoy receives and push new quota assignments to Envoy instances.

When the connection to RLQS server fails, the filter will fall back to either the no assignment behavior if it has not yet received a rate limit quota or to the expired assignment behavior if connection could not be re-established by the time the existing quota expired.

Example 1

In this example the HTTP connection manager has the following bucket definitions in the rate limit quota filter configuration. This configuration enables a rate limit quota filter with 3 buckets. Note that bucket ID is a map of key-value pairs.

  1. Bucket id name: prod-rate-limit-quota for all requests with the deployment: prod header present. Until RLQS assigns a quota, all requests are allowed.

  2. Bucket id name: staging-rate-limit-quota for all requests with the deployment: staging header present. Until RLQS assigns a quota, all requests are denied.

  3. Bucket id name: default-rate-limit-quota for all other requests. Until RLQS assigns a quota, 1K RPS quota is applied.

28            rlqs_server:
29              envoy_grpc:
30                cluster_name: rate_limit_quota_service
31              domain: "acme-services"
32              matcher:
33                matcher_list:
34                  matchers:
35                  - predicate:
36                    - single_predicate:
37                        input:
38                          name: request-headers
39                          typed_config:
40                            "@type": type.googleapis.com/envoy.type.matcher.v3.HttpRequestHeaderMatchInput
41                            header_name: deployment
42                        value_match:
43                          exact: prod
44                    on_match:
45                      action:
46                        name: prod-bucket
47                        typed_config:
48                          "@type": type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
49                          bucket_id_builder:
50                            bucket_id_builder:
51                              "name":
52                                string_value: "prod-rate-limit-quota"
53                          reporting_interval: 60s
54                          no_assignment_behavior:
55                            blanket_rule: ALLOW_ALL
56                  - predicate:
57                    - single_predicate:
58                        input:
59                          name: request-headers
60                          typed_config:
61                            "@type": type.googleapis.com/envoy.type.matcher.v3.HttpRequestHeaderMatchInput
62                            header_name: deployment
63                        value_match:
64                          exact: staging
65                    on_match:
66                      action:
67                        name: staging-bucket
68                        typed_config:
69                          "@type": type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
70                          bucket_id_builder:
71                            bucket_id_builder:
72                              "name":
73                                string_value: "staging-rate-limit-quota"
74                          reporting_interval: 60s
75                          no_assignment_behavior:
76                            blanket_rule: DENY_ALL
77                # The "catch all" bucket settings
78                on_no_match:
79                  action:
80                    name: default-bucket
81                    typed_config:
82                      "@type": type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
83                      bucket_id_builder:
84                        bucket_id_builder:
85                          "name":
86                            string_value: "default-rate-limit-quota"
87                      reporting_interval: 60s
88                      deny_response_settings:
89                        http_status_code: 429
90                      no_assignment_behavior:
91                        blanket_rule: ALLOW_ALL
92                      expired_assignment_behavior:
93                        fallback_rate_limit:
94                          requests_per_time_unit:
95                            requests_per_time_unit: 1000
96                            time_unit: 1s

Rate Limit Quota Override

Rate limit filter configuration can be overridden at the virtual host or route levels using the RateLimitQuotaOverride configuration. The more specific configuration fully overrides less specific configuration.

Matcher extensions

TODO

Statistics

The rate limit filter outputs statistics in the cluster.<route target cluster>.rate_limit_quota. namespace. 429 responses or the configured rate limited status are emitted to the normal cluster dynamic HTTP statistics.

Name

Type

Description

buckets

Counter

Total number of quota buckets created

assignments

Counter

Total rate limit assignments received from the rate limit quota service

error

Counter

Total errors contacting the rate limit quota service

over_limit

Counter

Total requests that exceeded assigned rate limit

no_assigment

Counter

Total requests that were applied the no_assigment_behavior

expired_assigment

Counter

Total requests that were applied the expired_assignment_behavior