Rate Limit Quota (Work-In-Progress)
Attention
The Rate Limit Quota filter is currently under active development and is not ready for use yet. Capabilities and the configuration structures could be changed.
Global rate limiting architecture overview
This filter provides implementation of the global rate limit quota protocol. The rate limit quota service (RLQS) provides quota assignments to each Envoy instance connected to the RLQS service. In addition to enforcing rate limit quota assignments, this filter periodically reports request rates to the RLQS, allowing the RLQS server to rebalance quota assignments between Envoy instances based on the real-time individual load of each Envoy instance. When quota assignments change, the RLQS proactively pushes the new assignment to Envoy.
The HTTP rate limit quota filter will call the rate limit quota service when it is configured in the HTTP connection manager filter chain. Filter configuration defines the RLQS service and definitions of quota buckets that will receive quota assignments from the server. Quota buckets are defined by a set of matchers that determine if a request is subject to the rate limit quota assigned to that bucket. Each matcher can contain multiple buckets by the means of the bucket_id_builder. The bucket ID builder allows quota buckets to be generated either dynamically based on request attributes, such as request header value or statically based on the configuration.
If a request does not match any set of matchers then quota assignment for the “catch all” bucket configured by the on_no_match
field of the
bucket_matchers is applied. If the on_no_match
configuration is not provided, all unmatched requests are not rate limited (i.e. fail-open).
Bucket definitions can be overridden by the virtual host or route configurations. The more specific definition completely overrides the less specific definition.
Initially all Envoy’s quota assignments are empty. The rate limit quota filter requests quota assignment from RLQS when the request matches to a bucket for the first time.
The behavior of the filter while it waits for the initial assignment is determined by the no_assignment_behavior
value. In this state, requests can either all be
immediately allowed, denied or enqueued until quota assignment is received.
A quota assignment may have an associated time to live. The RLQS is expected to update the assignment before the TTL runs out. If RLQS failed to update the assignment and its TTL has expired, the filter can be configured to continue using the last quota assignment or fall back to a value predefined in the expired assignment configuration.
The rate limit quota filter reports the request load for each bucket to the RLQS with the configured reporting_interval
. The RLQS may rebalance quota assignments based on the request
load that each Envoy receives and push new quota assignments to Envoy instances.
When the connection to RLQS server fails, the filter will fall back to either the no assignment behavior if it has not yet received a rate limit quota or to the expired assignment behavior if connection could not be re-established by the time the existing quota expired.
Example 1
In this example the HTTP connection manager has the following bucket definitions in the rate limit quota filter configuration. This configuration enables a rate limit quota filter with 3 buckets. Note that bucket ID is a map of key-value pairs.
Bucket id
name: prod-rate-limit-quota
for all requests with thedeployment: prod
header present. Until RLQS assigns a quota, all requests are allowed.Bucket id
name: staging-rate-limit-quota
for all requests with thedeployment: staging
header present. Until RLQS assigns a quota, all requests are denied.Bucket id
name: default-rate-limit-quota
for all other requests. Until RLQS assigns a quota, 1K RPS quota is applied.
28 rlqs_server:
29 envoy_grpc:
30 cluster_name: rate_limit_quota_service
31 domain: "acme-services"
32 matcher:
33 matcher_list:
34 matchers:
35 - predicate:
36 - single_predicate:
37 input:
38 name: request-headers
39 typed_config:
40 "@type": type.googleapis.com/envoy.type.matcher.v3.HttpRequestHeaderMatchInput
41 header_name: deployment
42 value_match:
43 exact: prod
44 on_match:
45 action:
46 name: prod-bucket
47 typed_config:
48 "@type": type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
49 bucket_id_builder:
50 bucket_id_builder:
51 "name":
52 string_value: "prod-rate-limit-quota"
53 reporting_interval: 60s
54 no_assignment_behavior:
55 blanket_rule: ALLOW_ALL
56 - predicate:
57 - single_predicate:
58 input:
59 name: request-headers
60 typed_config:
61 "@type": type.googleapis.com/envoy.type.matcher.v3.HttpRequestHeaderMatchInput
62 header_name: deployment
63 value_match:
64 exact: staging
65 on_match:
66 action:
67 name: staging-bucket
68 typed_config:
69 "@type": type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
70 bucket_id_builder:
71 bucket_id_builder:
72 "name":
73 string_value: "staging-rate-limit-quota"
74 reporting_interval: 60s
75 no_assignment_behavior:
76 blanket_rule: DENY_ALL
77 # The "catch all" bucket settings
78 on_no_match:
79 action:
80 name: default-bucket
81 typed_config:
82 "@type": type.googleapis.com/envoy.extensions.filters.http.rate_limit_quota.v3.RateLimitQuotaBucketSettings
83 bucket_id_builder:
84 bucket_id_builder:
85 "name":
86 string_value: "default-rate-limit-quota"
87 reporting_interval: 60s
88 deny_response_settings:
89 http_status_code: 429
90 no_assignment_behavior:
91 blanket_rule: ALLOW_ALL
92 expired_assignment_behavior:
93 fallback_rate_limit:
94 requests_per_time_unit:
95 requests_per_time_unit: 1000
96 time_unit: 1s
Rate Limit Quota Override
Rate limit filter configuration can be overridden at the virtual host or route levels using the RateLimitQuotaOverride configuration. The more specific configuration fully overrides less specific configuration.
Matcher extensions
TODO
Statistics
The rate limit filter outputs statistics in the cluster.<route target cluster>.rate_limit_quota.
namespace.
429 responses or the configured
rate limited status
are emitted to the normal cluster dynamic HTTP statistics.
Name |
Type |
Description |
---|---|---|
buckets |
Counter |
Total number of quota buckets created |
assignments |
Counter |
Total rate limit assignments received from the rate limit quota service |
error |
Counter |
Total errors contacting the rate limit quota service |
over_limit |
Counter |
Total requests that exceeded assigned rate limit |
no_assigment |
Counter |
Total requests that were applied the no_assigment_behavior |
expired_assigment |
Counter |
Total requests that were applied the expired_assignment_behavior |