Local rate limit
Local rate limiting architecture overview
This filter should be configured with the type URL
type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
.
The HTTP local rate limit filter applies a token bucket rate limit when the request’s route or virtual host has a per filter local rate limit configuration.
If the local rate limit token bucket is checked, and there are no tokens available, a 429 response is returned (the response is configurable). The local rate limit filter then sets the x-envoy-ratelimited response header. Additional response headers can be configured to be returned.
Request headers can be configured to be added to forwarded requests to the upstream when the local rate limit filter is enabled but not enforced.
Depending on the value of the config local_rate_limit_per_downstream_connection, the token bucket is either shared across all workers or on a per connection basis. This results in the local rate limits being applied either per Envoy process or per downstream connection. By default the rate limits are applied per Envoy process.
Example configuration
Example filter configuration for a globally set rate limiter (e.g.: all vhosts/routes share the same token bucket):
13 http_filters:
14 - name: envoy.filters.http.local_ratelimit
15 typed_config:
16 "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
17 stat_prefix: http_local_rate_limiter
18 token_bucket:
19 max_tokens: 10000
20 tokens_per_fill: 1000
21 fill_interval: 1s
22 filter_enabled:
23 runtime_key: local_rate_limit_enabled
24 default_value:
25 numerator: 100
26 denominator: HUNDRED
27 filter_enforced:
28 runtime_key: local_rate_limit_enforced
29 default_value:
30 numerator: 100
31 denominator: HUNDRED
32 response_headers_to_add:
33 - append_action: OVERWRITE_IF_EXISTS_OR_ADD
34 header:
35 key: x-local-rate-limit
36 value: 'true'
37 local_rate_limit_per_downstream_connection: false
Example filter configuration for a globally disabled rate limiter but enabled for a specific route:
13 http_filters:
14 - name: envoy.filters.http.local_ratelimit
15 typed_config:
16 "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
17 stat_prefix: http_local_rate_limiter
The route specific configuration:
21 route_config:
22 name: local_route
23 virtual_hosts:
24 - name: local_service
25 domains: ["*"]
26 routes:
27 - match: { prefix: "/path/with/rate/limit" }
28 route: { cluster: service_protected_by_rate_limit }
29 typed_per_filter_config:
30 envoy.filters.http.local_ratelimit:
31 "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
32 stat_prefix: http_local_rate_limiter
33 token_bucket:
34 max_tokens: 10000
35 tokens_per_fill: 1000
36 fill_interval: 1s
37 filter_enabled:
38 runtime_key: local_rate_limit_enabled
39 default_value:
40 numerator: 100
41 denominator: HUNDRED
42 filter_enforced:
43 runtime_key: local_rate_limit_enforced
44 default_value:
45 numerator: 100
46 denominator: HUNDRED
47 response_headers_to_add:
48 - append_action: OVERWRITE_IF_EXISTS_OR_ADD
49 header:
50 key: x-local-rate-limit
51 value: 'true'
52 - match: { prefix: "/" }
53 route: { cluster: default_service }
Note that if this filter is configured as globally disabled and there are no virtual host or route level token buckets, no rate limiting will be applied.
Using rate limit descriptors for local rate limiting
Rate limit descriptors can be used to override local per-route rate limiting. A route’s rate limit action is used to match up a local descriptor in the filter config descriptor list. The local descriptor’s token bucket settings are then used to decide if the request should be rate limited or not depending on whether the local descriptor’s entries match the route’s rate limit actions descriptor entries. If there is no matching descriptor entries, the default token bucket is used. All the matched local descriptors will be sorterd by tokens per second and try to consume tokens in order, in most cases if one of them is limited, the remaining descriptors will not consume their tokens. However, in some cases, it may not work, for example, we have two descriptors A and B, A is limited 3 requests per second, and B is limited 20 requests per 10 seconds. Obviously B is stricter than A (token per second), as a result, if we send requests above 3 in a second, the limited requests from A will also consume tokens of B. Note that global tokens are not sorted, so we suggest they should be larger than other descriptors.
Example filter configuration using descriptors:
21 route_config:
22 name: local_route
23 virtual_hosts:
24 - name: local_service
25 domains: ["*"]
26 routes:
27 - match: { prefix: "/foo" }
28 route:
29 cluster: service_protected_by_rate_limit
30 rate_limits:
31 - actions: # any actions in here
32 - request_headers:
33 header_name: x-envoy-downstream-service-cluster
34 descriptor_key: client_cluster
35 - request_headers:
36 header_name: ":path"
37 descriptor_key: path
38 typed_per_filter_config:
39 envoy.filters.http.local_ratelimit:
40 "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
41 stat_prefix: test
42 token_bucket:
43 max_tokens: 1000
44 tokens_per_fill: 1000
45 fill_interval: 60s
46 filter_enabled:
47 runtime_key: test_enabled
48 default_value:
49 numerator: 100
50 denominator: HUNDRED
51 filter_enforced:
52 runtime_key: test_enforced
53 default_value:
54 numerator: 100
55 denominator: HUNDRED
56 response_headers_to_add:
57 - append_action: OVERWRITE_IF_EXISTS_OR_ADD
58 header:
59 key: x-test-rate-limit
60 value: 'true'
61 descriptors:
62 - entries:
63 - key: client_cluster
64 value: foo
65 - key: path
66 value: /foo/bar
67 token_bucket:
68 max_tokens: 10
69 tokens_per_fill: 10
70 fill_interval: 60s
71 - entries:
72 - key: client_cluster
73 value: foo
74 - key: path
75 value: /foo/bar2
76 token_bucket:
77 max_tokens: 100
78 tokens_per_fill: 100
79 fill_interval: 60s
80 - match: { prefix: "/" }
81 route: { cluster: default_service }
In this example, requests are rate-limited for routes prefixed with “/foo” as follows. If requests come from a downstream service cluster “foo” for “/foo/bar” path, then 10 req/min are allowed. But if they come from a downstream service cluster “foo” for “/foo/bar2” path, then 100 req/min are allowed. Otherwise, 1000 req/min are allowed.
Statistics
The local rate limit filter outputs statistics in the <stat_prefix>.http_local_rate_limit. namespace. 429 responses – or the configured status code – are emitted to the normal cluster dynamic HTTP statistics.
Name |
Type |
Description |
---|---|---|
enabled |
Counter |
Total number of requests for which the rate limiter was consulted |
ok |
Counter |
Total under limit responses from the token bucket |
rate_limited |
Counter |
Total responses without an available token (but not necessarily enforced) |
enforced |
Counter |
Total number of requests for which rate limiting was applied (e.g.: 429 returned) |
Runtime
The HTTP rate limit filter supports the following runtime fractional settings:
- http_filter_enabled
% of requests that will check the local rate limit decision, but not enforce, for a given route_key specified in the local rate limit configuration. Defaults to 0.
- http_filter_enforcing
% of requests that will enforce the local rate limit decision for a given route_key specified in the local rate limit configuration. Defaults to 0. This can be used to test what would happen before fully enforcing the outcome.