Local rate limit

The HTTP local rate limit filter applies a token bucket rate limit when the request’s route or virtual host has a per filter local rate limit configuration.

If the local rate limit token bucket is checked, and there are no tokens available, a 429 response is returned (the response is configurable). The local rate limit filter also sets the x-envoy-ratelimited header. Additional response headers may be configured.

Note

The token bucket is shared across all workers, thus the rate limits are applied per Envoy process.

Example configuration

Example filter configuration for a globally set rate limiter (e.g.: all vhosts/routes share the same token bucket):

name: envoy.filters.http.local_ratelimit
typed_config:
  "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
  stat_prefix: http_local_rate_limiter
  token_bucket:
    max_tokens: 10000
    tokens_per_fill: 1000
    fill_interval: 1s
  filter_enabled:
    runtime_key: local_rate_limit_enabled
    default_value:
      numerator: 100
      denominator: HUNDRED
  filter_enforced:
    runtime_key: local_rate_limit_enforced
    default_value:
      numerator: 100
      denominator: HUNDRED
  response_headers_to_add:
    - append: false
      header:
        key: x-local-rate-limit
        value: 'true'

Example filter configuration for a globally disabled rate limiter but enabled for a specific route:

name: envoy.filters.http.local_ratelimit
typed_config:
  "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
  stat_prefix: http_local_rate_limiter

The route specific configuration:

route_config:
  name: local_route
  virtual_hosts:
  - name: local_service
    domains: ["*"]
    routes:
    - match: { prefix: "/path/with/rate/limit" }
      route: { cluster: service_protected_by_rate_limit }
      typed_per_filter_config:
        envoy.filters.http.local_ratelimit:
          "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
          token_bucket:
            max_tokens: 10000
            tokens_per_fill: 1000
            fill_interval: 1s
          filter_enabled:
            runtime_key: local_rate_limit_enabled
            default_value:
              numerator: 100
              denominator: HUNDRED
          filter_enforced:
            runtime_key: local_rate_limit_enforced
            default_value:
              numerator: 100
              denominator: HUNDRED
          response_headers_to_add:
            - append: false
              header:
                key: x-local-rate-limit
                value: 'true'
    - match: { prefix: "/" }
      route: { cluster: default_service }

Note that if this filter is configured as globally disabled and there are no virtual host or route level token buckets, no rate limiting will be applied.

Using rate limit descriptors for local rate limiting

Rate limit descriptors can be used to override local per-route rate limiting. A route’s rate limit action is used to match up a local descriptor in the filter config descriptor list. The local descriptor’s token bucket settings are then used to decide if the request should be rate limited or not depending on whether the local descriptor’s entries match the route’s rate limit actions descriptor entries. If there is no matching descriptor entries, the default token bucket is used.

Example filter configuration using descriptors:

route_config:
  name: local_route
  virtual_hosts:
  - name: local_service
    domains: ["*"]
    routes:
    - match: { prefix: "/foo" }
      route: { cluster: service_protected_by_rate_limit }
      typed_per_filter_config:
        envoy.filters.http.local_ratelimit:
          "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
          stat_prefix: test
          token_bucket:
            max_tokens: 1000
            tokens_per_fill: 1000
            fill_interval: 60s
          filter_enabled:
            runtime_key: test_enabled
            default_value:
              numerator: 100
              denominator: HUNDRED
          filter_enforced:
            runtime_key: test_enforced
            default_value:
              numerator: 100
              denominator: HUNDRED
          response_headers_to_add:
            - append: false
              header:
                key: x-test-rate-limit
                value: 'true'
          descriptors:
          - entries:
            - key: client_cluster
              value: foo
            - key: path
              value: /foo/bar
            token_bucket:
              max_tokens: 10
              tokens_per_fill: 10
              fill_interval: 60s
          - entries:
            - key: client_cluster
              value: foo
            - key: path
              value: /foo/bar2
            token_bucket:
              max_tokens: 100
              tokens_per_fill: 100
              fill_interval: 60s
    - match: { prefix: "/" }
      route: { cluster: default_service }
    rate_limits:
    - actions: # any actions in here
      - request_headers:
          header_name: x-envoy-downstream-service-cluster
          descriptor_key: client_cluster
      - request_headers:
          header_name: ":path"
          descriptor_key: path

In this example, requests are rate-limited for routes prefixed with “/foo” as follow. If requests come from a downstream service cluster “foo” for “/foo/bar” path, then 10 req/min are allowed. But if they come from a downstream service cluster “foo” for “/foo/bar2” path, then 100 req/min are allowed. Otherwise, 1000 req/min are allowed.

Statistics

The local rate limit filter outputs statistics in the <stat_prefix>.http_local_rate_limit. namespace. 429 responses – or the configured status code – are emitted to the normal cluster dynamic HTTP statistics.

Name

Type

Description

enabled

Counter

Total number of requests for which the rate limiter was consulted

ok

Counter

Total under limit responses from the token bucket

rate_limited

Counter

Total responses without an available token (but not necessarily enforced)

enforced

Counter

Total number of requests for which rate limiting was applied (e.g.: 429 returned)

Runtime

The HTTP rate limit filter supports the following runtime fractional settings:

http_filter_enabled

% of requests that will check the local rate limit decision, but not enforce, for a given route_key specified in the local rate limit configuration. Defaults to 0.

http_filter_enforcing

% of requests that will enforce the local rate limit decision for a given route_key specified in the local rate limit configuration. Defaults to 0. This can be used to test what would happen before fully enforcing the outcome.