Composite cluster

The composite cluster type provides retry-aware cluster selection, allowing different retry attempts to automatically target different upstream clusters. Unlike the standard aggregate cluster which uses health-based selection, the composite cluster uses the retry attempt count to deterministically select which sub-cluster to route to.

Use cases

The composite cluster addresses several important scenarios:

  • Retry-based progression: Different clusters for retry attempts (primary → secondary → tertiary).

  • AI Gateway failover: Route initial requests to preferred providers and retries to fallbacks.

  • Cost optimization: Try expensive, high-performance services first, fall back to cheaper alternatives.

Configuration

The composite cluster is configured using the ClusterConfig.

Example configuration

The following example shows a composite cluster with three sub-clusters:

name: composite_cluster
connect_timeout: 0.25s
lb_policy: CLUSTER_PROVIDED
cluster_type:
  name: envoy.clusters.composite
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.clusters.composite.v3.ClusterConfig
    clusters:
    - name: primary_cluster
    - name: secondary_cluster
    - name: fallback_cluster

In this configuration:

  • Initial requests (attempt 1) go to primary_cluster.

  • First retries (attempt 2) go to secondary_cluster.

  • Second retries (attempt 3) go to fallback_cluster.

  • Further retry attempts (attempt 4+) will fail with no host available.

Cluster selection

The composite cluster uses a sequential selection strategy based on retry attempt count:

  • Initial request (attempt 1): Uses the first cluster.

  • First retry (attempt 2): Uses the second cluster.

  • Second retry (attempt 3): Uses the third cluster.

  • Further retries: Fail with no host available.

When retry attempts exceed the number of configured clusters, requests fail with no host available. Configure the number of retries in your retry policy to match your cluster configuration.

Retry policy coordination

For the composite cluster to function correctly, configure an appropriate retry policy at the route level:

retry_policy:
  retry_on: "5xx,gateway-error,connect-failure,refused-stream"
  num_retries: 2  # Enables attempts 1, 2, and 3 (3 total attempts)

Important considerations

  • Sub-cluster independence: Each sub-cluster maintains its own health checking, load balancing, and outlier detection. If a selected sub-cluster has no healthy hosts available, the request will fail according to that sub-cluster’s load balancing behavior, potentially triggering another retry attempt if configured.

  • Deterministic routing: The same retry attempt will always target the same cluster given identical configuration. Cluster selection is based solely on retry attempt count, not on the health status of sub-clusters.

  • Thread-local clustering: Cluster selection occurs at the thread-local level for optimal performance.

  • Sub-cluster health: Unlike the aggregate cluster, the composite cluster does not consider sub-cluster health when selecting which cluster to use. Each retry attempt targets a specific cluster based on attempt count, regardless of whether that cluster has healthy endpoints available.

Comparison with aggregate cluster

Feature

Aggregate Cluster

Composite Cluster

Selection basis

Health status

Retry attempt count

Primary use case

Health-based failover

Retry progression

Overflow handling

Health-dependent

Fails request

Predictability

Health-dependent

Fully deterministic