Administration interface

Envoy exposes a local administration interface that can be used to query and modify different aspects of the server:

v3 API reference

Attention

The administration interface in its current form both allows destructive operations to be performed (e.g., shutting down the server) as well as potentially exposes private information (e.g., stats, cluster names, cert info, etc.). It is critical that access to the administration interface is only allowed via a secure network. It is also critical that hosts that access the administration interface are only attached to the secure network (i.e., to avoid CSRF attacks). This involves setting up an appropriate firewall or optimally only allowing access to the administration listener via localhost. You can additionally restrict which admin paths are reachable using allow_paths. This can be accomplished with a configuration like the following:

admin-interface.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 127.0.0.1
      port_value: 9901
  allow_paths:
  - exact: /ready
  - prefix: /stats
  profile_path: /tmp/envoy.prof

All mutations must be sent as HTTP POST operations. When a mutation is requested via GET, the request has no effect, and an HTTP 400 (Invalid Request) response is returned.

Note

For an endpoint with ?format=json, it dumps data as a JSON-serialized proto. Fields with default values are not rendered. For example for /clusters?format=json, the circuit breakers thresholds priority field is omitted when its value is DEFAULT priority as shown below:

{
 "thresholds": [
  {
   "max_connections": 1,
   "max_pending_requests": 1024,
   "max_requests": 1024,
   "max_retries": 1
  },
  {
   "priority": "HIGH",
   "max_connections": 1,
   "max_pending_requests": 1024,
   "max_requests": 1024,
   "max_retries": 1
  }
 ]
}

GET /: Render an HTML home page with a table of links to all available options. This can be disabled by compiling Envoy with --define=admin_html=disabled in which case an error message is printed. Disabling the HTML mode reduces the Envoy binary size.

GET /help: Print a textual table of all available options.

GET /certs: List out all loaded TLS certificates, including file name, serial number, subject alternate names and days until expiration in JSON format conforming to the certificate proto definition.

GET /clusters

List out all configured cluster manager clusters. This information includes all discovered upstream hosts in each cluster along with per host statistics. This is useful for debugging service discovery issues.

Cluster manager information

version_info string – the version info string of the last loaded CDS update. If Envoy does not have CDS setup, the output will read version_info::static.

Cluster wide information

circuit breakers settings for all priority settings.
Information about outlier detection if a detector is installed. Currently average success rate, and ejection threshold are presented. Both of these values could be -1 if there was not enough data to calculate them in the last interval.
added_via_api flag – false if the cluster was added via static configuration, true if it was added via the CDS api.

Per host statistics

Name	Type	Description
cx_total	Counter	Total connections
cx_active	Gauge	Total active connections
cx_connect_fail	Counter	Total connection failures
rq_total	Counter	Total requests
rq_timeout	Counter	Total timed out requests
rq_success	Counter	Total requests with non-5xx responses
rq_error	Counter	Total requests with 5xx responses
rq_active	Gauge	Total active requests
healthy	String	The health status of the host. See below
weight	Integer	Load balancing weight (1-100)
zone	String	Service zone
canary	Boolean	Whether the host is a canary
success_rate	Double	Request success rate (0-100). -1 if there was not enough request volume in the interval to calculate it

Host health status

A host is either healthy or unhealthy because of one or more different failing health states. If the host is healthy the healthy output will be equal to healthy.

If the host is not healthy, the healthy output will be composed of one or more of the following strings:

/failed_active_hc: The host has failed an active health check.

/failed_eds_health: The host was marked unhealthy by EDS.

/failed_outlier_check: The host has failed an outlier detection check.

GET /clusters?format=json: Dump the /clusters output in a JSON-serialized proto. See the definition for more information.

GET /clusters?filter=regex

Filters the returned clusters to those with names matching the regular expression regex. Compatible with format. Performs partial matching by default, so /clusters?filter=service will return all clusters containing the word service. Full-string matching can be specified with begin- and end-line anchors. (i.e. /clusters?filter=^my-service-cluster$)

By default, the regular expression is evaluated using the Google RE2 engine.

GET /config_dump: Dump currently loaded configuration from various Envoy components as JSON-serialized proto messages. See the response definition for more information.

Warning

Configuration may include TLS certificates. Before dumping the configuration, Envoy will attempt to redact the private_key and password fields from any certificates it finds. This relies on the configuration being a strongly-typed protobuf message. If your Envoy configuration uses deprecated config fields (of type google.protobuf.Struct), please update to the recommended typed_config fields (of type google.protobuf.Any) to ensure sensitive data is redacted properly.

Warning

The underlying proto is marked v2alpha and hence its contents, including the JSON representation, are not guaranteed to be stable.

GET /config_dump?include_eds: Dump currently loaded configuration including EDS. See the response definition for more information.

GET /config_dump?mask={}: Specify a subset of fields that you would like to be returned. The mask is parsed as a Protobuf::FieldMask and applied to each top level dump such as BootstrapConfigDump and ClustersConfigDump. This behavior changes if both resource and mask query parameters are specified. See below for details.

GET /config_dump?resource={}: Dump only the currently loaded configuration that matches the specified resource. The resource must be a repeated field in one of the top level config dumps such as static_listeners from ListenersConfigDump or dynamic_active_clusters from ClustersConfigDump. If you need a non-repeated field, use the mask query parameter documented above. If you want only a subset of fields from the repeated resource, use both as documented below.

GET /config_dump?name_regex={}

Dump only the currently loaded configurations whose names match the specified regex. Can be used with both resource and mask query parameters.

For example, /config_dump?name_regex=.*substring.* would return all resource types whose name field matches the given regex.

Per resource, the matched name field is:

For ECDS config dump, the matched name field is the corresponding filter name, which is stored in:

envoy.config.core.v3.TypedExtensionConfig.name

GET /config_dump?resource={}&mask={}

When both resource and mask query parameters are specified, the mask is applied to every element in the desired repeated field so that only a subset of fields are returned. The mask is parsed as a Protobuf::FieldMask.

For example, get the names of all active dynamic clusters with /config_dump?resource=dynamic_active_clusters&mask=cluster.name

GET /contention: Dump current Envoy mutex contention stats (MutexStats) in JSON format, if mutex tracing is enabled. See --enable-mutex-tracing.

POST /cpuprofiler: Enable or disable the CPU profiler. Requires compiling with gperftools. The output file can be configured by admin.profile_path.

POST /heapprofiler: Enable or disable the Heap profiler. Requires compiling with gperftools. The output file can be configured by admin.profile_path.

GET /heap_dump: Dump current heap profile of Envoy process. The output content is parsable binary by the pprof tool. Requires compiling with tcmalloc (default).

GET /peak_heap_dump: Dump peak heap profile of Envoy process. This captures the heap state at peak memory usage. The output content is parsable binary by the pprof tool. Requires compiling with tcmalloc (default).

POST /allocprofiler: Enable or disable the allocation profiler. The output content is parsable binary by the pprof tool. Requires compiling with tcmalloc (default).

POST /healthcheck/fail: Fail inbound health checks. This requires the use of the HTTP health check filter. This is useful for draining a server prior to shutting it down or doing a full restart. Invoking this command will universally fail health check requests regardless of how the filter is configured (pass through, etc.).

POST /healthcheck/ok: Negate the effect of POST /healthcheck/fail. This requires the use of the HTTP health check filter.

GET /hot_restart_version: See --hot-restart-version.

GET /init_dump: Dump current information of unready targets of various Envoy components as JSON-serialized proto messages. See the response definition for more information.

GET /init_dump?mask={}

When mask query parameters is specified, the mask value is the desired component to dump unready targets. The mask is parsed as a Protobuf::FieldMask.

For example, get the unready targets of all listeners with /init_dump?mask=listener

GET /listeners: List out all configured listeners. This information includes the names of listeners as well as the addresses that they are listening on. If a listener is configured to listen on port 0, then the output will contain the actual port that was allocated by the OS.

GET /listeners?format=json: Dump the /listeners output in a JSON-serialized proto. See the definition for more information.

POST /logging

Enable/disable logging levels for different loggers.

If the default component logger is used, the logger name should be exactlly the component name.

To change the logging level across all loggers, set the query parameter as level=<desired_level>.
To change a particular logger’s level, set the query parameter like so, <logger_name>=<desired_level>.
To change multiple logging levels at once, set the query parameter as paths=<logger_name1>:<desired_level1>,<logger_name2>:<desired_level2>.
To list the loggers, send a POST request to the /logging endpoint without a query parameter.

If --enable-fine-grain-logging is set, the logger is represented by the path of the file it belongs to (to be specific, the path determined by __FILE__), so the logger list will show a list of file paths, and the specific path should be used as <logger_name> to change the log level.

We also added the file basename, glob * and ? support for fine-grain loggers. For example, we have the following active loggers with trace level.

source/server/admin/admin_filter.cc: trace
source/common/event/dispatcher_impl.cc: trace
source/common/network/tcp_listener_impl.cc: trace
source/common/network/udp_listener_impl.cc: trace

/logging?paths=source/common/event/dispatcher_impl.cc:debug will make the level of source/common/event/dispatcher_impl.cc be debug.
/logging?group=http:info will make the level of all loggers in the http group be info.
/logging?admin_filter=info will make the level of source/server/admin/admin_filter.cc be info, and other unmatched loggers will be the default trace.
/logging?paths=source/common*:warning will make the level of source/common/event/dispatcher_impl.cc:, source/common/network/tcp_listener_impl.cc be warning. Other unmatched loggers will be the default trace, e.g., admin_filter.cc, even it was updated to info from the previous post update.
/logging?paths=???_listener_impl:info will make the level of source/common/network/tcp_listener_impl.cc, source/common/network/udp_listener_impl.cc be info.
/logging?paths=???_listener_impl:info,tcp_listener_impl:warning, the level of source/common/network/tcp_listener_impl.cc will be info, since the first match will take effect.
/logging?level=info will change the default verbosity level to info. All the unmatched loggers in the following update will be this default level.

GET /memory: Prints current memory allocation / heap usage, in bytes. Useful in lieu of printing all /stats and filtering to get the memory-related statistics.

GET /memory/tcmalloc: Dumps the current TCMalloc stats.

POST /quitquitquit: Cleanly exit the server.

POST /reset_counters: Reset all counters to zero. This is useful along with GET /stats during debugging. Note that this does not drop any data sent to statsd. It just affects local output of the GET /stats command.

POST /drain_listeners

Drains all listeners.

POST /drain_listeners?inboundonly

Drains all inbound listeners. traffic_direction field in Listener is used to determine whether a listener is inbound or outbound. May not be effective for network filters like Redis, Mongo, or Thrift.

POST /drain_listeners?graceful

When draining listeners, enter a graceful drain period prior to closing listeners. This behaviour and duration is configurable via server options or CLI (--drain-time-s and --drain-strategy).

POST /drain_listeners?graceful&skip_exit

When draining listeners, do not exit after the drain period. This must be used with graceful.

Attention

This operation directly stops the matched listeners on workers. Once listeners in a given traffic direction are stopped, listener additions and modifications in that direction are not allowed.

GET /server_info

Outputs a JSON message containing information about the running server.

Sample output looks like:

{
  "version": "b050513e840aa939a01f89b07c162f00ab3150eb/1.9.0-dev/Modified/DEBUG",
  "state": "LIVE",
  "command_line_options": {
    "base_id": "0",
    "concurrency": 8,
    "config_path": "config.yaml",
    "config_yaml": "",
    "allow_unknown_static_fields": false,
    "admin_address_path": "",
    "local_address_ip_version": "v4",
    "log_level": "info",
    "component_log_level": "",
    "log_format": "[%Y-%m-%d %T.%e][%t][%l][%n] %v",
    "log_path": "",
    "hot_restart_version": false,
    "service_cluster": "",
    "service_node": "",
    "service_zone": "",
    "mode": "Serve",
    "disable_hot_restart": false,
    "enable_mutex_tracing": false,
    "restart_epoch": 0,
    "file_flush_interval": "10s",
    "drain_time": "600s",
    "parent_shutdown_time": "900s",
    "cpuset_threads": false
  },
  "uptime_current_epoch": "6s",
  "uptime_all_epochs": "6s",
  "node": {
    "id": "node1",
    "cluster": "cluster1",
    "user_agent_name": "envoy",
    "user_agent_build_version": {
      "version": {
        "major_number": 1,
        "minor_number": 15,
        "patch": 0
      }
    },
    "metadata": {},
    "extensions": [],
    "client_features": [],
    "listening_addresses": []
  }
}

See the ServerInfo proto for an explanation of the output.

GET /ready

Outputs a string and error code reflecting the state of the server. 200 is returned for the LIVE state, and 503 otherwise. This can be used as a readiness check.

Example output:

LIVE

See the state field of the ServerInfo proto for an explanation of the output.

GET /stats

Outputs all statistics on demand. This command is very useful for local debugging. Histograms will output the computed quantiles i.e P0,P25,P50,P75,P90,P99,P99.9 and P100. The output for each quantile will be in the form of (interval,cumulative) where the interval value represents the summary since last flush. By default, a timer is setup to flush in intervals defined by stats_flush_interval, defaulting to 5 seconds. If stats_flush_on_admin is specified, stats are flushed when this endpoint is queried and a timer will not be used. The cumulative value represents the summary since the start of Envoy instance. “No recorded values” in the histogram output indicates that it has not been updated with a value. See here for more information.

GET /stats?usedonly

Outputs statistics that Envoy has updated (counters incremented at least once, gauges changed at least once, and histograms added to at least once).

Only outputs statistics that are internally marked as hidden.

Hidden stats will be shown along side non-hidden stats.

Hidden stats will be excluded from the output. This is the default behavior.

GET /stats?filter=regex

Filters the returned stats to those with names matching the regular expression regex. Compatible with usedonly. Performs partial matching by default, so /stats?filter=server will return all stats containing the word server. Full-string matching can be specified with begin- and end-line anchors. (i.e. /stats?filter=^server.concurrency$)

By default, the regular expression is evaluated using the Google RE2 engine.

GET /stats?histogram_buckets=cumulative

Changes histogram output to display cumulative buckets with upper bounds (e.g. B0.5, B1, B5, …). The output for each bucket will be in the form of (interval,cumulative) (e.g. B0.5(0,0)). All values below the upper bound are included even if they are placed into other buckets. Compatible with usedonly and filter.

GET /stats?histogram_buckets=disjoint

Changes histogram output to display disjoint buckets with upper bounds (e.g. B0.5, B1, B5, …). The output for each bucket will be in the form of (interval,cumulative) (e.g. B0.5(0,0)). Buckets do not include values from other buckets with smaller upper bounds; the previous bucket’s upper bound acts as a lower bound. Compatible with usedonly and filter.

GET /stats?histogram_buckets=detailed

Shows histograms as both percentile summary data, and raw bucket data.

Example output .. code-block:: text

http.admin.downstream_rq_time:
totals=1,0.25:25, 2,0.25:9 intervals=1,0.25:2, 2,0.25:3 summary=P0(1,1) P25(1.0625,1.034) P50(2.0166,1.068) P75(2.058,2.005) P90(2.083,2.06) P95(2.091,2.08) P99(2.09,2.09) P99.5(2.099,2.098) P99.9(2.099,2.099) P100(2.1,2.1)

Each bucket is shown as lower_bound,width:count. In the above example there are two buckets. totals contains the accumulated data-points since the binary was started. intervals shows the new data points since the previous stats flush.

Compatible with usedonly and filter.

GET /stats?format=html

Renders stats using HTML for a web browser, providing form fields to incrementally modify the filter, toggle used-only mode, control the types of stats displayed, and also toggle into another format.

This format is disabled if Envoy is compiled with –define=admin_html=disabled

GET /stats?format=active-html

Renders stats continuously, displaying the top 50 stats ordered by frequency of changes. In this format, used-only mode is implied. You can incrementally adjust the filter, the subset of types, the number of stats displayed, and the interval between updates.

After using this mode, be sure to close the browser tab to avoid placing periodic load on the server as stats are updated regularly.

This format is disabled if Envoy is compiled with –define=admin_html=disabled

GET /stats?format=json

Outputs /stats in JSON format. This can be used for programmatic access of stats. Counters and Gauges will be in the form of a set of (name,value) pairs. Histograms will be under the element “histograms”, that contains “supported_quantiles” which lists the quantiles supported and an array of computed_quantiles that has the computed quantile for each histogram.

If a histogram is not updated during an interval, the output will have null for all the quantiles.

Example histogram output:

{
  "histograms": {
    "supported_quantiles": [
      0, 25, 50, 75, 90, 95, 99, 99.9, 100
    ],
    "computed_quantiles": [
      {
        "name": "cluster.external_auth_cluster.upstream_cx_length_ms",
        "values": [
          {"interval": 0, "cumulative": 0},
          {"interval": 0, "cumulative": 0},
          {"interval": 1.0435787, "cumulative": 1.0435787},
          {"interval": 1.0941565, "cumulative": 1.0941565},
          {"interval": 2.0860023, "cumulative": 2.0860023},
          {"interval": 3.0665233, "cumulative": 3.0665233},
          {"interval": 6.046609, "cumulative": 6.046609},
          {"interval": 229.57333,"cumulative": 229.57333},
          {"interval": 260,"cumulative": 260}
        ]
      },
      {
        "name": "http.admin.downstream_rq_time",
        "values": [
          {"interval": null, "cumulative": 0},
          {"interval": null, "cumulative": 0},
          {"interval": null, "cumulative": 1.0435787},
          {"interval": null, "cumulative": 1.0941565},
          {"interval": null, "cumulative": 2.0860023},
          {"interval": null, "cumulative": 3.0665233},
          {"interval": null, "cumulative": 6.046609},
          {"interval": null, "cumulative": 229.57333},
          {"interval": null, "cumulative": 260}
        ]
      }
    ]
  }
}

GET /stats?format=json&usedonly

Outputs statistics that Envoy has updated (counters incremented at least once, gauges changed at least once, and histograms added to at least once) in JSON format.

GET /stats?format=json&histogram_buckets=cumulative

Changes histogram output to display cumulative buckets with upper bounds. All values below the upper bound are included even if they are placed into other buckets. Compatible with usedonly and filter.

Example histogram output:

{
  "histograms": [
    {
      "name": "example_histogram",
      "buckets": [
        {"upper_bound": 1, "interval": 0, "cumulative": 0},
        {"upper_bound": 2, "interval": 0, "cumulative": 1},
        {"upper_bound": 3, "interval": 1, "cumulative": 3},
        {"upper_bound": 4, "interval": 1, "cumulative": 3}
      ]
    },
    {
      "name": "other_example_histogram",
      "buckets": [
        {"upper_bound": 0.5, "interval": 0, "cumulative": 0},
        {"upper_bound": 1, "interval": 0, "cumulative": 0},
        {"upper_bound": 5, "interval": 0, "cumulative": 0},
        {"upper_bound": 10, "interval": 0, "cumulative": 0},
        {"upper_bound": 25, "interval": 0, "cumulative": 0},
        {"upper_bound": 50, "interval": 0, "cumulative": 0},
        {"upper_bound": 100, "interval": 0, "cumulative": 0},
        {"upper_bound": 250, "interval": 0, "cumulative": 0},
        {"upper_bound": 500, "interval": 0, "cumulative": 0},
        {"upper_bound": 1000, "interval": 0, "cumulative": 0},
        {"upper_bound": 2500, "interval": 0, "cumulative": 100},
        {"upper_bound": 5000, "interval": 0, "cumulative": 300},
        {"upper_bound": 10000, "interval": 0, "cumulative": 600},
        {"upper_bound": 30000, "interval": 0, "cumulative": 600},
        {"upper_bound": 60000, "interval": 0, "cumulative": 600},
        {"upper_bound": 300000, "interval": 0, "cumulative": 600},
        {"upper_bound": 600000, "interval": 0, "cumulative": 600},
        {"upper_bound": 1800000, "interval": 0, "cumulative": 600},
        {"upper_bound": 3600000, "interval": 0, "cumulative": 600}
      ]
    }
  ]
}

GET /stats?format=json&histogram_buckets=disjoint

Changes histogram output to display disjoint buckets with upper bounds. Buckets do not include values from other buckets with smaller upper bounds; the previous bucket’s upper bound acts as a lower bound. Compatible with usedonly and filter.

Example histogram output:

{
  "histograms": [
    {
      "name": "example_histogram",
      "buckets": [
        {"upper_bound": 1, "interval": 0, "cumulative": 0},
        {"upper_bound": 2, "interval": 0, "cumulative": 1},
        {"upper_bound": 3, "interval": 1, "cumulative": 2},
        {"upper_bound": 4, "interval": 0, "cumulative": 0}
      ]
    },
    {
      "name": "other_example_histogram",
      "buckets": [
        {"upper_bound": 0.5, "interval": 0, "cumulative": 0},
        {"upper_bound": 1, "interval": 0, "cumulative": 0},
        {"upper_bound": 5, "interval": 0, "cumulative": 0},
        {"upper_bound": 10, "interval": 0, "cumulative": 0},
        {"upper_bound": 25, "interval": 0, "cumulative": 0},
        {"upper_bound": 50, "interval": 0, "cumulative": 0},
        {"upper_bound": 100, "interval": 0, "cumulative": 0},
        {"upper_bound": 250, "interval": 0, "cumulative": 0},
        {"upper_bound": 500, "interval": 0, "cumulative": 0},
        {"upper_bound": 1000, "interval": 0, "cumulative": 0},
        {"upper_bound": 2500, "interval": 0, "cumulative": 100},
        {"upper_bound": 5000, "interval": 0, "cumulative": 200},
        {"upper_bound": 10000, "interval": 0, "cumulative": 0},
        {"upper_bound": 30000, "interval": 0, "cumulative": 0},
        {"upper_bound": 60000, "interval": 0, "cumulative": 0},
        {"upper_bound": 300000, "interval": 0, "cumulative": 0},
        {"upper_bound": 600000, "interval": 0, "cumulative": 0},
        {"upper_bound": 1800000, "interval": 0, "cumulative": 0},
        {"upper_bound": 3600000, "interval": 0, "cumulative": 0}
      ]
    }
  ]
}

GET /stats?format=json&histogram_buckets=detailed

Shows histograms as both percentile summary data..

Example output:

{
  "stats": [
    {
      "histograms": {
        "supported_percentiles": [0, 25, 50, 75, 90, 95, 99, 99.5, 99.9, 100],
        "details": [
          {
            "name": "http.admin.downstream_rq_time",
            "percentiles": [
              { "interval": null, "cumulative": 1 },
              { "interval": null, "cumulative": 1.0351851851851852 },
              { "interval": null "cumulative": 1.0703703703703704 },
              { "interval": null, "cumulative": 2.0136363636363637 },
              { "interval": null "cumulative": 2.0654545454545454 },
              { "interval": null "cumulative": 2.0827272727272725 },
              { "interval": null "cumulative": 2.0965454545454545 },
              { "interval": null, "cumulative": 2.098272727272727 },
              { "interval": null, "cumulative": 2.0996545454545457 },
              { "interval": null "cumulative": 2.1 }
            ],
            "totals": [
              { "lower_bound": 1, "width": 0.25, "count": 25 },
              { "lower_bound": 2, "width": 0.25, "count": 9 }
            ],
            "intervals": [
              { "lower_bound": 1, "width": 0.25, "count": 2 },
              { "lower_bound": 2, "width": 0.25, "count": 3 }
            ],
          },
        ]
      }
    }
  ]
}

Compatible with usedonly and filter.

GET /stats?format=prometheus

or alternatively,

GET /stats/prometheus

Outputs /stats in Prometheus format. This can be used to integrate with a Prometheus server.

The output will either be the protobuf format or the v0.0.4 text format, depending on the value of the Accept header. A prometheus scrape configuration specifies the desired protocol:

scrape_configs:
  - scrape_protocols:
    - 'PrometheusProto'
    - 'PrometheusText0.0.4'

GET /stats/prometheus?histogram_buckets=prometheusnative&native_histogram_max_buckets=20

Outputs histograms as Prometheus native histograms. This is only available when using the protobuf exposition format.

This mode ignores configured histogram bucket limits and generates a sparse histogram representation which will use a maximum number of buckets, with accuracy adjusted to that number. The default values is 20 if no value for native_histogram_max_buckets is specified.

GET /stats?format=prometheus&usedonly

You can optionally pass the usedonly URL query parameter to only get statistics that Envoy has updated (counters incremented at least once, gauges changed at least once, and histograms added to at least once).

GET /stats?format=prometheus&text_readouts

Optional text_readouts query parameter is used to get all stats including text readouts. Text readout stats are returned in gauge format. These gauges always have value 0. Each gauge record has additional label named text_value that contains value of a text readout.

Warning

Every unique combination of key-value label pair represents a new time series in Prometheus, which can dramatically increase the amount of data stored. Text readout stats create a new label value every time the value of the text readout stat changes, which could create an unbounded number of time series.

GET /stats?format=prometheus&histogram_buckets=summary

Optional histogram_buckets query parameter is used to control how histogram metrics get reported. If unset, histograms get reported as the “histogram” prometheus metric type, but can also be used to emit prometheus “summary” metrics if set to summary. Each emitted summary is over the interval of the last stats_flush_interval.

Example histogram output:

# TYPE envoy_server_initialization_time_ms histogram
envoy_server_initialization_time_ms_bucket{le="0.5"} 0
envoy_server_initialization_time_ms_bucket{le="1"} 0
envoy_server_initialization_time_ms_bucket{le="5"} 0
envoy_server_initialization_time_ms_bucket{le="10"} 0
envoy_server_initialization_time_ms_bucket{le="25"} 0
envoy_server_initialization_time_ms_bucket{le="50"} 0
envoy_server_initialization_time_ms_bucket{le="100"} 0
envoy_server_initialization_time_ms_bucket{le="250"} 1
envoy_server_initialization_time_ms_bucket{le="500"} 1
envoy_server_initialization_time_ms_bucket{le="1000"} 1
envoy_server_initialization_time_ms_bucket{le="2500"} 1
envoy_server_initialization_time_ms_bucket{le="5000"} 1
envoy_server_initialization_time_ms_bucket{le="10000"} 1
envoy_server_initialization_time_ms_bucket{le="30000"} 1
envoy_server_initialization_time_ms_bucket{le="60000"} 1
envoy_server_initialization_time_ms_bucket{le="300000"} 1
envoy_server_initialization_time_ms_bucket{le="600000"} 1
envoy_server_initialization_time_ms_bucket{le="1800000"} 1
envoy_server_initialization_time_ms_bucket{le="3600000"} 1
envoy_server_initialization_time_ms_bucket{le="+Inf"} 1
envoy_server_initialization_time_ms_sum{} 115.000000000000014210854715202
envoy_server_initialization_time_ms_count{} 1

Example summary output:

# TYPE envoy_server_initialization_time_ms summary
envoy_server_initialization_time_ms{quantile="0"} 110.00000000000001
envoy_server_initialization_time_ms{quantile="0.25"} 112.50000000000001
envoy_server_initialization_time_ms{quantile="0.5"} 115.00000000000001
envoy_server_initialization_time_ms{quantile="0.75"} 117.50000000000001
envoy_server_initialization_time_ms{quantile="0.9"} 119.00000000000001
envoy_server_initialization_time_ms{quantile="0.95"} 119.50000000000001
envoy_server_initialization_time_ms{quantile="0.99"} 119.90000000000002
envoy_server_initialization_time_ms{quantile="0.995"} 119.95000000000002
envoy_server_initialization_time_ms{quantile="0.999"} 119.99000000000001
envoy_server_initialization_time_ms{quantile="1"} 120.00000000000001
envoy_server_initialization_time_ms_sum{} 115.000000000000014210854715202
envoy_server_initialization_time_ms_count{} 1

GET /stats/recentlookups

This endpoint helps Envoy developers debug potential contention issues in the stats system. Initially, only the count of StatName lookups is acumulated, not the specific names that are being looked up. In order to see specific recent requests, you must enable the feature by POSTing to /stats/recentlookups/enable. There may be approximately 40-100 nanoseconds of added overhead per lookup.

When enabled, this endpoint emits a table of stat names that were recently accessed as strings by Envoy. Ideally, strings should be converted into StatNames, counters, gauges, and histograms by Envoy code only during startup or when receiving a new configuration via xDS. This is because when stats are looked up as strings they must take a global symbol table lock. During startup this is acceptable, but in response to user requests on high core-count machines, this can cause performance issues due to mutex contention.

See source/docs/stats.md for more details.

Note also that actual mutex contention can be tracked via GET /contention.

POST /stats/recentlookups/enable

Turns on collection of recent lookup of stat-names, thus enabling /stats/recentlookups.

See source/docs/stats.md for more details.

POST /stats/recentlookups/disable

Turns off collection of recent lookup of stat-names, thus disabling /stats/recentlookups. It also clears the list of lookups. However, the total count, visible as stat server.stats_recent_lookups, is not cleared, and continues to accumulate.

See source/docs/stats.md for more details.

POST /stats/recentlookups/clear

Clears all outstanding lookups and counts. This clears all recent lookups data as well as the count, but collection continues if it is enabled.

See source/docs/stats.md for more details.

GET /runtime: Outputs all runtime values on demand in JSON format. See here for more information on how these values are configured and utilized. The output include the list of the active runtime override layers and the stack of layer values for each key. Empty strings indicate no value, and the final active value from the stack also is included in a separate key. Example output:

{
  "layers": [
    "disk",
    "override",
    "admin",
  ],
  "entries": {
    "my_key": {
      "layer_values": [
        "my_disk_value",
        "",
        ""
      ],
      "final_value": "my_disk_value"
    },
    "my_second_key": {
      "layer_values": [
        "my_second_disk_value",
        "my_disk_override_value",
        "my_admin_override_value"
      ],
      "final_value": "my_admin_override_value"
    }
  }
}

POST /runtime_modify?key1=value1&key2=value2&keyN=valueN: Adds or modifies runtime values as passed in query parameters. To delete a previously added key, use an empty string as the value. Note that deletion only applies to overrides added via this endpoint; values loaded from disk can be modified via override but not deleted.

Attention

Use the /runtime_modify endpoint with care. Changes are effectively immediately. It is critical that the admin interface is properly secured.

GET /hystrix_event_stream

This endpoint is intended to be used as the stream source for Hystrix dashboard. a GET to this endpoint will trigger a stream of statistics from Envoy in text/event-stream format, as expected by the Hystrix dashboard.

If invoked from a browser or a terminal, the response will be shown as a continuous stream, sent in intervals defined by the Bootstrap stats_flush_interval

This handler is enabled only when a Hystrix sink is enabled in the config file as documented here.

As Envoy’s and Hystrix resiliency mechanisms differ, some of the statistics shown in the dashboard had to be adapted:

Thread pool rejections - Generally similar to what’s called short circuited in Envoy, and counted by upstream_rq_pending_overflow, although the term thread pool is not accurate for Envoy. Both in Hystrix and Envoy, the result is rejected requests which are not passed upstream.
circuit breaker status (closed or open) - Since in Envoy, a circuit is opened based on the current number of connections/requests in queue, there is no sleeping window for circuit breaker, circuit open/closed is momentary. Hence, we set the circuit breaker status to “forced closed”.
Short-circuited (rejected) - The term exists in Envoy but refers to requests not sent because of passing a limit (queue or connections), while in Hystrix it refers to requests not sent because of high percentage of service unavailable responses during some time frame. In Envoy, service unavailable response will cause outlier detection - removing a node off the load balancer pool, but requests are not rejected as a result. Therefore, this counter is always set to ‘0’.
Latency information represents data since last flush. Mean latency is currently not available.

POST /tap

This endpoint is used for configuring an active tap session. It is only available if a valid tap extension has been configured, and that extension has been configured to accept admin configuration. See:

HTTP tap filter configuration

POST /reopen_logs: Triggers reopen of all access logs. Behavior is similar to SIGUSR1 handling.