本文档采用自动化机器翻译技术翻译。 尽管我们力求提供准确的译文,但不对翻译内容的完整性、准确性或可靠性作出任何保证。 若出现任何内容不一致情况,请以原始 英文 版本为准,且原始英文版本为权威文本。

PromQL 表达式参考

本文档中的 PromQL 表达式可用于配置警报。

有关查询 Prometheus 时间序列数据库的更多信息,请参阅官方 Prometheus 文档。

集群指标

集群处理器利用率

编目 表达式:

细节

1 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance))

总结

1 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])))

集群负载平均值

编目 表达式:

细节

<table><tr><td>load1</td><td>sum(node_load1) by (instance) / count(node_cpu_seconds_total{mode="system"}) by (instance)</td></tr><tr><td>load5</td><td>sum(node_load5) by (instance) / count(node_cpu_seconds_total{mode="system"}) by (instance)</td></tr><tr><td>load15</td><td>sum(node_load15) by (instance) / count(node_cpu_seconds_total{mode="system"}) by (instance)</td></tr></table>

总结

<table><tr><td>load1</td><td>sum(node_load1) by (instance) / count(node_cpu_seconds_total{mode="system"})</td></tr><tr><td>load5</td><td>sum(node_load5) by (instance) / count(node_cpu_seconds_total{mode="system"})</td></tr><tr><td>load15</td><td>sum(node_load15) by (instance) / count(node_cpu_seconds_total{mode="system"})</td></tr></table>

集群内存利用率

编目 表达式:

细节

1 - sum(node_memory_MemAvailable_bytes) by (instance) / sum(node_memory_MemTotal_bytes) by (instance)

总结

1 - sum(node_memory_MemAvailable_bytes) / sum(node_memory_MemTotal_bytes)

集群磁盘利用率

编目 表达式:

细节

(sum(node_filesystem_size_bytes{device!="rootfs"}) by (instance) - sum(node_filesystem_free_bytes{device!="rootfs"}) by (instance)) / sum(node_filesystem_size_bytes{device!="rootfs"}) by (instance)

总结

(sum(node_filesystem_size_bytes{device!="rootfs"}) - sum(node_filesystem_free_bytes{device!="rootfs"})) / sum(node_filesystem_size_bytes{device!="rootfs"})

集群磁盘 I/O

编目 表达式:

细节

<table><tr><td>read</td><td>sum(rate(node_disk_read_bytes_total[5m])) by (instance)</td></tr><tr><td>written</td><td>sum(rate(node_disk_written_bytes_total[5m])) by (instance)</td></tr></table>

总结

<table><tr><td>read</td><td>sum(rate(node_disk_read_bytes_total[5m]))</td></tr><tr><td>written</td><td>sum(rate(node_disk_written_bytes_total[5m]))</td></tr></table>

集群网络数据包

编目 表达式:

细节

<table><tr><td>receive-dropped</td><td>sum(rate(node_network_receive_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</td></tr><tr><td>receive-errs</td><td>sum(rate(node_network_receive_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</td></tr><tr><td>receive-packets</td><td>sum(rate(node_network_receive_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</td></tr><tr><td>transmit-dropped</td><td>sum(rate(node_network_transmit_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</td></tr><tr><td>transmit-errs</td><td>sum(rate(node_network_transmit_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</td></tr><tr><td>transmit-packets</td><td>sum(rate(node_network_transmit_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</td></tr></table>

总结

<table><tr><td>receive-dropped</td><td>sum(rate(node_network_receive_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</td></tr><tr><td>receive-errs</td><td>sum(rate(node_network_receive_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</td></tr><tr><td>receive-packets</td><td>sum(rate(node_network_receive_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</td></tr><tr><td>transmit-dropped</td><td>sum(rate(node_network_transmit_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</td></tr><tr><td>transmit-errs</td><td>sum(rate(node_network_transmit_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</td></tr><tr><td>transmit-packets</td><td>sum(rate(node_network_transmit_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</td></tr></table>

集群网络 I/O

编目 表达式:

细节

<table><tr><td>receive</td><td>sum(rate(node_network_receive_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</td></tr><tr><td>transmit</td><td>sum(rate(node_network_transmit_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m])) by (instance)</td></tr></table>

总结

<table><tr><td>receive</td><td>sum(rate(node_network_receive_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</td></tr><tr><td>transmit</td><td>sum(rate(node_network_transmit_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*"}[5m]))</td></tr></table>

节点指标

节点 CPU 利用率

编目 表达式:

细节

avg(irate(node_cpu_seconds_total{mode!="idle", instance=~"$instance"}[5m])) by (mode)

总结

1 - (avg(irate(node_cpu_seconds_total{mode="idle", instance=~"$instance"}[5m])))

节点负载平均值

编目 表达式:

细节

<table><tr><td>负载1</td><td>sum(node_load1{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance="$instance"})</td></tr><tr><td>负载5</td><td>sum(node_load5{instance="$instance"}) / count(node_cpu_seconds_total{mode="system",instance="$instance"})</td></tr><tr><td>负载15</td><td>sum(node_load15{instance="$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})</td></tr></table>

总结

<table><tr><td>负载1</td><td>sum(node_load1{instance=~"$instance"}) / count(node_cpu_seconds_total{mode="system",instance="$instance"})</td></tr><tr><td>负载5</td><td>sum(node_load5{instance="$instance"}) / count(node_cpu_seconds_total{mode="system",instance="$instance"})</td></tr><tr><td>负载15</td><td>sum(node_load15{instance="$instance"}) / count(node_cpu_seconds_total{mode="system",instance=~"$instance"})</td></tr></table>

节点内存利用率

编目 表达式:

细节

1 - sum(node_memory_MemAvailable_bytes{instance=~"$instance"}) / sum(node_memory_MemTotal_bytes{instance=~"$instance"})

总结

`1 - sum(node_memory_MemAvailable_bytes{instance=~"$instance"}) / sum(node_memory_MemTotal_bytes{instance=~"$instance"}) `

节点磁盘利用率

编目 表达式:

细节

(sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"}) by (device) - sum(node_filesystem_free_bytes{device!="rootfs",instance=~"$instance"}) by (device)) / sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"}) by (device)

总结

(sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"}) - sum(node_filesystem_free_bytes{device!="rootfs",instance=~"$instance"})) / sum(node_filesystem_size_bytes{device!="rootfs",instance=~"$instance"})

节点磁盘 I/O

编目 表达式:

细节

<table><tr><td>读取</td><td>sum(rate(node_disk_read_bytes_total{instance="$instance"}[5m]))</td></tr><tr><td>写入</td><td>sum(rate(node_disk_written_bytes_total{instance="$instance"}[5m]))</td></tr></table>

总结

<table><tr><td>读取</td><td>sum(rate(node_disk_read_bytes_total{instance="$instance"}[5m]))</td></tr><tr><td>写入</td><td>sum(rate(node_disk_written_bytes_total{instance="$instance"}[5m]))</td></tr></table>

节点网络数据包

编目 表达式:

细节

<table><tr><td>接收丢包</td><td>sum(rate(node_network_receive_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</td></tr><tr><td>接收错误</td><td>sum(rate(node_network_receive_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</td></tr><tr><td>接收数据包</td><td>sum(rate(node_network_receive_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</td></tr><tr><td>发送丢包</td><td>sum(rate(node_network_transmit_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</td></tr><tr><td>发送错误</td><td>sum(rate(node_network_transmit_errs_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</td></tr><tr><td>发送数据包</td><td>sum(rate(node_network_transmit_packets_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</td></tr></table>

总结

<table><tr><td>接收丢包</td><td>sum(rate(node_network_receive_drop_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance="$instance"}[5m]))</td></tr><tr><td>接收错误</td><td>sum(rate(node_network_receive_errs_total{device!"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance="$instance"}[5m]))</td></tr><tr><td>接收数据包</td><td>sum(rate(node_network_receive_packets_total{device!"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance="$instance"}[5m]))</td></tr><tr><td>发送丢包</td><td>sum(rate(node_network_transmit_drop_total{device!"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance="$instance"}[5m]))</td></tr><tr><td>发送错误</td><td>sum(rate(node_network_transmit_errs_total{device!"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance="$instance"}[5m]))</td></tr><tr><td>发送数据包</td><td>sum(rate(node_network_transmit_packets_total{device!"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</td></tr></table>

节点网络I/O

编目 表达式:

细节

<table><tr><td>接收</td><td>sum(rate(node_network_receive_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</td></tr><tr><td>发送</td><td>sum(rate(node_network_transmit_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m])) by (device)</td></tr></table>

总结

<table><tr><td>接收</td><td>sum(rate(node_network_receive_bytes_total{device!~"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance="$instance"}[5m]))</td></tr><tr><td>发送</td><td>sum(rate(node_network_transmit_bytes_total{device!"lo | veth.* | docker.* | flannel.* | cali.* | cbr.*",instance=~"$instance"}[5m]))</td></tr></table>

Etcd 指标

Etcd 有一个领导者

max(etcd_server_has_leader)

领导者更换次数

max(etcd_server_leader_changes_seen_total)

失败提案次数

sum(etcd_server_proposals_failed_total)

GRPC 客户端流量

编目 表达式:

细节

<table><tr><td>接收</td><td>sum(rate(etcd_network_client_grpc_received_bytes_total[5m])) by (instance)</td></tr><tr><td>发送</td><td>sum(rate(etcd_network_client_grpc_sent_bytes_total[5m])) by (instance)</td></tr></table>

总结

<table><tr><td>接收</td><td>sum(rate(etcd_network_client_grpc_received_bytes_total[5m]))</td></tr><tr><td>发送</td><td>sum(rate(etcd_network_client_grpc_sent_bytes_total[5m]))</td></tr></table>

对等流量

编目 表达式:

细节

<table><tr><td>接收</td><td>sum(rate(etcd_network_peer_received_bytes_total[5m])) by (instance)</td></tr><tr><td>发送</td><td>sum(rate(etcd_network_peer_sent_bytes_total[5m])) by (instance)</td></tr></table>

总结

<table><tr><td>接收</td><td>sum(rate(etcd_network_peer_received_bytes_total[5m]))</td></tr><tr><td>发送</td><td>sum(rate(etcd_network_peer_sent_bytes_total[5m]))</td></tr></table>

数据库大小

编目 表达式:

细节

sum(etcd_debugging_mvcc_db_total_size_in_bytes) by (instance)

总结

sum(etcd_debugging_mvcc_db_total_size_in_bytes)

活动流

编目 表达式:

细节

<table><tr><td>租约观察</td><td>sum(grpc_server_started_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}) by (instance) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}) by (instance)</td></tr><tr><td>观察</td><td>sum(grpc_server_started_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}) by (instance) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}) by (instance)</td></tr></table>

总结

<table><tr><td>租约观察</td><td>sum(grpc_server_started_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"}) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Lease",grpc_type="bidi_stream"})</td></tr><tr><td>租约观察</td><td>sum(grpc_server_started_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}) - sum(grpc_server_handled_total{grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"})</td></tr></table>

Raft提案

编目 表达式:

细节

<table><tr><td>已应用</td><td>sum(increase(etcd_server_proposals_applied_total[5m])) by (instance)</td></tr><tr><td>已提交</td><td>sum(increase(etcd_server_proposals_committed_total[5m])) by (instance)</td></tr><tr><td>待处理</td><td>sum(increase(etcd_server_proposals_pending[5m])) by (instance)</td></tr><tr><td>失败</td><td>sum(increase(etcd_server_proposals_failed_total[5m])) by (instance)</td></tr></table>

总结

<table><tr><td>已应用</td><td>sum(increase(etcd_server_proposals_applied_total[5m]))</td></tr><tr><td>已提交</td><td>sum(increase(etcd_server_proposals_committed_total[5m]))</td></tr><tr><td>待处理</td><td>sum(increase(etcd_server_proposals_pending[5m]))</td></tr><tr><td>失败</td><td>sum(increase(etcd_server_proposals_failed_total[5m]))</td></tr></table>

RPC速率

编目 表达式:

细节

<table><tr><td>总计</td><td>sum(rate(grpc_server_started_total{grpc_type="unary"}[5m])) by (instance)</td></tr><tr><td>失败数</td><td>sum(rate(grpc_server_handled_total{grpc_type="unary",grpc_code!="OK"}[5m])) by (instance)</td></tr></table>

总结

<table><tr><td>总计</td><td>sum(rate(grpc_server_started_total{grpc_type="unary"}[5m]))</td></tr><tr><td>失败数</td><td>sum(rate(grpc_server_handled_total{grpc_type="unary",grpc_code!="OK"}[5m]))</td></tr></table>

磁盘操作

编目 表达式:

细节

<table><tr><td>由后端调用的提交</td><td>sum(rate(etcd_disk_backend_commit_duration_seconds_sum[1m])) by (instance)</td></tr><tr><td>由WAL调用的fsync</td><td>sum(rate(etcd_disk_wal_fsync_duration_seconds_sum[1m])) by (instance)</td></tr></table>

总结

<table><tr><td>由后端调用的提交</td><td>sum(rate(etcd_disk_backend_commit_duration_seconds_sum[1m]))</td></tr><tr><td>由WAL调用的fsync</td><td>sum(rate(etcd_disk_wal_fsync_duration_seconds_sum[1m]))</td></tr></table>

磁盘同步时长

编目 表达式:

细节

<table><tr><td>wal</td><td>histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) by (instance, le))</td></tr><tr><td>db</td><td>histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (instance, le))</td></tr></table>

总结

<table><tr><td>wal</td><td>sum(histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[5m])) by (instance, le)))</td></tr><tr><td>db</td><td>sum(histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (instance, le)))</td></tr></table>

Kubernetes组件指标

API 服务器请求延迟

编目 表达式:

细节

avg(apiserver_request_latencies_sum / apiserver_request_latencies_count) by (instance, verb) /1e+06

总结

avg(apiserver_request_latencies_sum / apiserver_request_latencies_count) by (instance) /1e+06

API 服务器请求速率

编目 表达式:

细节

sum(rate(apiserver_request_count[5m])) by (instance, code)

总结

sum(rate(apiserver_request_count[5m])) by (instance)

调度失败的 Pod

编目 表达式:

细节

sum(kube_pod_status_scheduled{condition="false"})

总结

sum(kube_pod_status_scheduled{condition="false"})

控制器管理器队列深度

编目 表达式:

细节

<table><tr><td>卷</td><td>sum(volumes_depth) by instance</td></tr><tr><td>deployment</td><td>sum(deployment_depth) by instance</td></tr><tr><td>副本集</td><td>sum(replicaset_depth) by instance</td></tr><tr><td>服务</td><td>sum(service_depth) by instance</td></tr><tr><td>服务账户</td><td>sum(serviceaccount_depth) by instance</td></tr><tr><td>端点</td><td>sum(endpoint_depth) by instance</td></tr><tr><td>守护程序集</td><td>sum(daemonset_depth) by instance</td></tr><tr><td>有状态集</td><td>sum(statefulset_depth) by instance</td></tr><tr><td>复制管理器</td><td>sum(replicationmanager_depth) by instance</td></tr></table>

总结

<table><tr><td>卷</td><td>sum(volumes_depth)</td></tr><tr><td>部署</td><td>sum(deployment_depth)</td></tr><tr><td>副本集</td><td>sum(replicaset_depth)</td></tr><tr><td>服务</td><td>sum(service_depth)</td></tr><tr><td>服务账户</td><td>sum(serviceaccount_depth)</td></tr><tr><td>端点</td><td>sum(endpoint_depth)</td></tr><tr><td>守护程序集</td><td>sum(daemonset_depth)</td></tr><tr><td>statefulset</td><td>sum(statefulset_depth)</td></tr><tr><td>复制管理器</td><td>sum(replicationmanager_depth)</td></tr></table>

调度器 E2E 调度延迟

编目 表达式:

细节

histogram_quantile(0.99, sum(scheduler_e2e_scheduling_latency_microseconds_bucket) by (le, instance)) / 1e+06

总结

sum(histogram_quantile(0.99, sum(scheduler_e2e_scheduling_latency_microseconds_bucket) by (le, instance)) / 1e+06)

调度器抢占尝试

编目 表达式:

细节

sum(rate(scheduler_total_preemption_attempts[5m])) by (instance)

总结

sum(rate(scheduler_total_preemption_attempts[5m]))

Ingress 控制器连接

编目 表达式:

细节

<table><tr><td>读取</td><td>sum(nginx_ingress_controller_nginx_process_connections{state="reading"}) by (instance)</td></tr><tr><td>等待</td><td>sum(nginx_ingress_controller_nginx_process_connections{state="waiting"}) by (instance)</td></tr><tr><td>写入</td><td>sum(nginx_ingress_controller_nginx_process_connections{state="writing"}) by (instance)</td></tr><tr><td>已接受</td><td>sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="accepted"}[5m]))) by (instance)</td></tr><tr><td>活动</td><td>sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="active"}[5m]))) by (instance)</td></tr><tr><td>处理</td><td>sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="handled"}[5m]))) by (instance)</td></tr></table>

总结

<table><tr><td>读取</td><td>sum(nginx_ingress_controller_nginx_process_connections{state="reading"})</td></tr><tr><td>等待</td><td>sum(nginx_ingress_controller_nginx_process_connections{state="waiting"})</td></tr><tr><td>写入</td><td>sum(nginx_ingress_controller_nginx_process_connections{state="writing"})</td></tr><tr><td>已接受</td><td>sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="accepted"}[5m])))</td></tr><tr><td>活动</td><td>sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="active"}[5m])))</td></tr><tr><td>处理</td><td>sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state="handled"}[5m])))</td></tr></table>

Ingress 控制器请求处理时间

编目 表达式:

细节

topk(10, histogram_quantile(0.95,sum by (le, host, path)(rate(nginx_ingress_controller_request_duration_seconds_bucket{host!="_"}[5m]))))

总结

topk(10, histogram_quantile(0.95,sum by (le, host)(rate(nginx_ingress_controller_request_duration_seconds_bucket{host!="_"}[5m]))))

Rancher 日志指标

Fluentd 缓冲队列速率

编目 表达式:

细节

sum(rate(fluentd_output_status_buffer_queue_length[5m])) by (instance)

总结

sum(rate(fluentd_output_status_buffer_queue_length[5m]))

Fluentd 输入速率

编目 表达式:

细节

sum(rate(fluentd_input_status_num_records_total[5m])) by (instance)

总结

sum(rate(fluentd_input_status_num_records_total[5m]))

Fluentd 输出错误率

编目 表达式:

细节

sum(rate(fluentd_output_status_num_errors[5m])) by (type)

总结

sum(rate(fluentd_output_status_num_errors[5m]))

Fluentd 输出速率

编目 表达式:

细节

sum(rate(fluentd_output_status_num_records_total[5m])) by (instance)

总结

sum(rate(fluentd_output_status_num_records_total[5m]))

工作负载指标

工作负载 CPU 利用率

编目 表达式:

细节

<table><tr><td>CFS 限制秒数</td><td>sum(rate(container_cpu_cfs_throttled_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>用户秒数</td><td>sum(rate(container_cpu_user_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>系统秒数</td><td>sum(rate(container_cpu_system_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>使用秒数</td><td>sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr></table>

总结

<table><tr><td>CFS 限制秒数</td><td>sum(rate(container_cpu_cfs_throttled_seconds_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>用户秒数</td><td>sum(rate(container_cpu_user_seconds_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>系统秒数</td><td>sum(rate(container_cpu_system_seconds_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>使用秒数</td><td>sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr></table>

工作负载内存利用率

编目 表达式:

细节

sum(container_memory_working_set_bytes{namespace="$namespace",pod_name=~"$podName", container_name!=""}) by (pod_name)

总结

sum(container_memory_working_set_bytes{namespace="$namespace",pod_name=~"$podName", container_name!=""})

工作负载网络数据包

编目 表达式:

细节

<table><tr><td>接收数据包</td><td>sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>接收丢包</td><td>sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>接收错误</td><td>sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>发送数据包</td><td>sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>发送丢包</td><td>sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>发送错误</td><td>sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr></table>

总结

<table><tr><td>接收数据包</td><td>sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>接收丢包</td><td>sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>接收错误</td><td>sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送数据包</td><td>sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送丢包</td><td>sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送错误</td><td>sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr></table>

工作负载网络 I/O

编目 表达式:

细节

<table><tr><td>接收</td><td>sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>发送</td><td>sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr></table>

总结

<table><tr><td>接收</td><td>sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送</td><td>sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr></table>

工作负载磁盘 I/O

编目 表达式:

细节

<table><tr><td>读取</td><td>sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr><tr><td>写入</td><td>sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name=~"$podName",container_name!=""}[5m])) by (pod_name)</td></tr></table>

总结

<table><tr><td>读取</td><td>sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>写入</td><td>sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr></table>

Pod 指标

Pod CPU 利用率

编目 表达式:

细节

<table><tr><td>CFS 限制的秒数</td><td>sum(rate(container_cpu_cfs_throttled_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)</td></tr><tr><td>使用秒数</td><td>sum(rate(container_cpu_usage_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)</td></tr><tr><td>系统秒数</td><td>sum(rate(container_cpu_system_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)</td></tr><tr><td>用户秒数</td><td>sum(rate(container_cpu_user_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m])) by (container_name)</td></tr></table>

总结

<table><tr><td>CFS 限制的秒数</td><td>sum(rate(container_cpu_cfs_throttled_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))</td></tr><tr><td>使用秒数</td><td>sum(rate(container_cpu_usage_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))</td></tr><tr><td>系统秒数</td><td>sum(rate(container_cpu_system_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))</td></tr><tr><td>用户秒数</td><td>sum(rate(container_cpu_user_seconds_total{container_name!="POD",namespace="$namespace",pod_name="$podName", container_name!=""}[5m]))</td></tr></table>

Pod 内存利用率

编目 表达式:

细节

sum(container_memory_working_set_bytes{container_name!="POD",namespace="$namespace",pod_name="$podName",container_name!=""}) by (container_name)

总结

sum(container_memory_working_set_bytes{container_name!="POD",namespace="$namespace",pod_name="$podName",container_name!=""})

Pod 网络数据包

编目 表达式:

细节

<table><tr><td>接收数据包</td><td>sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>接收丢包</td><td>sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>接收错误</td><td>sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送数据包</td><td>sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送丢包</td><td>sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送错误</td><td>sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr></table>

总结

<table><tr><td>接收数据包</td><td>sum(rate(container_network_receive_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>接收丢包</td><td>sum(rate(container_network_receive_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>接收错误</td><td>sum(rate(container_network_receive_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送数据包</td><td>sum(rate(container_network_transmit_packets_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送丢包</td><td>sum(rate(container_network_transmit_packets_dropped_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送错误</td><td>sum(rate(container_network_transmit_errors_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr></table>

Pod 网络 I/O

编目 表达式:

细节

<table><tr><td>接收</td><td>sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送</td><td>sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr></table>

总结

<table><tr><td>接收</td><td>sum(rate(container_network_receive_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>发送</td><td>sum(rate(container_network_transmit_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr></table>

Pod 磁盘 I/O

编目 表达式:

细节

<table><tr><td>读取</td><td>sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m])) by (container_name)</td></tr><tr><td>写入</td><td>sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m])) by (container_name)</td></tr></table>

总结

<table><tr><td>读取</td><td>sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr><tr><td>写入</td><td>sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name!=""}[5m]))</td></tr></table>

容器指标

容器 CPU 利用率

编目 表达式:

cfs 限制秒数

sum(rate(container_cpu_cfs_throttled_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))

使用秒数

sum(rate(container_cpu_usage_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))

系统秒数

sum(rate(container_cpu_system_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))

用户秒数

sum(rate(container_cpu_user_seconds_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))

容器内存利用率

sum(container_memory_working_set_bytes{namespace="$namespace",pod_name="$podName",container_name="$containerName"})

容器磁盘 I/O

编目 表达式:

sum(rate(container_fs_reads_bytes_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))

write

sum(rate(container_fs_writes_bytes_total{namespace="$namespace",pod_name="$podName",container_name="$containerName"}[5m]))