5.1 Prometheus 配置示例
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'openclaw'
static_configs:
- targets: ['openclaw-api:8080']
metrics_path: '/actuator/prometheus'
- job_name: 'claude-code-engine'
static_configs:
- targets: ['claude-engine:8081']
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
5.2 自定义指标导出示例
from prometheus_client import start_http_server, Gauge, Counter
import time
service_health = Gauge('service_health_status',
'Service health status',
['service_name'])
workflow_progress = Gauge('workflow_total_progress',
'Workflow total progress')
task_success = Counter('workflow_task_success_total',
'Total successful tasks')
start_http_server(8000)
while True:
service_health.labels(service_name='openclaw').set(95)
workflow_progress.set(65)
time.sleep(15)
5.3 Grafana Dashboard JSON 结构
{
"dashboard": {
"title": "服务状态监控",
"panels": [
{
"id": 1,
"title": "服务健康度",
"type": "stat",
"targets": [{
"expr": "avg(service_health_status)",
"legendFormat": "健康度"
}],
"fieldConfig": {
"defaults": {
"thresholds": {
"steps": [
{"color": "green", "value": 80},
{"color": "yellow", "value": 60},
{"color": "red", "value": 0}
]
}
}
}
}
]
}
}