基于 OpenClaw + Claude Code 的端到端研发自动化系统

环境部署手册与项目初始化说明文档

📅 版本:v1.0 | 更新日期:2026 年 3 月 13 日 | 适用版本:OpenClaw v2.5+ / Claude Code v3.7+

第 1 章 系统概述与架构设计

🎯 本章目标:理解端到端研发自动化系统的整体架构、核心价值和技术选型依据

1.1 系统背景与价值主张

在 2025-2026 年 AI Agent 技术迅猛发展的背景下,OpenClaw 作为本地优先(Local-First)的 AI 助手框架, 与 Anthropic 推出的 Claude Code 命令行工具相结合,构建了一套完整的端到端研发自动化系统。 本系统实现了从需求分析到生产部署的全流程自动化,支持各研发角色的岗位 Agents 协同工作,并在关键节点提供人机协同能力。

✅ 核心价值:
  • 效率提升 10 倍+:AI 自主完成重复性编码、测试、部署任务
  • 质量保障:标准化流程 + 自动化测试覆盖,减少人为错误
  • 人机协同:关键决策点保留人工审核,平衡效率与安全
  • 可追溯性:全流程记录,便于审计和问题定位

1.2 系统整体架构

需求管理
PRD 设计
技术方案
API 协议
AI Coding
单元测试
集成测试
CI/CD
K8S 部署
UI 自动化验收

1.3 技术栈组成

层级 技术组件 版本要求 核心作用
AI Agent 层 OpenClaw + Claude Code v2.5+ / v3.7+ 智能体编排、代码生成、任务执行
大模型层 Claude 3.7 Sonnet / GPT-5.4 最新版 推理决策、代码理解、自然语言处理
开发工具层 Git / Maven / npm 最新版 版本控制、依赖管理、构建工具
测试层 JUnit / pytest / Playwright 最新版 单元测试、集成测试、UI 自动化测试
CI/CD层 Jenkins + KubeSphere Jenkins 2.400+ / KubeSphere 4.2+ 持续集成、流水线编排、容器管理
容器编排层 Docker + Kubernetes Docker 24+ / K8s 1.28+ 容器化、服务编排、自动扩缩容
监控层 Prometheus + Grafana 最新版 系统监控、指标采集、可视化告警

1.4 研发角色 Agents 矩阵

产品经理 Agent (PM-Agent)

职责:需求分析、PRD 文档生成、用户故事拆解、优先级排序

工具链:Claude Code + OpenClaw Skills + Confluence API

架构师 Agent (Architect-Agent)

职责:技术方案设计、系统架构评审、技术选型建议、风险评估

工具链:Claude Code Thinking Mode + 架构知识库 + Draw.io MCP

后端开发 Agent (Backend-Agent)

职责:API 接口开发、数据库设计、业务逻辑实现、性能优化

工具链:Claude Code + Spring Boot / FastAPI + PostgreSQL MCP

前端开发 Agent (Frontend-Agent)

职责:UI 组件开发、页面交互实现、响应式适配、性能优化

工具链:Claude Code + React/Vue + TailwindCSS + Playwright MCP

测试工程师 Agent (QA-Agent)

职责:测试用例生成、自动化测试执行、缺陷报告、质量评估

工具链:Claude Code + JUnit/pytest + Playwright + Allure Report

DevOps 工程师 Agent (DevOps-Agent)

职责:CI/CD 流水线配置、容器镜像构建、K8S 部署、监控告警

工具链:Claude Code + Jenkins Pipeline + KubeSphere API + Helm

第 2 章 基础环境准备

🎯 本章目标:完成所有基础软件、依赖和运行环境的安装配置

2.1 操作系统要求

操作系统 最低版本 推荐配置 备注
Ubuntu Linux 22.04 LTS 24.04 LTS 首选推荐
macOS 13.0 (Ventura) 14.0+ (Sonoma) M 系列芯片优化
Windows 11 11 + WSL2 需启用 WSL2
⚠️ 硬件资源要求:
  • CPU:最少 8 核,推荐 16 核以上(AI 推理需要较强算力)
  • 内存:最少 16GB,推荐 32GB 以上
  • 存储:最少 100GB SSD,推荐 500GB NVMe SSD
  • 网络:稳定互联网连接(访问 GitHub、Docker Hub、LLM API)

2.2 系统依赖安装

2.2.1 Ubuntu/Debian 系统

# 更新系统包
sudo apt update && sudo apt upgrade -y

# 安装基础开发工具
sudo apt install -y git curl wget vim build-essential software-properties-common

# 安装 Docker 依赖
sudo apt install -y ca-certificates gnupg lsb-release

# 安装 Kubernetes 工具依赖
sudo apt install -y apt-transport-https conntrack socs ipset

2.2.2 macOS 系统

# 安装 Homebrew 包管理器
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# 安装基础工具
brew install git curl wget vim

2.3 Python 环境配置

# 安装 Python 3.12
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install -y python3.12 python3.12-venv python3.12-dev

# 验证安装
python3.12 --version

# 创建虚拟环境
python3.12 -m venv ~/.venvs/openclaw-env
source ~/.venvs/openclaw-env/bin/activate

# 升级 pip
pip install --upgrade pip setuptools wheel

# 安装核心依赖
pip install openclaw==2.5.0 anthropic==0.37.0 playwright==1.42.0

2.4 Node.js 环境配置

# 使用 nvm 安装 Node.js 20 LTS
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
source ~/.bashrc
nvm install 20
nvm use 20
nvm alias default 20

# 验证安装
node --version
npm --version

# 安装全局工具
npm install -g pnpm yarn typescript ts-node

2.5 Docker 与 Docker Compose

# 添加 Docker 官方 GPG 密钥
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

# 添加 Docker 仓库
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# 安装 Docker Engine
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# 启动 Docker 并设置开机自启
sudo systemctl enable docker
sudo systemctl start docker

# 将当前用户加入 docker 组(避免每次使用 sudo)
sudo usermod -aG docker $USER
newgrp docker

# 验证安装
docker --version
docker compose version

2.6 Kubernetes 集群准备

💡 部署方案选择:
  • 开发环境:使用 kind 或 minikube 快速搭建单节点集群
  • 测试环境:使用 k3s 或 kubeadm 搭建多节点集群
  • 生产环境:使用云厂商托管 K8S(EKS/GKE/AKS)或自建高可用集群

2.6.1 使用 kind 快速搭建开发集群

# 安装 kind
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.22.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind

# 创建多节点集群配置
cat > kind-config.yaml <# 创建集群
kind create cluster --config kind-config.yaml --name rd-automation

# 验证集群状态
kubectl cluster-info
kubectl get nodes

2.6.2 安装 kubectl 命令行工具

# 下载 kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin/

# 配置 kubectl 自动补全
echo "source <(kubectl completion bash)" >> ~/.bashrc
source ~/.bashrc

# 验证安装
kubectl version --client

2.7 KubeSphere 安装

# 前置检查:确保 K8s 集群正常运行
kubectl get nodes

# 下载 ks-installer
git clone https://github.com/kubesphere/ks-installer.git
cd ks-installer

# 编辑 cluster-configuration.yaml 启用所需组件
vim cluster-configuration.yaml
# 启用 devops、monitoring、logging 等组件

# 执行安装
kubectl apply -f cluster-configuration.yaml

# 查看安装进度
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l app=ks-install -o jsonpath='{.items[0].metadata.name}') -f

# 安装完成后访问控制台
# 默认地址:http://:30880
# 默认账号:admin / 密码:P@88w0rd

2.8 Jenkins 安装与配置

# 创建 Jenkins 命名空间
kubectl create namespace jenkins

# 创建持久化存储 PV
cat > jenkins-pv.yaml <# 部署 Jenkins
helm repo add jenkinsci https://charts.jenkins.io
helm repo update
helm install jenkins jenkinsci/jenkins \
  --namespace jenkins \
  --set persistence.enabled=true \
  --set persistence.size=50Gi \
  --set service.type=NodePort \
  --set service.nodePort=30080

# 获取初始管理员密码
kubectl exec -it -n jenkins $(kubectl get pod -n jenkins -l app.kubernetes.io/component=jenkins-controller -o jsonpath='{.items[0].metadata.name}') -- cat /var/jenkins_home/secrets/initialAdminPassword

# 访问 Jenkins
# 地址:http://:30080

2.9 环境变量配置

# 添加到 ~/.bashrc 或 ~/.zshrc
cat >> ~/.bashrc <<'EOF'

# OpenClaw + Claude Code 研发自动化系统环境变量

# API Keys(请替换为您的实际密钥)
export ANTHROPIC_API_KEY="your-anthropic-api-key"
export OPENAI_API_KEY="your-openai-api-key"

# OpenClaw 配置
export OPENCLAW_HOME="$HOME/openclaw"
export OPENCLAW_CONFIG="$OPENCLAW_HOME/config"
export OPENCLAW_SKILLS="$OPENCLAW_HOME/skills"

# Claude Code 配置
export CLAUDE_CODE_HOME="$HOME/.claude-code"
export CLAUDE_CODE_MODEL="claude-sonnet-4-20250514"

# Kubernetes 配置
export KUBECONFIG="$HOME/.kube/config"
export DEFAULT_NAMESPACE="default"

# Docker 仓库配置
export DOCKER_REGISTRY="registry.example.com"
export DOCKER_USERNAME="your-username"
export DOCKER_PASSWORD="your-password"

# Jenkins 配置
export JENKINS_URL="http://jenkins.example.com:30080"
export JENKINS_USER="admin"
export JENKINS_TOKEN="your-jenkins-token"

# Git 配置
export GIT_AUTHOR_NAME="RD-Automation-System"
export GIT_AUTHOR_EMAIL="automation@example.com"

EOF

# 使配置生效
source ~/.bashrc
✅ 环境验证清单:
  • Python 3.12+ 已安装并可正常使用
  • Node.js 20 LTS 已安装并可正常使用
  • Docker 已启动且当前用户可无 sudo 执行 docker 命令
  • Kubernetes 集群正常运行(kubectl get nodes 显示 Ready)
  • KubeSphere 控制台可访问
  • Jenkins 可访问且已完成初始化
  • 所有环境变量已正确配置

第 3 章 核心组件安装配置

🎯 本章目标:完成 OpenClaw、Claude Code 及 MCP 服务器的深度配置

3.1 OpenClaw 安装与配置

3.1.1 克隆源码与安装

# 克隆 OpenClaw 仓库
cd ~
git clone https://github.com/OpenClawHQ/openclaw.git
cd openclaw

# 安装 pnpm(如果未安装)
npm install -g pnpm

# 安装依赖
pnpm install

# 构建项目
pnpm build

# 初始化配置
pnpm run init

# 验证安装
pnpm run doctor

3.1.2 配置文件详解

# ~/.openclaw/config.yaml
llm:
  provider: anthropic
  model: claude-sonnet-4-20250514
  api_key_env: ANTHROPIC_API_KEY
  max_tokens: 8192
  temperature: 0.7
  
memory:
  type: sqlite
  path: ~/.openclaw/memory.db
  vector_store: chromadb
  embedding_model: text-embedding-3-small
  
channels:
  enabled:
    - feishu      # 飞书
    - wechat      # 企业微信
    - dingtalk    # 钉钉
    - slack
    - discord
  
  feishu:
    app_id: ${FEISHU_APP_ID}
    app_secret: ${FEISHU_APP_SECRET}
    verification_token: ${FEISHU_VERIFICATION_TOKEN}
  
skills:
  auto_load: true
  custom_paths:
    - ~/.openclaw/custom-skills
  security:
    sandbox_mode: strict
    allowed_commands:
      - git
      - docker
      - kubectl
      - npm
      - pnpm
      - python3
    blocked_commands:
      - rm -rf /
      - dd
      - mkfs
      
gateway:
  host: 0.0.0.0
  port: 8080
  websocket_path: /ws
  
ui:
  control_panel: true
  web_chat: true
  port: 3000
  
logging:
  level: info
  file: ~/.openclaw/logs/openclaw.log
  max_size: 100MB
  retention_days: 30

3.1.3 启动 OpenClaw 服务

# 启动 Gateway 服务
cd ~/openclaw
pnpm run gateway:start

# 启动 Control UI
pnpm run ui:start

# 后台运行(使用 systemd)
sudo systemctl enable openclaw-gateway
sudo systemctl start openclaw-gateway
sudo systemctl status openclaw-gateway

3.2 Claude Code 安装与配置

3.2.1 安装 Claude Code CLI

# 使用 npm 全局安装
npm install -g @anthropic-ai/claude-code

# 或使用 npx 直接运行
npx @anthropic-ai/claude-code@latest

# 验证安装
claude --version

# 初始化配置
claude auth
# 按提示输入 ANTHROPIC_API_KEY

3.2.2 Claude Code 配置文件

# ~/.claude-code/settings.json
{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 8192,
  "temperature": 0.7,
  "thinking_enabled": true,
  "auto_approve": ["read", "search"],
  "require_approval_for": ["write", "execute", "deploy"],
  "mcp_servers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_TOKEN": "${GITHUB_TOKEN}"
      }
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem"],
      "args": ["/home/user/projects"]
    },
    "kubernetes": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-kubernetes"]
    },
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp-server"]
    }
  },
  "permissions": {
    "/home/user/projects": "read-write",
    "/tmp": "read-write",
    "/etc": "read-only"
  },
  "hooks": {
    "pre_execute": "~/.claude-code/hooks/pre-execute.sh",
    "post_execute": "~/.claude-code/hooks/post-execute.sh"
  }
}

3.3 MCP 服务器配置

3.3.1 GitHub MCP Server

# 创建 GitHub Token(需要 repo、workflow 权限)
# 访问 https://github.com/settings/tokens

# 配置 MCP
cat >> ~/.claude-code/mcp.json <

3.3.2 Playwright MCP Server(UI 自动化)

# 安装 Playwright 浏览器
npx playwright install chromium firefox webkit

# 配置 Playwright MCP
cat >> ~/.claude-code/mcp.json <

3.3.3 Kubernetes MCP Server

# 配置 K8s MCP
cat >> ~/.claude-code/mcp.json <

3.4 OpenClaw Skills 开发

3.4.1 创建自定义 Skill

# 创建 Skill 目录结构
mkdir -p ~/openclaw-custom-skills/code-generator
cd ~/openclaw-custom-skills/code-generator

# 创建 skill.yaml
cat > skill.yaml <=0.37.0
  - jinja2>=3.1.0
  - black>=24.0.0
  - prettier>=3.0.0
EOF

# 创建主执行脚本
cat > main.py <<'PYTHON'
#!/usr/bin/env python3
"""
代码生成器 Skill - 根据 PRD 文档自动生成项目骨架
"""

import os
import sys
from pathlib import Path
from jinja2 import Template

class CodeGenerator:
    def __init__(self, project_name: str, prd_content: str):
        self.project_name = project_name
        self.prd_content = prd_content
        self.base_dir = Path.cwd() / project_name
    
    def generate_backend_scaffold(self):
        """生成后端项目骨架(Spring Boot / FastAPI)"""
        backend_dir = self.base_dir / "backend"
        backend_dir.mkdir(parents=True, exist_ok=True)
        
        # 创建 FastAPI 项目结构
        (backend_dir / "app").mkdir(exist_ok=True)
        (backend_dir / "app" / "api").mkdir(exist_ok=True)
        (backend_dir / "app" / "models").mkdir(exist_ok=True)
        (backend_dir / "app" / "services").mkdir(exist_ok=True)
        
        # 生成 main.py
        main_template = Template("""
from fastapi import FastAPI
from .api import router

app = FastAPI(
    title="{{ project_name }}",
    description="Auto-generated by OpenClaw Code Generator",
    version="1.0.0"
)

app.include_router(router, prefix="/api/v1")

@app.get("/health")
def health_check():
    return {"status": "healthy"}
""")
        
        with open(backend_dir / "app" / "main.py", 'w') as f:
            f.write(main_template.render(project_name=self.project_name))
        
        print(f"✓ Backend scaffold generated at {backend_dir}")
    
    def generate_frontend_scaffold(self):
        """生成前端项目骨架(React + TypeScript)"""
        frontend_dir = self.base_dir / "frontend"
        frontend_dir.mkdir(parents=True, exist_ok=True)
        
        # 创建 React 项目结构
        (frontend_dir / "src").mkdir(exist_ok=True)
        (frontend_dir / "src" / "components").mkdir(exist_ok=True)
        (frontend_dir / "src" / "pages").mkdir(exist_ok=True)
        (frontend_dir / "src" / "hooks").mkdir(exist_ok=True)
        
        print(f"✓ Frontend scaffold generated at {frontend_dir}")
    
    def generate_dockerfiles(self):
        """生成 Dockerfile 和 docker-compose.yml"""
        # Backend Dockerfile
        backend_dockerfile = """FROM python:3.12-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8000

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
"""
        
        with open(self.base_dir / "backend" / "Dockerfile", 'w') as f:
            f.write(backend_dockerfile)
        
        # docker-compose.yml
        docker_compose = """version: '3.8'

services:
  backend:
    build: ./backend
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql://user:pass@db:5432/app
    depends_on:
      - db
  
  frontend:
    build: ./frontend
    ports:
      - "3000:3000"
  
  db:
    image: postgres:15
    environment:
      POSTGRES_DB: app
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:
"""
        
        with open(self.base_dir / "docker-compose.yml", 'w') as f:
            f.write(docker_compose)
        
        print("✓ Docker configuration generated")
    
    def generate_all(self):
        """执行完整的项目生成流程"""
        print(f"🚀 Starting project generation for: {self.project_name}")
        self.generate_backend_scaffold()
        self.generate_frontend_scaffold()
        self.generate_dockerfiles()
        print(f"✅ Project '{self.project_name}' generated successfully!")

if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("Usage: python main.py  ")
        sys.exit(1)
    
    project_name = sys.argv[1]
    prd_file = sys.argv[2]
    
    with open(prd_file, 'r') as f:
        prd_content = f.read()
    
    generator = CodeGenerator(project_name, prd_content)
    generator.generate_all()
PYTHON

chmod +x main.py

3.5 安全配置与权限管理

⚠️ 安全最佳实践:
  • 所有 API Key 必须通过环境变量注入,禁止硬编码
  • 启用 OpenClaw 的沙箱模式(sandbox_mode: strict)
  • 配置命令白名单,只允许必要的系统命令
  • 定期轮换密钥和访问令牌
  • 启用操作审计日志,记录所有 AI 执行的操作
# 创建安全策略配置文件
cat > ~/.openclaw/security-policy.yaml <sandbox:
  enabled: true
  mode: strict
  network_isolation: true
  filesystem_isolation: true

command_whitelist:
  allowed:
    - git clone
    - git pull
    - git push
    - git commit
    - docker build
    - docker run
    - kubectl apply
    - kubectl get
    - kubectl describe
    - npm install
    - npm run build
    - pnpm install
    - python3 -m pytest
    - java -jar
  
  denied:
    - rm -rf
    - dd
    - mkfs
    - chmod 777
    - curl.*\|.*sh
    - wget.*\|.*sh

audit:
  log_all_actions: true
  log_file: ~/.openclaw/logs/audit.log
  retention_days: 90
  alert_on_denied: true

rate_limits:
  max_api_calls_per_hour: 1000
  max_file_operations_per_minute: 100
  max_concurrent_executions: 5
EOF

第 4 章 研发角色 Agents 配置

🎯 本章目标:配置各研发岗位的专属 Agent,定义其职责、工具链和工作流程

4.1 Agent 通用配置框架

# ~/.openclaw/agents/agent-template.yaml
agent:
  name: 
  role: 
  version: 1.0.0
  
personality:
  tone: professional
  expertise_level: senior
  communication_style: concise-and-clear

capabilities:
  - 
  - 

tools:
  llm:
    provider: anthropic
    model: claude-sonnet-4-20250514
    temperature: 0.7
    max_tokens: 8192
  
  mcp_servers:
    - github
    - filesystem
    - 
  
  skills:
    - 
    - 

workflow:
  trigger: 
  steps:
    - 
    - 
  
  human_in_loop:
    enabled: true
    approval_points:
      - 
      - 

output:
  format: markdown
  delivery:
    - channel: feishu
    - channel: email
    - channel: confluence

4.2 产品经理 Agent (PM-Agent)

PM-Agent PRO

核心职责:需求分析、PRD 文档生成、用户故事地图、优先级排序、竞品分析

# ~/.openclaw/agents/pm-agent.yaml
agent:
  name: PM-Agent
  role: 高级产品经理
  version: 1.0.0

personality:
  tone: professional-and-empathetic
  expertise_level: senior
  years_of_experience: 10
  specialization:
    - SaaS products
    - B2B enterprise software
    - AI-powered applications

capabilities:
  - requirement-analysis
  - prd-generation
  - user-story-mapping
  - priority-ranking
  - competitive-analysis
  - market-research
  - stakeholder-interviews

tools:
  llm:
    provider: anthropic
    model: claude-sonnet-4-20250514
    thinking_mode: enabled
    temperature: 0.8
  
  mcp_servers:
    - confluence
    - jira
    - figma
    - google-docs
  
  skills:
    - requirement-analyzer
    - prd-generator
    - user-story-creator
    - competitor-analyzer

knowledge_base:
  sources:
    - ~/knowledge/product-management/
    - ~/knowledge/industry-reports/
    - ~/knowledge/user-research/
  
  vector_store: chromadb
  embedding_model: text-embedding-3-large

workflow:
  name: PRD Generation Workflow
  trigger:
    type: keyword
    patterns:
      - "生成 PRD"
      - "create PRD"
      - "产品需求文档"
      - "product requirement"
  
  steps:
    - step: 1
      action: interview-stakeholders
      description: 通过对话收集需求信息
      questions:
        - "产品的目标用户是谁?"
        - "核心痛点是什么?"
        - "期望的关键功能有哪些?"
        - "成功指标如何定义?"
    
    - step: 2
      action: analyze-requirements
      description: 使用 Kano 模型分析需求优先级
    
    - step: 3
      action: generate-prd-draft
      description: 生成 PRD 初稿
      template: ~/templates/prd-template.md
    
    - step: 4
      action: create-user-stories
      description: 拆解为用户故事和验收标准
      format: As a [user], I want to [action], so that [benefit]
    
    - step: 5
      action: human-review
      description: 提交人工审核
      channel: feishu
    
    - step: 6
      action: finalize-and-publish
      description: 根据反馈修订并发布到 Confluence

  human_in_loop:
    enabled: true
    approval_points:
      - prd-draft-completion
      - priority-finalization
      - before-publishing

output_templates:
  prd: ~/templates/prd-output.md
  user_stories: ~/templates/user-stories.md
  competitive_analysis: ~/templates/competitor-analysis.md

metrics:
  track:
    - prd-generation-time
    - stakeholder-satisfaction
    - requirement-clarity-score

4.3 架构师 Agent (Architect-Agent)

Architect-Agent PRO

核心职责:系统架构设计、技术选型、架构评审、风险评估、性能优化方案

# ~/.openclaw/agents/architect-agent.yaml
agent:
  name: Architect-Agent
  role: 首席架构师
  version: 1.0.0

personality:
  tone: analytical-and-pragmatic
  expertise_level: principal
  years_of_experience: 15
  specialization:
    - distributed systems
    - cloud-native architecture
    - microservices
    - event-driven architecture

capabilities:
  - system-design
  - technology-selection
  - architecture-review
  - risk-assessment
  - performance-optimization
  - scalability-planning
  - security-architecture

tools:
  llm:
    provider: anthropic
    model: claude-sonnet-4-20250514
    thinking_mode: deep
    temperature: 0.5
  
  mcp_servers:
    - drawio
    - mermaid
    - github
    - stackoverflow
  
  skills:
    - architecture-designer
    - tech-stack-advisor
    - risk-analyzer
    - diagram-generator

knowledge_base:
  sources:
    - ~/knowledge/architecture-patterns/
    - ~/knowledge/tech-radar/
    - ~/knowledge/case-studies/
    - aws-well-architected-framework
    - microsoft-azure-well-architected-framework

workflow:
  name: Technical Solution Design Workflow
  trigger:
    type: keyword
    patterns:
      - "技术方案设计"
      - "architecture design"
      - "系统架构"
      - "technical proposal"
  
  steps:
    - step: 1
      action: analyze-prd
      description: 深入分析 PRD 文档,提取技术要求
    
    - step: 2
      action: define-non-functional-requirements
      description: 明确性能、可用性、安全性等非功能性需求
      aspects:
        - performance: "P99 latency < 200ms"
        - availability: "99.9% SLA"
        - scalability: "Support 10x growth"
        - security: "SOC2 Type II compliant"
    
    - step: 3
      action: evaluate-tech-options
      description: 评估技术选型方案
      criteria:
        - maturity
        - community-support
        - team-expertise
        - cost
        - vendor-lock-in-risk
    
    - step: 4
      action: design-architecture
      description: 设计系统架构图
      deliverables:
        - component-diagram
        - deployment-diagram
        - data-flow-diagram
        - api-contract
    
    - step: 5
      action: risk-assessment
      description: 识别技术风险和缓解措施
    
    - step: 6
      action: architecture-review-meeting
      description: 组织架构评审会议
      participants:
        - tech-lead
        - senior-engineers
        - security-team
    
    - step: 7
      action: finalize-architecture-doc
      description: 输出最终技术方案文档

  human_in_loop:
    enabled: true
    approval_points:
      - technology-selection
      - architecture-finalization
      - before-implementation

templates:
  architecture_doc: ~/templates/architecture-design.md
  api_contract: ~/templates/api-contract-openapi.yaml
  risk_register: ~/templates/risk-register.xlsx

4.4 后端开发 Agent (Backend-Agent)

Backend-Agent NEW

核心职责:API 开发、数据库设计、业务逻辑实现、代码审查、性能优化

# ~/.openclaw/agents/backend-agent.yaml
agent:
  name: Backend-Agent
  role: 高级后端工程师
  version: 1.0.0

personality:
  tone: precise-and-efficient
  expertise_level: senior
  years_of_experience: 8
  specialization:
    - RESTful API design
    - database optimization
    - microservices
    - event-driven systems

capabilities:
  - api-development
  - database-design
  - business-logic-implementation
  - code-review
  - performance-optimization
  - security-implementation
  - integration-testing

tools:
  llm:
    provider: anthropic
    model: claude-sonnet-4-20250514
    temperature: 0.3
  
  mcp_servers:
    - github
    - postgresql
    - redis
    - swagger
  
  skills:
    - code-generator
    - api-designer
    - database-modeler
    - test-writer
    - code-reviewer

frameworks:
  preferred:
    - FastAPI (Python)
    - Spring Boot (Java)
    - Express.js (Node.js)
  
  database:
    - PostgreSQL
    - MongoDB
    - Redis

workflow:
  name: Backend Development Workflow
  trigger:
    type: keyword
    patterns:
      - "开发 API"
      - "develop API"
      - "后端实现"
      - "backend implementation"
  
  steps:
    - step: 1
      action: analyze-api-contract
      description: 分析 API 接口协议(OpenAPI Spec)
    
    - step: 2
      action: design-database-schema
      description: 设计数据库表结构和索引
      deliverable: migration-scripts
    
    - step: 3
      action: implement-models
      description: 实现数据模型层(ORM)
    
    - step: 4
      action: implement-services
      description: 实现业务逻辑层
    
    - step: 5
      action: implement-controllers
      description: 实现 API 控制器层
    
    - step: 6
      action: write-unit-tests
      description: 编写单元测试,覆盖率>80%
    
    - step: 7
      action: self-code-review
      description: 自我代码审查
    
    - step: 8
      action: create-pull-request
      description: 创建 Pull Request

  human_in_loop:
    enabled: true
    approval_points:
      - database-schema-change
      - before-merge-to-main

coding_standards:
  style_guide: PEP8 / Google Java Style
  linting: black, flake8, checkstyle
  documentation: docstrings required
  testing: TDD preferred

4.5 前端开发 Agent (Frontend-Agent)

Frontend-Agent NEW

核心职责:UI 组件开发、页面实现、响应式适配、性能优化、可访问性

# ~/.openclaw/agents/frontend-agent.yaml
agent:
  name: Frontend-Agent
  role: 高级前端工程师
  version: 1.0.0

personality:
  tone: creative-and-detail-oriented
  expertise_level: senior
  years_of_experience: 7
  specialization:
    - React ecosystem
    - responsive design
    - web performance
    - accessibility (WCAG 2.1)

capabilities:
  - component-development
  - page-implementation
  - responsive-design
  - performance-optimization
  - accessibility-audit
  - cross-browser-testing
  - e2e-testing

tools:
  llm:
    provider: anthropic
    model: claude-sonnet-4-20250514
    temperature: 0.6
  
  mcp_servers:
    - github
    - figma
    - playwright
    - lighthouse
  
  skills:
    - component-generator
    - style-converter
    - accessibility-checker
    - performance-optimizer

frameworks:
  preferred:
    - React 18+
    - Next.js 14+
    - Vue 3+
  
  styling:
    - TailwindCSS
    - CSS Modules
    - Styled Components
  
  state_management:
    - Zustand
    - Redux Toolkit
    - React Query

workflow:
  name: Frontend Development Workflow
  trigger:
    type: keyword
    patterns:
      - "开发页面"
      - "develop page"
      - "前端实现"
      - "frontend implementation"
  
  steps:
    - step: 1
      action: analyze-design-mockup
      description: 分析 Figma 设计稿
  
    - step: 2
      action: setup-component-structure
      description: 规划组件层级结构
  
    - step: 3
      action: implement-components
      description: 实现 UI 组件(Atomic Design 原则)
  
    - step: 4
      action: implement-pages
      description: 组装页面,对接 API
  
    - step: 5
      action: responsive-testing
      description: 多端响应式测试(Mobile/Tablet/Desktop)
  
    - step: 6
      action: performance-optimization
      description: 性能优化(Lighthouse 评分>90)
      metrics:
        - LCP < 2.5s
        - FID < 100ms
        - CLS < 0.1
  
    - step: 7
      action: accessibility-audit
      description: 可访问性检查(WCAG 2.1 AA)
  
    - step: 8
      action: create-pull-request
      description: 创建 Pull Request

  human_in_loop:
    enabled: true
    approval_points:
      - design-review
      - before-merge-to-main

coding_standards:
  style_guide: Airbnb React Style Guide
  linting: ESLint + Prettier
  testing: Jest + React Testing Library
  documentation: Storybook for components

4.6 测试工程师 Agent (QA-Agent)

QA-Agent PRO

核心职责:测试策略制定、测试用例生成、自动化测试执行、缺陷管理、质量报告

# ~/.openclaw/agents/qa-agent.yaml
agent:
  name: QA-Agent
  role: 高级测试工程师
  version: 1.0.0

personality:
  tone: meticulous-and-systematic
  expertise_level: senior
  years_of_experience: 8
  specialization:
    - test automation
    - performance testing
    - security testing
    - quality assurance

capabilities:
  - test-strategy-planning
  - test-case-generation
  - automated-test-execution
  - defect-management
  - quality-reporting
  - regression-testing
  - load-testing

tools:
  llm:
    provider: anthropic
    model: claude-sonnet-4-20250514
    temperature: 0.4
  
  mcp_servers:
    - github
    - jira
    - playwright
    - jmeter
  
  skills:
    - test-case-generator
    - test-runner
    - defect-analyzer
    - coverage-reporter

testing_frameworks:
  unit:
    - pytest (Python)
    - JUnit 5 (Java)
    - Jest (JavaScript)
  
  integration:
    - TestContainers
    - WireMock
  
  e2e:
    - Playwright
    - Cypress
  
  performance:
    - k6
    - JMeter

workflow:
  name: Quality Assurance Workflow
  trigger:
    type: keyword
    patterns:
      - "生成测试"
      - "generate tests"
      - "执行测试"
      - "run tests"
  
  steps:
    - step: 1
      action: analyze-requirements
      description: 分析 PRD 和用户故事,提取测试点
    
    - step: 2
      action: create-test-strategy
      description: 制定测试策略(测试金字塔)
      distribution:
        - unit: 70%
        - integration: 20%
        - e2e: 10%
    
    - step: 3
      action: generate-test-cases
      description: 生成测试用例(等价类、边界值、场景法)
    
    - step: 4
      action: implement-automated-tests
      description: 实现自动化测试脚本
    
    - step: 5
      action: execute-test-suite
      description: 执行测试套件
    
    - step: 6
      action: analyze-results
      description: 分析测试结果,生成缺陷报告
    
    - step: 7
      action: generate-quality-report
      description: 生成质量报告(覆盖率、通过率、趋势)

  human_in_loop:
    enabled: true
    approval_points:
      - test-strategy-review
      - critical-defect-confirmation

quality_gates:
  unit_test_coverage: ">80%"
  integration_test_coverage: ">60%"
  critical_bugs: "0"
  test_pass_rate: ">95%"

4.7 DevOps 工程师 Agent (DevOps-Agent)

DevOps-Agent PRO

核心职责:CI/CD 流水线配置、基础设施即代码、容器编排、监控告警、安全合规

# ~/.openclaw/agents/devops-agent.yaml
agent:
  name: DevOps-Agent
  role: 高级 DevOps 工程师
  version: 1.0.0

personality:
  tone: reliable-and-security-focused
  expertise_level: senior
  years_of_experience: 10
  specialization:
    - CI/CD pipelines
    - Kubernetes operations
    - infrastructure as code
    - observability

capabilities:
  - pipeline-configuration
  - infrastructure-provisioning
  - container-orchestration
  - monitoring-setup
  - security-hardening
  - cost-optimization
  - disaster-recovery

tools:
  llm:
    provider: anthropic
    model: claude-sonnet-4-20250514
    temperature: 0.3
  
  mcp_servers:
    - github
    - jenkins
    - kubernetes
    - terraform
    - prometheus
  
  skills:
    - pipeline-builder
    - helm-chart-creator
    - terraform-writer
    - monitoring-configurator

infrastructure_tools:
  iac:
    - Terraform
    - Pulumi
  
  container:
    - Docker
    - Buildah
  
  orchestration:
    - Kubernetes
    - Helm
    - Kustomize
  
  ci_cd:
    - Jenkins
    - GitHub Actions
    - ArgoCD
  
  monitoring:
    - Prometheus
    - Grafana
    - ELK Stack

workflow:
  name: CI/CD Pipeline Configuration Workflow
  trigger:
    type: keyword
    patterns:
      - "配置流水线"
      - "setup pipeline"
      - "部署配置"
      - "deployment config"
  
  steps:
    - step: 1
      action: analyze-project-structure
      description: 分析项目结构和构建需求
    
    - step: 2
      action: design-pipeline-stages
      description: 设计流水线阶段
      stages:
        - checkout
        - build
        - test
        - security-scan
        - package
        - deploy-dev
        - integration-test
        - deploy-staging
        - uat
        - deploy-prod
    
    - step: 3
      action: write-jenkinsfile
      description: 编写 Jenkinsfile 声明式流水线
    
    - step: 4
      action: create-dockerfile
      description: 编写优化的 Dockerfile(多阶段构建)
    
    - step: 5
      action: create-helm-charts
      description: 创建 Helm Chart 部署包
    
    - step: 6
      action: configure-monitoring
      description: 配置监控告警规则
    
    - step: 7
      action: setup-gitops
      description: 配置 GitOps 流程(ArgoCD)
    
    - step: 8
      action: test-pipeline
      description: 端到端测试流水线

  human_in_loop:
    enabled: true
    approval_points:
      - production-deployment
      - infrastructure-changes
      - security-policy-changes

security_practices:
  image_scanning: Trivy / Snyk
  secret_management: HashiCorp Vault
  network_policy: Calico Network Policies
  rbac: Least Privilege Principle

第 5 章 CI/CD 流水线与自动化部署

🎯 本章目标:配置完整的 CI/CD 流水线,实现从代码提交到生产部署的全流程自动化

5.1 流水线整体架构

📦 Stage 1: Code & Build

  • Git Checkout
  • Dependency Installation
  • Code Compilation
  • Artifact Generation

🧪 Stage 2: Test

  • Unit Tests
  • Integration Tests
  • Code Coverage Check
  • Static Code Analysis

🔒 Stage 3: Security

  • SAST (SonarQube)
  • Dependency Scan (Snyk)
  • Container Image Scan (Trivy)
  • Secret Detection

📦 Stage 4: Package

  • Docker Image Build
  • Image Tagging
  • Push to Registry
  • Helm Chart Package

🚀 Stage 5: Deploy

  • Deploy to Dev
  • Smoke Tests
  • Deploy to Staging
  • Integration Tests
  • Manual Approval Gate
  • Deploy to Production

5.2 Jenkins Pipeline 配置详解

5.2.1 声明式流水线模板

// Jenkinsfile (Declarative Pipeline)
pipeline {
    agent {
        kubernetes {
            yaml '''
apiVersion: v1
kind: Pod
metadata:
  name: maven-pod
spec:
  containers:
  - name: maven
    image: maven:3.9-eclipse-temurin-17
    command:
    - cat
    tty: true
    volumeMounts:
    - name: maven-cache
      mountPath: /root/.m2
  - name: docker
    image: docker:24-dind
    securityContext:
      privileged: true
  volumes:
  - name: maven-cache
    persistentVolumeClaim:
      claimName: maven-cache-pvc
'''
        }
    }

    environment {
        DOCKER_REGISTRY = 'registry.example.com'
        IMAGE_NAME = "${env.APP_NAME}"
        IMAGE_TAG = "${env.BUILD_NUMBER}-${GIT_COMMIT.take(7)}"
        KUBECONFIG_PATH = '/home/jenkins/.kube/config'
    }

    parameters {
        string(name: 'APP_NAME', defaultValue: 'my-app', description: '应用名称')
        choice(name: 'DEPLOY_ENV', choices: ['dev', 'staging', 'prod'], description: '部署环境')
        booleanParam(name: 'SKIP_TESTS', defaultValue: false, description: '跳过测试')
    }

    options {
        timeout(time: 60, unit: 'MINUTES')
        disableConcurrentBuilds()
        buildDiscarder(logRotator(numToKeepStr: '10'))
        timestamps()
    }

    triggers {
        pollSCM('*/5 * * * *')
        cron('H 2 * * *')
    }

    stages {
        stage('Checkout') {
            steps {
                container('maven') {
                    checkout scm
                    script {
                        env.GIT_COMMIT_SHORT = sh(script: 'git rev-parse --short HEAD', returnStdout: true).trim()
                    }
                }
            }
        }

        stage('Install Dependencies') {
            steps {
                container('maven') {
                    sh 'mvn dependency:go-offline -B'
                }
            }
        }

        stage('Build') {
            steps {
                container('maven') {
                    sh 'mvn clean package -DskipTests -B'
                }
            }
        }

        stage('Unit Test') {
            when {
                expression { return !params.SKIP_TESTS }
            }
            steps {
                container('maven') {
                    sh 'mvn test -B'
                }
            }
            post {
                always {
                    junit allowEmptyResults: true, testResults: '**/target/surefire-reports/*.xml'
                    publishCoverage adapters: [jacocoAdapter('**/target/site/jacoco/jacoco.xml')],
                                  sourceFileEncoding: 'UTF-8'
                }
            }
        }

        stage('Code Quality') {
            steps {
                container('maven') {
                    withSonarQubeEnv('SonarQube') {
                        sh 'mvn sonar:sonar -Dsonar.projectKey=${APP_NAME}'
                    }
                }
            }
        }

        stage('Security Scan') {
            parallel {
                stage('SAST') {
                    steps {
                        container('maven') {
                            sh 'mvn org.owasp:dependency-check-maven:check -B'
                        }
                    }
                }
                stage('Container Scan') {
                    steps {
                        container('docker') {
                            sh '''
                            docker build -t ${IMAGE_NAME}:${IMAGE_TAG} .
                            trivy image --exit-code 1 --severity HIGH,CRITICAL ${IMAGE_NAME}:${IMAGE_TAG}
                            '''
                        }
                    }
                }
            }
        }

        stage('Build & Push Docker Image') {
            steps {
                container('docker') {
                    script {
                        docker.withRegistry("https://${DOCKER_REGISTRY}", 'docker-registry-credentials') {
                            def image = docker.build("${IMAGE_NAME}:${IMAGE_TAG}")
                            image.push()
                            image.push('latest')
                        }
                    }
                }
            }
        }

        stage('Deploy to Environment') {
            stages {
                stage('Deploy to Dev') {
                    when {
                        expression { return params.DEPLOY_ENV == 'dev' }
                    }
                    steps {
                        container('maven') {
                            sh '''
                            kubectl config use-context dev-cluster
                            helm upgrade --install ${APP_NAME} ./helm/${APP_NAME} \\
                              --namespace dev \\
                              --set image.tag=${IMAGE_TAG} \\
                              --set replicaCount=1 \\
                              --wait --timeout 5m
                            '''
                        }
                    }
                }

                stage('Deploy to Staging') {
                    when {
                        expression { return params.DEPLOY_ENV == 'staging' }
                    }
                    steps {
                        container('maven') {
                            sh '''
                            kubectl config use-context staging-cluster
                            helm upgrade --install ${APP_NAME} ./helm/${APP_NAME} \\
                              --namespace staging \\
                              --set image.tag=${IMAGE_TAG} \\
                              --set replicaCount=2 \\
                              --wait --timeout 5m
                            '''
                        }
                    }
                }

                stage('Deploy to Production') {
                    when {
                        expression { return params.DEPLOY_ENV == 'prod' }
                    }
                    steps {
                        input message: 'Approve production deployment?', ok: 'Deploy to Prod'
                        container('maven') {
                            sh '''
                            kubectl config use-context prod-cluster
                            helm upgrade --install ${APP_NAME} ./helm/${APP_NAME} \\
                              --namespace production \\
                              --set image.tag=${IMAGE_TAG} \\
                              --set replicaCount=3 \\
                              --wait --timeout 10m
                            '''
                        }
                    }
                }
            }
        }

        stage('Smoke Test') {
            steps {
                container('maven') {
                    sh '''
                    curl -f http://${APP_NAME}.${DEPLOY_ENV}.example.com/health || exit 1
                    '''
                }
            }
        }
    }

    post {
        always {
            cleanWs()
            archiveArtifacts artifacts: '**/target/*.jar', allowEmptyArchive: true
        }
        success {
            echo 'Pipeline completed successfully! 🎉'
            slackSend(color: 'good', message: "Build ${env.BUILD_NUMBER} succeeded!")
        }
        failure {
            echo 'Pipeline failed! ❌'
            slackSend(color: 'danger', message: "Build ${env.BUILD_NUMBER} failed!")
        }
    }
}

5.3 KubeSphere DevOps 项目配置

5.3.1 创建 DevOps 项目

# 使用 kubectl 创建 DevOps 项目
cat > devops-project.yaml <# 创建凭证(Docker Registry、Git、Kubeconfig)
kubectl create secret docker-registry docker-registry-secret \\
  --docker-server=registry.example.com \\
  --docker-username=admin \\
  --docker-password=YourPassword \\
  -n rd-automation-pipeline

kubectl create secret generic git-credentials \\
  --from-literal=username=gituser \\
  --from-literal=password=YourGitPassword \\
  -n rd-automation-pipeline

5.3.2 配置 KubeSphere 流水线

# KubeSphere 流水线 YAML 配置
cat > kubesphere-pipeline.yaml <

5.4 Helm Chart 配置

5.4.1 Helm Chart 目录结构

helm/
└── my-app/
    ├── Chart.yaml          # Chart 元数据
    ├── values.yaml         # 默认配置值
    ├── values-dev.yaml     # 开发环境配置
    ├── values-staging.yaml # 预发布环境配置
    ├── values-prod.yaml    # 生产环境配置
    ├── templates/
    │   ├── _helpers.tpl    # 模板辅助函数
    │   ├── deployment.yaml # Deployment 模板
    │   ├── service.yaml    # Service 模板
    │   ├── ingress.yaml    # Ingress 模板
    │   ├── configmap.yaml  # ConfigMap 模板
    │   ├── secret.yaml     # Secret 模板
    │   └── hpa.yaml        # HPA 自动扩缩容模板
    └── charts/             # 子 Charts 依赖

5.4.2 Chart.yaml 配置

# helm/my-app/Chart.yaml
apiVersion: v2
name: my-app
description: A Helm chart for RD Automation System Application
type: application
version: 1.0.0
appVersion: "1.0.0"
keywords:
  - microservice
  - spring-boot
  - api
maintainers:
  - name: DevOps Team
    email: devops@example.com
dependencies:
  - name: postgresql
    version: 12.x.x
    repository: https://charts.bitnami.com/bitnami
    condition: postgresql.enabled
  - name: redis
    version: 17.x.x
    repository: https://charts.bitnami.com/bitnami
    condition: redis.enabled

5.4.3 values-prod.yaml 生产环境配置

# helm/my-app/values-prod.yaml
replicaCount: 3

image:
  repository: registry.example.com/my-app
  tag: latest
  pullPolicy: IfNotPresent

resources:
  limits:
    cpu: 1000m
    memory: 2Gi
  requests:
    cpu: 500m
    memory: 1Gi

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80

ingress:
  enabled: true
  className: nginx
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
    - host: api.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: my-app-tls
      hosts:
        - api.example.com

livenessProbe:
  httpGet:
    path: /actuator/health/liveness
    port: 8080
  initialDelaySeconds: 60
  periodSeconds: 10
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /actuator/health/readiness
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 5
  failureThreshold: 3

podDisruptionBudget:
  enabled: true
  minAvailable: 2

networkPolicy:
  enabled: true
  ingressRules:
    - from:
        - namespaceSelector:
            matchLabels:
              name: ingress-nginx
      ports:
        - protocol: TCP
          port: 8080

5.5 GitOps 配置(ArgoCD)

# ArgoCD Application 配置
cat > argocd-app.yaml <

第 6 章 自动化测试体系

🎯 本章目标:建立完整的自动化测试体系,包括单元测试、集成测试、E2E 测试和 UI 自动化测试

6.1 测试金字塔模型

🔺
E2E 测试 (10%)
Playwright / Cypress
集成测试 (20%)
TestContainers / WireMock
单元测试 (70%)
JUnit / pytest / Jest

6.2 单元测试配置

6.2.1 Java/JUnit 5 配置

// pom.xml 测试依赖配置
<dependencies>
    <!-- JUnit 5 -->
    <dependency>
        <groupId>org.junit.jupiter</groupId>
        <artifactId>junit-jupiter</artifactId>
        <version>5.10.2</version>
        <scope>test</scope>
    </dependency>
    
    <!-- Mockito -->
    <dependency>
        <groupId>org.mockito</groupId>
        <artifactId>mockito-junit-jupiter</artifactId>
        <version>5.10.0</version>
        <scope>test</scope>
    </dependency>
    
    <!-- AssertJ -->
    <dependency>
        <groupId>org.assertj</groupId>
        <artifactId>assertj-core</artifactId>
        <version>3.25.3</version>
        <scope>test</scope>
    </dependency>
</dependencies>

// 示例测试类
@ExtendWith(MockitoExtension.class)
class UserServiceTest {
    
    @Mock
    private UserRepository userRepository;
    
    @InjectMocks
    private UserService userService;
    
    @Test
    @DisplayName("Should create user successfully")
    void shouldCreateUserSuccessfully() {
        // Given
        User newUser = new User("john@example.com", "John Doe");
        when(userRepository.save(any(User.class))).thenReturn(newUser);
        
        // When
        User created = userService.createUser(newUser);
        
        // Then
        assertThat(created.getEmail()).isEqualTo("john@example.com");
        verify(userRepository, times(1)).save(newUser);
    }
}

6.2.2 Python/pytest 配置

# requirements-test.txt
pytest==8.1.1
pytest-cov==5.0.0
pytest-asyncio==0.23.6
pytest-mock==3.14.0
factory-boy==3.3.0

# pytest.ini
[pytest]
testpaths = tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*
addopts = 
    -v
    --cov=app
    --cov-report=html
    --cov-report=term-missing
    --cov-fail-under=80
    -m "not slow"
markers =
    slow: marks tests as slow
    integration: marks tests as integration tests
    e2e: marks tests as end-to-end tests

# 示例测试文件 tests/test_user_service.py
import pytest
from unittest.mock import Mock, patch
from app.services.user_service import UserService
from app.models.user import User

class TestUserService:
    
    @pytest.fixture
    def mock_repo(self):
        return Mock()
    
    @pytest.fixture
    def service(self, mock_repo):
        return UserService(mock_repo)
    
    def test_create_user_success(self, service, mock_repo):
        # Arrange
        user_data = {"email": "test@example.com", "name": "Test User"}
        expected_user = User(**user_data)
        mock_repo.save.return_value = expected_user
        
        # Act
        result = service.create_user(**user_data)
        
        # Assert
        assert result.email == "test@example.com"
        mock_repo.save.assert_called_once()

6.3 集成测试配置

6.3.1 TestContainers 配置

// Java 集成测试示例
@Testcontainers
@SpringBootTest
class UserIntegrationTest {
    
    @Container
    static PostgreSQLContainer postgres = 
        new PostgreSQLContainer<>("postgres:15-alpine")
            .withDatabaseName("testdb")
            .withUsername("test")
            .withPassword("test");
    
    @Container
    static RedisContainer redis = 
        new RedisContainer<>("redis:7-alpine");
    
    @DynamicPropertySource
    static void configureTestProperties(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.url", postgres::getJdbcUrl);
        registry.add("spring.datasource.username", postgres::getUsername);
        registry.add("spring.datasource.password", postgres::getPassword);
        registry.add("spring.redis.host", redis::getHost);
        registry.add("spring.redis.port", redis::getFirstMappedPort);
    }
    
    @Autowired
    private UserRepository userRepository;
    
    @Test
    void shouldSaveAndRetrieveUser() {
        // Test database operations
        User user = new User("test@example.com", "Test");
        userRepository.save(user);
        
        Optional found = userRepository.findById(user.getId());
        assertThat(found).isPresent();
        assertThat(found.get().getEmail()).isEqualTo("test@example.com");
    }
}

6.4 UI 自动化测试(Playwright)

6.4.1 Playwright 配置

# playwright.config.ts
import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  testDir: './tests/e2e',
  fullyParallel: true,
  forbidOnly: !!process.env.CI,
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 1 : undefined,
  reporter: [
    ['html', { outputFolder: 'playwright-report' }],
    ['json', { outputFile: 'test-results/results.json' }],
    ['junit', { outputFile: 'test-results/junit.xml' }]
  ],
  use: {
    baseURL: process.env.BASE_URL || 'http://localhost:3000',
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
    video: 'retain-on-failure',
  },
  projects: [
    {
      name: 'chromium',
      use: { ...devices['Desktop Chrome'] },
    },
    {
      name: 'firefox',
      use: { ...devices['Desktop Firefox'] },
    },
    {
      name: 'webkit',
      use: { ...devices['Desktop Safari'] },
    },
    {
      name: 'Mobile Chrome',
      use: { ...devices['Pixel 5'] },
    },
    {
      name: 'Mobile Safari',
      use: { ...devices['iPhone 12'] },
    },
  ],
  webServer: {
    command: 'npm run dev',
    url: 'http://localhost:3000',
    reuseExistingServer: !process.env.CI,
  },
});

6.4.2 E2E 测试示例

# tests/e2e/user-flow.spec.ts
import { test, expect } from '@playwright/test';

test.describe('User Registration Flow', () => {
  test.beforeEach(async ({ page }) => {
    await page.goto('/');
  });

  test('should complete user registration successfully', async ({ page }) => {
    // Navigate to registration page
    await page.click('[data-testid="register-button"]');
    await expect(page).toHaveURL('/register');

    // Fill registration form
    await page.fill('[name="email"]', 'test@example.com');
    await page.fill('[name="password"]', 'SecurePass123!');
    await page.fill('[name="confirmPassword"]', 'SecurePass123!');
    await page.fill('[name="name"]', 'Test User');

    // Submit form
    await page.click('[type="submit"]');

    // Verify success message
    await expect(page.locator('[data-testid="success-message"]'))
      .toBeVisible();
    
    // Verify redirect to dashboard
    await expect(page).toHaveURL('/dashboard');
  });

  test('should show validation errors for invalid input', async ({ page }) => {
    await page.click('[data-testid="register-button"]');
    
    // Submit empty form
    await page.click('[type="submit"]');
    
    // Verify validation errors
    await expect(page.locator('[data-testid="email-error"]')).toBeVisible();
    await expect(page.locator('[data-testid="password-error"]')).toBeVisible();
  });

  test('should handle password mismatch', async ({ page }) => {
    await page.click('[data-testid="register-button"]');
    
    await page.fill('[name="email"]', 'test@example.com');
    await page.fill('[name="password"]', 'Pass123!');
    await page.fill('[name="confirmPassword"]', 'DifferentPass123!');
    
    await page.click('[type="submit"]');
    
    await expect(page.locator('[data-testid="password-match-error"]'))
      .toBeVisible();
  });
});

test.describe('User Login Flow', () => {
  test('should login successfully with valid credentials', async ({ page }) => {
    await page.goto('/login');
    
    await page.fill('[name="email"]', 'test@example.com');
    await page.fill('[name="password"]', 'SecurePass123!');
    await page.click('[type="submit"]');
    
    await expect(page.locator('[data-testid="welcome-message"]')).toBeVisible();
    await expect(page).toHaveURL('/dashboard');
  });

  test('should show error for invalid credentials', async ({ page }) => {
    await page.goto('/login');
    
    await page.fill('[name="email"]', 'wrong@example.com');
    await page.fill('[name="password"]', 'WrongPassword');
    await page.click('[type="submit"]');
    
    await expect(page.locator('[data-testid="login-error"]')).toBeVisible();
  });
});

6.4.3 视觉回归测试

# tests/e2e/visual-regression.spec.ts
import { test, expect } from '@playwright/test';

test.describe('Visual Regression Tests', () => {
  test('homepage should match baseline screenshot', async ({ page }) => {
    await page.goto('/');
    await expect(page).toHaveScreenshot('homepage.png', {
      maxDiffPixels: 100,
      fullPage: true,
    });
  });

  test('dashboard should match baseline screenshot', async ({ page }) => {
    await page.goto('/dashboard');
    await expect(page).toHaveScreenshot('dashboard.png', {
      maxDiffPixels: 150,
      fullPage: true,
    });
  });

  test('components should render correctly', async ({ page }) => {
    await page.goto('/components');
    
    const button = page.locator('[data-testid="primary-button"]');
    await expect(button).toHaveScreenshot('primary-button.png');
    
    const card = page.locator('[data-testid="info-card"]');
    await expect(card).toHaveScreenshot('info-card.png');
  });
});

6.5 性能测试配置

6.5.1 k6 负载测试脚本

// tests/performance/load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';

// 自定义指标
const errorRate = new Rate('errors');
const apiLatency = new Trend('api_latency');

export const options = {
  stages: [
    { duration: '2m', target: 50 },   // Ramp up to 50 users
    { duration: '5m', target: 50 },   // Stay at 50 users
    { duration: '2m', target: 100 },  // Ramp up to 100 users
    { duration: '5m', target: 100 },  // Stay at 100 users
    { duration: '2m', target: 0 },    // Ramp down to 0 users
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'], // 95% of requests should be below 500ms
    http_req_failed: ['rate<0.01'],   // Error rate should be less than 1%
    errors: ['rate<0.1'],             // Custom error rate threshold
    api_latency: ['p(90)<300'],       // API latency p90 < 300ms
  },
};

export default function () {
  // Test homepage
  let res = http.get('http://localhost:3000/');
  check(res, {
    'homepage status is 200': (r) => r.status === 200,
    'homepage has content': (r) => r.body.includes('Welcome'),
  });
  errorRate.add(res.status !== 200);
  apiLatency.add(res.timings.duration);
  
  sleep(1);

  // Test API endpoint
  res = http.get('http://localhost:8000/api/v1/users', {
    headers: { 'Authorization': 'Bearer test-token' },
  });
  check(res, {
    'api status is 200': (r) => r.status === 200,
  });
  errorRate.add(res.status !== 200);
  apiLatency.add(res.timings.duration);
  
  sleep(1);
}

6.6 测试报告与质量门禁

# 生成综合测试报告脚本
#!/bin/bash

echo "📊 Generating Comprehensive Test Report..."

# 运行单元测试并生成覆盖率报告
pytest --cov=app --cov-report=html --cov-report=xml

# 运行 E2E 测试
npx playwright test --reporter=html

# 运行性能测试
k6 run tests/performance/load-test.js

# 生成 Allure 报告
allure generate --clean -o allure-report

# 质量门禁检查
COVERAGE=$(grep -oP '"percent_covered": "\K[^"]+' coverage.json)
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
    echo "❌ Code coverage ($COVERAGE%) is below threshold (80%)"
    exit 1
fi

echo "✅ All quality gates passed!"
echo "📈 Test Report available at: allure-report/index.html"

第 7 章 人机协同工作机制

🎯 本章目标:定义 AI 自主执行与人工审核的边界,建立高效的人机协同流程

7.1 人机协同决策矩阵

任务类型 AI 自主程度 人工介入点 审批要求
代码生成 高(80% 自主) PR 审查、合并前 Tech Lead 审批
单元测试 高(90% 自主) 覆盖率阈值检查 自动通过
数据库变更 中(50% 自主) Schema 评审、迁移脚本审查 DBA + Tech Lead 双审批
生产部署 低(20% 自主) 部署前审批、灰度发布决策 变更委员会审批
安全修复 中(60% 自主) 漏洞评估、修复方案审查 安全团队审批
架构变更 低(30% 自主) 架构评审会议、RFC 文档审查 架构委员会审批

7.2 人工审批工作流配置

# ~/.openclaw/human-in-loop-config.yaml
human_approval:
  enabled: true
  
  approval_channels:
    - type: feishu
      webhook: ${FEISHU_WEBHOOK_URL}
      mention_users:
        - tech_lead
        - product_owner
    
    - type: email
      smtp_server: smtp.example.com
      recipients:
        - approvals@example.com
    
    - type: slack
      webhook: ${SLACK_WEBHOOK_URL}
      channel: "#approvals"

  approval_rules:
    - trigger: production_deployment
      required_approvers: 2
      approvers_from:
        - role: tech_lead
        - role: product_owner
      timeout_minutes: 60
      escalation: notify_cto
    
    - trigger: database_schema_change
      required_approvers: 2
      approvers_from:
        - role: dba
        - role: tech_lead
      timeout_minutes: 30
    
    - trigger: security_patch
      required_approvers: 1
      approvers_from:
        - role: security_team
      timeout_minutes: 15
      priority: high
    
    - trigger: pr_merge_to_main
      required_approvers: 1
      approvers_from:
        - role: code_owner
      timeout_minutes: 120

  notification_templates:
    approval_request: |
      🔔 需要您的审批
      
      **任务类型**: {{task_type}}
      **申请人**: {{requester}}
      **描述**: {{description}}
      
      [查看详情]({{detail_url}})
      
      请回复:
      ✅ 批准
      ❌ 拒绝 + 原因
    
    approval_timeout: |
      ⚠️ 审批超时提醒
      
      **任务**: {{task_type}}
      **等待时间**: {{elapsed_minutes}}分钟
      **当前状态**: 等待 {{pending_approvers}} 审批
      
      请尽快处理或升级通知。

7.3 AI 建议与人工决策交互界面

# OpenClaw Web UI 人机协同界面配置
cat > ~/openclaw/ui/collaboration-config.json <

7.4 异常处理与升级机制

# ~/.openclaw/escalation-policy.yaml
escalation_policy:
  levels:
    - level: 1
      name: Team Lead
      response_time_minutes: 30
      notification_methods:
        - instant_message
        - email
    
    - level: 2
      name: Engineering Manager
      response_time_minutes: 60
      notification_methods:
        - phone_call
        - sms
    
    - level: 3
      name: CTO / VP Engineering
      response_time_minutes: 120
      notification_methods:
        - phone_call
        - emergency_alert

  escalation_triggers:
    - condition: approval_timeout
      timeout_multiplier: 2
      max_escalation_level: 3
    
    - condition: production_incident
      immediate_escalation_to: 2
      auto_create_incident_ticket: true
    
    - condition: security_breach_detected
      immediate_escalation_to: 3
      auto_lockdown: true
      notify_security_team: true

  auto_remediation:
    enabled: true
    actions:
      - trigger: deployment_failure
        action: rollback_to_previous_version
        notify: true
      
      - trigger: health_check_failure
        action: restart_container
        max_retries: 3
      
      - trigger: high_error_rate
        action: enable_circuit_breaker
        threshold: error_rate > 5%

第 8 章 项目初始化实战

🎯 本章目标:通过完整的项目初始化案例,演示端到端研发自动化系统的实际使用流程

8.1 项目启动:从需求到 PRD

# 步骤 1: 通过自然语言启动项目
# 在飞书/OpenClaw Control UI 中输入:

"我要创建一个在线商城系统,主要功能包括:
1. 用户注册登录
2. 商品浏览和搜索
3. 购物车和订单管理
4. 支付集成
5. 后台管理系统

请帮我生成 PRD 文档和项目骨架。"

# PM-Agent 将自动响应并开始需求分析流程

8.2 自动化项目脚手架生成

# 步骤 2: 触发项目脚手架生成
cd ~/projects
claude-code "create a new e-commerce project with:
- Backend: FastAPI + PostgreSQL
- Frontend: Next.js 14 + TypeScript + TailwindCSS
- Docker containerization
- Kubernetes deployment manifests
- CI/CD pipeline configuration"

# Claude Code 将自动生成以下项目结构:
ecommerce-platform/
├── backend/
│   ├── app/
│   │   ├── main.py
│   │   ├── api/
│   │   ├── models/
│   │   ├── services/
│   │   └── tests/
│   ├── Dockerfile
│   ├── requirements.txt
│   └── pytest.ini
├── frontend/
│   ├── src/
│   │   ├── app/
│   │   ├── components/
│   │   ├── hooks/
│   │   └── __tests__/
│   ├── public/
│   ├── Dockerfile
│   └── next.config.js
├── infra/
│   ├── kubernetes/
│   │   ├── deployment.yaml
│   │   ├── service.yaml
│   │   └── ingress.yaml
│   └── helm/
│       └── ecommerce/
├── .github/
│   └── workflows/
│       └── ci-cd.yml
├── docker-compose.yml
├── README.md
└── docs/
    ├── architecture.md
    └── api-spec.yaml

8.3 一键初始化完整项目

# 步骤 3: 执行项目初始化脚本
cd ecommerce-platform
./scripts/init-project.sh

# 初始化脚本内容
#!/bin/bash
set -e

echo "🚀 Initializing E-commerce Platform Project..."

# 1. 设置后端虚拟环境
cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd ..

# 2. 安装前端依赖
cd frontend
pnpm install
cd ..

# 3. 初始化数据库
docker compose up -d postgres
sleep 5
docker compose exec -T postgres psql -U postgres -c "CREATE DATABASE ecommerce;"

# 4. 运行数据库迁移
cd backend
alembic upgrade head
cd ..

# 5. 运行测试
cd backend && pytest && cd ..
cd frontend && pnpm test && cd ..

# 6. 构建 Docker 镜像
docker compose build

# 7. 启动开发环境
docker compose up -d

echo "✅ Project initialization complete!"
echo "📱 Frontend: http://localhost:3000"
echo "🔧 Backend API: http://localhost:8000"
echo "📚 API Docs: http://localhost:8000/docs"

8.4 配置 CI/CD 流水线

# 步骤 4: 在 Jenkins/KubeSphere 中配置流水线

# 4.1 创建 Jenkins 凭据
kubectl create secret docker-registry docker-hub-creds \
  --docker-server=docker.io \
  --docker-username=$DOCKER_USERNAME \
  --docker-password=$DOCKER_PASSWORD \
  -n rd-automation

# 4.2 创建 GitHub 凭据
kubectl create secret generic github-creds \
  --from-literal=username=$GITHUB_USER \
  --from-literal=password=$GITHUB_TOKEN \
  -n rd-automation

# 4.3 导入 Jenkinsfile
# 在 KubeSphere UI 中:
# 1. 进入 DevOps 项目
# 2. 创建新流水线
# 3. 选择"从 SCM 导入"
# 4. 配置 Git 仓库地址
# 5. 选择 Jenkinsfile 路径:Jenkinsfile

# 4.4 配置 Webhook 自动触发
# 在 GitHub 仓库 Settings -> Webhooks 中添加:
# Payload URL: http://jenkins.example.com/github-webhook/
# Content type: application/json
# Events: Push events, Pull Request events

8.5 首次完整流水线执行

# 步骤 5: 触发首次完整流水线

# 5.1 提交代码到 Git 仓库
cd ecommerce-platform
git init
git remote add origin https://github.com/your-org/ecommerce-platform.git
git add .
git commit -m "feat: initial project scaffold generated by AI"
git push -u origin main

# 5.2 流水线将自动触发并执行以下步骤:
# ✓ Checkout code from Git
# ✓ Install dependencies (backend & frontend)
# ✓ Run unit tests and collect coverage
# ✓ Static code analysis (SonarQube)
# ✓ Security scanning (SAST + container scan)
# ✓ Build Docker images
# ✓ Push images to registry
# ✓ Deploy to development environment
# ✓ Run smoke tests
# ✓ Generate deployment report

# 5.3 查看流水线执行状态
# 访问 KubeSphere UI: http://kubesphere.example.com:30880
# 导航到:DevOps 项目 -> 流水线 -> rd-automation-pipeline

# 5.4 查看部署结果
kubectl get pods -n dev
kubectl get svc -n dev
curl http://ecommerce-api.dev.example.com/health

8.6 验收测试与上线

# 步骤 6: 执行验收测试

# 6.1 运行 E2E 测试
cd ecommerce-platform/frontend
npx playwright test --project=chromium

# 6.2 运行性能测试
k6 run ../tests/performance/scenario-checkout.js

# 6.3 生成验收报告
./scripts/generate-acceptance-report.sh

# 步骤 7: 生产部署(需人工审批)

# 7.1 创建生产部署申请
claude-code "create production deployment request for ecommerce-platform v1.0.0"

# 7.2 审批流程自动触发
# - Tech Lead 收到飞书审批通知
# - Product Owner 收到邮件审批通知
# - 两人审批后自动执行生产部署

# 7.3 查看部署进度
kubectl rollout status deployment/ecommerce-api -n production
kubectl rollout status deployment/ecommerce-frontend -n production

# 7.4 验证生产环境
curl https://api.ecommerce-example.com/health
curl https://www.ecommerce-example.com/

# 🎉 项目成功上线!

第 9 章 故障排查与最佳实践

🎯 本章目标:提供常见问题解决方案和系统运维最佳实践

9.1 常见问题排查

9.1.1 OpenClaw 无法启动

# 问题:OpenClaw Gateway 启动失败

# 诊断步骤:
# 1. 检查日志
tail -f ~/.openclaw/logs/openclaw.log

# 2. 检查端口占用
lsof -i :8080

# 3. 检查 API Key 配置
echo $ANTHROPIC_API_KEY

# 解决方案:
# - 确保 API Key 有效且未过期
# - 释放被占用的端口或修改配置
# - 重新安装依赖:pnpm install

9.1.2 Claude Code 认证失败

# 问题:Claude Code 提示认证错误

# 诊断:
claude auth status

# 重新认证:
claude auth
# 按提示输入新的 API Key

# 检查配置文件:
cat ~/.claude-code/settings.json | grep api_key

9.1.3 Kubernetes 部署失败

# 问题:Pod 一直处于 Pending 或 CrashLoopBackOff 状态

# 诊断步骤:
# 1. 查看 Pod 状态
kubectl get pods -n dev

# 2. 查看 Pod 详细事件
kubectl describe pod  -n dev

# 3. 查看容器日志
kubectl logs  -n dev

# 4. 检查资源配额
kubectl describe quota -n dev

# 常见解决方案:
# - 资源不足:调整 requests/limits
# - 镜像拉取失败:检查 imagePullSecrets
# - 配置错误:检查 ConfigMap/Secret
# - 健康检查失败:调整 probe 参数

9.1.4 Jenkins 流水线卡住

# 问题:Jenkins Pipeline 长时间无响应

# 诊断:
# 1. 查看构建日志
# 访问:http://jenkins.example.com:30080/job///console

# 2. 检查 Agent 状态
kubectl get pods -n jenkins

# 3. 检查磁盘空间
kubectl exec -it  -n jenkins -- df -h

# 解决方案:
# - 清理 workspace: cleanWs()
# - 增加超时时间:timeout(time: 120, unit: 'MINUTES')
# - 重启 Jenkins Pod

9.2 性能优化最佳实践

9.2.1 AI 推理性能优化

# 优化建议:

# 1. 使用流式响应减少等待时间
export CLAUDE_STREAMING=true

# 2. 合理设置 temperature 和 max_tokens
# - 代码生成:temperature=0.3, max_tokens=4096
# - 创意写作:temperature=0.8, max_tokens=2048
# - 数据分析:temperature=0.1, max_tokens=8192

# 3. 启用本地缓存
export OPENCLAW_CACHE_ENABLED=true
export OPENCLAW_CACHE_TTL=3600

# 4. 批量处理请求
# 避免频繁的小请求,累积后批量发送

9.2.2 CI/CD 流水线加速

# 优化措施:

# 1. 启用并行执行
parallel {
    stage('Backend Tests') { ... }
    stage('Frontend Tests') { ... }
    stage('Security Scan') { ... }
}

# 2. 使用构建缓存
pipeline {
    agent {
        kubernetes {
            yaml '''
            volumes:
            - name: maven-cache
              persistentVolumeClaim:
                claimName: maven-cache-pvc
            '''
        }
    }
}

# 3. 条件执行阶段
when {
    changeset "**/backend/**"
}

# 4. 优化 Docker 镜像构建
# - 使用多阶段构建
# - 优化层缓存(先 COPY 依赖文件)
# - 使用更小的基础镜像(alpine/distroless)

9.3 安全最佳实践

# 安全加固清单:

# 1. 密钥管理
# ✓ 所有敏感信息存储在 Secret 中
# ✓ 使用外部密钥管理服务(Vault/AWS Secrets Manager)
# ✓ 定期轮换密钥(每 90 天)

# 2. 网络隔离
# ✓ 启用 NetworkPolicy 限制 Pod 间通信
# ✓ 使用私有子网部署工作负载
# ✓ 配置 WAF 防护 Web 攻击

# 3. 镜像安全
# ✓ 扫描所有镜像(Trivy/Snyk)
# ✓ 只允许信任的 Registry
# ✓ 使用不可变镜像标签

# 4. 访问控制
# ✓ 实施最小权限原则(RBAC)
# ✓ 启用审计日志
# ✓ 配置 MFA 多因素认证

# 5. AI 安全
# ✓ 启用命令沙箱模式
# ✓ 配置命令白名单
# ✓ 记录所有 AI 执行操作
# ✓ 定期审查 AI 行为日志

9.4 监控与告警配置

# Prometheus 告警规则
cat > prometheus-alerts.yaml < 0.05
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "High error rate detected"
      description: "Error rate is {{ \$value }}% for {{ \$labels.job }}"
  
  - alert: PodCrashLooping
    expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "Pod {{ \$labels.pod }} is crash looping"
  
  - alert: HighLatency
    expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 1
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High latency detected"
      description: "P95 latency is {{ \$value }}s"
  
  - alert: LowDiskSpace
    expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) < 0.1
    for: 10m
    labels:
      severity: critical
    annotations:
      summary: "Low disk space on {{ \$labels.instance }}"

  - alert: AIAPIRateLimitExceeded
    expr: rate(anthropic_api_calls_total{status="rate_limited"}[5m]) > 0
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "AI API rate limit exceeded"
EOF

kubectl apply -f prometheus-alerts.yaml -n monitoring

9.5 备份与灾难恢复

# 备份策略

# 1. 数据库备份(每天凌晨 2 点)
cat > backup-cronjob.yaml < /backup/ecommerce-\$(date +%Y%m%d).sql.gz
              aws s3 cp /backup/ecommerce-\$(date +%Y%m%d).sql.gz s3://backup-bucket/postgres/
            env:
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-secret
                  key: password
            volumeMounts:
            - name: backup-volume
              mountPath: /backup
          volumes:
          - name: backup-volume
            persistentVolumeClaim:
              claimName: backup-pvc
          restartPolicy: OnFailure
EOF

kubectl apply -f backup-cronjob.yaml

# 2. 配置备份
# - 所有 K8s YAML 文件存储到 Git
# - Helm Values 版本化管理
# - 定期导出集群配置

# 3. 灾难恢复演练(每季度一次)
# - 模拟数据库故障,验证备份恢复
# - 模拟区域故障,验证多活切换
# - 模拟 AI 服务中断,验证降级方案

附录:配置文件模板与脚本

📎 本章内容:提供所有配置文件的完整模板和实用脚本

A.1 完整的环境变量模板

# .env.template

# ==================== AI 服务配置 ====================
ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxxxxxxxxxx
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxx
CLAUDE_MODEL=claude-sonnet-4-20250514
CLAUDE_MAX_TOKENS=8192
CLAUDE_TEMPERATURE=0.7

# ==================== OpenClaw 配置 ====================
OPENCLAW_HOME=/home/user/openclaw
OPENCLAW_PORT=8080
OPENCLAW_UI_PORT=3000
OPENCLAW_LOG_LEVEL=info
OPENCLAW_SANDBOX_MODE=strict

# ==================== Git 配置 ====================
GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx
GITHUB_USERNAME=your-username
GIT_AUTHOR_NAME=RD-Automation
GIT_AUTHOR_EMAIL=automation@example.com

# ==================== Docker 配置 ====================
DOCKER_REGISTRY=registry.example.com
DOCKER_USERNAME=admin
DOCKER_PASSWORD=YourPassword

# ==================== Kubernetes 配置 ====================
KUBECONFIG=/home/user/.kube/config
K8S_CONTEXT=production
DEFAULT_NAMESPACE=default

# ==================== Jenkins 配置 ====================
JENKINS_URL=http://jenkins.example.com:30080
JENKINS_USER=admin
JENKINS_TOKEN=11xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

# ==================== 通知服务配置 ====================
FEISHU_WEBHOOK_URL=https://open.feishu.cn/open-apis/bot/v2/hook/xxxxx
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxxxx
SMTP_SERVER=smtp.example.com
SMTP_PORT=587
SMTP_USERNAME=notifications@example.com
SMTP_PASSWORD=YourPassword

# ==================== 监控配置 ====================
PROMETHEUS_URL=http://prometheus.example.com:9090
GRAFANA_URL=http://grafana.example.com:3000
GRAFANA_API_KEY=eyJrIjoiXXXXX

# ==================== 数据库配置 ====================
DATABASE_HOST=localhost
DATABASE_PORT=5432
DATABASE_NAME=rd_automation
DATABASE_USER=postgres
DATABASE_PASSWORD=YourPassword

# ==================== Redis 配置 ====================
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=YourPassword

A.2 快速部署脚本

# deploy-all.sh - 一键部署完整系统
#!/bin/bash
set -e

echo "🚀 Starting Full System Deployment..."

# 颜色定义
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

# 前置检查
check_prerequisites() {
    echo -e "${YELLOW}Checking prerequisites...${NC}"
    
    commands=("docker" "kubectl" "helm" "pnpm" "python3")
    for cmd in "${commands[@]}"; do
        if ! command -v $cmd &> /dev/null; then
            echo -e "${RED}Error: $cmd is not installed${NC}"
            exit 1
        fi
    done
    
    echo -e "${GREEN}✓ All prerequisites met${NC}"
}

# 部署 Kubernetes 集群
deploy_k8s_cluster() {
    echo -e "${YELLOW}Creating Kubernetes cluster...${NC}"
    kind create cluster --config kind-config.yaml --name rd-automation || true
    echo -e "${GREEN}✓ Kubernetes cluster ready${NC}"
}

# 部署 KubeSphere
deploy_kubesphere() {
    echo -e "${YELLOW}Deploying KubeSphere...${NC}"
    kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v4.2.1/kubesphere-installer.yaml
    kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v4.2.1/cluster-configuration.yaml
    
    echo "Waiting for KubeSphere installation..."
    kubectl rollout status deployment ks-install -n kubesphere-system --timeout=600s
    echo -e "${GREEN}✓ KubeSphere deployed${NC}"
}

# 部署 Jenkins
deploy_jenkins() {
    echo -e "${YELLOW}Deploying Jenkins...${NC}"
    helm repo add jenkinsci https://charts.jenkins.io
    helm repo update
    helm upgrade --install jenkins jenkinsci/jenkins \
      --namespace jenkins --create-namespace \
      -f jenkins-values.yaml \
      --wait --timeout 10m
    echo -e "${GREEN}✓ Jenkins deployed${NC}"
}

# 部署监控系统
deploy_monitoring() {
    echo -e "${YELLOW}Deploying monitoring stack...${NC}"
    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
    helm repo update
    helm upgrade --install prometheus prometheus-community/kube-prometheus-stack \
      --namespace monitoring --create-namespace \
      -f prometheus-values.yaml \
      --wait
    echo -e "${GREEN}✓ Monitoring stack deployed${NC}"
}

# 安装 OpenClaw
install_openclaw() {
    echo -e "${YELLOW}Installing OpenClaw...${NC}"
    cd ~
    git clone https://github.com/OpenClawHQ/openclaw.git || true
    cd openclaw
    pnpm install
    pnpm build
    pnpm run init
    echo -e "${GREEN}✓ OpenClaw installed${NC}"
}

# 配置环境变量
setup_environment() {
    echo -e "${YELLOW}Setting up environment variables...${NC}"
    if [ ! -f ~/.openclaw/.env ]; then
        cp .env.template ~/.openclaw/.env
        echo -e "${YELLOW}Please edit ~/.openclaw/.env with your actual values${NC}"
    fi
    source ~/.openclaw/.env
    echo -e "${GREEN}✓ Environment configured${NC}"
}

# 主流程
main() {
    check_prerequisites
    deploy_k8s_cluster
    deploy_kubesphere
    deploy_jenkins
    deploy_monitoring
    install_openclaw
    setup_environment
    
    echo ""
    echo -e "${GREEN}========================================${NC}"
    echo -e "${GREEN}🎉 Deployment Complete!${NC}"
    echo -e "${GREEN}========================================${NC}"
    echo ""
    echo "📊 KubeSphere Console: http://$(hostname -I | awk '{print $1}'):30880"
    echo "🔧 Jenkins: http://$(hostname -I | awk '{print $1}'):30080"
    echo "📈 Grafana: http://$(hostname -I | awk '{print $1}'):30900"
    echo "🤖 OpenClaw UI: http://localhost:3000"
    echo ""
    echo "Default credentials:"
    echo "  KubeSphere: admin / P@88w0rd"
    echo "  Jenkins: Check initial password with:"
    echo "    kubectl exec -it -n jenkins \$(kubectl get pod -n jenkins -l app.kubernetes.io/component=jenkins-controller -o jsonpath='{.items[0].metadata.name}') -- cat /var/jenkins_home/secrets/initialAdminPassword"
}

main "$@"

A.3 系统健康检查脚本

# health-check.sh - 系统健康状态检查
#!/bin/bash

echo "🏥 Running System Health Check..."
echo ""

checks_passed=0
checks_failed=0

check_service() {
    local name=$1
    local url=$2
    
    if curl -f -s -o /dev/null "$url"; then
        echo "✅ $name: UP"
        ((checks_passed++))
    else
        echo "❌ $name: DOWN"
        ((checks_failed++))
    fi
}

check_k8s_resources() {
    local resource=$1
    local namespace=$2
    
    count=$(kubectl get $resource -n $namespace --no-headers 2>/dev/null | wc -l)
    if [ "$count" -gt 0 ]; then
        echo "✅ $resource in $namespace: $count found"
        ((checks_passed++))
    else
        echo "❌ $resource in $namespace: NOT FOUND"
        ((checks_failed++))
    fi
}

echo "=== Service Health ==="
check_service "KubeSphere" "http://localhost:30880"
check_service "Jenkins" "http://localhost:30080"
check_service "Grafana" "http://localhost:30900"
check_service "OpenClaw Gateway" "http://localhost:8080"
check_service "OpenClaw UI" "http://localhost:3000"

echo ""
echo "=== Kubernetes Resources ==="
check_k8s_resources "nodes" ""
check_k8s_resources "pods" "kubesphere-system"
check_k8s_resources "pods" "jenkins"
check_k8s_resources "pods" "monitoring"

echo ""
echo "=== AI Services ==="
if [ -n "$ANTHROPIC_API_KEY" ]; then
    echo "✅ Anthropic API Key: Configured"
    ((checks_passed++))
else
    echo "❌ Anthropic API Key: NOT CONFIGURED"
    ((checks_failed++))
fi

echo ""
echo "=== Summary ==="
echo "Passed: $checks_passed"
echo "Failed: $checks_failed"

if [ $checks_failed -eq 0 ]; then
    echo -e "\n🟢 All health checks passed!"
    exit 0
else
    echo -e "\n🔴 Some health checks failed!"
    exit 1
fi

A.4 实用命令速查表

类别 命令 说明
Kubernetes kubectl get pods -A 查看所有命名空间的 Pod
kubectl logs -f <pod> 实时查看 Pod 日志
kubectl describe pod <pod> 查看 Pod 详细信息
kubectl exec -it <pod> -- bash 进入 Pod 容器
Docker docker ps -a 查看所有容器
docker logs -f <container> 实时查看容器日志
docker exec -it <container> bash 进入容器
docker system prune -a 清理未使用的资源
Helm helm list -A 列出所有 Release
helm uninstall <release> 卸载 Release
helm rollback <release> <revision> 回滚到指定版本
helm get values <release> 查看 Release 配置
OpenClaw pnpm run gateway:start 启动 Gateway 服务
pnpm run ui:start 启动 Control UI
pnpm run doctor 检查系统状态
pnpm run skill:list 列出已安装 Skills
Claude Code claude --help 查看帮助文档
claude auth 重新认证
claude "your prompt" 执行 AI 任务
claude --version 查看版本
2 action: setup-component-structure description: 规划组件层级结构 - step: