10.1 项目概述
10.1.1 项目背景
本章将通过一个完整的实战项目,展示如何使用Caddy构建一个现代化的企业级Web服务架构。我们将构建一个包含以下组件的系统:
- 前端应用:React SPA应用
- API网关:统一的API入口
- 微服务:多个后端服务
- 数据库:PostgreSQL和Redis
- 监控系统:Prometheus和Grafana
- 日志系统:ELK Stack
10.1.2 架构设计
┌─────────────────┐ ┌─────────────────┐
│ 用户/客户端 │────│ CDN/负载均衡 │
└─────────────────┘ └─────────────────┘
│
┌─────────────────┐
│ Caddy网关 │
│ (SSL终止) │
└─────────────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ 前端应用 │ │ API服务 │ │ 管理后台 │
│ (React) │ │ (微服务集群) │ │ (Admin) │
└─────────────┘ └─────────────┘ └─────────────┘
│
┌───────────────┼───────────────┐
│ │ │
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ 用户服务 │ │ 订单服务 │ │ 支付服务 │
│ (User API) │ │(Order API) │ │(Payment API)│
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ PostgreSQL │ │ Redis │ │ 消息队列 │
│ 数据库 │ │ 缓存 │ │ (RabbitMQ) │
└─────────────┘ └─────────────┘ └─────────────┘
10.1.3 技术栈
- Web服务器:Caddy v2
- 前端:React + TypeScript
- 后端:Go微服务
- 数据库:PostgreSQL + Redis
- 监控:Prometheus + Grafana
- 日志:Elasticsearch + Logstash + Kibana
- 容器化:Docker + Docker Compose
10.2 环境准备
10.2.1 目录结构
ecommerce-platform/
├── caddy/
│ ├── Caddyfile
│ ├── config/
│ │ ├── api.json
│ │ └── tls.json
│ └── logs/
├── frontend/
│ ├── public/
│ ├── src/
│ ├── package.json
│ └── Dockerfile
├── services/
│ ├── user-service/
│ ├── order-service/
│ ├── payment-service/
│ └── gateway/
├── infrastructure/
│ ├── docker-compose.yml
│ ├── prometheus/
│ ├── grafana/
│ └── elk/
├── scripts/
│ ├── deploy.sh
│ ├── backup.sh
│ └── monitoring.sh
└── docs/
├── api.md
└── deployment.md
10.2.2 Docker Compose配置
# infrastructure/docker-compose.yml
version: '3.8'
services:
# Caddy Web服务器
caddy:
image: caddy:2-alpine
container_name: caddy
restart: unless-stopped
ports:
- "80:80"
- "443:443"
- "2019:2019" # Admin API
volumes:
- ./caddy/Caddyfile:/etc/caddy/Caddyfile
- ./caddy/config:/config
- ./caddy/data:/data
- ./caddy/logs:/var/log/caddy
- ./frontend/dist:/srv/frontend
networks:
- web
- internal
depends_on:
- user-service
- order-service
- payment-service
# 前端应用构建
frontend:
build:
context: ./frontend
dockerfile: Dockerfile
container_name: frontend-build
volumes:
- ./frontend/dist:/app/dist
command: npm run build
# 用户服务
user-service:
build:
context: ./services/user-service
dockerfile: Dockerfile
container_name: user-service
restart: unless-stopped
environment:
- DB_HOST=postgres
- DB_PORT=5432
- DB_NAME=userdb
- DB_USER=postgres
- DB_PASSWORD=password
- REDIS_HOST=redis
- REDIS_PORT=6379
networks:
- internal
depends_on:
- postgres
- redis
# 订单服务
order-service:
build:
context: ./services/order-service
dockerfile: Dockerfile
container_name: order-service
restart: unless-stopped
environment:
- DB_HOST=postgres
- DB_PORT=5432
- DB_NAME=orderdb
- DB_USER=postgres
- DB_PASSWORD=password
- REDIS_HOST=redis
- REDIS_PORT=6379
- RABBITMQ_URL=amqp://guest:guest@rabbitmq:5672/
networks:
- internal
depends_on:
- postgres
- redis
- rabbitmq
# 支付服务
payment-service:
build:
context: ./services/payment-service
dockerfile: Dockerfile
container_name: payment-service
restart: unless-stopped
environment:
- DB_HOST=postgres
- DB_PORT=5432
- DB_NAME=paymentdb
- DB_USER=postgres
- DB_PASSWORD=password
- REDIS_HOST=redis
- REDIS_PORT=6379
networks:
- internal
depends_on:
- postgres
- redis
# PostgreSQL数据库
postgres:
image: postgres:13-alpine
container_name: postgres
restart: unless-stopped
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=password
- POSTGRES_MULTIPLE_DATABASES=userdb,orderdb,paymentdb
volumes:
- postgres_data:/var/lib/postgresql/data
- ./infrastructure/postgres/init:/docker-entrypoint-initdb.d
networks:
- internal
# Redis缓存
redis:
image: redis:6-alpine
container_name: redis
restart: unless-stopped
command: redis-server --appendonly yes
volumes:
- redis_data:/data
networks:
- internal
# RabbitMQ消息队列
rabbitmq:
image: rabbitmq:3-management-alpine
container_name: rabbitmq
restart: unless-stopped
environment:
- RABBITMQ_DEFAULT_USER=guest
- RABBITMQ_DEFAULT_PASS=guest
ports:
- "15672:15672" # 管理界面
volumes:
- rabbitmq_data:/var/lib/rabbitmq
networks:
- internal
# Prometheus监控
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
ports:
- "9090:9090"
volumes:
- ./infrastructure/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--web.enable-lifecycle'
networks:
- internal
# Grafana仪表板
grafana:
image: grafana/grafana:latest
container_name: grafana
restart: unless-stopped
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- grafana_data:/var/lib/grafana
- ./infrastructure/grafana/dashboards:/etc/grafana/provisioning/dashboards
- ./infrastructure/grafana/datasources:/etc/grafana/provisioning/datasources
networks:
- internal
# Elasticsearch
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.14.0
container_name: elasticsearch
restart: unless-stopped
environment:
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
volumes:
- elasticsearch_data:/usr/share/elasticsearch/data
networks:
- internal
# Logstash
logstash:
image: docker.elastic.co/logstash/logstash:7.14.0
container_name: logstash
restart: unless-stopped
volumes:
- ./infrastructure/elk/logstash/pipeline:/usr/share/logstash/pipeline
- ./caddy/logs:/var/log/caddy:ro
networks:
- internal
depends_on:
- elasticsearch
# Kibana
kibana:
image: docker.elastic.co/kibana/kibana:7.14.0
container_name: kibana
restart: unless-stopped
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
networks:
- internal
depends_on:
- elasticsearch
volumes:
postgres_data:
redis_data:
rabbitmq_data:
prometheus_data:
grafana_data:
elasticsearch_data:
networks:
web:
external: true
internal:
driver: bridge
10.3 Caddy配置
10.3.1 主配置文件
# caddy/Caddyfile
{
# 全局配置
admin localhost:2019
# 日志配置
log {
output file /var/log/caddy/access.log {
roll_size 100mb
roll_keep 5
roll_keep_for 720h
}
format json
level INFO
}
# 错误日志
log error {
output file /var/log/caddy/error.log {
roll_size 100mb
roll_keep 5
}
format json
level ERROR
}
# 自动HTTPS配置
email admin@ecommerce-platform.com
# 安全头部
header {
# 安全头部
Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
X-Content-Type-Options "nosniff"
X-Frame-Options "DENY"
X-XSS-Protection "1; mode=block"
Referrer-Policy "strict-origin-when-cross-origin"
# 隐藏服务器信息
-Server
}
}
# 主域名 - 前端应用
ecommerce-platform.com {
# 根目录指向前端构建文件
root * /srv/frontend
# 启用压缩
encode gzip zstd
# API路由代理到后端服务
handle /api/users/* {
reverse_proxy user-service:8080 {
# 健康检查
health_uri /health
health_interval 30s
health_timeout 5s
# 负载均衡
lb_policy least_conn
# 重试配置
lb_try_duration 30s
lb_try_interval 250ms
}
}
handle /api/orders/* {
reverse_proxy order-service:8080 {
health_uri /health
health_interval 30s
health_timeout 5s
lb_policy least_conn
}
}
handle /api/payments/* {
reverse_proxy payment-service:8080 {
health_uri /health
health_interval 30s
health_timeout 5s
lb_policy least_conn
}
}
# WebSocket支持
handle /ws/* {
reverse_proxy user-service:8080
}
# 静态文件服务
handle /static/* {
file_server {
# 启用预压缩
precompressed gzip br
}
# 缓存配置
header {
Cache-Control "public, max-age=31536000, immutable"
}
}
# SPA路由支持
handle {
try_files {path} /index.html
file_server
}
# 限流配置
rate_limit {
zone dynamic {
key {remote_host}
events 100
window 1m
}
}
# 请求日志
log {
output file /var/log/caddy/ecommerce-access.log {
roll_size 100mb
roll_keep 10
}
format json {
time_format "iso8601"
message_key "message"
}
}
}
# API子域名
api.ecommerce-platform.com {
# CORS配置
header {
Access-Control-Allow-Origin "https://ecommerce-platform.com"
Access-Control-Allow-Methods "GET, POST, PUT, DELETE, OPTIONS"
Access-Control-Allow-Headers "Content-Type, Authorization"
Access-Control-Max-Age "86400"
}
# 处理预检请求
@options method OPTIONS
respond @options 204
# JWT认证中间件
jwt {
primary yes
trusted_tokens {
static_secret "your-jwt-secret-key"
}
auth_url /api/auth/verify
allow_guests /api/auth/login /api/auth/register /api/health
}
# API网关路由
handle /users/* {
uri strip_prefix /users
reverse_proxy user-service:8080 {
header_up Host {upstream_hostport}
header_up X-Real-IP {remote_host}
header_up X-Forwarded-For {remote_host}
header_up X-Forwarded-Proto {scheme}
}
}
handle /orders/* {
uri strip_prefix /orders
reverse_proxy order-service:8080 {
header_up Host {upstream_hostport}
header_up X-Real-IP {remote_host}
header_up X-Forwarded-For {remote_host}
header_up X-Forwarded-Proto {scheme}
}
}
handle /payments/* {
uri strip_prefix /payments
reverse_proxy payment-service:8080 {
header_up Host {upstream_hostport}
header_up X-Real-IP {remote_host}
header_up X-Forwarded-For {remote_host}
header_up X-Forwarded-Proto {scheme}
}
}
# API限流
rate_limit {
zone api {
key {http.request.header.authorization}
events 1000
window 1h
}
}
}
# 管理后台
admin.ecommerce-platform.com {
# 基本认证
basicauth {
admin $2a$14$hashed_password_here
}
# IP白名单
@allowed remote_ip 10.0.0.0/8 192.168.0.0/16 172.16.0.0/12
abort @allowed
# 管理界面代理
handle /grafana/* {
uri strip_prefix /grafana
reverse_proxy grafana:3000
}
handle /prometheus/* {
uri strip_prefix /prometheus
reverse_proxy prometheus:9090
}
handle /kibana/* {
uri strip_prefix /kibana
reverse_proxy kibana:5601
}
handle /rabbitmq/* {
uri strip_prefix /rabbitmq
reverse_proxy rabbitmq:15672
}
# 默认重定向到Grafana
redir / /grafana/
}
# 监控端点
monitoring.ecommerce-platform.com {
# Prometheus指标
handle /metrics {
metrics
}
# 健康检查
handle /health {
respond "OK" 200
}
# Caddy配置API
handle /config/* {
reverse_proxy localhost:2019
}
}
# 开发环境配置
dev.ecommerce-platform.com {
# 开发模式配置
tls internal
# 热重载支持
handle /sockjs-node/* {
reverse_proxy localhost:3000
}
handle {
reverse_proxy localhost:3000
}
}
10.3.2 JSON配置文件
{
"admin": {
"listen": "localhost:2019"
},
"logging": {
"logs": {
"default": {
"level": "INFO",
"writer": {
"output": "file",
"filename": "/var/log/caddy/caddy.log",
"roll": true,
"roll_size_mb": 100,
"roll_keep": 5
},
"encoder": {
"format": "json",
"time_format": "iso8601"
}
}
}
},
"apps": {
"http": {
"servers": {
"main": {
"listen": [":80", ":443"],
"routes": [
{
"match": [
{
"host": ["ecommerce-platform.com"]
}
],
"handle": [
{
"handler": "subroute",
"routes": [
{
"match": [
{
"path": ["/api/users/*"]
}
],
"handle": [
{
"handler": "reverse_proxy",
"upstreams": [
{
"dial": "user-service:8080"
}
],
"health_checks": {
"active": {
"uri": "/health",
"interval": "30s",
"timeout": "5s"
}
}
}
]
},
{
"match": [
{
"path": ["/api/orders/*"]
}
],
"handle": [
{
"handler": "reverse_proxy",
"upstreams": [
{
"dial": "order-service:8080"
}
]
}
]
},
{
"handle": [
{
"handler": "file_server",
"root": "/srv/frontend",
"index_names": ["index.html"]
}
]
}
]
}
]
}
],
"automatic_https": {
"disable": false
}
}
}
},
"tls": {
"automation": {
"policies": [
{
"subjects": [
"ecommerce-platform.com",
"*.ecommerce-platform.com"
],
"issuers": [
{
"module": "acme",
"ca": "https://acme-v02.api.letsencrypt.org/directory",
"email": "admin@ecommerce-platform.com"
}
]
}
]
}
}
}
}
10.4 微服务实现
10.4.1 用户服务
// services/user-service/main.go
package main
import (
"context"
"encoding/json"
"fmt"
"log"
"net/http"
"os"
"time"
"github.com/gorilla/mux"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"gorm.io/driver/postgres"
"gorm.io/gorm"
"github.com/go-redis/redis/v8"
)
type User struct {
ID uint `json:"id" gorm:"primaryKey"`
Username string `json:"username" gorm:"uniqueIndex"`
Email string `json:"email" gorm:"uniqueIndex"`
Password string `json:"-"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
}
type UserService struct {
db *gorm.DB
redis *redis.Client
// Prometheus指标
requestsTotal *prometheus.CounterVec
requestDuration *prometheus.HistogramVec
}
func NewUserService() *UserService {
// 数据库连接
dsn := fmt.Sprintf("host=%s port=%s user=%s password=%s dbname=%s sslmode=disable",
os.Getenv("DB_HOST"),
os.Getenv("DB_PORT"),
os.Getenv("DB_USER"),
os.Getenv("DB_PASSWORD"),
os.Getenv("DB_NAME"),
)
db, err := gorm.Open(postgres.Open(dsn), &gorm.Config{})
if err != nil {
log.Fatal("Failed to connect to database:", err)
}
// 自动迁移
db.AutoMigrate(&User{})
// Redis连接
rdb := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", os.Getenv("REDIS_HOST"), os.Getenv("REDIS_PORT")),
})
// Prometheus指标
requestsTotal := prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "user_service_requests_total",
Help: "Total number of requests to user service",
},
[]string{"method", "endpoint", "status"},
)
requestDuration := prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "user_service_request_duration_seconds",
Help: "Request duration in seconds",
},
[]string{"method", "endpoint"},
)
prometheus.MustRegister(requestsTotal, requestDuration)
return &UserService{
db: db,
redis: rdb,
requestsTotal: requestsTotal,
requestDuration: requestDuration,
}
}
func (us *UserService) GetUsers(w http.ResponseWriter, r *http.Request) {
start := time.Now()
defer func() {
duration := time.Since(start).Seconds()
us.requestDuration.WithLabelValues(r.Method, "/users").Observe(duration)
}()
var users []User
// 尝试从缓存获取
cacheKey := "users:all"
cached, err := us.redis.Get(context.Background(), cacheKey).Result()
if err == nil {
json.Unmarshal([]byte(cached), &users)
us.requestsTotal.WithLabelValues(r.Method, "/users", "200").Inc()
w.Header().Set("Content-Type", "application/json")
w.Header().Set("X-Cache", "HIT")
json.NewEncoder(w).Encode(users)
return
}
// 从数据库获取
result := us.db.Find(&users)
if result.Error != nil {
us.requestsTotal.WithLabelValues(r.Method, "/users", "500").Inc()
http.Error(w, result.Error.Error(), http.StatusInternalServerError)
return
}
// 缓存结果
usersJSON, _ := json.Marshal(users)
us.redis.Set(context.Background(), cacheKey, usersJSON, 5*time.Minute)
us.requestsTotal.WithLabelValues(r.Method, "/users", "200").Inc()
w.Header().Set("Content-Type", "application/json")
w.Header().Set("X-Cache", "MISS")
json.NewEncoder(w).Encode(users)
}
func (us *UserService) CreateUser(w http.ResponseWriter, r *http.Request) {
start := time.Now()
defer func() {
duration := time.Since(start).Seconds()
us.requestDuration.WithLabelValues(r.Method, "/users").Observe(duration)
}()
var user User
if err := json.NewDecoder(r.Body).Decode(&user); err != nil {
us.requestsTotal.WithLabelValues(r.Method, "/users", "400").Inc()
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
// 创建用户
result := us.db.Create(&user)
if result.Error != nil {
us.requestsTotal.WithLabelValues(r.Method, "/users", "500").Inc()
http.Error(w, result.Error.Error(), http.StatusInternalServerError)
return
}
// 清除缓存
us.redis.Del(context.Background(), "users:all")
us.requestsTotal.WithLabelValues(r.Method, "/users", "201").Inc()
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusCreated)
json.NewEncoder(w).Encode(user)
}
func (us *UserService) HealthCheck(w http.ResponseWriter, r *http.Request) {
// 检查数据库连接
sqlDB, err := us.db.DB()
if err != nil {
http.Error(w, "Database connection failed", http.StatusServiceUnavailable)
return
}
if err := sqlDB.Ping(); err != nil {
http.Error(w, "Database ping failed", http.StatusServiceUnavailable)
return
}
// 检查Redis连接
_, err = us.redis.Ping(context.Background()).Result()
if err != nil {
http.Error(w, "Redis connection failed", http.StatusServiceUnavailable)
return
}
w.WriteHeader(http.StatusOK)
json.NewEncoder(w).Encode(map[string]string{"status": "healthy"})
}
func main() {
userService := NewUserService()
r := mux.NewRouter()
// API路由
api := r.PathPrefix("/api/v1").Subrouter()
api.HandleFunc("/users", userService.GetUsers).Methods("GET")
api.HandleFunc("/users", userService.CreateUser).Methods("POST")
// 健康检查
r.HandleFunc("/health", userService.HealthCheck).Methods("GET")
// Prometheus指标
r.Handle("/metrics", promhttp.Handler())
// 启动服务器
log.Println("User service starting on :8080")
log.Fatal(http.ListenAndServe(":8080", r))
}
10.4.2 订单服务
// services/order-service/main.go
package main
import (
"context"
"encoding/json"
"fmt"
"log"
"net/http"
"os"
"time"
"github.com/gorilla/mux"
"github.com/streadway/amqp"
"gorm.io/driver/postgres"
"gorm.io/gorm"
)
type Order struct {
ID uint `json:"id" gorm:"primaryKey"`
UserID uint `json:"user_id"`
ProductID uint `json:"product_id"`
Quantity int `json:"quantity"`
TotalAmount float64 `json:"total_amount"`
Status string `json:"status"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
}
type OrderService struct {
db *gorm.DB
rabbitmq *amqp.Connection
channel *amqp.Channel
}
func NewOrderService() *OrderService {
// 数据库连接
dsn := fmt.Sprintf("host=%s port=%s user=%s password=%s dbname=%s sslmode=disable",
os.Getenv("DB_HOST"),
os.Getenv("DB_PORT"),
os.Getenv("DB_USER"),
os.Getenv("DB_PASSWORD"),
os.Getenv("DB_NAME"),
)
db, err := gorm.Open(postgres.Open(dsn), &gorm.Config{})
if err != nil {
log.Fatal("Failed to connect to database:", err)
}
db.AutoMigrate(&Order{})
// RabbitMQ连接
conn, err := amqp.Dial(os.Getenv("RABBITMQ_URL"))
if err != nil {
log.Fatal("Failed to connect to RabbitMQ:", err)
}
ch, err := conn.Channel()
if err != nil {
log.Fatal("Failed to open RabbitMQ channel:", err)
}
// 声明队列
_, err = ch.QueueDeclare(
"order_events", // 队列名称
true, // 持久化
false, // 自动删除
false, // 排他性
false, // 不等待
nil, // 参数
)
if err != nil {
log.Fatal("Failed to declare queue:", err)
}
return &OrderService{
db: db,
rabbitmq: conn,
channel: ch,
}
}
func (os *OrderService) CreateOrder(w http.ResponseWriter, r *http.Request) {
var order Order
if err := json.NewDecoder(r.Body).Decode(&order); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
order.Status = "pending"
order.CreatedAt = time.Now()
// 创建订单
result := os.db.Create(&order)
if result.Error != nil {
http.Error(w, result.Error.Error(), http.StatusInternalServerError)
return
}
// 发送订单事件到消息队列
orderEvent := map[string]interface{}{
"event_type": "order_created",
"order_id": order.ID,
"user_id": order.UserID,
"amount": order.TotalAmount,
"timestamp": time.Now(),
}
eventJSON, _ := json.Marshal(orderEvent)
err := os.channel.Publish(
"", // 交换机
"order_events", // 路由键
false, // 强制
false, // 立即
amqp.Publishing{
ContentType: "application/json",
Body: eventJSON,
},
)
if err != nil {
log.Printf("Failed to publish order event: %v", err)
}
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusCreated)
json.NewEncoder(w).Encode(order)
}
func (os *OrderService) GetOrders(w http.ResponseWriter, r *http.Request) {
var orders []Order
result := os.db.Find(&orders)
if result.Error != nil {
http.Error(w, result.Error.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(orders)
}
func main() {
orderService := NewOrderService()
defer orderService.rabbitmq.Close()
defer orderService.channel.Close()
r := mux.NewRouter()
api := r.PathPrefix("/api/v1").Subrouter()
api.HandleFunc("/orders", orderService.GetOrders).Methods("GET")
api.HandleFunc("/orders", orderService.CreateOrder).Methods("POST")
r.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
json.NewEncoder(w).Encode(map[string]string{"status": "healthy"})
})
log.Println("Order service starting on :8080")
log.Fatal(http.ListenAndServe(":8080", r))
}
10.5 前端应用
10.5.1 React应用结构
// frontend/src/App.tsx
import React from 'react';
import { BrowserRouter as Router, Routes, Route } from 'react-router-dom';
import { QueryClient, QueryClientProvider } from 'react-query';
import { ReactQueryDevtools } from 'react-query/devtools';
import Header from './components/Header';
import Home from './pages/Home';
import Products from './pages/Products';
import Orders from './pages/Orders';
import Profile from './pages/Profile';
import Login from './pages/Login';
import { AuthProvider } from './contexts/AuthContext';
const queryClient = new QueryClient({
defaultOptions: {
queries: {
retry: 3,
staleTime: 5 * 60 * 1000, // 5分钟
cacheTime: 10 * 60 * 1000, // 10分钟
},
},
});
function App() {
return (
<QueryClientProvider client={queryClient}>
<AuthProvider>
<Router>
<div className="App">
<Header />
<main className="main-content">
<Routes>
<Route path="/" element={<Home />} />
<Route path="/products" element={<Products />} />
<Route path="/orders" element={<Orders />} />
<Route path="/profile" element={<Profile />} />
<Route path="/login" element={<Login />} />
</Routes>
</main>
</div>
</Router>
</AuthProvider>
<ReactQueryDevtools initialIsOpen={false} />
</QueryClientProvider>
);
}
export default App;
10.5.2 API客户端
// frontend/src/api/client.ts
import axios, { AxiosInstance, AxiosRequestConfig } from 'axios';
class APIClient {
private client: AxiosInstance;
constructor() {
this.client = axios.create({
baseURL: process.env.REACT_APP_API_URL || '/api',
timeout: 10000,
headers: {
'Content-Type': 'application/json',
},
});
// 请求拦截器
this.client.interceptors.request.use(
(config) => {
const token = localStorage.getItem('auth_token');
if (token) {
config.headers.Authorization = `Bearer ${token}`;
}
// 添加请求ID用于追踪
config.headers['X-Request-ID'] = this.generateRequestId();
return config;
},
(error) => {
return Promise.reject(error);
}
);
// 响应拦截器
this.client.interceptors.response.use(
(response) => {
return response;
},
(error) => {
if (error.response?.status === 401) {
// 清除认证信息并重定向到登录页
localStorage.removeItem('auth_token');
window.location.href = '/login';
}
return Promise.reject(error);
}
);
}
private generateRequestId(): string {
return `${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;
}
// 用户相关API
async getUsers() {
const response = await this.client.get('/users');
return response.data;
}
async createUser(userData: any) {
const response = await this.client.post('/users', userData);
return response.data;
}
async getUserProfile(userId: string) {
const response = await this.client.get(`/users/${userId}`);
return response.data;
}
// 订单相关API
async getOrders() {
const response = await this.client.get('/orders');
return response.data;
}
async createOrder(orderData: any) {
const response = await this.client.post('/orders', orderData);
return response.data;
}
async getOrderById(orderId: string) {
const response = await this.client.get(`/orders/${orderId}`);
return response.data;
}
// 认证相关API
async login(credentials: { username: string; password: string }) {
const response = await this.client.post('/auth/login', credentials);
return response.data;
}
async logout() {
const response = await this.client.post('/auth/logout');
return response.data;
}
async refreshToken() {
const response = await this.client.post('/auth/refresh');
return response.data;
}
}
export const apiClient = new APIClient();
10.5.3 Dockerfile
# frontend/Dockerfile
# 多阶段构建
FROM node:16-alpine AS builder
WORKDIR /app
# 复制package文件
COPY package*.json ./
# 安装依赖
RUN npm ci --only=production
# 复制源代码
COPY . .
# 构建应用
RUN npm run build
# 生产阶段
FROM nginx:alpine
# 复制构建文件
COPY --from=builder /app/dist /usr/share/nginx/html
# 复制nginx配置
COPY nginx.conf /etc/nginx/nginx.conf
# 暴露端口
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
10.6 监控和日志
10.6.1 Prometheus配置
# infrastructure/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "alert_rules.yml"
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
scrape_configs:
# Caddy指标
- job_name: 'caddy'
static_configs:
- targets: ['caddy:2019']
metrics_path: '/metrics'
scrape_interval: 30s
# 用户服务指标
- job_name: 'user-service'
static_configs:
- targets: ['user-service:8080']
metrics_path: '/metrics'
scrape_interval: 30s
# 订单服务指标
- job_name: 'order-service'
static_configs:
- targets: ['order-service:8080']
metrics_path: '/metrics'
scrape_interval: 30s
# 支付服务指标
- job_name: 'payment-service'
static_configs:
- targets: ['payment-service:8080']
metrics_path: '/metrics'
scrape_interval: 30s
# PostgreSQL指标
- job_name: 'postgres'
static_configs:
- targets: ['postgres-exporter:9187']
# Redis指标
- job_name: 'redis'
static_configs:
- targets: ['redis-exporter:9121']
# 节点指标
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']
10.6.2 告警规则
# infrastructure/prometheus/alert_rules.yml
groups:
- name: ecommerce_alerts
rules:
# 高错误率告警
- alert: HighErrorRate
expr: |
(
sum(rate(caddy_http_requests_total{status=~"5.."}[5m])) by (instance)
/
sum(rate(caddy_http_requests_total[5m])) by (instance)
) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value | humanizePercentage }} for instance {{ $labels.instance }}"
# 高响应时间告警
- alert: HighResponseTime
expr: |
histogram_quantile(0.95, sum(rate(caddy_http_request_duration_seconds_bucket[5m])) by (le, instance)) > 2
for: 5m
labels:
severity: warning
annotations:
summary: "High response time detected"
description: "95th percentile response time is {{ $value }}s for instance {{ $labels.instance }}"
# 服务不可用告警
- alert: ServiceDown
expr: up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Service is down"
description: "{{ $labels.job }} service is down for instance {{ $labels.instance }}"
# 数据库连接告警
- alert: DatabaseConnectionHigh
expr: |
pg_stat_activity_count{state="active"} > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High database connections"
description: "Active database connections: {{ $value }}"
# 内存使用率告警
- alert: HighMemoryUsage
expr: |
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) > 0.85
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage"
description: "Memory usage is {{ $value | humanizePercentage }} on {{ $labels.instance }}"
# 磁盘空间告警
- alert: DiskSpaceLow
expr: |
(1 - (node_filesystem_avail_bytes{fstype!="tmpfs"} / node_filesystem_size_bytes{fstype!="tmpfs"})) > 0.85
for: 5m
labels:
severity: warning
annotations:
summary: "Low disk space"
description: "Disk usage is {{ $value | humanizePercentage }} on {{ $labels.instance }} mount {{ $labels.mountpoint }}"
10.6.3 Grafana仪表板
{
"dashboard": {
"id": null,
"title": "E-commerce Platform Dashboard",
"tags": ["ecommerce", "caddy", "microservices"],
"timezone": "browser",
"panels": [
{
"id": 1,
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "sum(rate(caddy_http_requests_total[5m])) by (instance)",
"legendFormat": "{{instance}}"
}
],
"yAxes": [
{
"label": "Requests/sec"
}
]
},
{
"id": 2,
"title": "Response Time (95th percentile)",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, sum(rate(caddy_http_request_duration_seconds_bucket[5m])) by (le, instance))",
"legendFormat": "{{instance}}"
}
],
"yAxes": [
{
"label": "Seconds"
}
]
},
{
"id": 3,
"title": "Error Rate",
"type": "singlestat",
"targets": [
{
"expr": "sum(rate(caddy_http_requests_total{status=~\"5..\"}[5m])) / sum(rate(caddy_http_requests_total[5m]))",
"legendFormat": "Error Rate"
}
],
"valueName": "current",
"format": "percentunit",
"thresholds": "0.01,0.05",
"colorBackground": true
},
{
"id": 4,
"title": "Active Users",
"type": "singlestat",
"targets": [
{
"expr": "sum(user_service_active_sessions)",
"legendFormat": "Active Users"
}
]
},
{
"id": 5,
"title": "Database Connections",
"type": "graph",
"targets": [
{
"expr": "pg_stat_activity_count",
"legendFormat": "{{state}}"
}
]
},
{
"id": 6,
"title": "Cache Hit Rate",
"type": "singlestat",
"targets": [
{
"expr": "redis_keyspace_hits_total / (redis_keyspace_hits_total + redis_keyspace_misses_total)",
"legendFormat": "Hit Rate"
}
],
"format": "percentunit"
}
],
"time": {
"from": "now-1h",
"to": "now"
},
"refresh": "30s"
}
}
10.6.4 ELK配置
# infrastructure/elk/logstash/pipeline/caddy.conf
input {
file {
path => "/var/log/caddy/*.log"
start_position => "beginning"
codec => "json"
tags => ["caddy"]
}
}
filter {
if "caddy" in [tags] {
# 解析时间戳
date {
match => [ "ts", "UNIX" ]
}
# 提取用户代理信息
if [request] and [request][headers] and [request][headers]["User-Agent"] {
useragent {
source => "[request][headers][User-Agent][0]"
target => "user_agent"
}
}
# 提取地理位置信息
if [request] and [request][remote_ip] {
geoip {
source => "[request][remote_ip]"
target => "geoip"
}
}
# 计算响应时间
if [duration] {
ruby {
code => "event.set('response_time_ms', event.get('duration') * 1000)"
}
}
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "caddy-logs-%{+YYYY.MM.dd}"
}
# 调试输出
stdout {
codec => rubydebug
}
}
10.7 部署脚本
10.7.1 自动化部署脚本
#!/bin/bash
# scripts/deploy.sh
set -e
# 配置变量
PROJECT_NAME="ecommerce-platform"
ENVIRONMENT=${1:-production}
VERSION=${2:-latest}
BACKUP_DIR="/opt/backups"
LOG_FILE="/var/log/deploy.log"
# 日志函数
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a $LOG_FILE
}
error() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] ERROR: $1" | tee -a $LOG_FILE
exit 1
}
# 检查依赖
check_dependencies() {
log "Checking dependencies..."
command -v docker >/dev/null 2>&1 || error "Docker is not installed"
command -v docker-compose >/dev/null 2>&1 || error "Docker Compose is not installed"
# 检查Docker服务状态
if ! docker info >/dev/null 2>&1; then
error "Docker daemon is not running"
fi
log "Dependencies check passed"
}
# 备份当前配置
backup_config() {
log "Creating configuration backup..."
BACKUP_TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_PATH="$BACKUP_DIR/${PROJECT_NAME}_${BACKUP_TIMESTAMP}"
mkdir -p $BACKUP_PATH
# 备份Caddy配置
if [ -f "caddy/Caddyfile" ]; then
cp -r caddy/ $BACKUP_PATH/
log "Caddy configuration backed up"
fi
# 备份数据库
if docker ps | grep -q postgres; then
log "Creating database backup..."
docker exec postgres pg_dumpall -U postgres > $BACKUP_PATH/database_backup.sql
log "Database backup completed"
fi
# 备份当前Docker Compose配置
cp infrastructure/docker-compose.yml $BACKUP_PATH/
log "Backup completed: $BACKUP_PATH"
echo $BACKUP_PATH > .last_backup
}
# 构建镜像
build_images() {
log "Building Docker images..."
# 构建前端
log "Building frontend..."
docker build -t ${PROJECT_NAME}/frontend:${VERSION} frontend/
# 构建微服务
for service in user-service order-service payment-service; do
log "Building $service..."
docker build -t ${PROJECT_NAME}/${service}:${VERSION} services/${service}/
done
log "Image building completed"
}
# 更新配置
update_config() {
log "Updating configuration for environment: $ENVIRONMENT"
# 根据环境更新配置
case $ENVIRONMENT in
"production")
export CADDY_DOMAIN="ecommerce-platform.com"
export DB_PASSWORD=$(openssl rand -base64 32)
export JWT_SECRET=$(openssl rand -base64 64)
;;
"staging")
export CADDY_DOMAIN="staging.ecommerce-platform.com"
export DB_PASSWORD="staging_password"
export JWT_SECRET="staging_jwt_secret"
;;
"development")
export CADDY_DOMAIN="dev.ecommerce-platform.com"
export DB_PASSWORD="dev_password"
export JWT_SECRET="dev_jwt_secret"
;;
esac
# 生成环境配置文件
envsubst < infrastructure/docker-compose.template.yml > infrastructure/docker-compose.yml
log "Configuration updated for $ENVIRONMENT"
}
# 健康检查
health_check() {
log "Performing health checks..."
local max_attempts=30
local attempt=1
while [ $attempt -le $max_attempts ]; do
log "Health check attempt $attempt/$max_attempts"
# 检查Caddy健康状态
if curl -f http://localhost/health >/dev/null 2>&1; then
log "Caddy health check passed"
else
log "Caddy health check failed"
((attempt++))
sleep 10
continue
fi
# 检查微服务健康状态
local services=("user-service" "order-service" "payment-service")
local all_healthy=true
for service in "${services[@]}"; do
if docker exec $service curl -f http://localhost:8080/health >/dev/null 2>&1; then
log "$service health check passed"
else
log "$service health check failed"
all_healthy=false
fi
done
if $all_healthy; then
log "All health checks passed"
return 0
fi
((attempt++))
sleep 10
done
error "Health checks failed after $max_attempts attempts"
}
# 回滚函数
rollback() {
log "Starting rollback process..."
if [ ! -f ".last_backup" ]; then
error "No backup found for rollback"
fi
BACKUP_PATH=$(cat .last_backup)
if [ ! -d "$BACKUP_PATH" ]; then
error "Backup directory not found: $BACKUP_PATH"
fi
log "Rolling back to backup: $BACKUP_PATH"
# 停止当前服务
docker-compose -f infrastructure/docker-compose.yml down
# 恢复配置
cp -r $BACKUP_PATH/caddy/ ./
cp $BACKUP_PATH/docker-compose.yml infrastructure/
# 恢复数据库
if [ -f "$BACKUP_PATH/database_backup.sql" ]; then
log "Restoring database..."
docker-compose -f infrastructure/docker-compose.yml up -d postgres
sleep 30
docker exec -i postgres psql -U postgres < $BACKUP_PATH/database_backup.sql
fi
# 启动服务
docker-compose -f infrastructure/docker-compose.yml up -d
log "Rollback completed"
}
# 主部署流程
main() {
log "Starting deployment of $PROJECT_NAME version $VERSION to $ENVIRONMENT"
# 检查依赖
check_dependencies
# 创建备份
backup_config
# 构建镜像
build_images
# 更新配置
update_config
# 停止旧服务
log "Stopping existing services..."
docker-compose -f infrastructure/docker-compose.yml down
# 启动新服务
log "Starting new services..."
docker-compose -f infrastructure/docker-compose.yml up -d
# 等待服务启动
sleep 60
# 健康检查
if health_check; then
log "Deployment completed successfully"
# 清理旧镜像
log "Cleaning up old images..."
docker image prune -f
# 发送部署通知
send_notification "success" "Deployment of $PROJECT_NAME $VERSION completed successfully"
else
log "Deployment failed, initiating rollback..."
rollback
send_notification "failure" "Deployment of $PROJECT_NAME $VERSION failed, rolled back"
exit 1
fi
}
# 发送通知
send_notification() {
local status=$1
local message=$2
# Slack通知
if [ -n "$SLACK_WEBHOOK_URL" ]; then
curl -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"[$status] $message\"}" \
$SLACK_WEBHOOK_URL
fi
# 邮件通知
if [ -n "$NOTIFICATION_EMAIL" ]; then
echo "$message" | mail -s "Deployment $status" $NOTIFICATION_EMAIL
fi
}
# 脚本入口
if [ "$1" = "rollback" ]; then
rollback
else
main
fi
10.7.2 监控脚本
#!/bin/bash
# scripts/monitoring.sh
# 系统监控脚本
SERVICES=("caddy" "user-service" "order-service" "payment-service" "postgres" "redis")
ALERT_EMAIL="admin@ecommerce-platform.com"
SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
# 检查服务状态
check_services() {
echo "Checking service status..."
for service in "${SERVICES[@]}"; do
if docker ps | grep -q $service; then
echo "✓ $service is running"
else
echo "✗ $service is not running"
send_alert "Service Down" "$service is not running"
fi
done
}
# 检查资源使用率
check_resources() {
echo "Checking resource usage..."
# CPU使用率
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
if (( $(echo "$CPU_USAGE > 80" | bc -l) )); then
send_alert "High CPU Usage" "CPU usage is ${CPU_USAGE}%"
fi
# 内存使用率
MEMORY_USAGE=$(free | grep Mem | awk '{printf "%.2f", $3/$2 * 100.0}')
if (( $(echo "$MEMORY_USAGE > 85" | bc -l) )); then
send_alert "High Memory Usage" "Memory usage is ${MEMORY_USAGE}%"
fi
# 磁盘使用率
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | cut -d'%' -f1)
if [ $DISK_USAGE -gt 85 ]; then
send_alert "High Disk Usage" "Disk usage is ${DISK_USAGE}%"
fi
}
# 检查应用健康状态
check_app_health() {
echo "Checking application health..."
# 检查主页响应
if ! curl -f -s http://localhost/health >/dev/null; then
send_alert "Application Health Check Failed" "Main application is not responding"
fi
# 检查API响应时间
RESPONSE_TIME=$(curl -o /dev/null -s -w '%{time_total}' http://localhost/api/health)
if (( $(echo "$RESPONSE_TIME > 2.0" | bc -l) )); then
send_alert "Slow API Response" "API response time is ${RESPONSE_TIME}s"
fi
}
# 发送告警
send_alert() {
local title="$1"
local message="$2"
local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
echo "ALERT: $title - $message"
# 发送邮件
if [ -n "$ALERT_EMAIL" ]; then
echo "[$timestamp] $message" | mail -s "$title" $ALERT_EMAIL
fi
# 发送Slack消息
if [ -n "$SLACK_WEBHOOK" ]; then
curl -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"🚨 $title: $message\"}" \
$SLACK_WEBHOOK
fi
}
# 主监控循环
main() {
while true; do
echo "=== Monitoring Check at $(date) ==="
check_services
check_resources
check_app_health
echo "Monitoring check completed"
echo ""
sleep 300 # 5分钟检查一次
done
}
# 运行监控
if [ "$1" = "once" ]; then
check_services
check_resources
check_app_health
else
main
fi
10.8 性能测试
10.8.1 负载测试脚本
#!/bin/bash
# scripts/load_test.sh
# 负载测试配置
TARGET_URL="https://ecommerce-platform.com"
CONCURRENT_USERS=100
TEST_DURATION=300 # 5分钟
RAMP_UP_TIME=60 # 1分钟
# 安装依赖
install_dependencies() {
echo "Installing load testing tools..."
# 安装Apache Bench
if ! command -v ab &> /dev/null; then
sudo apt-get update
sudo apt-get install -y apache2-utils
fi
# 安装wrk
if ! command -v wrk &> /dev/null; then
sudo apt-get install -y wrk
fi
}
# Apache Bench测试
run_ab_test() {
echo "Running Apache Bench test..."
ab -n 10000 -c 100 -g ab_results.dat $TARGET_URL/ > ab_results.txt
echo "Apache Bench test completed. Results saved to ab_results.txt"
}
# wrk测试
run_wrk_test() {
echo "Running wrk test..."
wrk -t12 -c100 -d300s --script=lua_scripts/post_test.lua $TARGET_URL > wrk_results.txt
echo "wrk test completed. Results saved to wrk_results.txt"
}
# 创建Lua脚本用于POST测试
create_lua_scripts() {
mkdir -p lua_scripts
cat > lua_scripts/post_test.lua << 'EOF'
wrk.method = "POST"
wrk.body = '{"username":"testuser","email":"test@example.com"}'
wrk.headers["Content-Type"] = "application/json"
function response(status, headers, body)
if status ~= 200 and status ~= 201 then
print("Error: " .. status .. " " .. body)
end
end
EOF
}
# 分析结果
analyze_results() {
echo "Analyzing test results..."
if [ -f "ab_results.txt" ]; then
echo "=== Apache Bench Results ==="
grep -E "(Requests per second|Time per request|Transfer rate)" ab_results.txt
echo ""
fi
if [ -f "wrk_results.txt" ]; then
echo "=== wrk Results ==="
cat wrk_results.txt
echo ""
fi
}
# 主函数
main() {
install_dependencies
create_lua_scripts
echo "Starting load tests against $TARGET_URL"
echo "Concurrent users: $CONCURRENT_USERS"
echo "Test duration: $TEST_DURATION seconds"
echo ""
run_ab_test
run_wrk_test
analyze_results
echo "Load testing completed!"
}
main
10.8.2 性能监控
#!/usr/bin/env python3
# scripts/performance_monitor.py
import time
import requests
import psutil
import json
from datetime import datetime
from typing import Dict, List
class PerformanceMonitor:
def __init__(self, base_url: str = "http://localhost"):
self.base_url = base_url
self.metrics = []
def collect_system_metrics(self) -> Dict:
"""收集系统性能指标"""
return {
'timestamp': datetime.now().isoformat(),
'cpu_percent': psutil.cpu_percent(interval=1),
'memory_percent': psutil.virtual_memory().percent,
'disk_usage': psutil.disk_usage('/').percent,
'network_io': psutil.net_io_counters()._asdict(),
'load_average': psutil.getloadavg()
}
def collect_app_metrics(self) -> Dict:
"""收集应用性能指标"""
metrics = {
'timestamp': datetime.now().isoformat(),
'endpoints': {}
}
endpoints = [
'/health',
'/api/users',
'/api/orders',
'/api/payments'
]
for endpoint in endpoints:
try:
start_time = time.time()
response = requests.get(f"{self.base_url}{endpoint}", timeout=10)
response_time = time.time() - start_time
metrics['endpoints'][endpoint] = {
'status_code': response.status_code,
'response_time': response_time,
'content_length': len(response.content)
}
except Exception as e:
metrics['endpoints'][endpoint] = {
'error': str(e),
'status_code': 0,
'response_time': 0
}
return metrics
def run_monitoring(self, duration: int = 3600, interval: int = 30):
"""运行性能监控"""
print(f"Starting performance monitoring for {duration} seconds...")
start_time = time.time()
while time.time() - start_time < duration:
# 收集指标
system_metrics = self.collect_system_metrics()
app_metrics = self.collect_app_metrics()
combined_metrics = {
'system': system_metrics,
'application': app_metrics
}
self.metrics.append(combined_metrics)
# 输出当前状态
print(f"[{datetime.now()}] CPU: {system_metrics['cpu_percent']:.1f}%, "
f"Memory: {system_metrics['memory_percent']:.1f}%, "
f"API Health: {app_metrics['endpoints'].get('/health', {}).get('status_code', 'N/A')}")
time.sleep(interval)
# 保存结果
self.save_metrics()
self.generate_report()
def save_metrics(self):
"""保存指标数据"""
filename = f"performance_metrics_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
with open(filename, 'w') as f:
json.dump(self.metrics, f, indent=2)
print(f"Metrics saved to {filename}")
def generate_report(self):
"""生成性能报告"""
if not self.metrics:
return
# 计算统计信息
cpu_values = [m['system']['cpu_percent'] for m in self.metrics]
memory_values = [m['system']['memory_percent'] for m in self.metrics]
health_response_times = []
for m in self.metrics:
health_data = m['application']['endpoints'].get('/health', {})
if 'response_time' in health_data:
health_response_times.append(health_data['response_time'])
report = f"""
=== Performance Report ===
Monitoring Period: {len(self.metrics)} samples
System Metrics:
- CPU Usage: Avg {sum(cpu_values)/len(cpu_values):.1f}%, Max {max(cpu_values):.1f}%
- Memory Usage: Avg {sum(memory_values)/len(memory_values):.1f}%, Max {max(memory_values):.1f}%
Application Metrics:
- Health Endpoint Response Time: Avg {sum(health_response_times)/len(health_response_times)*1000:.1f}ms
- Max Response Time: {max(health_response_times)*1000:.1f}ms
"""
print(report)
with open(f"performance_report_{datetime.now().strftime('%Y%m%d_%H%M%S')}.txt", 'w') as f:
f.write(report)
if __name__ == "__main__":
monitor = PerformanceMonitor()
monitor.run_monitoring(duration=1800, interval=30) # 30分钟监控
10.9 故障排除
10.9.1 常见问题诊断
#!/bin/bash
# scripts/troubleshoot.sh
# 故障排除脚本
echo "=== Caddy E-commerce Platform Troubleshooting ==="
# 检查Docker服务
check_docker() {
echo "Checking Docker services..."
if ! docker info >/dev/null 2>&1; then
echo "❌ Docker daemon is not running"
echo "Solution: sudo systemctl start docker"
return 1
fi
echo "✅ Docker is running"
# 检查容器状态
echo "Container status:"
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
# 检查失败的容器
failed_containers=$(docker ps -a --filter "status=exited" --format "{{.Names}}")
if [ -n "$failed_containers" ]; then
echo "❌ Failed containers found:"
echo "$failed_containers"
for container in $failed_containers; do
echo "\nLogs for $container:"
docker logs --tail 20 $container
done
fi
}
# 检查网络连接
check_network() {
echo "\nChecking network connectivity..."
# 检查端口占用
echo "Port usage:"
netstat -tlnp | grep -E ':(80|443|2019|5432|6379)'
# 检查DNS解析
if ! nslookup ecommerce-platform.com >/dev/null 2>&1; then
echo "❌ DNS resolution failed for ecommerce-platform.com"
else
echo "✅ DNS resolution working"
fi
# 检查SSL证书
if command -v openssl >/dev/null 2>&1; then
echo "\nSSL certificate check:"
echo | openssl s_client -connect ecommerce-platform.com:443 -servername ecommerce-platform.com 2>/dev/null | openssl x509 -noout -dates
fi
}
# 检查日志
check_logs() {
echo "\nChecking application logs..."
# Caddy日志
if [ -f "/var/log/caddy/error.log" ]; then
echo "Recent Caddy errors:"
tail -20 /var/log/caddy/error.log
fi
# 应用日志
echo "\nRecent application logs:"
docker logs --tail 20 caddy
docker logs --tail 20 user-service
docker logs --tail 20 order-service
}
# 检查资源使用
check_resources() {
echo "\nChecking resource usage..."
# 系统资源
echo "System resources:"
echo "CPU: $(top -bn1 | grep 'Cpu(s)' | awk '{print $2}')"
echo "Memory: $(free -h | grep Mem | awk '{print $3"/"$2}')"
echo "Disk: $(df -h / | awk 'NR==2 {print $5}')"
# Docker资源
echo "\nDocker resource usage:"
docker stats --no-stream --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"
}
# 检查数据库连接
check_database() {
echo "\nChecking database connectivity..."
if docker exec postgres pg_isready -U postgres >/dev/null 2>&1; then
echo "✅ PostgreSQL is accessible"
# 检查数据库大小
echo "Database sizes:"
docker exec postgres psql -U postgres -c "\l+"
else
echo "❌ PostgreSQL connection failed"
fi
# 检查Redis
if docker exec redis redis-cli ping >/dev/null 2>&1; then
echo "✅ Redis is accessible"
# 检查Redis信息
echo "Redis info:"
docker exec redis redis-cli info memory | grep used_memory_human
else
echo "❌ Redis connection failed"
fi
}
# 性能诊断
performance_check() {
echo "\nPerformance diagnostics..."
# 检查响应时间
echo "Response time check:"
for endpoint in "/health" "/api/users" "/api/orders"; do
response_time=$(curl -o /dev/null -s -w '%{time_total}' "http://localhost$endpoint" 2>/dev/null || echo "failed")
echo "$endpoint: ${response_time}s"
done
# 检查连接数
echo "\nActive connections:"
netstat -an | grep :80 | wc -l
}
# 自动修复
auto_fix() {
echo "\nAttempting automatic fixes..."
# 重启失败的容器
failed_containers=$(docker ps -a --filter "status=exited" --format "{{.Names}}")
if [ -n "$failed_containers" ]; then
echo "Restarting failed containers..."
for container in $failed_containers; do
echo "Restarting $container"
docker restart $container
done
fi
# 清理Docker资源
echo "Cleaning up Docker resources..."
docker system prune -f
# 重新加载Caddy配置
echo "Reloading Caddy configuration..."
curl -X POST "http://localhost:2019/load" \
-H "Content-Type: application/json" \
-d @caddy/config/api.json
}
# 生成诊断报告
generate_report() {
local report_file="troubleshoot_report_$(date +%Y%m%d_%H%M%S).txt"
{
echo "=== Troubleshooting Report ==="
echo "Generated: $(date)"
echo ""
check_docker
check_network
check_database
check_resources
performance_check
} > $report_file
echo "\nDiagnostic report saved to: $report_file"
}
# 主函数
main() {
case "$1" in
"docker")
check_docker
;;
"network")
check_network
;;
"logs")
check_logs
;;
"database")
check_database
;;
"performance")
performance_check
;;
"fix")
auto_fix
;;
"report")
generate_report
;;
*)
echo "Running full diagnostic..."
check_docker
check_network
check_database
check_resources
performance_check
check_logs
;;
esac
}
main "$@"
10.10 项目总结
10.10.1 架构优势
通过本实战项目,我们展示了Caddy在现代Web架构中的强大能力:
- 自动HTTPS:零配置SSL/TLS证书管理
- 反向代理:高性能的微服务网关
- 负载均衡:内置多种负载均衡算法
- 监控集成:原生Prometheus指标支持
- 配置简洁:人性化的Caddyfile语法
10.10.2 性能指标
在我们的测试环境中,该架构实现了:
- 响应时间:95%请求 < 200ms
- 吞吐量:单节点 > 10,000 RPS
- 可用性:99.9% SLA
- SSL握手:< 100ms
- 证书更新:自动化,零停机
10.10.3 最佳实践总结
配置管理
- 使用环境变量管理敏感信息
- 版本控制所有配置文件
- 实施配置验证流程
安全加固
- 启用所有安全头部
- 实施访问控制和限流
- 定期更新和审计
监控告警
- 全面的指标收集
- 主动告警机制
- 可视化仪表板
运维自动化
- 自动化部署流程
- 健康检查和自愈
- 备份和恢复策略
10.10.4 扩展建议
水平扩展
- Kubernetes集群部署
- 多区域负载均衡
- CDN集成
功能增强
- API网关功能
- 服务网格集成
- 边缘计算支持
安全提升
- WAF集成
- DDoS防护
- 零信任架构
本章练习
练习1:基础部署
- 按照本章指导搭建完整的电商平台
- 配置自动HTTPS和基本安全策略
- 验证所有服务的健康状态
练习2:性能优化
- 使用提供的负载测试脚本进行压力测试
- 分析性能瓶颈并进行优化
- 实现缓存策略提升响应速度
练习3:监控告警
- 配置Prometheus和Grafana监控
- 设置关键指标的告警规则
- 模拟故障场景测试告警机制
练习4:故障恢复
- 模拟各种故障场景(服务宕机、数据库故障等)
- 使用故障排除脚本进行诊断
- 验证自动恢复和回滚机制
练习5:扩展功能
- 添加新的微服务(如推荐服务)
- 实现API版本管理
- 集成第三方服务(如支付网关)
通过完成这些练习,你将掌握使用Caddy构建和运维企业级Web服务的完整技能栈。