1. 部署架构概述
1.1 部署模式
Kong支持多种部署模式,每种模式适用于不同的场景和需求:
1.1.1 传统模式(DB模式)
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Kong 1 │ │ Kong 2 │ │ Kong 3 │
│ (Gateway) │ │ (Gateway) │ │ (Gateway) │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
└───────────────────┼───────────────────┘
│
┌─────────────┐
│ PostgreSQL │
│ Database │
└─────────────┘
特点: - 配置存储在数据库中 - 支持动态配置更新 - 适合大规模部署 - 需要数据库高可用
1.1.2 DB-less模式
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Kong 1 │ │ Kong 2 │ │ Kong 3 │
│ (DB-less) │ │ (DB-less) │ │ (DB-less) │
│ config.yml │ │ config.yml │ │ config.yml │
└─────────────┘ └─────────────┘ └─────────────┘
特点: - 配置通过YAML文件管理 - 无数据库依赖 - 配置不可变 - 适合云原生部署
1.1.3 混合模式
┌─────────────┐ ┌─────────────┐
│ Control │ │ Data │
│ Plane │ │ Plane │
│ (Admin API) │ │ (Gateway) │
└─────────────┘ └─────────────┘
│ │
│ ┌─────────────┐
└────────────│ PostgreSQL │
│ Database │
└─────────────┘
特点: - 控制平面和数据平面分离 - 提高安全性 - 支持多区域部署 - 适合企业级场景
1.2 架构组件
1.2.1 核心组件
- Kong Gateway: 处理API请求的核心组件
- Kong Admin API: 管理Kong配置的REST API
- Kong Manager: Web界面管理工具(企业版)
- 数据库: 存储配置信息(PostgreSQL/Cassandra)
1.2.2 可选组件
- Kong Dev Portal: 开发者门户(企业版)
- Kong Vitals: 监控和分析(企业版)
- Kong Immunity: 安全分析(企业版)
2. 环境准备
2.1 系统要求
2.1.1 硬件要求
# 最小配置
minimum:
cpu: 1 core
memory: 1GB RAM
disk: 10GB
network: 100Mbps
# 推荐配置
recommended:
cpu: 4 cores
memory: 8GB RAM
disk: 100GB SSD
network: 1Gbps
# 生产环境
production:
cpu: 8+ cores
memory: 16GB+ RAM
disk: 500GB+ SSD
network: 10Gbps
2.1.2 操作系统支持
# 支持的操作系统
- Ubuntu 18.04, 20.04, 22.04
- CentOS 7, 8
- RHEL 7, 8, 9
- Amazon Linux 2
- Debian 9, 10, 11
- Alpine Linux
2.1.3 依赖软件
# 必需依赖
- OpenResty 1.19.9+
- LuaJIT 2.1+
- OpenSSL 1.1.1+
# 数据库(选择其一)
- PostgreSQL 9.5+
- Cassandra 3.11+
# 可选依赖
- Redis(用于速率限制等插件)
- Prometheus(监控)
- Grafana(可视化)
2.2 网络规划
2.2.1 端口规划
ports:
proxy:
http: 8000
https: 8443
admin:
http: 8001
https: 8444
manager:
http: 8002
https: 8445
portal:
http: 8003
https: 8446
portal_api:
http: 8004
https: 8447
2.2.2 防火墙配置
# 开放必要端口
sudo ufw allow 8000/tcp # Proxy HTTP
sudo ufw allow 8443/tcp # Proxy HTTPS
sudo ufw allow 8001/tcp # Admin API (内网)
sudo ufw allow 8444/tcp # Admin API HTTPS (内网)
# 数据库端口(内网)
sudo ufw allow from 10.0.0.0/8 to any port 5432 # PostgreSQL
sudo ufw allow from 10.0.0.0/8 to any port 9042 # Cassandra
# 启用防火墙
sudo ufw enable
3. 单机部署
3.1 Docker部署
3.1.1 基础Docker部署
# 创建网络
docker network create kong-net
# 启动PostgreSQL
docker run -d --name kong-database \
--network=kong-net \
-p 5432:5432 \
-e "POSTGRES_USER=kong" \
-e "POSTGRES_PASSWORD=kong" \
-e "POSTGRES_DB=kong" \
postgres:13
# 等待数据库启动
sleep 30
# 初始化数据库
docker run --rm \
--network=kong-net \
-e "KONG_DATABASE=postgres" \
-e "KONG_PG_HOST=kong-database" \
-e "KONG_PG_USER=kong" \
-e "KONG_PG_PASSWORD=kong" \
-e "KONG_PG_DATABASE=kong" \
kong:latest kong migrations bootstrap
# 启动Kong
docker run -d --name kong \
--network=kong-net \
-e "KONG_DATABASE=postgres" \
-e "KONG_PG_HOST=kong-database" \
-e "KONG_PG_USER=kong" \
-e "KONG_PG_PASSWORD=kong" \
-e "KONG_PG_DATABASE=kong" \
-e "KONG_PROXY_ACCESS_LOG=/dev/stdout" \
-e "KONG_ADMIN_ACCESS_LOG=/dev/stdout" \
-e "KONG_PROXY_ERROR_LOG=/dev/stderr" \
-e "KONG_ADMIN_ERROR_LOG=/dev/stderr" \
-e "KONG_ADMIN_LISTEN=0.0.0.0:8001" \
-p 8000:8000 \
-p 8443:8443 \
-p 8001:8001 \
-p 8444:8444 \
kong:latest
# 验证部署
curl -i http://localhost:8001/
3.1.2 Docker Compose部署
# docker-compose.yml
version: '3.8'
services:
kong-database:
image: postgres:13
container_name: kong-database
environment:
POSTGRES_USER: kong
POSTGRES_PASSWORD: kong
POSTGRES_DB: kong
volumes:
- kong_data:/var/lib/postgresql/data
networks:
- kong-net
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "pg_isready -U kong"]
interval: 30s
timeout: 10s
retries: 3
kong-migration:
image: kong:latest
container_name: kong-migration
command: kong migrations bootstrap
environment:
KONG_DATABASE: postgres
KONG_PG_HOST: kong-database
KONG_PG_USER: kong
KONG_PG_PASSWORD: kong
KONG_PG_DATABASE: kong
depends_on:
kong-database:
condition: service_healthy
networks:
- kong-net
restart: "no"
kong:
image: kong:latest
container_name: kong
environment:
KONG_DATABASE: postgres
KONG_PG_HOST: kong-database
KONG_PG_USER: kong
KONG_PG_PASSWORD: kong
KONG_PG_DATABASE: kong
KONG_PROXY_ACCESS_LOG: /dev/stdout
KONG_ADMIN_ACCESS_LOG: /dev/stdout
KONG_PROXY_ERROR_LOG: /dev/stderr
KONG_ADMIN_ERROR_LOG: /dev/stderr
KONG_ADMIN_LISTEN: 0.0.0.0:8001
KONG_ADMIN_GUI_URL: http://localhost:8002
ports:
- "8000:8000"
- "8443:8443"
- "8001:8001"
- "8444:8444"
depends_on:
- kong-migration
networks:
- kong-net
restart: unless-stopped
healthcheck:
test: ["CMD", "kong", "health"]
interval: 30s
timeout: 10s
retries: 3
# Kong Manager (可选)
kong-manager:
image: pantsel/konga:latest
container_name: kong-manager
environment:
NODE_ENV: production
KONGA_HOOK_TIMEOUT: 120000
ports:
- "1337:1337"
depends_on:
- kong
networks:
- kong-net
restart: unless-stopped
volumes:
kong_data:
networks:
kong-net:
driver: bridge
# 启动服务
docker-compose up -d
# 查看服务状态
docker-compose ps
# 查看日志
docker-compose logs kong
# 停止服务
docker-compose down
3.2 原生安装
3.2.1 Ubuntu/Debian安装
# 添加Kong仓库
curl -fsSL https://download.konghq.com/gateway-2.x-ubuntu-$(lsb_release -cs)/gpg | sudo apt-key add -
echo "deb https://download.konghq.com/gateway-2.x-ubuntu-$(lsb_release -cs)/ default all" | sudo tee /etc/apt/sources.list.d/kong.list
# 更新包列表
sudo apt update
# 安装Kong
sudo apt install kong
# 安装PostgreSQL
sudo apt install postgresql postgresql-contrib
# 配置PostgreSQL
sudo -u postgres createuser kong
sudo -u postgres createdb kong --owner kong
sudo -u postgres psql -c "ALTER USER kong PASSWORD 'kong';"
# 配置Kong
sudo cp /etc/kong/kong.conf.default /etc/kong/kong.conf
sudo vim /etc/kong/kong.conf
# 初始化数据库
sudo kong migrations bootstrap -c /etc/kong/kong.conf
# 启动Kong
sudo kong start -c /etc/kong/kong.conf
# 设置开机自启
sudo systemctl enable kong
sudo systemctl start kong
3.2.2 CentOS/RHEL安装
# 添加Kong仓库
sudo yum install -y wget
wget https://download.konghq.com/gateway-2.x-centos-8/Packages/k/kong-2.8.1.el8.amd64.rpm
# 安装Kong
sudo yum install -y kong-2.8.1.el8.amd64.rpm
# 安装PostgreSQL
sudo yum install -y postgresql-server postgresql-contrib
# 初始化PostgreSQL
sudo postgresql-setup initdb
sudo systemctl enable postgresql
sudo systemctl start postgresql
# 配置PostgreSQL
sudo -u postgres createuser kong
sudo -u postgres createdb kong --owner kong
sudo -u postgres psql -c "ALTER USER kong PASSWORD 'kong';"
# 修改pg_hba.conf
sudo vim /var/lib/pgsql/data/pg_hba.conf
# 添加: local kong kong md5
# 重启PostgreSQL
sudo systemctl restart postgresql
# 配置Kong
sudo cp /etc/kong/kong.conf.default /etc/kong/kong.conf
# 编辑配置文件
sudo vim /etc/kong/kong.conf
3.2.3 Kong配置文件
# /etc/kong/kong.conf
# 数据库配置
database = postgres
pg_host = 127.0.0.1
pg_port = 5432
pg_user = kong
pg_password = kong
pg_database = kong
# 代理配置
proxy_listen = 0.0.0.0:8000, 0.0.0.0:8443 ssl
# Admin API配置
admin_listen = 127.0.0.1:8001, 127.0.0.1:8444 ssl
# 日志配置
proxy_access_log = /var/log/kong/access.log
proxy_error_log = /var/log/kong/error.log
admin_access_log = /var/log/kong/admin_access.log
admin_error_log = /var/log/kong/admin_error.log
# 性能配置
nginx_worker_processes = auto
nginx_worker_connections = 1024
# 插件配置
plugins = bundled
# SSL配置
ssl_cert = /etc/kong/ssl/kong.crt
ssl_cert_key = /etc/kong/ssl/kong.key
# 其他配置
log_level = notice
mem_cache_size = 128m
4. 集群部署
4.1 高可用架构
4.1.1 多节点集群
┌─────────────┐
│ Load │
│ Balancer │
└─────────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Kong 1 │ │ Kong 2 │ │ Kong 3 │
│ (Node 1) │ │ (Node 2) │ │ (Node 3) │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
└──────────────────┼──────────────────┘
│
┌─────────────┐
│ PostgreSQL │
│ Cluster │
└─────────────┘
4.1.2 数据库高可用
# PostgreSQL主从配置
postgresql_cluster:
master:
host: pg-master.internal
port: 5432
user: kong
password: kong_password
database: kong
slaves:
- host: pg-slave1.internal
port: 5432
- host: pg-slave2.internal
port: 5432
connection_pool:
max_connections: 100
idle_timeout: 300
max_lifetime: 3600
4.2 Kubernetes部署
4.2.1 Namespace和ConfigMap
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: kong
---
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: kong-config
namespace: kong
data:
kong.conf: |
database = postgres
pg_host = postgres-service
pg_port = 5432
pg_user = kong
pg_password = kong
pg_database = kong
proxy_listen = 0.0.0.0:8000, 0.0.0.0:8443 ssl
admin_listen = 0.0.0.0:8001, 0.0.0.0:8444 ssl
log_level = notice
proxy_access_log = /dev/stdout
proxy_error_log = /dev/stderr
admin_access_log = /dev/stdout
admin_error_log = /dev/stderr
4.2.2 PostgreSQL部署
# postgres-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: postgres-secret
namespace: kong
type: Opaque
data:
username: a29uZw== # kong
password: a29uZw== # kong
---
# postgres-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
namespace: kong
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: fast-ssd
---
# postgres-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
namespace: kong
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:13
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgres-secret
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
- name: POSTGRES_DB
value: kong
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
ports:
- containerPort: 5432
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
exec:
command:
- pg_isready
- -U
- kong
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- pg_isready
- -U
- kong
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: postgres-pvc
---
# postgres-service.yaml
apiVersion: v1
kind: Service
metadata:
name: postgres-service
namespace: kong
spec:
selector:
app: postgres
ports:
- port: 5432
targetPort: 5432
type: ClusterIP
4.2.3 Kong部署
# kong-migration-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: kong-migration
namespace: kong
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: kong-migration
image: kong:latest
command: ["kong", "migrations", "bootstrap"]
env:
- name: KONG_DATABASE
value: postgres
- name: KONG_PG_HOST
value: postgres-service
- name: KONG_PG_USER
valueFrom:
secretKeyRef:
name: postgres-secret
key: username
- name: KONG_PG_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
- name: KONG_PG_DATABASE
value: kong
---
# kong-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: kong
namespace: kong
spec:
replicas: 3
selector:
matchLabels:
app: kong
template:
metadata:
labels:
app: kong
spec:
containers:
- name: kong
image: kong:latest
env:
- name: KONG_DATABASE
value: postgres
- name: KONG_PG_HOST
value: postgres-service
- name: KONG_PG_USER
valueFrom:
secretKeyRef:
name: postgres-secret
key: username
- name: KONG_PG_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
- name: KONG_PG_DATABASE
value: kong
- name: KONG_PROXY_ACCESS_LOG
value: /dev/stdout
- name: KONG_ADMIN_ACCESS_LOG
value: /dev/stdout
- name: KONG_PROXY_ERROR_LOG
value: /dev/stderr
- name: KONG_ADMIN_ERROR_LOG
value: /dev/stderr
- name: KONG_ADMIN_LISTEN
value: 0.0.0.0:8001
ports:
- containerPort: 8000
name: proxy
- containerPort: 8443
name: proxy-ssl
- containerPort: 8001
name: admin
- containerPort: 8444
name: admin-ssl
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
httpGet:
path: /status
port: 8001
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /status/ready
port: 8001
initialDelaySeconds: 5
periodSeconds: 5
volumeMounts:
- name: kong-config
mountPath: /etc/kong
volumes:
- name: kong-config
configMap:
name: kong-config
---
# kong-service.yaml
apiVersion: v1
kind: Service
metadata:
name: kong-proxy
namespace: kong
spec:
type: LoadBalancer
ports:
- name: proxy
port: 80
targetPort: 8000
protocol: TCP
- name: proxy-ssl
port: 443
targetPort: 8443
protocol: TCP
selector:
app: kong
---
apiVersion: v1
kind: Service
metadata:
name: kong-admin
namespace: kong
spec:
type: ClusterIP
ports:
- name: admin
port: 8001
targetPort: 8001
protocol: TCP
selector:
app: kong
4.2.4 HPA自动扩缩容
# kong-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: kong-hpa
namespace: kong
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: kong
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 50
periodSeconds: 60
4.3 Helm部署
4.3.1 Helm Chart安装
# 添加Kong Helm仓库
helm repo add kong https://charts.konghq.com
helm repo update
# 创建values文件
cat > kong-values.yaml << EOF
image:
repository: kong
tag: "latest"
env:
database: postgres
pg_host: postgres-service
pg_user: kong
pg_password: kong
pg_database: kong
proxy:
enabled: true
type: LoadBalancer
http:
enabled: true
servicePort: 80
containerPort: 8000
tls:
enabled: true
servicePort: 443
containerPort: 8443
admin:
enabled: true
type: ClusterIP
http:
enabled: true
servicePort: 8001
containerPort: 8001
replicaCount: 3
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 250m
memory: 256Mi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
postgresql:
enabled: true
auth:
username: kong
password: kong
database: kong
primary:
persistence:
enabled: true
size: 20Gi
EOF
# 安装Kong
helm install kong kong/kong -f kong-values.yaml -n kong --create-namespace
# 查看部署状态
helm status kong -n kong
kubectl get pods -n kong
# 升级Kong
helm upgrade kong kong/kong -f kong-values.yaml -n kong
# 卸载Kong
helm uninstall kong -n kong
5. 配置管理
5.1 配置文件管理
5.1.1 环境配置分离
# 开发环境配置
# /etc/kong/kong-dev.conf
database = postgres
pg_host = dev-postgres.internal
pg_user = kong_dev
pg_password = dev_password
pg_database = kong_dev
log_level = debug
proxy_listen = 0.0.0.0:8000
admin_listen = 0.0.0.0:8001
# 测试环境配置
# /etc/kong/kong-test.conf
database = postgres
pg_host = test-postgres.internal
pg_user = kong_test
pg_password = test_password
pg_database = kong_test
log_level = info
proxy_listen = 0.0.0.0:8000
admin_listen = 127.0.0.1:8001
# 生产环境配置
# /etc/kong/kong-prod.conf
database = postgres
pg_host = prod-postgres.internal
pg_user = kong_prod
pg_password = prod_password
pg_database = kong_prod
log_level = warn
proxy_listen = 0.0.0.0:8000, 0.0.0.0:8443 ssl
admin_listen = 127.0.0.1:8001, 127.0.0.1:8444 ssl
5.1.2 环境变量配置
# 环境变量配置文件
# /etc/kong/kong.env
# 数据库配置
export KONG_DATABASE=postgres
export KONG_PG_HOST=${DB_HOST:-localhost}
export KONG_PG_PORT=${DB_PORT:-5432}
export KONG_PG_USER=${DB_USER:-kong}
export KONG_PG_PASSWORD=${DB_PASSWORD}
export KONG_PG_DATABASE=${DB_NAME:-kong}
# 代理配置
export KONG_PROXY_LISTEN="0.0.0.0:8000, 0.0.0.0:8443 ssl"
export KONG_ADMIN_LISTEN="127.0.0.1:8001, 127.0.0.1:8444 ssl"
# 日志配置
export KONG_LOG_LEVEL=${LOG_LEVEL:-notice}
export KONG_PROXY_ACCESS_LOG=/var/log/kong/access.log
export KONG_PROXY_ERROR_LOG=/var/log/kong/error.log
# 性能配置
export KONG_NGINX_WORKER_PROCESSES=${WORKER_PROCESSES:-auto}
export KONG_NGINX_WORKER_CONNECTIONS=${WORKER_CONNECTIONS:-1024}
# 插件配置
export KONG_PLUGINS=${KONG_PLUGINS:-bundled}
# SSL配置
export KONG_SSL_CERT=${SSL_CERT_PATH}
export KONG_SSL_CERT_KEY=${SSL_KEY_PATH}
5.2 声明式配置
5.2.1 YAML配置文件
# kong.yaml
_format_version: "2.1"
_transform: true
services:
- name: user-service
url: http://user-service.internal:8080
tags:
- production
- user
routes:
- name: user-api
paths:
- /api/users
methods:
- GET
- POST
- PUT
- DELETE
strip_path: false
preserve_host: false
tags:
- api
- user
- name: order-service
url: http://order-service.internal:8080
tags:
- production
- order
routes:
- name: order-api
paths:
- /api/orders
methods:
- GET
- POST
- PUT
- DELETE
strip_path: false
tags:
- api
- order
upstreams:
- name: user-service-upstream
algorithm: round-robin
hash_on: none
hash_fallback: none
healthchecks:
active:
healthy:
interval: 5
successes: 3
unhealthy:
interval: 5
tcp_failures: 3
http_failures: 3
passive:
healthy:
successes: 3
unhealthy:
tcp_failures: 3
http_failures: 3
targets:
- target: user-service-1.internal:8080
weight: 100
- target: user-service-2.internal:8080
weight: 100
- target: user-service-3.internal:8080
weight: 100
consumers:
- username: api-client
custom_id: client-001
tags:
- external
- api-client
- username: mobile-app
custom_id: mobile-001
tags:
- mobile
- app
plugins:
- name: rate-limiting
config:
minute: 1000
hour: 10000
policy: local
tags:
- rate-limiting
- global
- name: prometheus
config:
per_consumer: true
status_code_metrics: true
latency_metrics: true
bandwidth_metrics: true
tags:
- monitoring
- prometheus
- name: cors
config:
origins:
- "*"
methods:
- GET
- POST
- PUT
- DELETE
- OPTIONS
headers:
- Accept
- Accept-Version
- Content-Length
- Content-MD5
- Content-Type
- Date
- X-Auth-Token
exposed_headers:
- X-Auth-Token
credentials: true
max_age: 3600
tags:
- cors
- security
- name: key-auth
service: user-service
config:
key_names:
- apikey
- x-api-key
key_in_body: false
hide_credentials: true
tags:
- authentication
- api-key
key_auths:
- consumer: api-client
key: client-api-key-12345
tags:
- api-client
- production
- consumer: mobile-app
key: mobile-api-key-67890
tags:
- mobile-app
- production
5.2.2 配置验证和应用
# 验证配置文件
kong config parse kong.yaml
# 应用配置(DB-less模式)
kong start -c kong.conf --declarative-config kong.yaml
# 重新加载配置
kong reload -c kong.conf --declarative-config kong.yaml
# 通过Admin API应用配置
curl -X POST http://localhost:8001/config \
-F config=@kong.yaml
# 验证当前配置
curl -s http://localhost:8001/config | jq .
5.3 配置版本控制
5.3.1 Git版本控制
# 初始化配置仓库
mkdir kong-config
cd kong-config
git init
# 创建目录结构
mkdir -p environments/{dev,test,prod}
mkdir -p services
mkdir -p plugins
mkdir -p consumers
# 环境特定配置
cat > environments/dev/kong.yaml << EOF
_format_version: "2.1"
_transform: true
services:
- name: user-service
url: http://user-service-dev.internal:8080
tags: ["dev", "user"]
EOF
cat > environments/prod/kong.yaml << EOF
_format_version: "2.1"
_transform: true
services:
- name: user-service
url: http://user-service-prod.internal:8080
tags: ["prod", "user"]
EOF
# 提交配置
git add .
git commit -m "Initial Kong configuration"
# 创建分支
git checkout -b feature/new-service
# 修改配置...
git add .
git commit -m "Add new service configuration"
# 合并到主分支
git checkout main
git merge feature/new-service
5.3.2 配置部署脚本
#!/bin/bash
# deploy-config.sh
set -e
ENVIRONMENT=${1:-dev}
CONFIG_FILE="environments/${ENVIRONMENT}/kong.yaml"
KONG_ADMIN_URL=${KONG_ADMIN_URL:-http://localhost:8001}
if [ ! -f "$CONFIG_FILE" ]; then
echo "Error: Configuration file $CONFIG_FILE not found"
exit 1
fi
echo "Validating configuration for $ENVIRONMENT..."
kong config parse "$CONFIG_FILE"
echo "Backing up current configuration..."
curl -s "$KONG_ADMIN_URL/config" > "backup-$(date +%Y%m%d-%H%M%S).json"
echo "Applying configuration to $ENVIRONMENT..."
curl -X POST "$KONG_ADMIN_URL/config" \
-F "config=@$CONFIG_FILE"
echo "Verifying deployment..."
sleep 5
curl -s "$KONG_ADMIN_URL/status" | jq .
echo "Configuration deployed successfully to $ENVIRONMENT"
6. 监控与日志
6.1 监控系统
6.1.1 Prometheus监控
# prometheus-config.yaml
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "kong-alerts.yml"
scrape_configs:
- job_name: 'kong'
static_configs:
- targets: ['kong:8001']
metrics_path: '/metrics'
scrape_interval: 5s
scrape_timeout: 5s
- job_name: 'kong-cluster'
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- kong
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_app]
action: keep
regex: kong
- source_labels: [__meta_kubernetes_pod_ip]
target_label: __address__
replacement: ${1}:8001
6.1.2 Grafana仪表板
{
"dashboard": {
"title": "Kong Gateway Monitoring",
"panels": [
{
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(kong_http_requests_total[5m])",
"legendFormat": "{{service}}"
}
]
},
{
"title": "Response Time",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(kong_latency_bucket[5m]))",
"legendFormat": "95th percentile"
},
{
"expr": "histogram_quantile(0.50, rate(kong_latency_bucket[5m]))",
"legendFormat": "50th percentile"
}
]
},
{
"title": "Error Rate",
"type": "graph",
"targets": [
{
"expr": "rate(kong_http_status{code=~\"5..\"}[5m])",
"legendFormat": "5xx errors"
},
{
"expr": "rate(kong_http_status{code=~\"4..\"}[5m])",
"legendFormat": "4xx errors"
}
]
}
]
}
}
6.2 日志管理
6.2.1 日志配置
# Kong日志配置
# /etc/kong/kong.conf
# 访问日志
proxy_access_log = /var/log/kong/access.log
admin_access_log = /var/log/kong/admin_access.log
# 错误日志
proxy_error_log = /var/log/kong/error.log
admin_error_log = /var/log/kong/admin_error.log
# 日志级别
log_level = notice
# 自定义日志格式
nginx_http_log_format = '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_time $upstream_response_time'
6.2.2 Logrotate配置
# /etc/logrotate.d/kong
/var/log/kong/*.log {
daily
missingok
rotate 30
compress
delaycompress
notifempty
create 644 kong kong
postrotate
/usr/local/bin/kong reload > /dev/null 2>&1 || true
endscript
}
6.2.3 ELK集成
# filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/kong/access.log
fields:
service: kong
log_type: access
fields_under_root: true
multiline.pattern: '^\d{4}/\d{2}/\d{2}'
multiline.negate: true
multiline.match: after
- type: log
enabled: true
paths:
- /var/log/kong/error.log
fields:
service: kong
log_type: error
fields_under_root: true
output.elasticsearch:
hosts: ["elasticsearch:9200"]
index: "kong-logs-%{+yyyy.MM.dd}"
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
6.3 告警配置
6.3.1 Prometheus告警规则
# kong-alerts.yml
groups:
- name: kong.rules
rules:
- alert: KongHighErrorRate
expr: |
(
rate(kong_http_status{code=~"5.."}[5m]) /
rate(kong_http_requests_total[5m])
) * 100 > 5
for: 2m
labels:
severity: critical
service: kong
annotations:
summary: "Kong high error rate detected"
description: "Kong error rate is {{ $value }}% for the last 5 minutes"
- alert: KongHighLatency
expr: |
histogram_quantile(0.95, rate(kong_latency_bucket[5m])) > 1000
for: 5m
labels:
severity: warning
service: kong
annotations:
summary: "Kong high latency detected"
description: "Kong 95th percentile latency is {{ $value }}ms"
- alert: KongServiceDown
expr: up{job="kong"} == 0
for: 1m
labels:
severity: critical
service: kong
annotations:
summary: "Kong service is down"
description: "Kong service has been down for more than 1 minute"
- alert: KongDatabaseConnectionFailed
expr: kong_database_reachable == 0
for: 30s
labels:
severity: critical
service: kong
annotations:
summary: "Kong database connection failed"
description: "Kong cannot connect to the database"
- alert: KongMemoryUsageHigh
expr: |
(
kong_memory_workers_lua_vms_bytes /
kong_memory_workers_lua_vms_bytes
) * 100 > 80
for: 5m
labels:
severity: warning
service: kong
annotations:
summary: "Kong memory usage is high"
description: "Kong memory usage is {{ $value }}%"
6.3.2 AlertManager配置
# alertmanager.yml
global:
smtp_smarthost: 'smtp.company.com:587'
smtp_from: 'alerts@company.com'
route:
group_by: ['alertname', 'service']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'web.hook'
routes:
- match:
severity: critical
receiver: 'critical-alerts'
- match:
severity: warning
receiver: 'warning-alerts'
receivers:
- name: 'web.hook'
webhook_configs:
- url: 'http://webhook-service:8080/alerts'
- name: 'critical-alerts'
email_configs:
- to: 'ops-team@company.com'
subject: 'CRITICAL: Kong Alert - {{ .GroupLabels.alertname }}'
body: |
{{ range .Alerts }}
Alert: {{ .Annotations.summary }}
Description: {{ .Annotations.description }}
{{ end }}
slack_configs:
- api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
channel: '#ops-alerts'
title: 'CRITICAL Kong Alert'
text: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
- name: 'warning-alerts'
email_configs:
- to: 'dev-team@company.com'
subject: 'WARNING: Kong Alert - {{ .GroupLabels.alertname }}'
body: |
{{ range .Alerts }}
Alert: {{ .Annotations.summary }}
Description: {{ .Annotations.description }}
{{ end }}
7. 备份与恢复
7.1 数据库备份
7.1.1 PostgreSQL备份
#!/bin/bash
# backup-postgres.sh
set -e
# 配置变量
DB_HOST=${DB_HOST:-localhost}
DB_PORT=${DB_PORT:-5432}
DB_USER=${DB_USER:-kong}
DB_NAME=${DB_NAME:-kong}
BACKUP_DIR=${BACKUP_DIR:-/backup/kong}
RETENTION_DAYS=${RETENTION_DAYS:-30}
# 创建备份目录
mkdir -p "$BACKUP_DIR"
# 生成备份文件名
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
BACKUP_FILE="$BACKUP_DIR/kong_backup_$TIMESTAMP.sql"
echo "Starting Kong database backup..."
# 执行备份
PGPASSWORD="$DB_PASSWORD" pg_dump \
-h "$DB_HOST" \
-p "$DB_PORT" \
-U "$DB_USER" \
-d "$DB_NAME" \
--verbose \
--no-password \
--format=custom \
--file="$BACKUP_FILE"
if [ $? -eq 0 ]; then
echo "Backup completed successfully: $BACKUP_FILE"
# 压缩备份文件
gzip "$BACKUP_FILE"
echo "Backup compressed: $BACKUP_FILE.gz"
# 清理旧备份
find "$BACKUP_DIR" -name "kong_backup_*.sql.gz" -mtime +$RETENTION_DAYS -delete
echo "Old backups cleaned up (older than $RETENTION_DAYS days)"
else
echo "Backup failed!"
exit 1
fi
7.1.2 自动化备份
# 添加到crontab
# crontab -e
# 每天凌晨2点执行备份
0 2 * * * /opt/kong/scripts/backup-postgres.sh >> /var/log/kong/backup.log 2>&1
# 每周日凌晨3点执行完整备份
0 3 * * 0 /opt/kong/scripts/backup-postgres.sh full >> /var/log/kong/backup.log 2>&1
7.1.3 云存储备份
#!/bin/bash
# backup-to-s3.sh
set -e
# AWS S3配置
S3_BUCKET=${S3_BUCKET:-kong-backups}
S3_PREFIX=${S3_PREFIX:-database}
AWS_REGION=${AWS_REGION:-us-west-2}
# 本地备份目录
BACKUP_DIR=${BACKUP_DIR:-/backup/kong}
# 上传最新备份到S3
LATEST_BACKUP=$(ls -t "$BACKUP_DIR"/kong_backup_*.sql.gz | head -1)
if [ -f "$LATEST_BACKUP" ]; then
echo "Uploading backup to S3: $LATEST_BACKUP"
aws s3 cp "$LATEST_BACKUP" \
"s3://$S3_BUCKET/$S3_PREFIX/$(basename "$LATEST_BACKUP")" \
--region "$AWS_REGION" \
--storage-class STANDARD_IA
if [ $? -eq 0 ]; then
echo "Backup uploaded successfully to S3"
else
echo "Failed to upload backup to S3"
exit 1
fi
else
echo "No backup file found"
exit 1
fi
# 清理S3中的旧备份(保留30天)
aws s3api list-objects-v2 \
--bucket "$S3_BUCKET" \
--prefix "$S3_PREFIX/" \
--query "Contents[?LastModified<='$(date -d '30 days ago' --iso-8601)'].Key" \
--output text | \
while read -r key; do
if [ -n "$key" ]; then
echo "Deleting old backup: $key"
aws s3 rm "s3://$S3_BUCKET/$key"
fi
done
7.2 配置备份
7.2.1 声明式配置备份
#!/bin/bash
# backup-config.sh
set -e
KONG_ADMIN_URL=${KONG_ADMIN_URL:-http://localhost:8001}
BACKUP_DIR=${BACKUP_DIR:-/backup/kong/config}
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
# 创建备份目录
mkdir -p "$BACKUP_DIR"
echo "Backing up Kong configuration..."
# 备份完整配置
curl -s "$KONG_ADMIN_URL/config" > "$BACKUP_DIR/kong_config_$TIMESTAMP.json"
# 备份各个资源
curl -s "$KONG_ADMIN_URL/services" > "$BACKUP_DIR/services_$TIMESTAMP.json"
curl -s "$KONG_ADMIN_URL/routes" > "$BACKUP_DIR/routes_$TIMESTAMP.json"
curl -s "$KONG_ADMIN_URL/consumers" > "$BACKUP_DIR/consumers_$TIMESTAMP.json"
curl -s "$KONG_ADMIN_URL/plugins" > "$BACKUP_DIR/plugins_$TIMESTAMP.json"
curl -s "$KONG_ADMIN_URL/upstreams" > "$BACKUP_DIR/upstreams_$TIMESTAMP.json"
curl -s "$KONG_ADMIN_URL/certificates" > "$BACKUP_DIR/certificates_$TIMESTAMP.json"
# 压缩备份
tar -czf "$BACKUP_DIR/kong_config_backup_$TIMESTAMP.tar.gz" -C "$BACKUP_DIR" \
kong_config_$TIMESTAMP.json \
services_$TIMESTAMP.json \
routes_$TIMESTAMP.json \
consumers_$TIMESTAMP.json \
plugins_$TIMESTAMP.json \
upstreams_$TIMESTAMP.json \
certificates_$TIMESTAMP.json
# 清理临时文件
rm "$BACKUP_DIR"/*_$TIMESTAMP.json
echo "Configuration backup completed: kong_config_backup_$TIMESTAMP.tar.gz"
7.3 恢复流程
7.3.1 数据库恢复
#!/bin/bash
# restore-postgres.sh
set -e
BACKUP_FILE=$1
DB_HOST=${DB_HOST:-localhost}
DB_PORT=${DB_PORT:-5432}
DB_USER=${DB_USER:-kong}
DB_NAME=${DB_NAME:-kong}
if [ -z "$BACKUP_FILE" ]; then
echo "Usage: $0 <backup_file>"
exit 1
fi
if [ ! -f "$BACKUP_FILE" ]; then
echo "Backup file not found: $BACKUP_FILE"
exit 1
fi
echo "WARNING: This will restore the Kong database from backup."
echo "All current data will be lost!"
read -p "Are you sure you want to continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
echo "Restore cancelled."
exit 0
fi
echo "Stopping Kong..."
sudo systemctl stop kong
echo "Dropping existing database..."
PGPASSWORD="$DB_PASSWORD" dropdb \
-h "$DB_HOST" \
-p "$DB_PORT" \
-U "$DB_USER" \
"$DB_NAME"
echo "Creating new database..."
PGPASSWORD="$DB_PASSWORD" createdb \
-h "$DB_HOST" \
-p "$DB_PORT" \
-U "$DB_USER" \
"$DB_NAME"
echo "Restoring database from backup..."
if [[ "$BACKUP_FILE" == *.gz ]]; then
gunzip -c "$BACKUP_FILE" | PGPASSWORD="$DB_PASSWORD" pg_restore \
-h "$DB_HOST" \
-p "$DB_PORT" \
-U "$DB_USER" \
-d "$DB_NAME" \
--verbose
else
PGPASSWORD="$DB_PASSWORD" pg_restore \
-h "$DB_HOST" \
-p "$DB_PORT" \
-U "$DB_USER" \
-d "$DB_NAME" \
--verbose \
"$BACKUP_FILE"
fi
echo "Starting Kong..."
sudo systemctl start kong
echo "Verifying Kong status..."
sleep 5
curl -s http://localhost:8001/status | jq .
echo "Database restore completed successfully!"
7.3.2 配置恢复
#!/bin/bash
# restore-config.sh
set -e
BACKUP_FILE=$1
KONG_ADMIN_URL=${KONG_ADMIN_URL:-http://localhost:8001}
if [ -z "$BACKUP_FILE" ]; then
echo "Usage: $0 <config_backup_file>"
exit 1
fi
if [ ! -f "$BACKUP_FILE" ]; then
echo "Backup file not found: $BACKUP_FILE"
exit 1
fi
echo "WARNING: This will restore Kong configuration from backup."
echo "All current configuration will be replaced!"
read -p "Are you sure you want to continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
echo "Restore cancelled."
exit 0
fi
# 解压备份文件
TEMP_DIR=$(mktemp -d)
tar -xzf "$BACKUP_FILE" -C "$TEMP_DIR"
echo "Restoring Kong configuration..."
# 恢复完整配置(DB-less模式)
if [ -f "$TEMP_DIR/kong_config_*.json" ]; then
CONFIG_FILE=$(ls "$TEMP_DIR"/kong_config_*.json | head -1)
curl -X POST "$KONG_ADMIN_URL/config" \
-F "config=@$CONFIG_FILE"
fi
# 清理临时目录
rm -rf "$TEMP_DIR"
echo "Configuration restore completed successfully!"
7.3.3 灾难恢复计划
#!/bin/bash
# disaster-recovery.sh
set -e
echo "Kong Disaster Recovery Plan"
echo "==========================="
# 1. 检查系统状态
echo "1. Checking system status..."
if systemctl is-active --quiet kong; then
echo " Kong is running"
else
echo " Kong is not running"
fi
if systemctl is-active --quiet postgresql; then
echo " PostgreSQL is running"
else
echo " PostgreSQL is not running - CRITICAL!"
fi
# 2. 检查数据库连接
echo "2. Checking database connectivity..."
if PGPASSWORD="$DB_PASSWORD" psql -h "$DB_HOST" -U "$DB_USER" -d "$DB_NAME" -c "SELECT 1;" > /dev/null 2>&1; then
echo " Database connection: OK"
else
echo " Database connection: FAILED - CRITICAL!"
fi
# 3. 检查Kong Admin API
echo "3. Checking Kong Admin API..."
if curl -s "$KONG_ADMIN_URL/status" > /dev/null; then
echo " Admin API: OK"
else
echo " Admin API: FAILED"
fi
# 4. 检查Kong Proxy
echo "4. Checking Kong Proxy..."
if curl -s "http://localhost:8000" > /dev/null; then
echo " Proxy: OK"
else
echo " Proxy: FAILED"
fi
# 5. 自动恢复流程
echo "5. Starting automatic recovery..."
# 重启服务
echo " Restarting PostgreSQL..."
sudo systemctl restart postgresql
sleep 10
echo " Restarting Kong..."
sudo systemctl restart kong
sleep 15
# 验证恢复
echo "6. Verifying recovery..."
if curl -s "$KONG_ADMIN_URL/status" | jq -r '.database.reachable' | grep -q "true"; then
echo " Recovery successful!"
else
echo " Recovery failed - manual intervention required"
exit 1
fi
8. 安全运维
8.1 安全配置
8.1.1 网络安全
# 防火墙配置
#!/bin/bash
# setup-firewall.sh
# 清除现有规则
sudo iptables -F
sudo iptables -X
sudo iptables -t nat -F
sudo iptables -t nat -X
# 设置默认策略
sudo iptables -P INPUT DROP
sudo iptables -P FORWARD DROP
sudo iptables -P OUTPUT ACCEPT
# 允许本地回环
sudo iptables -A INPUT -i lo -j ACCEPT
sudo iptables -A OUTPUT -o lo -j ACCEPT
# 允许已建立的连接
sudo iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
# 允许SSH(根据实际端口调整)
sudo iptables -A INPUT -p tcp --dport 22 -j ACCEPT
# 允许Kong代理端口(公网)
sudo iptables -A INPUT -p tcp --dport 8000 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 8443 -j ACCEPT
# 允许Kong Admin API(仅内网)
sudo iptables -A INPUT -p tcp -s 10.0.0.0/8 --dport 8001 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 172.16.0.0/12 --dport 8001 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 192.168.0.0/16 --dport 8001 -j ACCEPT
# 允许数据库端口(仅内网)
sudo iptables -A INPUT -p tcp -s 10.0.0.0/8 --dport 5432 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 172.16.0.0/12 --dport 5432 -j ACCEPT
sudo iptables -A INPUT -p tcp -s 192.168.0.0/16 --dport 5432 -j ACCEPT
# 保存规则
sudo iptables-save > /etc/iptables/rules.v4
echo "Firewall rules configured successfully"
8.1.2 SSL/TLS配置
# 生成SSL证书
#!/bin/bash
# generate-ssl-certs.sh
SSL_DIR="/etc/kong/ssl"
DOMAIN="api.company.com"
# 创建SSL目录
sudo mkdir -p "$SSL_DIR"
cd "$SSL_DIR"
# 生成私钥
sudo openssl genrsa -out kong.key 2048
# 生成证书签名请求
sudo openssl req -new -key kong.key -out kong.csr -subj "/C=US/ST=CA/L=San Francisco/O=Company/CN=$DOMAIN"
# 生成自签名证书(生产环境应使用CA签名)
sudo openssl x509 -req -days 365 -in kong.csr -signkey kong.key -out kong.crt
# 设置权限
sudo chown kong:kong kong.key kong.crt
sudo chmod 600 kong.key
sudo chmod 644 kong.crt
# 验证证书
openssl x509 -in kong.crt -text -noout
echo "SSL certificates generated successfully"
8.1.3 访问控制
# Kong RBAC配置(企业版)
rbac:
enabled: true
admin_gui_auth: basic-auth
admin_gui_auth_conf: |
{
"hide_credentials": true
}
session_conf: |
{
"secret": "your-session-secret",
"cookie_secure": true,
"cookie_httponly": true,
"cookie_samesite": "Strict"
}
# 创建管理员角色
roles:
- name: admin
permissions:
- "*"
- name: developer
permissions:
- "services:read"
- "routes:read"
- "plugins:read"
- name: operator
permissions:
- "services:*"
- "routes:*"
- "upstreams:*"
- "targets:*"
8.2 更新与补丁
8.2.1 更新流程
#!/bin/bash
# update-kong.sh
set -e
CURRENT_VERSION=$(kong version | grep -oP 'Kong\s+\K[0-9.]+')
TARGET_VERSION=${1:-latest}
echo "Current Kong version: $CURRENT_VERSION"
echo "Target version: $TARGET_VERSION"
# 1. 备份当前配置和数据
echo "1. Creating backup..."
/opt/kong/scripts/backup-postgres.sh
/opt/kong/scripts/backup-config.sh
# 2. 下载新版本
echo "2. Downloading Kong $TARGET_VERSION..."
wget "https://download.konghq.com/gateway-2.x-ubuntu-$(lsb_release -cs)/pool/all/k/kong/kong_${TARGET_VERSION}_amd64.deb"
# 3. 停止Kong服务
echo "3. Stopping Kong..."
sudo systemctl stop kong
# 4. 安装新版本
echo "4. Installing Kong $TARGET_VERSION..."
sudo dpkg -i "kong_${TARGET_VERSION}_amd64.deb"
# 5. 运行数据库迁移
echo "5. Running database migrations..."
sudo kong migrations up -c /etc/kong/kong.conf
# 6. 启动Kong
echo "6. Starting Kong..."
sudo systemctl start kong
# 7. 验证更新
echo "7. Verifying update..."
sleep 10
NEW_VERSION=$(kong version | grep -oP 'Kong\s+\K[0-9.]+')
echo "New Kong version: $NEW_VERSION"
if curl -s http://localhost:8001/status | jq -r '.database.reachable' | grep -q "true"; then
echo "Update completed successfully!"
else
echo "Update failed - rolling back..."
# 回滚逻辑
exit 1
fi
# 8. 清理
rm "kong_${TARGET_VERSION}_amd64.deb"
8.2.2 回滚流程
#!/bin/bash
# rollback-kong.sh
set -e
BACKUP_VERSION=$1
if [ -z "$BACKUP_VERSION" ]; then
echo "Usage: $0 <backup_version>"
echo "Available backups:"
ls /backup/kong/kong_backup_*.sql.gz
exit 1
fi
echo "WARNING: This will rollback Kong to a previous version."
echo "All changes since the backup will be lost!"
read -p "Are you sure you want to continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
echo "Rollback cancelled."
exit 0
fi
echo "Starting rollback process..."
# 1. 停止Kong
echo "1. Stopping Kong..."
sudo systemctl stop kong
# 2. 恢复数据库
echo "2. Restoring database..."
/opt/kong/scripts/restore-postgres.sh "/backup/kong/$BACKUP_VERSION"
# 3. 降级Kong版本(如果需要)
echo "3. Downgrading Kong version..."
# 这里需要根据具体情况安装旧版本
# 4. 启动Kong
echo "4. Starting Kong..."
sudo systemctl start kong
# 5. 验证回滚
echo "5. Verifying rollback..."
sleep 10
if curl -s http://localhost:8001/status | jq -r '.database.reachable' | grep -q "true"; then
echo "Rollback completed successfully!"
else
echo "Rollback failed!"
exit 1
fi
8.3 故障排除
8.3.1 常见问题诊断
#!/bin/bash
# diagnose-kong.sh
echo "Kong Diagnostic Tool"
echo "==================="
# 1. 检查Kong进程
echo "1. Checking Kong processes..."
ps aux | grep kong | grep -v grep
# 2. 检查端口占用
echo "2. Checking port usage..."
netstat -tlnp | grep -E ':(8000|8001|8443|8444)'
# 3. 检查Kong配置
echo "3. Checking Kong configuration..."
kong config -c /etc/kong/kong.conf
# 4. 检查数据库连接
echo "4. Checking database connection..."
if PGPASSWORD="$DB_PASSWORD" psql -h "$DB_HOST" -U "$DB_USER" -d "$DB_NAME" -c "SELECT version();" > /dev/null 2>&1; then
echo " Database connection: OK"
else
echo " Database connection: FAILED"
fi
# 5. 检查Kong状态
echo "5. Checking Kong status..."
curl -s http://localhost:8001/status | jq .
# 6. 检查最近的错误日志
echo "6. Recent error logs..."
tail -20 /var/log/kong/error.log
# 7. 检查系统资源
echo "7. System resources..."
echo " Memory usage:"
free -h
echo " Disk usage:"
df -h
echo " CPU load:"
uptime
# 8. 检查网络连接
echo "8. Network connectivity..."
echo " Testing proxy port:"
curl -I http://localhost:8000 2>/dev/null | head -1 || echo " Proxy port not responding"
echo " Testing admin port:"
curl -I http://localhost:8001 2>/dev/null | head -1 || echo " Admin port not responding"
8.3.2 性能问题排查
#!/bin/bash
# performance-check.sh
echo "Kong Performance Check"
echo "====================="
# 1. 检查Kong worker进程
echo "1. Kong worker processes:"
ps aux | grep 'nginx: worker process' | wc -l
# 2. 检查内存使用
echo "2. Memory usage by Kong processes:"
ps aux | grep kong | awk '{sum+=$6} END {print "Total RSS: " sum/1024 " MB"}'
# 3. 检查连接数
echo "3. Active connections:"
netstat -an | grep :8000 | grep ESTABLISHED | wc -l
# 4. 检查数据库连接池
echo "4. Database connections:"
PGPASSWORD="$DB_PASSWORD" psql -h "$DB_HOST" -U "$DB_USER" -d "$DB_NAME" -c "
SELECT count(*) as active_connections,
max_conn,
max_conn - count(*) as available_connections
FROM pg_stat_activity
CROSS JOIN (SELECT setting::int as max_conn FROM pg_settings WHERE name = 'max_connections') mc;
"
# 5. 检查响应时间
echo "5. Response time test:"
for i in {1..5}; do
curl -w "Response time: %{time_total}s\n" -o /dev/null -s http://localhost:8000
done
# 6. 检查错误率
echo "6. Recent error rate:"
ERRORS=$(tail -1000 /var/log/kong/access.log | grep -c ' 5[0-9][0-9] ')
TOTAL=$(tail -1000 /var/log/kong/access.log | wc -l)
if [ $TOTAL -gt 0 ]; then
ERROR_RATE=$(echo "scale=2; $ERRORS * 100 / $TOTAL" | bc)
echo " Error rate: $ERROR_RATE% ($ERRORS/$TOTAL)"
else
echo " No recent requests found"
fi
9. 运维最佳实践
9.1 部署策略
9.1.1 蓝绿部署
#!/bin/bash
# blue-green-deployment.sh
set -e
CURRENT_ENV=${1:-blue}
TARGET_ENV=${2:-green}
LB_CONFIG="/etc/nginx/conf.d/kong-lb.conf"
echo "Starting blue-green deployment..."
echo "Current: $CURRENT_ENV, Target: $TARGET_ENV"
# 1. 部署到目标环境
echo "1. Deploying to $TARGET_ENV environment..."
docker-compose -f docker-compose-$TARGET_ENV.yml up -d
# 2. 等待服务启动
echo "2. Waiting for $TARGET_ENV to be ready..."
for i in {1..30}; do
if curl -s http://kong-$TARGET_ENV:8001/status > /dev/null; then
echo " $TARGET_ENV is ready"
break
fi
sleep 10
done
# 3. 健康检查
echo "3. Running health checks on $TARGET_ENV..."
if ! curl -s http://kong-$TARGET_ENV:8001/status | jq -r '.database.reachable' | grep -q "true"; then
echo " Health check failed for $TARGET_ENV"
exit 1
fi
# 4. 切换负载均衡器
echo "4. Switching load balancer to $TARGET_ENV..."
sed -i "s/kong-$CURRENT_ENV/kong-$TARGET_ENV/g" "$LB_CONFIG"
nginx -s reload
# 5. 验证切换
echo "5. Verifying switch..."
sleep 5
if curl -s http://localhost/status > /dev/null; then
echo " Switch successful"
else
echo " Switch failed, rolling back..."
sed -i "s/kong-$TARGET_ENV/kong-$CURRENT_ENV/g" "$LB_CONFIG"
nginx -s reload
exit 1
fi
# 6. 停止旧环境
echo "6. Stopping $CURRENT_ENV environment..."
docker-compose -f docker-compose-$CURRENT_ENV.yml down
echo "Blue-green deployment completed successfully!"
9.1.2 金丝雀部署
#!/bin/bash
# canary-deployment.sh
set -e
CANARY_PERCENTAGE=${1:-10}
NEW_VERSION=$2
if [ -z "$NEW_VERSION" ]; then
echo "Usage: $0 <canary_percentage> <new_version>"
exit 1
fi
echo "Starting canary deployment..."
echo "Canary percentage: $CANARY_PERCENTAGE%"
echo "New version: $NEW_VERSION"
# 1. 部署金丝雀版本
echo "1. Deploying canary version..."
docker run -d --name kong-canary-$NEW_VERSION \
--network kong-net \
-e "KONG_DATABASE=postgres" \
-e "KONG_PG_HOST=kong-database" \
kong:$NEW_VERSION
# 2. 等待服务启动
echo "2. Waiting for canary to be ready..."
sleep 30
# 3. 配置流量分割
echo "3. Configuring traffic split..."
curl -X POST http://localhost:8001/upstreams \
-d "name=kong-cluster" \
-d "algorithm=round-robin"
# 添加生产目标(90%流量)
for i in $(seq 1 $((100 - CANARY_PERCENTAGE))); do
curl -X POST http://localhost:8001/upstreams/kong-cluster/targets \
-d "target=kong-prod:8000" \
-d "weight=1"
done
# 添加金丝雀目标(10%流量)
for i in $(seq 1 $CANARY_PERCENTAGE); do
curl -X POST http://localhost:8001/upstreams/kong-cluster/targets \
-d "target=kong-canary-$NEW_VERSION:8000" \
-d "weight=1"
done
# 4. 监控金丝雀指标
echo "4. Monitoring canary metrics..."
for i in {1..60}; do
ERROR_RATE=$(curl -s http://prometheus:9090/api/v1/query?query='rate(kong_http_status{code=~"5..",instance="kong-canary-'$NEW_VERSION':8001"}[5m])' | jq -r '.data.result[0].value[1] // 0')
if (( $(echo "$ERROR_RATE > 0.05" | bc -l) )); then
echo " High error rate detected: $ERROR_RATE"
echo " Rolling back canary deployment..."
docker stop kong-canary-$NEW_VERSION
docker rm kong-canary-$NEW_VERSION
exit 1
fi
sleep 60
done
echo "5. Canary deployment successful, promoting to production..."
# 这里可以添加完全切换到新版本的逻辑
9.2 监控告警
9.2.1 监控指标
# 关键监控指标
metrics:
availability:
- kong_up
- kong_database_reachable
- kong_nginx_http_current_connections
performance:
- kong_latency_bucket
- kong_bandwidth_bytes
- kong_http_requests_total
errors:
- kong_http_status{code=~"4.."}
- kong_http_status{code=~"5.."}
- kong_nginx_http_total_requests
resources:
- kong_memory_workers_lua_vms_bytes
- kong_nginx_connections_active
- kong_nginx_connections_reading
- kong_nginx_connections_writing
- kong_nginx_connections_waiting
9.2.2 告警阈值
# 告警阈值配置
alerts:
critical:
- metric: kong_up
threshold: 0
duration: 1m
description: "Kong service is down"
- metric: kong_database_reachable
threshold: 0
duration: 30s
description: "Kong cannot connect to database"
- metric: rate(kong_http_status{code=~"5.."}[5m])
threshold: 0.05
duration: 2m
description: "High 5xx error rate"
warning:
- metric: histogram_quantile(0.95, rate(kong_latency_bucket[5m]))
threshold: 1000
duration: 5m
description: "High response latency"
- metric: kong_memory_workers_lua_vms_bytes
threshold: 1073741824 # 1GB
duration: 10m
description: "High memory usage"
9.3 容量规划
9.3.1 性能基准测试
#!/bin/bash
# performance-benchmark.sh
set -e
TEST_URL="http://localhost:8000/api/test"
CONCURRENCY_LEVELS=(1 10 50 100 200 500)
DURATION=60
echo "Kong Performance Benchmark"
echo "========================="
# 准备测试环境
echo "Setting up test environment..."
curl -X POST http://localhost:8001/services \
-d "name=test-service" \
-d "url=http://httpbin.org"
curl -X POST http://localhost:8001/services/test-service/routes \
-d "paths[]=/api/test"
# 运行基准测试
for concurrency in "${CONCURRENCY_LEVELS[@]}"; do
echo "Testing with $concurrency concurrent connections..."
ab -n $((concurrency * 100)) -c $concurrency -t $DURATION "$TEST_URL" > "benchmark_c${concurrency}.txt"
# 提取关键指标
RPS=$(grep "Requests per second" "benchmark_c${concurrency}.txt" | awk '{print $4}')
LATENCY_MEAN=$(grep "Time per request" "benchmark_c${concurrency}.txt" | head -1 | awk '{print $4}')
LATENCY_95=$(grep "95%" "benchmark_c${concurrency}.txt" | awk '{print $2}')
echo " RPS: $RPS"
echo " Mean Latency: ${LATENCY_MEAN}ms"
echo " 95th Percentile: ${LATENCY_95}ms"
echo ""
done
# 生成报告
echo "Generating performance report..."
cat > performance_report.md << EOF
# Kong Performance Benchmark Report
## Test Configuration
- Duration: ${DURATION}s per test
- Target: $TEST_URL
- Date: $(date)
## Results
| Concurrency | RPS | Mean Latency (ms) | 95th Percentile (ms) |
|-------------|-----|-------------------|----------------------|
EOF
for concurrency in "${CONCURRENCY_LEVELS[@]}"; do
RPS=$(grep "Requests per second" "benchmark_c${concurrency}.txt" | awk '{print $4}')
LATENCY_MEAN=$(grep "Time per request" "benchmark_c${concurrency}.txt" | head -1 | awk '{print $4}')
LATENCY_95=$(grep "95%" "benchmark_c${concurrency}.txt" | awk '{print $2}')
echo "| $concurrency | $RPS | $LATENCY_MEAN | $LATENCY_95 |" >> performance_report.md
done
echo "Performance benchmark completed. Report saved to performance_report.md"
9.3.2 容量规划计算
#!/usr/bin/env python3
# capacity-planning.py
import json
import sys
from datetime import datetime, timedelta
def calculate_capacity(current_rps, target_rps, current_instances, cpu_threshold=70, memory_threshold=80):
"""
计算所需的Kong实例数量
"""
# 基于RPS的计算
rps_ratio = target_rps / current_rps
required_instances_rps = int(current_instances * rps_ratio * 1.2) # 20%缓冲
# 基于资源使用率的计算
cpu_factor = 100 / cpu_threshold
memory_factor = 100 / memory_threshold
required_instances_cpu = int(current_instances * cpu_factor)
required_instances_memory = int(current_instances * memory_factor)
# 取最大值
recommended_instances = max(required_instances_rps, required_instances_cpu, required_instances_memory)
return {
'current_instances': current_instances,
'target_rps': target_rps,
'recommended_instances': recommended_instances,
'scaling_factor': recommended_instances / current_instances,
'calculations': {
'rps_based': required_instances_rps,
'cpu_based': required_instances_cpu,
'memory_based': required_instances_memory
}
}
def estimate_costs(instances, instance_cost_per_hour=0.1, hours_per_month=730):
"""
估算月度成本
"""
monthly_cost = instances * instance_cost_per_hour * hours_per_month
return {
'instances': instances,
'cost_per_hour': instances * instance_cost_per_hour,
'monthly_cost': monthly_cost
}
def main():
if len(sys.argv) != 4:
print("Usage: python3 capacity-planning.py <current_rps> <target_rps> <current_instances>")
sys.exit(1)
current_rps = float(sys.argv[1])
target_rps = float(sys.argv[2])
current_instances = int(sys.argv[3])
# 计算容量需求
capacity = calculate_capacity(current_rps, target_rps, current_instances)
# 估算成本
current_cost = estimate_costs(current_instances)
recommended_cost = estimate_costs(capacity['recommended_instances'])
# 生成报告
report = {
'timestamp': datetime.now().isoformat(),
'capacity_planning': capacity,
'cost_estimation': {
'current': current_cost,
'recommended': recommended_cost,
'cost_increase': recommended_cost['monthly_cost'] - current_cost['monthly_cost']
},
'recommendations': [
f"Scale from {current_instances} to {capacity['recommended_instances']} instances",
f"Expected cost increase: ${recommended_cost['monthly_cost'] - current_cost['monthly_cost']:.2f}/month",
f"Scaling factor: {capacity['scaling_factor']:.2f}x"
]
}
print(json.dumps(report, indent=2))
if __name__ == '__main__':
main()
10. 总结
Kong的部署与运维是一个复杂但重要的过程,需要考虑多个方面:
10.1 关键要点
- 架构选择: 根据业务需求选择合适的部署模式(DB模式、DB-less模式、混合模式)
- 高可用性: 通过多节点集群、数据库高可用、负载均衡确保服务可用性
- 安全性: 实施网络安全、访问控制、SSL/TLS加密等安全措施
- 监控告警: 建立完善的监控体系,及时发现和处理问题
- 备份恢复: 定期备份配置和数据,制定灾难恢复计划
- 性能优化: 通过容量规划、性能调优确保系统性能
10.2 最佳实践
- 环境分离: 开发、测试、生产环境严格分离
- 配置管理: 使用版本控制管理配置,实现配置即代码
- 自动化部署: 采用CI/CD流水线,实现自动化部署
- 渐进式发布: 使用蓝绿部署、金丝雀部署等策略降低发布风险
- 持续监控: 建立全方位监控,包括业务指标、技术指标、用户体验指标
10.3 运维建议
- 文档化: 维护详细的运维文档和操作手册
- 培训: 定期进行运维培训,提高团队技能
- 演练: 定期进行故障演练,验证应急响应能力
- 优化: 持续优化配置和流程,提高运维效率
- 创新: 关注新技术和最佳实践,不断改进运维体系
通过遵循这些原则和实践,可以构建一个稳定、安全、高效的Kong API网关运维体系。