3.1 Pod基础概念
1. Pod定义
Pod是Kubernetes中最小的可部署单元,它包含一个或多个容器,这些容器共享存储、网络和运行规范。
Pod特性: - 共享网络命名空间(IP地址和端口空间) - 共享存储卷 - 容器间可以通过localhost通信 - 整个Pod作为一个单元进行调度
2. Pod生命周期
Pod阶段(Phase): - Pending - Pod已被创建,但容器镜像还未创建完成 - Running - Pod已绑定到节点,所有容器都已创建,至少一个容器正在运行 - Succeeded - Pod中所有容器都已成功终止 - Failed - Pod中所有容器都已终止,至少一个容器失败 - Unknown - 无法获取Pod状态
容器状态: - Waiting - 容器正在等待启动 - Running - 容器正在运行 - Terminated - 容器已终止
3. 简单Pod示例
simple-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: simple-pod
labels:
app: simple-app
version: v1
annotations:
description: "这是一个简单的Pod示例"
spec:
containers:
- name: nginx-container
image: nginx:1.20
ports:
- containerPort: 80
name: http
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
restartPolicy: Always
3.2 Pod配置详解
1. 多容器Pod
multi-container-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: multi-container-pod
labels:
app: multi-app
spec:
containers:
# 主应用容器
- name: web-server
image: nginx:1.20
ports:
- containerPort: 80
volumeMounts:
- name: shared-data
mountPath: /usr/share/nginx/html
- name: nginx-config
mountPath: /etc/nginx/conf.d
# 日志收集容器(Sidecar模式)
- name: log-collector
image: fluent/fluent-bit:1.9
volumeMounts:
- name: shared-data
mountPath: /var/log/nginx
- name: fluent-bit-config
mountPath: /fluent-bit/etc
# 初始化容器
initContainers:
- name: init-web-content
image: busybox:1.35
command: ['sh', '-c']
args:
- |
echo "<h1>Hello from Multi-Container Pod</h1>" > /work-dir/index.html
echo "<p>Generated at: $(date)</p>" >> /work-dir/index.html
volumeMounts:
- name: shared-data
mountPath: /work-dir
volumes:
- name: shared-data
emptyDir: {}
- name: nginx-config
configMap:
name: nginx-config
- name: fluent-bit-config
configMap:
name: fluent-bit-config
restartPolicy: Always
2. 资源管理
resource-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: resource-pod
spec:
containers:
- name: cpu-memory-demo
image: nginx:1.20
resources:
requests:
memory: "100Mi" # 请求100MB内存
cpu: "100m" # 请求0.1个CPU核心
ephemeral-storage: "1Gi" # 请求1GB临时存储
limits:
memory: "200Mi" # 限制200MB内存
cpu: "200m" # 限制0.2个CPU核心
ephemeral-storage: "2Gi" # 限制2GB临时存储
env:
- name: MEMORY_REQUEST
valueFrom:
resourceFieldRef:
resource: requests.memory
- name: CPU_LIMIT
valueFrom:
resourceFieldRef:
resource: limits.cpu
3. 环境变量配置
env-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: env-pod
spec:
containers:
- name: env-demo
image: busybox:1.35
command: ['sh', '-c', 'env && sleep 3600']
env:
# 直接设置环境变量
- name: DEMO_GREETING
value: "Hello from the environment"
- name: DEMO_FAREWELL
value: "Such a sweet sorrow"
# 从Pod字段获取
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
# 从ConfigMap获取
- name: CONFIG_USERNAME
valueFrom:
configMapKeyRef:
name: app-config
key: username
# 从Secret获取
- name: SECRET_PASSWORD
valueFrom:
secretKeyRef:
name: app-secret
key: password
# 从ConfigMap批量导入
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secret
4. 卷挂载配置
volume-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: volume-pod
spec:
containers:
- name: volume-demo
image: nginx:1.20
volumeMounts:
# EmptyDir卷
- name: cache-volume
mountPath: /cache
# HostPath卷
- name: host-volume
mountPath: /host-data
readOnly: true
# ConfigMap卷
- name: config-volume
mountPath: /etc/config
# Secret卷
- name: secret-volume
mountPath: /etc/secret
readOnly: true
# PVC卷
- name: persistent-volume
mountPath: /data
volumes:
# 临时卷(Pod删除时数据丢失)
- name: cache-volume
emptyDir:
sizeLimit: 1Gi
# 主机路径卷
- name: host-volume
hostPath:
path: /var/log
type: Directory
# ConfigMap卷
- name: config-volume
configMap:
name: app-config
items:
- key: app.properties
path: application.properties
- key: log4j.properties
path: log4j.properties
# Secret卷
- name: secret-volume
secret:
secretName: app-secret
defaultMode: 0400
# 持久化卷声明
- name: persistent-volume
persistentVolumeClaim:
claimName: app-pvc
3.3 Pod调度
1. 节点选择器
node-selector-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: node-selector-pod
spec:
nodeSelector:
disktype: ssd
zone: us-west1
containers:
- name: nginx
image: nginx:1.20
2. 节点亲和性
node-affinity-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: node-affinity-pod
spec:
affinity:
nodeAffinity:
# 必须满足的条件
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/arch
operator: In
values:
- amd64
- arm64
- key: node-type
operator: NotIn
values:
- spot
# 优先满足的条件
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: zone
operator: In
values:
- us-west1-a
- weight: 50
preference:
matchExpressions:
- key: instance-type
operator: In
values:
- c5.large
- c5.xlarge
containers:
- name: nginx
image: nginx:1.20
3. Pod亲和性和反亲和性
pod-affinity.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-affinity-demo
labels:
app: web-server
spec:
affinity:
# Pod亲和性 - 希望与某些Pod调度到同一节点
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- database
topologyKey: kubernetes.io/hostname
# Pod反亲和性 - 避免与某些Pod调度到同一节点
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- web-server
topologyKey: kubernetes.io/hostname
containers:
- name: web
image: nginx:1.20
4. 污点和容忍
toleration-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: toleration-pod
spec:
tolerations:
# 容忍NoSchedule污点
- key: "node-type"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
# 容忍NoExecute污点,并设置容忍时间
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 300
# 容忍所有污点
- operator: "Exists"
containers:
- name: nginx
image: nginx:1.20
3.4 Pod安全
1. 安全上下文
security-context-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: security-context-pod
spec:
# Pod级别安全上下文
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
fsGroupChangePolicy: "OnRootMismatch"
seccompProfile:
type: RuntimeDefault
containers:
- name: secure-container
image: nginx:1.20
# 容器级别安全上下文(会覆盖Pod级别设置)
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 2000
capabilities:
add:
- NET_ADMIN
drop:
- ALL
volumeMounts:
- name: tmp-volume
mountPath: /tmp
- name: var-cache-nginx
mountPath: /var/cache/nginx
- name: var-run
mountPath: /var/run
volumes:
- name: tmp-volume
emptyDir: {}
- name: var-cache-nginx
emptyDir: {}
- name: var-run
emptyDir: {}
2. Pod安全策略
pod-security-policy.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted-psp
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
- 'persistentVolumeClaim'
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
readOnlyRootFilesystem: true
3. 网络策略
network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: web-netpol
namespace: default
spec:
podSelector:
matchLabels:
app: web
policyTypes:
- Ingress
- Egress
ingress:
# 允许来自特定标签Pod的入站流量
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 80
# 允许来自特定命名空间的入站流量
- from:
- namespaceSelector:
matchLabels:
name: production
ports:
- protocol: TCP
port: 80
egress:
# 允许到数据库的出站流量
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 3306
# 允许DNS查询
- to: []
ports:
- protocol: UDP
port: 53
3.5 Pod生命周期管理
1. 生命周期钩子
lifecycle-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: lifecycle-pod
spec:
containers:
- name: lifecycle-demo
image: nginx:1.20
lifecycle:
# 容器启动后执行
postStart:
exec:
command:
- /bin/sh
- -c
- |
echo "Container started at $(date)" >> /var/log/lifecycle.log
nginx -t
# 容器终止前执行
preStop:
exec:
command:
- /bin/sh
- -c
- |
echo "Container stopping at $(date)" >> /var/log/lifecycle.log
nginx -s quit
# 存活探针
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# 就绪探针
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
successThreshold: 1
failureThreshold: 3
# 启动探针
startupProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 30
volumeMounts:
- name: log-volume
mountPath: /var/log
volumes:
- name: log-volume
emptyDir: {}
2. 健康检查详解
health-check-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: health-check-pod
spec:
containers:
- name: web-server
image: nginx:1.20
ports:
- containerPort: 80
# HTTP健康检查
livenessProbe:
httpGet:
path: /health
port: 80
httpHeaders:
- name: Custom-Header
value: Awesome
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 80
initialDelaySeconds: 5
periodSeconds: 5
- name: database
image: postgres:13
env:
- name: POSTGRES_PASSWORD
value: "password"
# TCP健康检查
livenessProbe:
tcpSocket:
port: 5432
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
tcpSocket:
port: 5432
initialDelaySeconds: 5
periodSeconds: 5
- name: worker
image: busybox:1.35
command: ['sh', '-c', 'while true; do echo working; sleep 10; done']
# 命令健康检查
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
readinessProbe:
exec:
command:
- cat
- /tmp/ready
initialDelaySeconds: 5
periodSeconds: 5
3. 重启策略
restart-policy-examples.yaml
# Always重启策略(默认)
apiVersion: v1
kind: Pod
metadata:
name: restart-always-pod
spec:
restartPolicy: Always
containers:
- name: container
image: busybox:1.35
command: ['sh', '-c', 'echo Hello && sleep 10 && exit 1']
---
# OnFailure重启策略
apiVersion: v1
kind: Pod
metadata:
name: restart-onfailure-pod
spec:
restartPolicy: OnFailure
containers:
- name: container
image: busybox:1.35
command: ['sh', '-c', 'echo Hello && sleep 10 && exit 1']
---
# Never重启策略
apiVersion: v1
kind: Pod
metadata:
name: restart-never-pod
spec:
restartPolicy: Never
containers:
- name: container
image: busybox:1.35
command: ['sh', '-c', 'echo Hello && sleep 10 && exit 1']
3.6 Pod管理命令
1. 基本操作命令
pod-management.sh
#!/bin/bash
echo "=== Pod管理命令示例 ==="
# 创建Pod
echo "1. 创建Pod:"
kubectl apply -f simple-pod.yaml
# 查看Pod列表
echo "\n2. 查看Pod列表:"
kubectl get pods
kubectl get pods -o wide
kubectl get pods --show-labels
# 查看Pod详细信息
echo "\n3. 查看Pod详细信息:"
kubectl describe pod simple-pod
# 查看Pod日志
echo "\n4. 查看Pod日志:"
kubectl logs simple-pod
kubectl logs simple-pod -c nginx-container # 多容器Pod指定容器
kubectl logs simple-pod --previous # 查看上一个容器的日志
kubectl logs simple-pod -f # 实时跟踪日志
# 进入Pod执行命令
echo "\n5. 进入Pod执行命令:"
kubectl exec simple-pod -- ls -la
kubectl exec -it simple-pod -- /bin/bash
kubectl exec simple-pod -c nginx-container -- nginx -t # 多容器Pod指定容器
# 端口转发
echo "\n6. 端口转发:"
kubectl port-forward simple-pod 8080:80 &
PORT_FORWARD_PID=$!
sleep 5
curl http://localhost:8080
kill $PORT_FORWARD_PID
# 复制文件
echo "\n7. 复制文件:"
kubectl cp /local/file simple-pod:/remote/file
kubectl cp simple-pod:/remote/file /local/file
# 查看Pod资源使用情况
echo "\n8. 查看资源使用:"
kubectl top pod simple-pod
# 删除Pod
echo "\n9. 删除Pod:"
kubectl delete pod simple-pod
kubectl delete -f simple-pod.yaml
echo "\n=== Pod管理命令完成 ==="
2. 调试命令
pod-debug.sh
#!/bin/bash
echo "=== Pod调试命令 ==="
POD_NAME="debug-pod"
# 查看Pod事件
echo "1. 查看Pod事件:"
kubectl get events --field-selector involvedObject.name=$POD_NAME
# 查看Pod状态
echo "\n2. 查看Pod状态:"
kubectl get pod $POD_NAME -o yaml
kubectl get pod $POD_NAME -o json | jq '.status'
# 查看Pod调度信息
echo "\n3. 查看调度信息:"
kubectl get pod $POD_NAME -o jsonpath='{.spec.nodeName}'
kubectl get pod $POD_NAME -o jsonpath='{.status.conditions}'
# 查看容器状态
echo "\n4. 查看容器状态:"
kubectl get pod $POD_NAME -o jsonpath='{.status.containerStatuses}'
# 临时调试Pod
echo "\n5. 创建临时调试Pod:"
kubectl run debug-pod --image=busybox:1.35 --rm -it --restart=Never -- sh
# 使用kubectl debug(Kubernetes 1.20+)
echo "\n6. 使用kubectl debug:"
kubectl debug $POD_NAME -it --image=busybox:1.35
echo "\n=== Pod调试完成 ==="
3. 批量操作
batch-pod-operations.sh
#!/bin/bash
echo "=== 批量Pod操作 ==="
# 根据标签选择Pod
echo "1. 根据标签操作:"
kubectl get pods -l app=nginx
kubectl delete pods -l app=nginx
# 根据字段选择器
echo "\n2. 根据字段选择器:"
kubectl get pods --field-selector status.phase=Running
kubectl get pods --field-selector spec.nodeName=worker-node-1
# 批量查看日志
echo "\n3. 批量查看日志:"
kubectl logs -l app=nginx --tail=10
# 批量执行命令
echo "\n4. 批量执行命令:"
for pod in $(kubectl get pods -l app=nginx -o jsonpath='{.items[*].metadata.name}'); do
echo "Executing command on $pod"
kubectl exec $pod -- nginx -t
done
# 批量端口转发
echo "\n5. 批量端口转发:"
PORT=8080
for pod in $(kubectl get pods -l app=nginx -o jsonpath='{.items[*].metadata.name}'); do
kubectl port-forward $pod $PORT:80 &
PORT=$((PORT + 1))
done
echo "\n=== 批量操作完成 ==="
3.7 Pod故障排查
1. 常见问题诊断
pod-troubleshooting.sh
#!/bin/bash
POD_NAME=$1
if [ -z "$POD_NAME" ]; then
echo "Usage: $0 <pod-name>"
exit 1
fi
echo "=== Pod故障排查: $POD_NAME ==="
# 1. 基本信息
echo "1. Pod基本信息:"
kubectl get pod $POD_NAME -o wide
# 2. Pod状态
echo "\n2. Pod详细状态:"
kubectl describe pod $POD_NAME
# 3. 容器状态
echo "\n3. 容器状态:"
kubectl get pod $POD_NAME -o jsonpath='{.status.containerStatuses[*].state}' | jq .
# 4. 重启次数
echo "\n4. 容器重启次数:"
kubectl get pod $POD_NAME -o jsonpath='{.status.containerStatuses[*].restartCount}'
# 5. 最近事件
echo "\n5. 相关事件:"
kubectl get events --field-selector involvedObject.name=$POD_NAME --sort-by='.lastTimestamp'
# 6. 日志检查
echo "\n6. 容器日志:"
kubectl logs $POD_NAME --tail=50
# 7. 上一次容器日志(如果重启过)
echo "\n7. 上一次容器日志:"
kubectl logs $POD_NAME --previous --tail=50 2>/dev/null || echo "没有上一次容器日志"
# 8. 资源使用情况
echo "\n8. 资源使用情况:"
kubectl top pod $POD_NAME 2>/dev/null || echo "Metrics server未安装或Pod未运行"
# 9. 网络检查
echo "\n9. 网络信息:"
kubectl get pod $POD_NAME -o jsonpath='{.status.podIP}'
echo
# 10. 存储检查
echo "\n10. 存储挂载:"
kubectl get pod $POD_NAME -o jsonpath='{.spec.volumes}' | jq .
echo "\n=== 故障排查完成 ==="
2. 性能分析
pod-performance.yaml
apiVersion: v1
kind: Pod
metadata:
name: performance-test-pod
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
spec:
containers:
- name: app
image: nginx:1.20
resources:
requests:
memory: "100Mi"
cpu: "100m"
limits:
memory: "200Mi"
cpu: "200m"
# 性能监控端点
ports:
- containerPort: 80
name: http
- containerPort: 8080
name: metrics
# 健康检查
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 5
# 性能监控Sidecar
- name: metrics-exporter
image: nginx/nginx-prometheus-exporter:0.10.0
args:
- -nginx.scrape-uri=http://localhost/nginx_status
ports:
- containerPort: 9113
name: metrics
总结
本章详细介绍了Pod的核心概念和管理方法,包括:
核心概念
- Pod定义 - Kubernetes最小部署单元
- 生命周期 - Pod和容器的各种状态
- 多容器模式 - Sidecar、Init Container等模式
配置管理
- 资源管理 - CPU、内存、存储资源的请求和限制
- 环境变量 - 多种环境变量配置方式
- 卷挂载 - 各种存储卷的使用方法
调度策略
- 节点选择 - nodeSelector、节点亲和性
- Pod亲和性 - Pod间的调度关系
- 污点容忍 - 特殊节点的调度策略
安全配置
- 安全上下文 - 用户、组、权限控制
- 网络策略 - 网络访问控制
- 安全策略 - Pod安全策略配置
生命周期管理
- 生命周期钩子 - postStart、preStop
- 健康检查 - 存活、就绪、启动探针
- 重启策略 - Always、OnFailure、Never
运维管理
- 基本操作 - 创建、查看、删除、调试
- 故障排查 - 日志分析、事件查看、状态检查
- 性能监控 - 资源使用、性能指标
最佳实践
- 资源规划 - 合理设置资源请求和限制
- 健康检查 - 配置适当的探针
- 安全配置 - 最小权限原则
- 监控告警 - 完善的监控体系
下一章我们将学习Deployment和ReplicaSet,了解如何管理Pod的副本和更新策略。