3.1 Pod基础概念

1. Pod定义

Pod是Kubernetes中最小的可部署单元,它包含一个或多个容器,这些容器共享存储、网络和运行规范。

Pod特性: - 共享网络命名空间(IP地址和端口空间) - 共享存储卷 - 容器间可以通过localhost通信 - 整个Pod作为一个单元进行调度

2. Pod生命周期

Pod阶段(Phase): - Pending - Pod已被创建,但容器镜像还未创建完成 - Running - Pod已绑定到节点,所有容器都已创建,至少一个容器正在运行 - Succeeded - Pod中所有容器都已成功终止 - Failed - Pod中所有容器都已终止,至少一个容器失败 - Unknown - 无法获取Pod状态

容器状态: - Waiting - 容器正在等待启动 - Running - 容器正在运行 - Terminated - 容器已终止

3. 简单Pod示例

simple-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: simple-pod
  labels:
    app: simple-app
    version: v1
  annotations:
    description: "这是一个简单的Pod示例"
spec:
  containers:
  - name: nginx-container
    image: nginx:1.20
    ports:
    - containerPort: 80
      name: http
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
  restartPolicy: Always

3.2 Pod配置详解

1. 多容器Pod

multi-container-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: multi-container-pod
  labels:
    app: multi-app
spec:
  containers:
  # 主应用容器
  - name: web-server
    image: nginx:1.20
    ports:
    - containerPort: 80
    volumeMounts:
    - name: shared-data
      mountPath: /usr/share/nginx/html
    - name: nginx-config
      mountPath: /etc/nginx/conf.d
  
  # 日志收集容器(Sidecar模式)
  - name: log-collector
    image: fluent/fluent-bit:1.9
    volumeMounts:
    - name: shared-data
      mountPath: /var/log/nginx
    - name: fluent-bit-config
      mountPath: /fluent-bit/etc
  
  # 初始化容器
  initContainers:
  - name: init-web-content
    image: busybox:1.35
    command: ['sh', '-c']
    args:
    - |
      echo "<h1>Hello from Multi-Container Pod</h1>" > /work-dir/index.html
      echo "<p>Generated at: $(date)</p>" >> /work-dir/index.html
    volumeMounts:
    - name: shared-data
      mountPath: /work-dir
  
  volumes:
  - name: shared-data
    emptyDir: {}
  - name: nginx-config
    configMap:
      name: nginx-config
  - name: fluent-bit-config
    configMap:
      name: fluent-bit-config
  
  restartPolicy: Always

2. 资源管理

resource-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: resource-pod
spec:
  containers:
  - name: cpu-memory-demo
    image: nginx:1.20
    resources:
      requests:
        memory: "100Mi"    # 请求100MB内存
        cpu: "100m"        # 请求0.1个CPU核心
        ephemeral-storage: "1Gi"  # 请求1GB临时存储
      limits:
        memory: "200Mi"    # 限制200MB内存
        cpu: "200m"        # 限制0.2个CPU核心
        ephemeral-storage: "2Gi"  # 限制2GB临时存储
    env:
    - name: MEMORY_REQUEST
      valueFrom:
        resourceFieldRef:
          resource: requests.memory
    - name: CPU_LIMIT
      valueFrom:
        resourceFieldRef:
          resource: limits.cpu

3. 环境变量配置

env-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: env-pod
spec:
  containers:
  - name: env-demo
    image: busybox:1.35
    command: ['sh', '-c', 'env && sleep 3600']
    env:
    # 直接设置环境变量
    - name: DEMO_GREETING
      value: "Hello from the environment"
    - name: DEMO_FAREWELL
      value: "Such a sweet sorrow"
    
    # 从Pod字段获取
    - name: MY_NODE_NAME
      valueFrom:
        fieldRef:
          fieldPath: spec.nodeName
    - name: MY_POD_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
    - name: MY_POD_NAMESPACE
      valueFrom:
        fieldRef:
          fieldPath: metadata.namespace
    - name: MY_POD_IP
      valueFrom:
        fieldRef:
          fieldPath: status.podIP
    
    # 从ConfigMap获取
    - name: CONFIG_USERNAME
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: username
    
    # 从Secret获取
    - name: SECRET_PASSWORD
      valueFrom:
        secretKeyRef:
          name: app-secret
          key: password
    
    # 从ConfigMap批量导入
    envFrom:
    - configMapRef:
        name: app-config
    - secretRef:
        name: app-secret

4. 卷挂载配置

volume-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: volume-pod
spec:
  containers:
  - name: volume-demo
    image: nginx:1.20
    volumeMounts:
    # EmptyDir卷
    - name: cache-volume
      mountPath: /cache
    
    # HostPath卷
    - name: host-volume
      mountPath: /host-data
      readOnly: true
    
    # ConfigMap卷
    - name: config-volume
      mountPath: /etc/config
    
    # Secret卷
    - name: secret-volume
      mountPath: /etc/secret
      readOnly: true
    
    # PVC卷
    - name: persistent-volume
      mountPath: /data
  
  volumes:
  # 临时卷(Pod删除时数据丢失)
  - name: cache-volume
    emptyDir:
      sizeLimit: 1Gi
  
  # 主机路径卷
  - name: host-volume
    hostPath:
      path: /var/log
      type: Directory
  
  # ConfigMap卷
  - name: config-volume
    configMap:
      name: app-config
      items:
      - key: app.properties
        path: application.properties
      - key: log4j.properties
        path: log4j.properties
  
  # Secret卷
  - name: secret-volume
    secret:
      secretName: app-secret
      defaultMode: 0400
  
  # 持久化卷声明
  - name: persistent-volume
    persistentVolumeClaim:
      claimName: app-pvc

3.3 Pod调度

1. 节点选择器

node-selector-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: node-selector-pod
spec:
  nodeSelector:
    disktype: ssd
    zone: us-west1
  containers:
  - name: nginx
    image: nginx:1.20

2. 节点亲和性

node-affinity-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: node-affinity-pod
spec:
  affinity:
    nodeAffinity:
      # 必须满足的条件
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/arch
            operator: In
            values:
            - amd64
            - arm64
          - key: node-type
            operator: NotIn
            values:
            - spot
      
      # 优先满足的条件
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        preference:
          matchExpressions:
          - key: zone
            operator: In
            values:
            - us-west1-a
      - weight: 50
        preference:
          matchExpressions:
          - key: instance-type
            operator: In
            values:
            - c5.large
            - c5.xlarge
  
  containers:
  - name: nginx
    image: nginx:1.20

3. Pod亲和性和反亲和性

pod-affinity.yaml

apiVersion: v1
kind: Pod
metadata:
  name: pod-affinity-demo
  labels:
    app: web-server
spec:
  affinity:
    # Pod亲和性 - 希望与某些Pod调度到同一节点
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - database
        topologyKey: kubernetes.io/hostname
    
    # Pod反亲和性 - 避免与某些Pod调度到同一节点
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - web-server
          topologyKey: kubernetes.io/hostname
  
  containers:
  - name: web
    image: nginx:1.20

4. 污点和容忍

toleration-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: toleration-pod
spec:
  tolerations:
  # 容忍NoSchedule污点
  - key: "node-type"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"
  
  # 容忍NoExecute污点,并设置容忍时间
  - key: "node.kubernetes.io/unreachable"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 300
  
  # 容忍所有污点
  - operator: "Exists"
  
  containers:
  - name: nginx
    image: nginx:1.20

3.4 Pod安全

1. 安全上下文

security-context-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: security-context-pod
spec:
  # Pod级别安全上下文
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    fsGroupChangePolicy: "OnRootMismatch"
    seccompProfile:
      type: RuntimeDefault
  
  containers:
  - name: secure-container
    image: nginx:1.20
    # 容器级别安全上下文(会覆盖Pod级别设置)
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      runAsUser: 2000
      capabilities:
        add:
        - NET_ADMIN
        drop:
        - ALL
    
    volumeMounts:
    - name: tmp-volume
      mountPath: /tmp
    - name: var-cache-nginx
      mountPath: /var/cache/nginx
    - name: var-run
      mountPath: /var/run
  
  volumes:
  - name: tmp-volume
    emptyDir: {}
  - name: var-cache-nginx
    emptyDir: {}
  - name: var-run
    emptyDir: {}

2. Pod安全策略

pod-security-policy.yaml

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted-psp
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    - 'persistentVolumeClaim'
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
  readOnlyRootFilesystem: true

3. 网络策略

network-policy.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: web-netpol
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: web
  policyTypes:
  - Ingress
  - Egress
  
  ingress:
  # 允许来自特定标签Pod的入站流量
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 80
  
  # 允许来自特定命名空间的入站流量
  - from:
    - namespaceSelector:
        matchLabels:
          name: production
    ports:
    - protocol: TCP
      port: 80
  
  egress:
  # 允许到数据库的出站流量
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 3306
  
  # 允许DNS查询
  - to: []
    ports:
    - protocol: UDP
      port: 53

3.5 Pod生命周期管理

1. 生命周期钩子

lifecycle-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: lifecycle-pod
spec:
  containers:
  - name: lifecycle-demo
    image: nginx:1.20
    lifecycle:
      # 容器启动后执行
      postStart:
        exec:
          command:
          - /bin/sh
          - -c
          - |
            echo "Container started at $(date)" >> /var/log/lifecycle.log
            nginx -t
      
      # 容器终止前执行
      preStop:
        exec:
          command:
          - /bin/sh
          - -c
          - |
            echo "Container stopping at $(date)" >> /var/log/lifecycle.log
            nginx -s quit
    
    # 存活探针
    livenessProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 30
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 3
    
    # 就绪探针
    readinessProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 5
      periodSeconds: 5
      timeoutSeconds: 3
      successThreshold: 1
      failureThreshold: 3
    
    # 启动探针
    startupProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 30
    
    volumeMounts:
    - name: log-volume
      mountPath: /var/log
  
  volumes:
  - name: log-volume
    emptyDir: {}

2. 健康检查详解

health-check-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: health-check-pod
spec:
  containers:
  - name: web-server
    image: nginx:1.20
    ports:
    - containerPort: 80
    
    # HTTP健康检查
    livenessProbe:
      httpGet:
        path: /health
        port: 80
        httpHeaders:
        - name: Custom-Header
          value: Awesome
      initialDelaySeconds: 30
      periodSeconds: 10
    
    readinessProbe:
      httpGet:
        path: /ready
        port: 80
      initialDelaySeconds: 5
      periodSeconds: 5
  
  - name: database
    image: postgres:13
    env:
    - name: POSTGRES_PASSWORD
      value: "password"
    
    # TCP健康检查
    livenessProbe:
      tcpSocket:
        port: 5432
      initialDelaySeconds: 30
      periodSeconds: 10
    
    readinessProbe:
      tcpSocket:
        port: 5432
      initialDelaySeconds: 5
      periodSeconds: 5
  
  - name: worker
    image: busybox:1.35
    command: ['sh', '-c', 'while true; do echo working; sleep 10; done']
    
    # 命令健康检查
    livenessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 5
      periodSeconds: 5
    
    readinessProbe:
      exec:
        command:
        - cat
        - /tmp/ready
      initialDelaySeconds: 5
      periodSeconds: 5

3. 重启策略

restart-policy-examples.yaml

# Always重启策略(默认)
apiVersion: v1
kind: Pod
metadata:
  name: restart-always-pod
spec:
  restartPolicy: Always
  containers:
  - name: container
    image: busybox:1.35
    command: ['sh', '-c', 'echo Hello && sleep 10 && exit 1']

---
# OnFailure重启策略
apiVersion: v1
kind: Pod
metadata:
  name: restart-onfailure-pod
spec:
  restartPolicy: OnFailure
  containers:
  - name: container
    image: busybox:1.35
    command: ['sh', '-c', 'echo Hello && sleep 10 && exit 1']

---
# Never重启策略
apiVersion: v1
kind: Pod
metadata:
  name: restart-never-pod
spec:
  restartPolicy: Never
  containers:
  - name: container
    image: busybox:1.35
    command: ['sh', '-c', 'echo Hello && sleep 10 && exit 1']

3.6 Pod管理命令

1. 基本操作命令

pod-management.sh

#!/bin/bash

echo "=== Pod管理命令示例 ==="

# 创建Pod
echo "1. 创建Pod:"
kubectl apply -f simple-pod.yaml

# 查看Pod列表
echo "\n2. 查看Pod列表:"
kubectl get pods
kubectl get pods -o wide
kubectl get pods --show-labels

# 查看Pod详细信息
echo "\n3. 查看Pod详细信息:"
kubectl describe pod simple-pod

# 查看Pod日志
echo "\n4. 查看Pod日志:"
kubectl logs simple-pod
kubectl logs simple-pod -c nginx-container  # 多容器Pod指定容器
kubectl logs simple-pod --previous          # 查看上一个容器的日志
kubectl logs simple-pod -f                  # 实时跟踪日志

# 进入Pod执行命令
echo "\n5. 进入Pod执行命令:"
kubectl exec simple-pod -- ls -la
kubectl exec -it simple-pod -- /bin/bash
kubectl exec simple-pod -c nginx-container -- nginx -t  # 多容器Pod指定容器

# 端口转发
echo "\n6. 端口转发:"
kubectl port-forward simple-pod 8080:80 &
PORT_FORWARD_PID=$!
sleep 5
curl http://localhost:8080
kill $PORT_FORWARD_PID

# 复制文件
echo "\n7. 复制文件:"
kubectl cp /local/file simple-pod:/remote/file
kubectl cp simple-pod:/remote/file /local/file

# 查看Pod资源使用情况
echo "\n8. 查看资源使用:"
kubectl top pod simple-pod

# 删除Pod
echo "\n9. 删除Pod:"
kubectl delete pod simple-pod
kubectl delete -f simple-pod.yaml

echo "\n=== Pod管理命令完成 ==="

2. 调试命令

pod-debug.sh

#!/bin/bash

echo "=== Pod调试命令 ==="

POD_NAME="debug-pod"

# 查看Pod事件
echo "1. 查看Pod事件:"
kubectl get events --field-selector involvedObject.name=$POD_NAME

# 查看Pod状态
echo "\n2. 查看Pod状态:"
kubectl get pod $POD_NAME -o yaml
kubectl get pod $POD_NAME -o json | jq '.status'

# 查看Pod调度信息
echo "\n3. 查看调度信息:"
kubectl get pod $POD_NAME -o jsonpath='{.spec.nodeName}'
kubectl get pod $POD_NAME -o jsonpath='{.status.conditions}'

# 查看容器状态
echo "\n4. 查看容器状态:"
kubectl get pod $POD_NAME -o jsonpath='{.status.containerStatuses}'

# 临时调试Pod
echo "\n5. 创建临时调试Pod:"
kubectl run debug-pod --image=busybox:1.35 --rm -it --restart=Never -- sh

# 使用kubectl debug(Kubernetes 1.20+)
echo "\n6. 使用kubectl debug:"
kubectl debug $POD_NAME -it --image=busybox:1.35

echo "\n=== Pod调试完成 ==="

3. 批量操作

batch-pod-operations.sh

#!/bin/bash

echo "=== 批量Pod操作 ==="

# 根据标签选择Pod
echo "1. 根据标签操作:"
kubectl get pods -l app=nginx
kubectl delete pods -l app=nginx

# 根据字段选择器
echo "\n2. 根据字段选择器:"
kubectl get pods --field-selector status.phase=Running
kubectl get pods --field-selector spec.nodeName=worker-node-1

# 批量查看日志
echo "\n3. 批量查看日志:"
kubectl logs -l app=nginx --tail=10

# 批量执行命令
echo "\n4. 批量执行命令:"
for pod in $(kubectl get pods -l app=nginx -o jsonpath='{.items[*].metadata.name}'); do
  echo "Executing command on $pod"
  kubectl exec $pod -- nginx -t
done

# 批量端口转发
echo "\n5. 批量端口转发:"
PORT=8080
for pod in $(kubectl get pods -l app=nginx -o jsonpath='{.items[*].metadata.name}'); do
  kubectl port-forward $pod $PORT:80 &
  PORT=$((PORT + 1))
done

echo "\n=== 批量操作完成 ==="

3.7 Pod故障排查

1. 常见问题诊断

pod-troubleshooting.sh

#!/bin/bash

POD_NAME=$1

if [ -z "$POD_NAME" ]; then
  echo "Usage: $0 <pod-name>"
  exit 1
fi

echo "=== Pod故障排查: $POD_NAME ==="

# 1. 基本信息
echo "1. Pod基本信息:"
kubectl get pod $POD_NAME -o wide

# 2. Pod状态
echo "\n2. Pod详细状态:"
kubectl describe pod $POD_NAME

# 3. 容器状态
echo "\n3. 容器状态:"
kubectl get pod $POD_NAME -o jsonpath='{.status.containerStatuses[*].state}' | jq .

# 4. 重启次数
echo "\n4. 容器重启次数:"
kubectl get pod $POD_NAME -o jsonpath='{.status.containerStatuses[*].restartCount}'

# 5. 最近事件
echo "\n5. 相关事件:"
kubectl get events --field-selector involvedObject.name=$POD_NAME --sort-by='.lastTimestamp'

# 6. 日志检查
echo "\n6. 容器日志:"
kubectl logs $POD_NAME --tail=50

# 7. 上一次容器日志(如果重启过)
echo "\n7. 上一次容器日志:"
kubectl logs $POD_NAME --previous --tail=50 2>/dev/null || echo "没有上一次容器日志"

# 8. 资源使用情况
echo "\n8. 资源使用情况:"
kubectl top pod $POD_NAME 2>/dev/null || echo "Metrics server未安装或Pod未运行"

# 9. 网络检查
echo "\n9. 网络信息:"
kubectl get pod $POD_NAME -o jsonpath='{.status.podIP}'
echo

# 10. 存储检查
echo "\n10. 存储挂载:"
kubectl get pod $POD_NAME -o jsonpath='{.spec.volumes}' | jq .

echo "\n=== 故障排查完成 ==="

2. 性能分析

pod-performance.yaml

apiVersion: v1
kind: Pod
metadata:
  name: performance-test-pod
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
spec:
  containers:
  - name: app
    image: nginx:1.20
    resources:
      requests:
        memory: "100Mi"
        cpu: "100m"
      limits:
        memory: "200Mi"
        cpu: "200m"
    
    # 性能监控端点
    ports:
    - containerPort: 80
      name: http
    - containerPort: 8080
      name: metrics
    
    # 健康检查
    livenessProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 30
      periodSeconds: 10
    
    readinessProbe:
      httpGet:
        path: /
        port: 80
      initialDelaySeconds: 5
      periodSeconds: 5
  
  # 性能监控Sidecar
  - name: metrics-exporter
    image: nginx/nginx-prometheus-exporter:0.10.0
    args:
    - -nginx.scrape-uri=http://localhost/nginx_status
    ports:
    - containerPort: 9113
      name: metrics

总结

本章详细介绍了Pod的核心概念和管理方法,包括:

核心概念

  1. Pod定义 - Kubernetes最小部署单元
  2. 生命周期 - Pod和容器的各种状态
  3. 多容器模式 - Sidecar、Init Container等模式

配置管理

  1. 资源管理 - CPU、内存、存储资源的请求和限制
  2. 环境变量 - 多种环境变量配置方式
  3. 卷挂载 - 各种存储卷的使用方法

调度策略

  1. 节点选择 - nodeSelector、节点亲和性
  2. Pod亲和性 - Pod间的调度关系
  3. 污点容忍 - 特殊节点的调度策略

安全配置

  1. 安全上下文 - 用户、组、权限控制
  2. 网络策略 - 网络访问控制
  3. 安全策略 - Pod安全策略配置

生命周期管理

  1. 生命周期钩子 - postStart、preStop
  2. 健康检查 - 存活、就绪、启动探针
  3. 重启策略 - Always、OnFailure、Never

运维管理

  1. 基本操作 - 创建、查看、删除、调试
  2. 故障排查 - 日志分析、事件查看、状态检查
  3. 性能监控 - 资源使用、性能指标

最佳实践

  1. 资源规划 - 合理设置资源请求和限制
  2. 健康检查 - 配置适当的探针
  3. 安全配置 - 最小权限原则
  4. 监控告警 - 完善的监控体系

下一章我们将学习Deployment和ReplicaSet,了解如何管理Pod的副本和更新策略。