10.1 Kubernetes安全概述

10.1.1 安全模型

# Kubernetes安全模型说明
apiVersion: v1
kind: ConfigMap
metadata:
  name: security-model
data:
  security-layers: |
    Kubernetes安全采用深度防御策略,包含多个安全层次:
    
    1. 集群安全:
       - API Server认证和授权
       - etcd数据加密
       - 网络策略和隔离
       - 节点安全配置
    
    2. 工作负载安全:
       - Pod安全策略
       - 容器镜像安全
       - 运行时安全
       - 资源限制
    
    3. 数据安全:
       - Secret管理
       - 数据加密
       - 备份安全
       - 审计日志
    
    4. 网络安全:
       - 网络策略
       - 服务网格
       - TLS加密
       - 入口控制
  
  security-principles: |
    安全设计原则:
    
    1. 最小权限原则:
       - 仅授予必要的最小权限
       - 定期审查和回收权限
       - 使用专用服务账户
    
    2. 深度防御:
       - 多层安全控制
       - 冗余安全机制
       - 故障安全设计
    
    3. 零信任架构:
       - 不信任任何网络位置
       - 验证每个请求
       - 持续监控和验证
    
    4. 安全左移:
       - 在开发阶段集成安全
       - 自动化安全检查
       - 持续安全测试

10.1.2 威胁模型

apiVersion: v1
kind: ConfigMap
metadata:
  name: threat-model
data:
  common-threats: |
    常见安全威胁:
    
    1. 未授权访问:
       - 弱认证机制
       - 权限提升
       - 横向移动
    
    2. 恶意容器:
       - 恶意镜像
       - 容器逃逸
       - 特权滥用
    
    3. 数据泄露:
       - 敏感数据暴露
       - 配置泄露
       - 日志泄露
    
    4. 拒绝服务:
       - 资源耗尽
       - 网络攻击
       - 恶意负载
    
    5. 供应链攻击:
       - 恶意依赖
       - 镜像投毒
       - 构建系统入侵
  
  mitigation-strategies: |
    缓解策略:
    
    1. 访问控制:
       - 强认证机制
       - 细粒度授权
       - 多因素认证
    
    2. 容器安全:
       - 镜像扫描
       - 运行时保护
       - 安全基线
    
    3. 网络隔离:
       - 网络策略
       - 微分段
       - 流量监控
    
    4. 监控审计:
       - 实时监控
       - 异常检测
       - 审计日志

10.2 认证和授权

10.2.1 认证机制

# 客户端证书认证
apiVersion: v1
kind: ConfigMap
metadata:
  name: client-cert-auth
data:
  create-user-cert.sh: |
    #!/bin/bash
    # 创建用户证书脚本
    
    USER_NAME=${1:-"developer"}
    GROUP_NAME=${2:-"developers"}
    
    echo "创建用户 $USER_NAME 的证书"
    
    # 生成私钥
    openssl genrsa -out ${USER_NAME}.key 2048
    
    # 创建证书签名请求
    openssl req -new -key ${USER_NAME}.key -out ${USER_NAME}.csr -subj "/CN=${USER_NAME}/O=${GROUP_NAME}"
    
    # 使用集群CA签名证书
    openssl x509 -req -in ${USER_NAME}.csr -CA /etc/kubernetes/pki/ca.crt -CAkey /etc/kubernetes/pki/ca.key -CAcreateserial -out ${USER_NAME}.crt -days 365
    
    # 创建kubeconfig
    kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/pki/ca.crt --server=https://kubernetes.default.svc.cluster.local:6443 --kubeconfig=${USER_NAME}.kubeconfig
    
    kubectl config set-credentials ${USER_NAME} --client-certificate=${USER_NAME}.crt --client-key=${USER_NAME}.key --kubeconfig=${USER_NAME}.kubeconfig
    
    kubectl config set-context ${USER_NAME}@kubernetes --cluster=kubernetes --user=${USER_NAME} --kubeconfig=${USER_NAME}.kubeconfig
    
    kubectl config use-context ${USER_NAME}@kubernetes --kubeconfig=${USER_NAME}.kubeconfig
    
    echo "用户证书创建完成: ${USER_NAME}.kubeconfig"
---
# ServiceAccount Token认证
apiVersion: v1
kind: ServiceAccount
metadata:
  name: api-service-account
  namespace: default
---
apiVersion: v1
kind: Secret
metadata:
  name: api-service-account-token
  namespace: default
  annotations:
    kubernetes.io/service-account.name: api-service-account
type: kubernetes.io/service-account-token
---
# OIDC认证配置
apiVersion: v1
kind: ConfigMap
metadata:
  name: oidc-config
data:
  apiserver-config: |
    # API Server OIDC配置参数
    --oidc-issuer-url=https://your-oidc-provider.com
    --oidc-client-id=kubernetes
    --oidc-username-claim=email
    --oidc-groups-claim=groups
    --oidc-ca-file=/etc/kubernetes/pki/oidc-ca.crt
  
  oidc-login.sh: |
    #!/bin/bash
    # OIDC登录脚本
    
    OIDC_ISSUER="https://your-oidc-provider.com"
    CLIENT_ID="kubernetes"
    CLIENT_SECRET="your-client-secret"
    
    # 获取授权码
    echo "请访问以下URL进行授权:"
    echo "${OIDC_ISSUER}/auth?client_id=${CLIENT_ID}&response_type=code&scope=openid+email+groups&redirect_uri=http://localhost:8080/callback"
    
    # 等待用户输入授权码
    read -p "请输入授权码: " AUTH_CODE
    
    # 交换访问令牌
    TOKEN_RESPONSE=$(curl -s -X POST "${OIDC_ISSUER}/token" \
      -H "Content-Type: application/x-www-form-urlencoded" \
      -d "grant_type=authorization_code&code=${AUTH_CODE}&client_id=${CLIENT_ID}&client_secret=${CLIENT_SECRET}&redirect_uri=http://localhost:8080/callback")
    
    ID_TOKEN=$(echo $TOKEN_RESPONSE | jq -r '.id_token')
    
    # 配置kubectl
    kubectl config set-credentials oidc-user \
      --auth-provider=oidc \
      --auth-provider-arg=idp-issuer-url=${OIDC_ISSUER} \
      --auth-provider-arg=client-id=${CLIENT_ID} \
      --auth-provider-arg=client-secret=${CLIENT_SECRET} \
      --auth-provider-arg=id-token=${ID_TOKEN}
    
    echo "OIDC认证配置完成"

10.2.2 RBAC权限控制

# 基础RBAC配置
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: deployment-manager
rules:
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["events"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: namespace-admin
rules:
- apiGroups: [""]
  resources: ["*"]
  verbs: ["*"]
- apiGroups: ["apps"]
  resources: ["*"]
  verbs: ["*"]
- apiGroups: ["extensions"]
  resources: ["*"]
  verbs: ["*"]
---
# 角色绑定
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: pod-readers
subjects:
- kind: User
  name: developer
  apiGroup: rbac.authorization.k8s.io
- kind: Group
  name: developers
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  namespace: development
  name: deployment-managers
subjects:
- kind: User
  name: devops-user
  apiGroup: rbac.authorization.k8s.io
- kind: ServiceAccount
  name: deployment-sa
  namespace: development
roleRef:
  kind: ClusterRole
  name: deployment-manager
  apiGroup: rbac.authorization.k8s.io
---
# 命名空间级别的角色
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: production
  name: config-manager
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  resourceNames: ["app-config", "app-secret"]
  verbs: ["delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  namespace: production
  name: config-managers
subjects:
- kind: User
  name: config-admin
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: config-manager
  apiGroup: rbac.authorization.k8s.io

10.2.3 ServiceAccount管理

# 应用专用ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: webapp-service-account
  namespace: production
  annotations:
    description: "Web应用专用服务账户"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: production
  name: webapp-role
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["webapp-secret"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: webapp-binding
  namespace: production
subjects:
- kind: ServiceAccount
  name: webapp-service-account
  namespace: production
roleRef:
  kind: Role
  name: webapp-role
  apiGroup: rbac.authorization.k8s.io
---
# 使用ServiceAccount的Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: webapp
  template:
    metadata:
      labels:
        app: webapp
    spec:
      serviceAccountName: webapp-service-account
      automountServiceAccountToken: true
      containers:
      - name: webapp
        image: nginx:1.20
        ports:
        - containerPort: 80
        env:
        - name: KUBERNETES_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: service-account-token
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          readOnly: true
      volumes:
      - name: service-account-token
        projected:
          sources:
          - serviceAccountToken:
              path: token
              expirationSeconds: 3600
          - configMap:
              name: kube-root-ca.crt
              items:
              - key: ca.crt
                path: ca.crt
          - downwardAPI:
              items:
              - path: namespace
                fieldRef:
                  fieldPath: metadata.namespace

10.3 Pod安全策略

10.3.1 Pod Security Standards

# Pod Security Standards配置
apiVersion: v1
kind: Namespace
metadata:
  name: restricted-namespace
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
---
apiVersion: v1
kind: Namespace
metadata:
  name: baseline-namespace
  labels:
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/audit: baseline
    pod-security.kubernetes.io/warn: baseline
---
apiVersion: v1
kind: Namespace
metadata:
  name: privileged-namespace
  labels:
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/warn: privileged
---
# 安全Pod示例
apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
  namespace: restricted-namespace
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: secure-container
    image: nginx:1.20
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      runAsUser: 1000
      runAsGroup: 1000
      capabilities:
        drop:
        - ALL
      seccompProfile:
        type: RuntimeDefault
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
    volumeMounts:
    - name: tmp-volume
      mountPath: /tmp
    - name: cache-volume
      mountPath: /var/cache/nginx
    - name: run-volume
      mountPath: /var/run
  volumes:
  - name: tmp-volume
    emptyDir: {}
  - name: cache-volume
    emptyDir: {}
  - name: run-volume
    emptyDir: {}

10.3.2 SecurityContext配置

# 详细的SecurityContext配置
apiVersion: apps/v1
kind: Deployment
metadata:
  name: security-demo
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: security-demo
  template:
    metadata:
      labels:
        app: security-demo
    spec:
      # Pod级别的安全上下文
      securityContext:
        # 运行为非root用户
        runAsNonRoot: true
        runAsUser: 1000
        runAsGroup: 1000
        # 设置文件系统组ID
        fsGroup: 1000
        # 设置补充组
        supplementalGroups: [2000, 3000]
        # 设置seccomp配置
        seccompProfile:
          type: RuntimeDefault
        # 设置SELinux选项
        seLinuxOptions:
          level: "s0:c123,c456"
        # 设置sysctl参数
        sysctls:
        - name: net.core.somaxconn
          value: "1024"
      containers:
      - name: app
        image: nginx:1.20
        # 容器级别的安全上下文
        securityContext:
          # 禁止特权提升
          allowPrivilegeEscalation: false
          # 只读根文件系统
          readOnlyRootFilesystem: true
          # 运行为非root用户
          runAsNonRoot: true
          runAsUser: 1000
          runAsGroup: 1000
          # 删除所有capabilities
          capabilities:
            drop:
            - ALL
            # 只添加必要的capabilities
            add:
            - NET_BIND_SERVICE
          # 设置seccomp配置
          seccompProfile:
            type: RuntimeDefault
        ports:
        - containerPort: 8080
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"
        # 挂载临时卷用于写入
        volumeMounts:
        - name: tmp-volume
          mountPath: /tmp
        - name: var-cache
          mountPath: /var/cache/nginx
        - name: var-run
          mountPath: /var/run
        # 健康检查
        livenessProbe:
          httpGet:
            path: /
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: tmp-volume
        emptyDir: {}
      - name: var-cache
        emptyDir: {}
      - name: var-run
        emptyDir: {}

10.3.3 Network Policy

# 默认拒绝所有流量
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
---
# 允许特定应用间通信
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
---
# 允许访问外部服务
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-external-access
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: webapp
  policyTypes:
  - Egress
  egress:
  # 允许DNS查询
  - to: []
    ports:
    - protocol: UDP
      port: 53
  # 允许访问外部API
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    ports:
    - protocol: TCP
      port: 443
  # 允许访问特定外部IP
  - to:
    - ipBlock:
        cidr: 10.0.0.0/8
        except:
        - 10.0.1.0/24
    ports:
    - protocol: TCP
      port: 5432
---
# 基于命名空间的网络策略
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-monitoring
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: webapp
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: monitoring
    ports:
    - protocol: TCP
      port: 8080
---
# 复杂的网络策略示例
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: complex-network-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      tier: database
  policyTypes:
  - Ingress
  - Egress
  ingress:
  # 允许来自应用层的连接
  - from:
    - podSelector:
        matchLabels:
          tier: application
    - namespaceSelector:
        matchLabels:
          environment: production
      podSelector:
        matchLabels:
          tier: application
    ports:
    - protocol: TCP
      port: 5432
  # 允许来自监控系统的连接
  - from:
    - namespaceSelector:
        matchLabels:
          name: monitoring
    ports:
    - protocol: TCP
      port: 9187
  egress:
  # 允许DNS查询
  - to: []
    ports:
    - protocol: UDP
      port: 53
  # 允许访问备份服务
  - to:
    - ipBlock:
        cidr: 192.168.1.0/24
    ports:
    - protocol: TCP
      port: 22

10.4 Secret和配置安全

10.4.1 Secret最佳实践

# 加密的Secret
apiVersion: v1
kind: Secret
metadata:
  name: database-credentials
  namespace: production
  annotations:
    description: "数据库连接凭据"
type: Opaque
data:
  username: cG9zdGdyZXM=  # postgres (base64编码)
  password: c3VwZXJzZWNyZXRwYXNzd29yZA==  # supersecretpassword (base64编码)
  host: ZGIucHJvZHVjdGlvbi5zdmMuY2x1c3Rlci5sb2NhbA==  # db.production.svc.cluster.local
  port: NTQzMg==  # 5432
---
# TLS Secret
apiVersion: v1
kind: Secret
metadata:
  name: tls-secret
  namespace: production
type: kubernetes.io/tls
data:
  tls.crt: LS0tLS1CRUdJTi... # TLS证书内容
  tls.key: LS0tLS1CRUdJTi... # TLS私钥内容
---
# Docker Registry Secret
apiVersion: v1
kind: Secret
metadata:
  name: registry-secret
  namespace: production
type: kubernetes.io/dockerconfigjson
data:
  .dockerconfigjson: eyJhdXRocyI6eyJyZWdpc3RyeS5leGFtcGxlLmNvbSI6eyJ1c2VybmFtZSI6InVzZXIiLCJwYXNzd29yZCI6InBhc3MiLCJhdXRoIjoiZFhObGNqcHdZWE56In19fQ==
---
# 使用Secret的Pod
apiVersion: v1
kind: Pod
metadata:
  name: secure-app
  namespace: production
spec:
  containers:
  - name: app
    image: myapp:latest
    env:
    # 从Secret获取环境变量
    - name: DB_USERNAME
      valueFrom:
        secretKeyRef:
          name: database-credentials
          key: username
    - name: DB_PASSWORD
      valueFrom:
        secretKeyRef:
          name: database-credentials
          key: password
    - name: DB_HOST
      valueFrom:
        secretKeyRef:
          name: database-credentials
          key: host
    volumeMounts:
    # 挂载Secret作为文件
    - name: tls-certs
      mountPath: /etc/tls
      readOnly: true
    - name: config-volume
      mountPath: /etc/config
      readOnly: true
  volumes:
  - name: tls-certs
    secret:
      secretName: tls-secret
      defaultMode: 0400
  - name: config-volume
    secret:
      secretName: database-credentials
      items:
      - key: host
        path: db_host
        mode: 0400
  imagePullSecrets:
  - name: registry-secret

10.4.2 外部Secret管理

# External Secrets Operator配置
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: vault-backend
  namespace: production
spec:
  provider:
    vault:
      server: "https://vault.example.com"
      path: "secret"
      version: "v2"
      auth:
        kubernetes:
          mountPath: "kubernetes"
          role: "external-secrets"
          serviceAccountRef:
            name: external-secrets-sa
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: vault-secret
  namespace: production
spec:
  refreshInterval: 15s
  secretStoreRef:
    name: vault-backend
    kind: SecretStore
  target:
    name: database-credentials
    creationPolicy: Owner
  data:
  - secretKey: username
    remoteRef:
      key: database
      property: username
  - secretKey: password
    remoteRef:
      key: database
      property: password
---
# AWS Secrets Manager集成
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: aws-secrets-manager
  namespace: production
spec:
  provider:
    aws:
      service: SecretsManager
      region: us-west-2
      auth:
        secretRef:
          accessKeyIDSecretRef:
            name: aws-credentials
            key: access-key-id
          secretAccessKeySecretRef:
            name: aws-credentials
            key: secret-access-key
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: aws-secret
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: SecretStore
  target:
    name: app-secrets
    creationPolicy: Owner
  data:
  - secretKey: api-key
    remoteRef:
      key: prod/app/api-key
  - secretKey: database-url
    remoteRef:
      key: prod/app/database-url

10.4.3 配置安全扫描

# 安全扫描配置
apiVersion: v1
kind: ConfigMap
metadata:
  name: security-scan-config
data:
  scan-script.sh: |
    #!/bin/bash
    # Kubernetes安全扫描脚本
    
    echo "=== Kubernetes安全扫描 ==="
    
    # 检查Secret使用
    echo "1. 检查Secret配置:"
    kubectl get secrets --all-namespaces -o json | jq -r '.items[] | select(.type=="Opaque") | "\(.metadata.namespace)/\(.metadata.name)"'
    
    # 检查ServiceAccount权限
    echo "\n2. 检查ServiceAccount权限:"
    kubectl get clusterrolebindings -o json | jq -r '.items[] | select(.subjects[]?.kind=="ServiceAccount") | "\(.metadata.name): \(.subjects[].name)"'
    
    # 检查Pod安全上下文
    echo "\n3. 检查Pod安全上下文:"
    kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.securityContext.runAsRoot!=false) | "\(.metadata.namespace)/\(.metadata.name): 可能以root运行"'
    
    # 检查网络策略
    echo "\n4. 检查网络策略:"
    kubectl get networkpolicies --all-namespaces
    
    # 检查资源限制
    echo "\n5. 检查资源限制:"
    kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.containers[].resources.limits==null) | "\(.metadata.namespace)/\(.metadata.name): 缺少资源限制"'
  
  kube-bench.yaml: |
    # kube-bench安全基准测试
    apiVersion: batch/v1
    kind: Job
    metadata:
      name: kube-bench
    spec:
      template:
        spec:
          hostPID: true
          nodeSelector:
            kubernetes.io/os: linux
          tolerations:
          - operator: Exists
            effect: NoSchedule
          containers:
          - name: kube-bench
            image: aquasec/kube-bench:latest
            command: ["kube-bench"]
            args: ["--version", "1.23"]
            volumeMounts:
            - name: var-lib-etcd
              mountPath: /var/lib/etcd
              readOnly: true
            - name: var-lib-kubelet
              mountPath: /var/lib/kubelet
              readOnly: true
            - name: etc-systemd
              mountPath: /etc/systemd
              readOnly: true
            - name: etc-kubernetes
              mountPath: /etc/kubernetes
              readOnly: true
            - name: usr-bin
              mountPath: /usr/local/mount-from-host/bin
              readOnly: true
          restartPolicy: Never
          volumes:
          - name: var-lib-etcd
            hostPath:
              path: "/var/lib/etcd"
          - name: var-lib-kubelet
            hostPath:
              path: "/var/lib/kubelet"
          - name: etc-systemd
            hostPath:
              path: "/etc/systemd"
          - name: etc-kubernetes
            hostPath:
              path: "/etc/kubernetes"
          - name: usr-bin
            hostPath:
              path: "/usr/bin"

10.5 镜像安全

10.5.1 镜像扫描和策略

# 镜像安全策略
apiVersion: v1
kind: ConfigMap
metadata:
  name: image-security-policy
data:
  policy.yaml: |
    # 镜像安全策略配置
    imagePolicy:
      # 允许的镜像仓库
      allowedRegistries:
        - "registry.company.com"
        - "gcr.io/company-project"
        - "docker.io/library"  # 官方镜像
      
      # 禁止的镜像标签
      deniedTags:
        - "latest"
        - "master"
        - "main"
      
      # 必需的镜像签名
      requireSignature: true
      
      # 漏洞扫描要求
      vulnerabilityPolicy:
        maxCritical: 0
        maxHigh: 2
        maxMedium: 10
      
      # 基础镜像要求
      baseImagePolicy:
        allowedBaseImages:
          - "alpine:3.15"
          - "ubuntu:20.04"
          - "debian:11-slim"
        
        # 禁止的基础镜像
        deniedBaseImages:
          - "*:latest"
          - "centos:*"
  
  scan-script.sh: |
    #!/bin/bash
    # 镜像安全扫描脚本
    
    IMAGE_NAME=${1}
    
    if [ -z "$IMAGE_NAME" ]; then
        echo "用法: $0 <image-name>"
        exit 1
    fi
    
    echo "=== 扫描镜像: $IMAGE_NAME ==="
    
    # 使用Trivy扫描漏洞
    echo "1. 漏洞扫描:"
    trivy image --severity HIGH,CRITICAL $IMAGE_NAME
    
    # 检查镜像配置
    echo "\n2. 镜像配置检查:"
    docker inspect $IMAGE_NAME | jq '.[0].Config'
    
    # 检查是否以root用户运行
    USER_ID=$(docker inspect $IMAGE_NAME | jq -r '.[0].Config.User')
    if [ "$USER_ID" = "null" ] || [ "$USER_ID" = "0" ] || [ "$USER_ID" = "root" ]; then
        echo "警告: 镜像可能以root用户运行"
    fi
    
    # 检查暴露的端口
    echo "\n3. 暴露的端口:"
    docker inspect $IMAGE_NAME | jq -r '.[0].Config.ExposedPorts // {} | keys[]'
    
    echo "\n=== 扫描完成 ==="
---
# OPA Gatekeeper策略
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredimageregistry
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredImageRegistry
      validation:
        type: object
        properties:
          allowedRegistries:
            type: array
            items:
              type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredimageregistry
        
        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          not starts_with_allowed_registry(container.image)
          msg := sprintf("镜像 '%v' 来自不允许的仓库", [container.image])
        }
        
        starts_with_allowed_registry(image) {
          allowed_registry := input.parameters.allowedRegistries[_]
          startswith(image, allowed_registry)
        }
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredImageRegistry
metadata:
  name: must-use-approved-registry
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
      - apiGroups: ["apps"]
        kinds: ["Deployment"]
  parameters:
    allowedRegistries:
      - "registry.company.com/"
      - "gcr.io/company-project/"
      - "docker.io/library/"

10.5.2 镜像签名验证

# Cosign镜像签名验证
apiVersion: v1
kind: ConfigMap
metadata:
  name: cosign-config
data:
  verify-image.sh: |
    #!/bin/bash
    # 镜像签名验证脚本
    
    IMAGE=${1}
    PUBLIC_KEY=${2:-"cosign.pub"}
    
    if [ -z "$IMAGE" ]; then
        echo "用法: $0 <image> [public-key]"
        exit 1
    fi
    
    echo "验证镜像签名: $IMAGE"
    
    # 验证镜像签名
    if cosign verify --key $PUBLIC_KEY $IMAGE; then
        echo "✓ 镜像签名验证成功"
    else
        echo "✗ 镜像签名验证失败"
        exit 1
    fi
    
    # 验证SBOM
    echo "\n验证SBOM:"
    cosign verify-attestation --key $PUBLIC_KEY --type spdx $IMAGE
  
  sign-image.sh: |
    #!/bin/bash
    # 镜像签名脚本
    
    IMAGE=${1}
    PRIVATE_KEY=${2:-"cosign.key"}
    
    if [ -z "$IMAGE" ]; then
        echo "用法: $0 <image> [private-key]"
        exit 1
    fi
    
    echo "签名镜像: $IMAGE"
    
    # 签名镜像
    cosign sign --key $PRIVATE_KEY $IMAGE
    
    # 生成并附加SBOM
    syft $IMAGE -o spdx-json > sbom.spdx.json
    cosign attest --key $PRIVATE_KEY --predicate sbom.spdx.json $IMAGE
    
    echo "镜像签名完成"
---
# Admission Controller配置
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionWebhook
metadata:
  name: image-signature-webhook
webhooks:
- name: verify-image-signature.example.com
  clientConfig:
    service:
      name: image-signature-webhook
      namespace: kube-system
      path: "/validate"
  rules:
  - operations: ["CREATE", "UPDATE"]
    apiGroups: [""]
    apiVersions: ["v1"]
    resources: ["pods"]
  - operations: ["CREATE", "UPDATE"]
    apiGroups: ["apps"]
    apiVersions: ["v1"]
    resources: ["deployments", "replicasets", "daemonsets", "statefulsets"]
  admissionReviewVersions: ["v1", "v1beta1"]
  sideEffects: None
  failurePolicy: Fail

10.5.3 运行时安全

# Falco运行时安全监控
apiVersion: v1
kind: ConfigMap
metadata:
  name: falco-config
  namespace: falco-system
data:
  falco.yaml: |
    rules_file:
      - /etc/falco/falco_rules.yaml
      - /etc/falco/falco_rules.local.yaml
      - /etc/falco/k8s_audit_rules.yaml
    
    time_format_iso_8601: true
    json_output: true
    json_include_output_property: true
    
    log_stderr: true
    log_syslog: false
    log_level: info
    
    priority: debug
    
    buffered_outputs: false
    
    syscall_event_drops:
      actions:
        - log
        - alert
      rate: 0.03333
      max_burst: 1000
    
    outputs:
      rate: 1
      max_burst: 1000
    
    syslog_output:
      enabled: false
    
    file_output:
      enabled: false
    
    stdout_output:
      enabled: true
    
    webserver:
      enabled: true
      listen_port: 8765
      k8s_healthz_endpoint: /healthz
      ssl_enabled: false
    
    grpc:
      enabled: false
    
    grpc_output:
      enabled: false
  
  custom_rules.yaml: |
    # 自定义Falco规则
    - rule: Detect crypto miners
      desc: Detect cryptocurrency miners
      condition: >
        spawned_process and
        (proc.name in (xmrig, minergate, cpuminer, t-rex, phoenixminer) or
         proc.cmdline contains "stratum+tcp" or
         proc.cmdline contains "mining.pool")
      output: >
        Cryptocurrency miner detected (user=%user.name command=%proc.cmdline
        container=%container.name image=%container.image.repository)
      priority: CRITICAL
      tags: [cryptocurrency, miners]
    
    - rule: Detect suspicious network activity
      desc: Detect suspicious network connections
      condition: >
        inbound_outbound and
        fd.typechar=4 and
        (fd.net.cip.name contains ".onion" or
         fd.net.cip.name contains "tor" or
         fd.net.sport in (4444, 5555, 7777, 8888, 9999))
      output: >
        Suspicious network activity (user=%user.name command=%proc.cmdline
        connection=%fd.name container=%container.name)
      priority: WARNING
      tags: [network, suspicious]
    
    - rule: Detect privilege escalation
      desc: Detect attempts to escalate privileges
      condition: >
        spawned_process and
        (proc.name in (sudo, su, doas) or
         proc.cmdline contains "chmod +s" or
         proc.cmdline contains "setuid")
      output: >
        Privilege escalation attempt (user=%user.name command=%proc.cmdline
        container=%container.name image=%container.image.repository)
      priority: HIGH
      tags: [privilege_escalation]
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: falco
  namespace: falco-system
spec:
  selector:
    matchLabels:
      app: falco
  template:
    metadata:
      labels:
        app: falco
    spec:
      serviceAccountName: falco
      hostNetwork: true
      hostPID: true
      containers:
      - name: falco
        image: falcosecurity/falco:0.32.0
        args:
          - /usr/bin/falco
          - --cri
          - /run/containerd/containerd.sock
          - -K
          - /var/run/secrets/kubernetes.io/serviceaccount/token
          - -k
          - https://kubernetes.default
          - -pk
        securityContext:
          privileged: true
        volumeMounts:
        - name: proc
          mountPath: /host/proc
          readOnly: true
        - name: boot
          mountPath: /host/boot
          readOnly: true
        - name: lib-modules
          mountPath: /host/lib/modules
          readOnly: true
        - name: usr
          mountPath: /host/usr
          readOnly: true
        - name: etc
          mountPath: /host/etc
          readOnly: true
        - name: dev
          mountPath: /host/dev
          readOnly: true
        - name: containerd-socket
          mountPath: /run/containerd/containerd.sock
        - name: falco-config
          mountPath: /etc/falco
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: boot
        hostPath:
          path: /boot
      - name: lib-modules
        hostPath:
          path: /lib/modules
      - name: usr
        hostPath:
          path: /usr
      - name: etc
        hostPath:
          path: /etc
      - name: dev
        hostPath:
          path: /dev
      - name: containerd-socket
        hostPath:
          path: /run/containerd/containerd.sock
      - name: falco-config
        configMap:
          name: falco-config
      tolerations:
      - operator: Exists
        effect: NoSchedule

10.6 安全管理脚本

10.6.1 安全审计脚本

#!/bin/bash
# security-audit.sh

echo "=== Kubernetes安全审计 ==="
echo "审计时间: $(date)"
echo ""

# 检查RBAC配置
echo "1. RBAC配置检查:"
echo "ClusterRole数量: $(kubectl get clusterroles --no-headers | wc -l)"
echo "ClusterRoleBinding数量: $(kubectl get clusterrolebindings --no-headers | wc -l)"
echo "Role数量: $(kubectl get roles --all-namespaces --no-headers | wc -l)"
echo "RoleBinding数量: $(kubectl get rolebindings --all-namespaces --no-headers | wc -l)"
echo ""

# 检查特权Pod
echo "2. 特权Pod检查:"
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.securityContext.privileged==true or .spec.containers[].securityContext.privileged==true) | "\(.metadata.namespace)/\(.metadata.name): 特权Pod"'
echo ""

# 检查hostNetwork Pod
echo "3. hostNetwork Pod检查:"
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.hostNetwork==true) | "\(.metadata.namespace)/\(.metadata.name): 使用hostNetwork"'
echo ""

# 检查hostPID Pod
echo "4. hostPID Pod检查:"
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.hostPID==true) | "\(.metadata.namespace)/\(.metadata.name): 使用hostPID"'
echo ""

# 检查以root运行的Pod
echo "5. root用户Pod检查:"
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.securityContext.runAsUser==0 or .spec.containers[].securityContext.runAsUser==0) | "\(.metadata.namespace)/\(.metadata.name): 以root用户运行"'
echo ""

# 检查没有资源限制的Pod
echo "6. 资源限制检查:"
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.containers[].resources.limits==null) | "\(.metadata.namespace)/\(.metadata.name): 缺少资源限制"'
echo ""

# 检查Secret使用
echo "7. Secret使用检查:"
echo "Secret总数: $(kubectl get secrets --all-namespaces --no-headers | wc -l)"
echo "使用Secret的Pod:"
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.volumes[]?.secret or .spec.containers[].env[]?.valueFrom.secretKeyRef) | "\(.metadata.namespace)/\(.metadata.name)"' | sort | uniq
echo ""

# 检查网络策略
echo "8. 网络策略检查:"
echo "网络策略总数: $(kubectl get networkpolicies --all-namespaces --no-headers | wc -l)"
echo "没有网络策略的命名空间:"
for ns in $(kubectl get namespaces -o jsonpath='{.items[*].metadata.name}'); do
    if [ $(kubectl get networkpolicies -n $ns --no-headers 2>/dev/null | wc -l) -eq 0 ]; then
        echo "  $ns"
    fi
done
echo ""

# 检查ServiceAccount
echo "9. ServiceAccount检查:"
echo "ServiceAccount总数: $(kubectl get serviceaccounts --all-namespaces --no-headers | wc -l)"
echo "使用默认ServiceAccount的Pod:"
kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.serviceAccountName=="default" or .spec.serviceAccountName==null) | "\(.metadata.namespace)/\(.metadata.name)"'
echo ""

# 检查镜像策略
echo "10. 镜像策略检查:"
echo "使用latest标签的镜像:"
kubectl get pods --all-namespaces -o json | jq -r '.items[] | .spec.containers[] | select(.image | endswith(":latest")) | "\(.image)"' | sort | uniq
echo ""
echo "使用非官方仓库的镜像:"
kubectl get pods --all-namespaces -o json | jq -r '.items[] | .spec.containers[] | select(.image | startswith("docker.io/") | not) | select(.image | startswith("gcr.io/") | not) | select(.image | startswith("registry.k8s.io/") | not) | "\(.image)"' | sort | uniq
echo ""

echo "=== 安全审计完成 ==="

10.6.2 安全加固脚本

#!/bin/bash
# security-hardening.sh

echo "=== Kubernetes安全加固 ==="

# 创建安全命名空间
echo "1. 创建安全命名空间"
kubectl create namespace security-system --dry-run=client -o yaml | kubectl apply -f -

# 应用Pod安全策略
echo "2. 应用Pod安全策略"
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
EOF

# 创建默认网络策略
echo "3. 创建默认网络策略"
for ns in $(kubectl get namespaces -o jsonpath='{.items[*].metadata.name}' | grep -v "kube-system\|kube-public\|kube-node-lease"); do
    cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: $ns
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
EOF
done

# 创建安全的ServiceAccount
echo "4. 创建安全的ServiceAccount"
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
  name: restricted-sa
  namespace: default
automountServiceAccountToken: false
EOF

# 创建最小权限RBAC
echo "5. 创建最小权限RBAC"
cat <<EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
- kind: ServiceAccount
  name: restricted-sa
  namespace: default
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io
EOF

# 设置资源配额
echo "6. 设置资源配额"
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ResourceQuota
metadata:
  name: security-quota
  namespace: default
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi
    pods: "10"
    secrets: "10"
    configmaps: "10"
EOF

# 创建LimitRange
echo "7. 创建LimitRange"
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: LimitRange
metadata:
  name: security-limits
  namespace: default
spec:
  limits:
  - default:
      cpu: 500m
      memory: 512Mi
    defaultRequest:
      cpu: 100m
      memory: 128Mi
    type: Container
EOF

echo "\n=== 安全加固完成 ==="
echo "建议:"
echo "1. 定期审查RBAC权限"
echo "2. 启用审计日志"
echo "3. 部署安全扫描工具"
echo "4. 配置镜像签名验证"
echo "5. 启用网络策略"

10.6.3 合规性检查脚本

#!/bin/bash
# compliance-check.sh

echo "=== Kubernetes合规性检查 ==="
echo "检查时间: $(date)"
echo ""

# CIS Kubernetes Benchmark检查
echo "1. CIS Kubernetes Benchmark检查:"

# 检查API Server配置
echo "1.1 API Server配置:"
echo "  检查匿名访问是否禁用..."
kubectl get configmap -n kube-system kube-apiserver -o yaml | grep -q "anonymous-auth=false" && echo "  ✓ 匿名访问已禁用" || echo "  ✗ 匿名访问未禁用"

echo "  检查基本认证是否禁用..."
kubectl get configmap -n kube-system kube-apiserver -o yaml | grep -q "basic-auth-file" && echo "  ✗ 基本认证未禁用" || echo "  ✓ 基本认证已禁用"

echo "  检查令牌认证是否禁用..."
kubectl get configmap -n kube-system kube-apiserver -o yaml | grep -q "token-auth-file" && echo "  ✗ 令牌认证未禁用" || echo "  ✓ 令牌认证已禁用"

# 检查etcd配置
echo "\n1.2 etcd配置:"
echo "  检查etcd数据加密..."
kubectl get secrets -n kube-system | grep -q "encryption-config" && echo "  ✓ etcd数据加密已启用" || echo "  ✗ etcd数据加密未启用"

# 检查kubelet配置
echo "\n1.3 kubelet配置:"
echo "  检查只读端口是否禁用..."
kubectl get nodes -o jsonpath='{.items[*].status.nodeInfo.kubeletVersion}' | head -1

# 检查网络策略
echo "\n2. 网络安全检查:"
echo "2.1 网络策略覆盖率:"
TOTAL_NS=$(kubectl get namespaces --no-headers | wc -l)
NS_WITH_NP=$(kubectl get networkpolicies --all-namespaces --no-headers | awk '{print $1}' | sort | uniq | wc -l)
echo "  命名空间总数: $TOTAL_NS"
echo "  有网络策略的命名空间: $NS_WITH_NP"
echo "  网络策略覆盖率: $(echo "scale=2; $NS_WITH_NP * 100 / $TOTAL_NS" | bc)%"

# 检查Pod安全策略
echo "\n3. Pod安全检查:"
echo "3.1 特权Pod检查:"
PRIVILEGED_PODS=$(kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.securityContext.privileged==true or .spec.containers[].securityContext.privileged==true) | "\(.metadata.namespace)/\(.metadata.name)"' | wc -l)
echo "  特权Pod数量: $PRIVILEGED_PODS"

echo "\n3.2 hostNetwork Pod检查:"
HOST_NETWORK_PODS=$(kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.hostNetwork==true) | "\(.metadata.namespace)/\(.metadata.name)"' | wc -l)
echo "  hostNetwork Pod数量: $HOST_NETWORK_PODS"

echo "\n3.3 root用户Pod检查:"
ROOT_PODS=$(kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.securityContext.runAsUser==0 or .spec.containers[].securityContext.runAsUser==0 or (.spec.securityContext.runAsUser==null and .spec.containers[].securityContext.runAsUser==null)) | "\(.metadata.namespace)/\(.metadata.name)"' | wc -l)
echo "  可能以root运行的Pod数量: $ROOT_PODS"

# 检查RBAC配置
echo "\n4. RBAC安全检查:"
echo "4.1 过度权限检查:"
echo "  ClusterAdmin绑定:"
kubectl get clusterrolebindings -o json | jq -r '.items[] | select(.roleRef.name=="cluster-admin") | "  \(.metadata.name): \(.subjects[].name // .subjects[].metadata.name)"'

echo "\n4.2 ServiceAccount权限检查:"
echo "  使用默认ServiceAccount的Pod:"
DEFAULT_SA_PODS=$(kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.serviceAccountName=="default" or .spec.serviceAccountName==null) | "\(.metadata.namespace)/\(.metadata.name)"' | wc -l)
echo "  使用默认ServiceAccount的Pod数量: $DEFAULT_SA_PODS"

# 检查Secret安全
echo "\n5. Secret安全检查:"
echo "5.1 Secret使用检查:"
TOTAL_SECRETS=$(kubectl get secrets --all-namespaces --no-headers | wc -l)
echo "  Secret总数: $TOTAL_SECRETS"

echo "\n5.2 未使用的Secret:"
for ns in $(kubectl get namespaces -o jsonpath='{.items[*].metadata.name}'); do
    for secret in $(kubectl get secrets -n $ns -o jsonpath='{.items[*].metadata.name}' 2>/dev/null); do
        if [ "$secret" != "default-token" ] && ! kubectl get pods -n $ns -o json 2>/dev/null | jq -e ".items[] | select(.spec.volumes[]?.secret.secretName==\"$secret\" or .spec.containers[].env[]?.valueFrom.secretKeyRef.name==\"$secret\")" >/dev/null 2>&1; then
            echo "  $ns/$secret"
        fi
    done
done

# 生成合规性报告
echo "\n=== 合规性评分 ==="
SCORE=100

# 扣分项
if [ $PRIVILEGED_PODS -gt 0 ]; then
    SCORE=$((SCORE - 10))
    echo "特权Pod存在: -10分"
fi

if [ $HOST_NETWORK_PODS -gt 0 ]; then
    SCORE=$((SCORE - 10))
    echo "hostNetwork Pod存在: -10分"
fi

if [ $ROOT_PODS -gt 5 ]; then
    SCORE=$((SCORE - 15))
    echo "过多root用户Pod: -15分"
fi

if [ $DEFAULT_SA_PODS -gt 10 ]; then
    SCORE=$((SCORE - 10))
    echo "过多默认ServiceAccount使用: -10分"
fi

NP_COVERAGE=$(echo "scale=0; $NS_WITH_NP * 100 / $TOTAL_NS" | bc)
if [ $NP_COVERAGE -lt 80 ]; then
    SCORE=$((SCORE - 20))
    echo "网络策略覆盖率不足: -20分"
fi

echo "\n最终合规性评分: $SCORE/100"

if [ $SCORE -ge 90 ]; then
    echo "合规性等级: 优秀 ✓"
elif [ $SCORE -ge 80 ]; then
    echo "合规性等级: 良好 ⚠"
elif [ $SCORE -ge 70 ]; then
    echo "合规性等级: 一般 ⚠"
else
    echo "合规性等级: 需要改进 ✗"
fi

echo "\n=== 合规性检查完成 ==="

10.7 安全监控和告警

10.7.1 安全事件监控

# 安全监控配置
apiVersion: v1
kind: ConfigMap
metadata:
  name: security-monitoring
data:
  monitor-security-events.sh: |
    #!/bin/bash
    # 安全事件监控脚本
    
    echo "=== 安全事件监控 ==="
    
    # 监控失败的认证尝试
    echo "1. 监控认证失败:"
    kubectl logs -n kube-system -l component=kube-apiserver --tail=1000 | grep "authentication failed" | tail -10
    
    # 监控权限被拒绝的事件
    echo "\n2. 监控权限拒绝:"
    kubectl logs -n kube-system -l component=kube-apiserver --tail=1000 | grep "forbidden" | tail -10
    
    # 监控特权操作
    echo "\n3. 监控特权操作:"
    kubectl get events --all-namespaces --field-selector reason=FailedMount,reason=FailedScheduling | head -10
    
    # 监控异常Pod创建
    echo "\n4. 监控异常Pod:"
    kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.securityContext.privileged==true) | "特权Pod: \(.metadata.namespace)/\(.metadata.name)"'
    
    # 监控网络策略违规
    echo "\n5. 监控网络策略:"
    kubectl get events --all-namespaces --field-selector reason=NetworkPolicyViolation | head -5
  
  security-alerts.yaml: |
    # Prometheus告警规则
    groups:
    - name: kubernetes-security
      rules:
      - alert: PrivilegedPodCreated
        expr: increase(kube_pod_info{created_by_kind="ReplicaSet",pod=~".*"}[5m]) > 0 and on(pod) kube_pod_spec_security_context_privileged == 1
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: "特权Pod被创建"
          description: "检测到特权Pod {{ $labels.pod }} 在命名空间 {{ $labels.namespace }} 中被创建"
      
      - alert: RootUserPodCreated
        expr: increase(kube_pod_info{created_by_kind="ReplicaSet",pod=~".*"}[5m]) > 0 and on(pod) kube_pod_spec_security_context_run_as_user == 0
        for: 0m
        labels:
          severity: warning
        annotations:
          summary: "root用户Pod被创建"
          description: "检测到以root用户运行的Pod {{ $labels.pod }} 在命名空间 {{ $labels.namespace }} 中被创建"
      
      - alert: HostNetworkPodCreated
        expr: increase(kube_pod_info{created_by_kind="ReplicaSet",pod=~".*"}[5m]) > 0 and on(pod) kube_pod_spec_host_network == 1
        for: 0m
        labels:
          severity: warning
        annotations:
          summary: "hostNetwork Pod被创建"
          description: "检测到使用hostNetwork的Pod {{ $labels.pod }} 在命名空间 {{ $labels.namespace }} 中被创建"
      
      - alert: UnauthorizedAPIAccess
        expr: increase(apiserver_audit_total{verb="create",objectRef_resource="pods",user_username!~"system:.*"}[5m]) > 10
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "检测到异常API访问"
          description: "用户 {{ $labels.user_username }} 在短时间内创建了大量Pod"
      
      - alert: FailedAuthentication
        expr: increase(apiserver_audit_total{verb="create",code="401"}[5m]) > 5
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "检测到认证失败"
          description: "在过去5分钟内检测到 {{ $value }} 次认证失败"

10.7.2 审计日志配置

# 审计策略配置
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# 记录所有认证失败
- level: Metadata
  namespaces: [""]
  verbs: ["create", "update", "patch", "delete"]
  resources:
  - group: ""
    resources: ["secrets", "configmaps"]
  - group: "rbac.authorization.k8s.io"
    resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]

# 记录特权操作
- level: Request
  users: ["system:admin"]
  verbs: ["create", "update", "patch", "delete"]

# 记录Pod创建和删除
- level: Metadata
  resources:
  - group: ""
    resources: ["pods"]
  verbs: ["create", "delete"]

# 记录网络策略变更
- level: Request
  resources:
  - group: "networking.k8s.io"
    resources: ["networkpolicies"]
  verbs: ["create", "update", "patch", "delete"]

# 忽略只读操作
- level: None
  verbs: ["get", "list", "watch"]
---
# 审计日志分析脚本
apiVersion: v1
kind: ConfigMap
metadata:
  name: audit-analysis
data:
  analyze-audit-logs.sh: |
    #!/bin/bash
    # 审计日志分析脚本
    
    AUDIT_LOG=${1:-"/var/log/audit.log"}
    
    if [ ! -f "$AUDIT_LOG" ]; then
        echo "审计日志文件不存在: $AUDIT_LOG"
        exit 1
    fi
    
    echo "=== 审计日志分析 ==="
    echo "分析文件: $AUDIT_LOG"
    echo "分析时间: $(date)"
    echo ""
    
    # 分析认证失败
    echo "1. 认证失败分析:"
    grep '"code":401' $AUDIT_LOG | jq -r '.user.username' | sort | uniq -c | sort -nr | head -10
    
    # 分析权限被拒绝
    echo "\n2. 权限拒绝分析:"
    grep '"code":403' $AUDIT_LOG | jq -r '.user.username + " -> " + .verb + " " + .objectRef.resource' | sort | uniq -c | sort -nr | head -10
    
    # 分析Secret访问
    echo "\n3. Secret访问分析:"
    grep '"resource":"secrets"' $AUDIT_LOG | jq -r '.user.username + " -> " + .verb + " " + .objectRef.namespace + "/" + .objectRef.name' | sort | uniq -c | sort -nr | head -10
    
    # 分析特权操作
    echo "\n4. 特权操作分析:"
    grep '"user":{"username":"system:admin"' $AUDIT_LOG | jq -r '.verb + " " + .objectRef.resource + " " + .objectRef.name' | sort | uniq -c | sort -nr | head -10
    
    # 分析异常时间访问
    echo "\n5. 异常时间访问分析:"
    grep -E '"timestamp":"[0-9]{4}-[0-9]{2}-[0-9]{2}T(0[0-6]|2[2-3])' $AUDIT_LOG | jq -r '.user.username' | sort | uniq -c | sort -nr | head -5
    
    echo "\n=== 分析完成 ==="

10.7.3 安全告警集成

# AlertManager配置
apiVersion: v1
kind: ConfigMap
metadata:
  name: alertmanager-config
  namespace: monitoring
data:
  alertmanager.yml: |
    global:
      smtp_smarthost: 'smtp.company.com:587'
      smtp_from: 'alerts@company.com'
    
    route:
      group_by: ['alertname', 'cluster', 'service']
      group_wait: 10s
      group_interval: 10s
      repeat_interval: 1h
      receiver: 'default'
      routes:
      - match:
          severity: critical
        receiver: 'critical-alerts'
      - match:
          alertname: 'PrivilegedPodCreated'
        receiver: 'security-team'
      - match:
          alertname: 'UnauthorizedAPIAccess'
        receiver: 'security-team'
    
    receivers:
    - name: 'default'
      email_configs:
      - to: 'ops@company.com'
        subject: 'Kubernetes Alert: {{ .GroupLabels.alertname }}'
        body: |
          {{ range .Alerts }}
          Alert: {{ .Annotations.summary }}
          Description: {{ .Annotations.description }}
          Labels: {{ .Labels }}
          {{ end }}
    
    - name: 'critical-alerts'
      email_configs:
      - to: 'ops@company.com,security@company.com'
        subject: 'CRITICAL Kubernetes Alert: {{ .GroupLabels.alertname }}'
        body: |
          CRITICAL ALERT!
          
          {{ range .Alerts }}
          Alert: {{ .Annotations.summary }}
          Description: {{ .Annotations.description }}
          Severity: {{ .Labels.severity }}
          Time: {{ .StartsAt }}
          Labels: {{ .Labels }}
          {{ end }}
      slack_configs:
      - api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
        channel: '#alerts'
        title: 'CRITICAL Kubernetes Alert'
        text: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
    
    - name: 'security-team'
      email_configs:
      - to: 'security@company.com'
        subject: 'Security Alert: {{ .GroupLabels.alertname }}'
        body: |
          SECURITY ALERT!
          
          {{ range .Alerts }}
          Alert: {{ .Annotations.summary }}
          Description: {{ .Annotations.description }}
          Time: {{ .StartsAt }}
          Labels: {{ .Labels }}
          {{ end }}
      webhook_configs:
      - url: 'https://security-system.company.com/webhook'
        send_resolved: true
---
# 安全事件响应脚本
apiVersion: v1
kind: ConfigMap
metadata:
  name: security-response
data:
  incident-response.sh: |
    #!/bin/bash
    # 安全事件响应脚本
    
    ALERT_TYPE=${1}
    NAMESPACE=${2}
    POD_NAME=${3}
    
    echo "=== 安全事件响应 ==="
    echo "事件类型: $ALERT_TYPE"
    echo "命名空间: $NAMESPACE"
    echo "Pod名称: $POD_NAME"
    echo "响应时间: $(date)"
    echo ""
    
    case $ALERT_TYPE in
        "PrivilegedPod")
            echo "处理特权Pod事件..."
            # 获取Pod详细信息
            kubectl describe pod $POD_NAME -n $NAMESPACE
            
            # 检查Pod的安全上下文
            kubectl get pod $POD_NAME -n $NAMESPACE -o jsonpath='{.spec.securityContext}'
            
            # 隔离Pod(添加网络策略)
            cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: isolate-$POD_NAME
  namespace: $NAMESPACE
spec:
  podSelector:
    matchLabels:
      app: $(kubectl get pod $POD_NAME -n $NAMESPACE -o jsonpath='{.metadata.labels.app}')
  policyTypes:
  - Ingress
  - Egress
EOF
            echo "Pod已被网络隔离"
            ;;
        
        "UnauthorizedAccess")
            echo "处理未授权访问事件..."
            # 检查最近的API调用
            kubectl logs -n kube-system -l component=kube-apiserver --tail=100 | grep "forbidden"
            
            # 检查用户权限
            kubectl auth can-i --list --as=$POD_NAME
            ;;
        
        "SuspiciousActivity")
            echo "处理可疑活动事件..."
            # 收集Pod日志
            kubectl logs $POD_NAME -n $NAMESPACE --tail=100 > /tmp/suspicious-pod-logs.txt
            
            # 检查Pod的网络连接
            kubectl exec $POD_NAME -n $NAMESPACE -- netstat -tulpn
            ;;
        
        *)
            echo "未知事件类型: $ALERT_TYPE"
            ;;
    esac
    
    echo "\n=== 响应完成 ==="

10.8 安全最佳实践

10.8.1 安全开发生命周期

# 安全开发流程配置
apiVersion: v1
kind: ConfigMap
metadata:
  name: secure-development
data:
  security-checklist.md: |
    # Kubernetes安全开发检查清单
    
    ## 设计阶段
    - [ ] 定义安全需求和威胁模型
    - [ ] 设计最小权限访问策略
    - [ ] 规划网络分段和隔离
    - [ ] 确定数据分类和保护要求
    
    ## 开发阶段
    - [ ] 使用安全的基础镜像
    - [ ] 实施容器安全最佳实践
    - [ ] 配置适当的SecurityContext
    - [ ] 避免在代码中硬编码敏感信息
    - [ ] 实施输入验证和输出编码
    
    ## 构建阶段
    - [ ] 扫描容器镜像漏洞
    - [ ] 验证镜像签名
    - [ ] 运行静态代码分析
    - [ ] 执行安全测试
    
    ## 部署阶段
    - [ ] 应用Pod安全策略
    - [ ] 配置网络策略
    - [ ] 设置资源限制
    - [ ] 启用审计日志
    - [ ] 配置监控和告警
    
    ## 运行阶段
    - [ ] 持续监控安全事件
    - [ ] 定期更新和打补丁
    - [ ] 执行安全评估
    - [ ] 维护事件响应计划
  
  pre-commit-security-check.sh: |
    #!/bin/bash
    # Git pre-commit安全检查脚本
    
    echo "=== Pre-commit安全检查 ==="
    
    # 检查是否包含敏感信息
    echo "1. 检查敏感信息..."
    if git diff --cached --name-only | xargs grep -l -i "password\|secret\|key\|token" 2>/dev/null; then
        echo "警告: 检测到可能的敏感信息"
        git diff --cached | grep -i "password\|secret\|key\|token"
        echo "请确认这些不是真实的敏感信息"
    fi
    
    # 检查Kubernetes配置
    echo "\n2. 检查Kubernetes配置..."
    for file in $(git diff --cached --name-only | grep -E "\.(yaml|yml)$"); do
        if [ -f "$file" ]; then
            # 检查特权配置
            if grep -q "privileged: true" "$file"; then
                echo "警告: $file 包含特权配置"
            fi
            
            # 检查hostNetwork
            if grep -q "hostNetwork: true" "$file"; then
                echo "警告: $file 使用hostNetwork"
            fi
            
            # 检查runAsUser: 0
            if grep -q "runAsUser: 0" "$file"; then
                echo "警告: $file 以root用户运行"
            fi
        fi
    done
    
    echo "\n=== 安全检查完成 ==="

10.8.2 安全配置模板

# 安全Pod模板
apiVersion: v1
kind: Template
metadata:
  name: secure-pod-template
objects:
- apiVersion: v1
  kind: Pod
  metadata:
    name: ${APP_NAME}
    namespace: ${NAMESPACE}
    labels:
      app: ${APP_NAME}
      security-policy: restricted
  spec:
    serviceAccountName: ${SERVICE_ACCOUNT_NAME}
    automountServiceAccountToken: false
    securityContext:
      runAsNonRoot: true
      runAsUser: 1000
      runAsGroup: 1000
      fsGroup: 1000
      seccompProfile:
        type: RuntimeDefault
    containers:
    - name: ${APP_NAME}
      image: ${IMAGE_NAME}:${IMAGE_TAG}
      securityContext:
        allowPrivilegeEscalation: false
        readOnlyRootFilesystem: true
        runAsNonRoot: true
        runAsUser: 1000
        runAsGroup: 1000
        capabilities:
          drop:
          - ALL
        seccompProfile:
          type: RuntimeDefault
      resources:
        requests:
          memory: ${MEMORY_REQUEST}
          cpu: ${CPU_REQUEST}
        limits:
          memory: ${MEMORY_LIMIT}
          cpu: ${CPU_LIMIT}
      volumeMounts:
      - name: tmp-volume
        mountPath: /tmp
      - name: cache-volume
        mountPath: /app/cache
      livenessProbe:
        httpGet:
          path: /health
          port: 8080
        initialDelaySeconds: 30
        periodSeconds: 10
      readinessProbe:
        httpGet:
          path: /ready
          port: 8080
        initialDelaySeconds: 5
        periodSeconds: 5
    volumes:
    - name: tmp-volume
      emptyDir: {}
    - name: cache-volume
      emptyDir: {}
parameters:
- name: APP_NAME
  description: "应用名称"
  required: true
- name: NAMESPACE
  description: "命名空间"
  value: "default"
- name: SERVICE_ACCOUNT_NAME
  description: "服务账户名称"
  value: "default"
- name: IMAGE_NAME
  description: "镜像名称"
  required: true
- name: IMAGE_TAG
  description: "镜像标签"
  value: "latest"
- name: MEMORY_REQUEST
  description: "内存请求"
  value: "128Mi"
- name: MEMORY_LIMIT
  description: "内存限制"
  value: "256Mi"
- name: CPU_REQUEST
  description: "CPU请求"
  value: "100m"
- name: CPU_LIMIT
  description: "CPU限制"
  value: "500m"
---
# 安全网络策略模板
apiVersion: v1
kind: Template
metadata:
  name: secure-network-policy-template
objects:
- apiVersion: networking.k8s.io/v1
  kind: NetworkPolicy
  metadata:
    name: ${APP_NAME}-network-policy
    namespace: ${NAMESPACE}
  spec:
    podSelector:
      matchLabels:
        app: ${APP_NAME}
    policyTypes:
    - Ingress
    - Egress
    ingress:
    - from:
      - podSelector:
          matchLabels:
            app: ${ALLOWED_SOURCE_APP}
      ports:
      - protocol: TCP
        port: ${INGRESS_PORT}
    egress:
    - to: []
      ports:
      - protocol: UDP
        port: 53  # DNS
    - to:
      - podSelector:
          matchLabels:
            app: ${ALLOWED_TARGET_APP}
      ports:
      - protocol: TCP
        port: ${EGRESS_PORT}
parameters:
- name: APP_NAME
  description: "应用名称"
  required: true
- name: NAMESPACE
  description: "命名空间"
  value: "default"
- name: ALLOWED_SOURCE_APP
  description: "允许访问的源应用"
  required: true
- name: ALLOWED_TARGET_APP
  description: "允许访问的目标应用"
  required: true
- name: INGRESS_PORT
  description: "入站端口"
  value: "8080"
- name: EGRESS_PORT
  description: "出站端口"
  value: "8080"

10.9 总结

本章详细介绍了Kubernetes的安全管理,包括:

核心安全概念

  • 安全模型: 深度防御、最小权限、零信任架构
  • 威胁模型: 常见安全威胁和缓解策略
  • 安全原则: 安全左移、持续监控、自动化安全

认证和授权

  • 认证机制: 客户端证书、ServiceAccount Token、OIDC
  • RBAC权限控制: 角色定义、权限绑定、最小权限原则
  • ServiceAccount管理: 专用账户、权限隔离、令牌管理

Pod安全策略

  • Pod Security Standards: Privileged、Baseline、Restricted
  • SecurityContext配置: 用户权限、文件系统权限、能力控制
  • Network Policy: 网络隔离、流量控制、微分段

Secret和配置安全

  • Secret最佳实践: 加密存储、访问控制、定期轮换
  • 外部Secret管理: Vault集成、AWS Secrets Manager
  • 配置安全扫描: 自动化检查、合规性验证

镜像安全

  • 镜像扫描和策略: 漏洞扫描、基础镜像管理、标签策略
  • 镜像签名验证: Cosign签名、SBOM管理、准入控制
  • 运行时安全: Falco监控、异常检测、事件响应

安全监控和告警

  • 安全事件监控: 实时监控、日志分析、异常检测
  • 审计日志配置: 审计策略、日志分析、合规性报告
  • 安全告警集成: AlertManager配置、事件响应、自动化处理

安全最佳实践

  • 安全开发生命周期: 设计安全、开发安全、部署安全
  • 安全配置模板: 标准化配置、自动化部署、一致性保证

通过本章的学习,你应该能够: 1. 理解Kubernetes安全模型和威胁 2. 配置认证、授权和访问控制 3. 实施Pod安全策略和网络隔离 4. 管理Secret和敏感配置 5. 确保容器镜像安全 6. 建立安全监控和告警体系 7. 遵循安全开发最佳实践

下一章我们将学习Kubernetes的故障排查和性能优化。