本章概述

Helm Hooks 是一种强大的机制,允许我们在 Chart 安装、升级、删除等生命周期的特定时刻执行自定义操作。通过 Hooks,我们可以实现数据库迁移、配置初始化、清理任务等复杂的部署逻辑。本章将深入探讨 Helm Hooks 的使用方法和最佳实践。

学习目标

  • 理解 Helm Hooks 的概念和作用
  • 掌握不同类型的 Hook 及其触发时机
  • 学会创建和配置各种 Hook 资源
  • 了解 Hook 的执行顺序和权重管理
  • 掌握 Hook 的错误处理和重试机制
  • 学习 Hook 在实际场景中的应用
  • 了解 Hook 的调试和故障排除方法

5.1 Hooks 概述

5.1.1 什么是 Helm Hooks

Helm Hooks 是在 Chart 生命周期的特定时刻执行的 Kubernetes 资源。它们允许我们在安装、升级、删除等操作的前后执行自定义逻辑,如数据库迁移、配置验证、清理任务等。

graph TB
    A[helm install] --> B[pre-install hooks]
    B --> C[install resources]
    C --> D[post-install hooks]
    
    E[helm upgrade] --> F[pre-upgrade hooks]
    F --> G[upgrade resources]
    G --> H[post-upgrade hooks]
    
    I[helm delete] --> J[pre-delete hooks]
    J --> K[delete resources]
    K --> L[post-delete hooks]
    
    subgraph "Hook 类型"
        M[pre-install]
        N[post-install]
        O[pre-upgrade]
        P[post-upgrade]
        Q[pre-delete]
        R[post-delete]
        S[pre-rollback]
        T[post-rollback]
        U[test]
    end

5.1.2 Hook 的优势

  1. 生命周期控制:在部署过程的关键节点执行操作
  2. 数据管理:执行数据库迁移和初始化
  3. 配置验证:在部署前验证环境和配置
  4. 清理任务:在删除时执行清理操作
  5. 测试集成:集成自动化测试到部署流程

5.2 Hook 类型和触发时机

5.2.1 安装相关 Hooks

# pre-install Hook
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-pre-install"
  annotations:
    "helm.sh/hook": pre-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: pre-install
        image: alpine:3.16
        command:
        - /bin/sh
        - -c
        - |
          echo "Executing pre-install tasks..."
          # 检查数据库连接
          nc -z {{ .Values.database.host }} {{ .Values.database.port }}
          echo "Database is accessible"
          # 创建必要的目录
          mkdir -p /shared/logs
          echo "Pre-install completed successfully"
# post-install Hook
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-post-install"
  annotations:
    "helm.sh/hook": post-install
    "helm.sh/hook-weight": "5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: post-install
        image: curlimages/curl:7.85.0
        command:
        - /bin/sh
        - -c
        - |
          echo "Executing post-install tasks..."
          # 等待服务就绪
          until curl -f http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/health; do
            echo "Waiting for service to be ready..."
            sleep 5
          done
          echo "Service is ready"
          # 发送通知
          curl -X POST {{ .Values.notifications.webhook }} \
            -H "Content-Type: application/json" \
            -d '{"message": "Application {{ include "myapp.fullname" . }} installed successfully"}'

5.2.2 升级相关 Hooks

# pre-upgrade Hook - 数据库备份
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-pre-upgrade-backup"
  annotations:
    "helm.sh/hook": pre-upgrade
    "helm.sh/hook-weight": "-10"
    "helm.sh/hook-delete-policy": before-hook-creation
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: backup
        image: postgres:13
        env:
        - name: PGPASSWORD
          valueFrom:
            secretKeyRef:
              name: {{ .Values.database.secretName }}
              key: password
        command:
        - /bin/bash
        - -c
        - |
          echo "Creating database backup before upgrade..."
          BACKUP_FILE="/backup/backup-$(date +%Y%m%d-%H%M%S).sql"
          pg_dump -h {{ .Values.database.host }} \
                  -U {{ .Values.database.username }} \
                  -d {{ .Values.database.name }} \
                  > $BACKUP_FILE
          echo "Backup created: $BACKUP_FILE"
        volumeMounts:
        - name: backup-storage
          mountPath: /backup
      volumes:
      - name: backup-storage
        persistentVolumeClaim:
          claimName: {{ include "myapp.fullname" . }}-backup-pvc
# post-upgrade Hook - 数据库迁移
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-post-upgrade-migration"
  annotations:
    "helm.sh/hook": post-upgrade
    "helm.sh/hook-weight": "1"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: migration
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        env:
        - name: DATABASE_URL
          value: "postgresql://{{ .Values.database.username }}:$(DB_PASSWORD)@{{ .Values.database.host }}:{{ .Values.database.port }}/{{ .Values.database.name }}"
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: {{ .Values.database.secretName }}
              key: password
        command:
        - /bin/sh
        - -c
        - |
          echo "Running database migrations..."
          ./migrate up
          echo "Migrations completed successfully"

5.2.3 删除相关 Hooks

# pre-delete Hook - 数据备份
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-pre-delete-backup"
  annotations:
    "helm.sh/hook": pre-delete
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: final-backup
        image: postgres:13
        env:
        - name: PGPASSWORD
          valueFrom:
            secretKeyRef:
              name: {{ .Values.database.secretName }}
              key: password
        command:
        - /bin/bash
        - -c
        - |
          echo "Creating final backup before deletion..."
          BACKUP_FILE="/backup/final-backup-$(date +%Y%m%d-%H%M%S).sql"
          pg_dump -h {{ .Values.database.host }} \
                  -U {{ .Values.database.username }} \
                  -d {{ .Values.database.name }} \
                  > $BACKUP_FILE
          echo "Final backup created: $BACKUP_FILE"
        volumeMounts:
        - name: backup-storage
          mountPath: /backup
      volumes:
      - name: backup-storage
        persistentVolumeClaim:
          claimName: {{ include "myapp.fullname" . }}-backup-pvc
# post-delete Hook - 清理任务
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-post-delete-cleanup"
  annotations:
    "helm.sh/hook": post-delete
    "helm.sh/hook-weight": "5"
    "helm.sh/hook-delete-policy": hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: cleanup
        image: alpine:3.16
        command:
        - /bin/sh
        - -c
        - |
          echo "Performing cleanup tasks..."
          # 清理外部资源
          # 发送删除通知
          echo "Cleanup completed"

5.2.4 回滚相关 Hooks

# pre-rollback Hook
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-pre-rollback"
  annotations:
    "helm.sh/hook": pre-rollback
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: pre-rollback
        image: alpine:3.16
        command:
        - /bin/sh
        - -c
        - |
          echo "Preparing for rollback..."
          # 验证回滚条件
          # 准备回滚环境
          echo "Pre-rollback tasks completed"
# post-rollback Hook
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-post-rollback"
  annotations:
    "helm.sh/hook": post-rollback
    "helm.sh/hook-weight": "5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: post-rollback
        image: curlimages/curl:7.85.0
        command:
        - /bin/sh
        - -c
        - |
          echo "Verifying rollback..."
          # 验证服务状态
          until curl -f http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/health; do
            echo "Waiting for service after rollback..."
            sleep 5
          done
          echo "Rollback verification completed"

5.2.5 测试 Hooks

# test Hook
apiVersion: v1
kind: Pod
metadata:
  name: "{{ include "myapp.fullname" . }}-test"
  annotations:
    "helm.sh/hook": test
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  restartPolicy: Never
  containers:
  - name: test
    image: curlimages/curl:7.85.0
    command:
    - /bin/sh
    - -c
    - |
      echo "Running application tests..."
      
      # 健康检查测试
      echo "Testing health endpoint..."
      curl -f http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/health
      
      # API 功能测试
      echo "Testing API endpoints..."
      curl -f http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/api/users
      
      # 数据库连接测试
      echo "Testing database connectivity..."
      curl -f http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/api/health/db
      
      echo "All tests passed successfully"

5.3 Hook 注解详解

5.3.1 基本注解

metadata:
  annotations:
    # Hook 类型(必需)
    "helm.sh/hook": "pre-install,post-install"
    
    # Hook 权重(可选,默认为 0)
    "helm.sh/hook-weight": "-5"
    
    # Hook 删除策略(可选)
    "helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded"

5.3.2 Hook 权重管理

权重决定了 Hook 的执行顺序,数值越小越先执行:

# 执行顺序示例
# 1. 权重 -10:数据库备份
apiVersion: batch/v1
kind: Job
metadata:
  name: backup-job
  annotations:
    "helm.sh/hook": pre-upgrade
    "helm.sh/hook-weight": "-10"

---
# 2. 权重 -5:环境准备
apiVersion: batch/v1
kind: Job
metadata:
  name: prepare-job
  annotations:
    "helm.sh/hook": pre-upgrade
    "helm.sh/hook-weight": "-5"

---
# 3. 权重 0:默认操作
apiVersion: batch/v1
kind: Job
metadata:
  name: default-job
  annotations:
    "helm.sh/hook": pre-upgrade
    # 默认权重为 0

---
# 4. 权重 5:后续处理
apiVersion: batch/v1
kind: Job
metadata:
  name: post-process-job
  annotations:
    "helm.sh/hook": pre-upgrade
    "helm.sh/hook-weight": "5"

5.3.3 Hook 删除策略

# 删除策略选项
metadata:
  annotations:
    "helm.sh/hook-delete-policy": "before-hook-creation"  # 创建新 Hook 前删除
    # "helm.sh/hook-delete-policy": "hook-succeeded"      # Hook 成功后删除
    # "helm.sh/hook-delete-policy": "hook-failed"         # Hook 失败后删除
    # "helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded"  # 组合策略

5.4 实际应用场景

5.4.1 数据库迁移系统

# templates/hooks/db-migration.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-db-migration"
  annotations:
    "helm.sh/hook": post-install,post-upgrade
    "helm.sh/hook-weight": "1"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    metadata:
      labels:
        app: {{ include "myapp.name" . }}
        component: db-migration
    spec:
      restartPolicy: Never
      initContainers:
      - name: wait-for-db
        image: postgres:13
        env:
        - name: PGPASSWORD
          valueFrom:
            secretKeyRef:
              name: {{ .Values.database.secretName }}
              key: password
        command:
        - /bin/bash
        - -c
        - |
          echo "Waiting for database to be ready..."
          until pg_isready -h {{ .Values.database.host }} -p {{ .Values.database.port }} -U {{ .Values.database.username }}; do
            echo "Database not ready, waiting..."
            sleep 2
          done
          echo "Database is ready"
      containers:
      - name: migrate
        image: "{{ .Values.migration.image.repository }}:{{ .Values.migration.image.tag }}"
        env:
        - name: DATABASE_URL
          value: "postgresql://{{ .Values.database.username }}:$(DB_PASSWORD)@{{ .Values.database.host }}:{{ .Values.database.port }}/{{ .Values.database.name }}?sslmode={{ .Values.database.sslmode }}"
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: {{ .Values.database.secretName }}
              key: password
        - name: MIGRATION_DIR
          value: {{ .Values.migration.directory }}
        command:
        - /bin/sh
        - -c
        - |
          echo "Starting database migration..."
          echo "Current database version:"
          migrate -path $MIGRATION_DIR -database $DATABASE_URL version
          
          echo "Running migrations..."
          migrate -path $MIGRATION_DIR -database $DATABASE_URL up
          
          echo "Migration completed. New database version:"
          migrate -path $MIGRATION_DIR -database $DATABASE_URL version
        volumeMounts:
        - name: migration-scripts
          mountPath: {{ .Values.migration.directory }}
      volumes:
      - name: migration-scripts
        configMap:
          name: {{ include "myapp.fullname" . }}-migration-scripts

5.4.2 配置验证系统

# templates/hooks/config-validation.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-config-validation"
  annotations:
    "helm.sh/hook": pre-install,pre-upgrade
    "helm.sh/hook-weight": "-10"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: validator
        image: alpine:3.16
        command:
        - /bin/sh
        - -c
        - |
          echo "Validating configuration..."
          
          # 验证必需的配置项
          if [ -z "{{ .Values.app.name }}" ]; then
            echo "ERROR: app.name is required"
            exit 1
          fi
          
          if [ -z "{{ .Values.database.host }}" ]; then
            echo "ERROR: database.host is required"
            exit 1
          fi
          
          # 验证资源配置
          CPU_LIMIT="{{ .Values.resources.limits.cpu }}"
          MEMORY_LIMIT="{{ .Values.resources.limits.memory }}"
          
          if [ "$CPU_LIMIT" = "" ] || [ "$MEMORY_LIMIT" = "" ]; then
            echo "ERROR: Resource limits must be specified"
            exit 1
          fi
          
          # 验证网络连接
          echo "Testing database connectivity..."
          nc -z {{ .Values.database.host }} {{ .Values.database.port }}
          if [ $? -ne 0 ]; then
            echo "ERROR: Cannot connect to database"
            exit 1
          fi
          
          {{- if .Values.redis.enabled }}
          echo "Testing Redis connectivity..."
          nc -z {{ .Values.redis.host }} {{ .Values.redis.port }}
          if [ $? -ne 0 ]; then
            echo "ERROR: Cannot connect to Redis"
            exit 1
          fi
          {{- end }}
          
          echo "Configuration validation passed"

5.4.3 数据初始化系统

# templates/hooks/data-initialization.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-data-init"
  annotations:
    "helm.sh/hook": post-install
    "helm.sh/hook-weight": "10"
    "helm.sh/hook-delete-policy": hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: data-init
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        env:
        - name: DATABASE_URL
          value: "postgresql://{{ .Values.database.username }}:$(DB_PASSWORD)@{{ .Values.database.host }}:{{ .Values.database.port }}/{{ .Values.database.name }}"
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: {{ .Values.database.secretName }}
              key: password
        - name: ADMIN_EMAIL
          value: {{ .Values.admin.email }}
        - name: ADMIN_PASSWORD
          valueFrom:
            secretKeyRef:
              name: {{ include "myapp.fullname" . }}-admin-secret
              key: password
        command:
        - /bin/sh
        - -c
        - |
          echo "Initializing application data..."
          
          # 等待应用服务就绪
          until curl -f http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/health; do
            echo "Waiting for application to be ready..."
            sleep 5
          done
          
          # 创建管理员用户
          echo "Creating admin user..."
          curl -X POST http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/api/admin/users \
            -H "Content-Type: application/json" \
            -d '{
              "email": "'$ADMIN_EMAIL'",
              "password": "'$ADMIN_PASSWORD'",
              "role": "admin"
            }'
          
          # 初始化默认数据
          {{- if .Values.dataInit.enabled }}
          echo "Loading initial data..."
          curl -X POST http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/api/admin/data/init
          {{- end }}
          
          echo "Data initialization completed"

5.4.4 清理和备份系统

# templates/hooks/cleanup.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-cleanup"
  annotations:
    "helm.sh/hook": pre-delete
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      serviceAccountName: {{ include "myapp.fullname" . }}-cleanup
      containers:
      - name: cleanup
        image: bitnami/kubectl:latest
        command:
        - /bin/bash
        - -c
        - |
          echo "Starting cleanup process..."
          
          # 备份重要数据
          echo "Creating final backup..."
          kubectl create job {{ include "myapp.fullname" . }}-final-backup \
            --from=cronjob/{{ include "myapp.fullname" . }}-backup \
            --namespace={{ .Release.Namespace }}
          
          # 等待备份完成
          kubectl wait --for=condition=complete job/{{ include "myapp.fullname" . }}-final-backup \
            --timeout=300s --namespace={{ .Release.Namespace }}
          
          # 清理外部资源
          {{- if .Values.cleanup.s3.enabled }}
          echo "Cleaning up S3 resources..."
          aws s3 rm s3://{{ .Values.cleanup.s3.bucket }}/{{ include "myapp.fullname" . }}/ --recursive
          {{- end }}
          
          {{- if .Values.cleanup.dns.enabled }}
          echo "Cleaning up DNS records..."
          # 清理 DNS 记录的逻辑
          {{- end }}
          
          echo "Cleanup completed"
        env:
        {{- if .Values.cleanup.s3.enabled }}
        - name: AWS_ACCESS_KEY_ID
          valueFrom:
            secretKeyRef:
              name: {{ .Values.cleanup.s3.secretName }}
              key: access-key-id
        - name: AWS_SECRET_ACCESS_KEY
          valueFrom:
            secretKeyRef:
              name: {{ .Values.cleanup.s3.secretName }}
              key: secret-access-key
        {{- end }}

5.5 Hook 错误处理和重试

5.5.1 重试机制

# templates/hooks/retry-example.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-retry-hook"
  annotations:
    "helm.sh/hook": post-install
    "helm.sh/hook-weight": "5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  # 重试配置
  backoffLimit: 3  # 最大重试次数
  activeDeadlineSeconds: 300  # 总超时时间
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: retry-task
        image: alpine:3.16
        command:
        - /bin/sh
        - -c
        - |
          echo "Attempting task (attempt: $((${JOB_COMPLETION_INDEX:-0} + 1)))..."
          
          # 模拟可能失败的任务
          if [ "$(shuf -i 1-10 -n 1)" -le 3 ]; then
            echo "Task failed, will retry..."
            exit 1
          fi
          
          echo "Task completed successfully"

5.5.2 超时处理

# templates/hooks/timeout-example.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-timeout-hook"
  annotations:
    "helm.sh/hook": pre-upgrade
    "helm.sh/hook-weight": "0"
    "helm.sh/hook-delete-policy": before-hook-creation
spec:
  activeDeadlineSeconds: 600  # 10分钟超时
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: long-running-task
        image: alpine:3.16
        command:
        - /bin/sh
        - -c
        - |
          echo "Starting long-running task..."
          
          # 设置内部超时
          timeout 300 /bin/sh -c '
            while true; do
              echo "Processing..."
              sleep 10
              # 检查完成条件
              if [ -f /tmp/task-complete ]; then
                break
              fi
            done
          ' || {
            echo "Task timed out after 5 minutes"
            exit 1
          }
          
          echo "Task completed within timeout"

5.5.3 错误恢复

# templates/hooks/error-recovery.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-error-recovery"
  annotations:
    "helm.sh/hook": post-upgrade
    "helm.sh/hook-weight": "10"
    "helm.sh/hook-delete-policy": hook-failed
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: recovery-task
        image: alpine:3.16
        command:
        - /bin/sh
        - -c
        - |
          echo "Starting error recovery process..."
          
          # 检查应用状态
          if ! curl -f http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/health; then
            echo "Application health check failed, attempting recovery..."
            
            # 尝试重启应用
            kubectl rollout restart deployment/{{ include "myapp.fullname" . }} \
              --namespace={{ .Release.Namespace }}
            
            # 等待重启完成
            kubectl rollout status deployment/{{ include "myapp.fullname" . }} \
              --namespace={{ .Release.Namespace }} --timeout=300s
            
            # 再次检查健康状态
            sleep 30
            if ! curl -f http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/health; then
              echo "Recovery failed, manual intervention required"
              exit 1
            fi
          fi
          
          echo "Recovery completed successfully"

5.6 Hook 调试和监控

5.6.1 Hook 状态查看

# 查看 Hook 资源
kubectl get jobs -l "app.kubernetes.io/managed-by=Helm"

# 查看 Hook 日志
kubectl logs job/myapp-pre-install

# 查看 Hook 事件
kubectl describe job myapp-pre-install

# 查看所有 Hook 相关的 Pod
kubectl get pods -l "helm.sh/hook"

5.6.2 Hook 调试模板

# templates/hooks/debug-hook.yaml
{{- if .Values.debug.hooks.enabled }}
apiVersion: v1
kind: Pod
metadata:
  name: "{{ include "myapp.fullname" . }}-debug-hook"
  annotations:
    "helm.sh/hook": {{ .Values.debug.hooks.type | default "post-install" }}
    "helm.sh/hook-weight": "100"
    "helm.sh/hook-delete-policy": "hook-succeeded"
spec:
  restartPolicy: Never
  containers:
  - name: debug
    image: alpine:3.16
    command:
    - /bin/sh
    - -c
    - |
      echo "=== Hook Debug Information ==="
      echo "Release Name: {{ .Release.Name }}"
      echo "Release Namespace: {{ .Release.Namespace }}"
      echo "Release Revision: {{ .Release.Revision }}"
      echo "Chart Name: {{ .Chart.Name }}"
      echo "Chart Version: {{ .Chart.Version }}"
      
      echo "\n=== Environment Variables ==="
      env | sort
      
      echo "\n=== Kubernetes Resources ==="
      kubectl get all -l "app.kubernetes.io/instance={{ .Release.Name }}" \
        --namespace={{ .Release.Namespace }}
      
      echo "\n=== Hook Resources ==="
      kubectl get jobs,pods -l "helm.sh/hook" \
        --namespace={{ .Release.Namespace }}
      
      echo "\n=== Values Debug ==="
      echo "App Name: {{ .Values.app.name }}"
      echo "Image: {{ .Values.image.repository }}:{{ .Values.image.tag }}"
      echo "Replica Count: {{ .Values.replicaCount }}"
      
      echo "\n=== Debug completed ==="
{{- end }}

5.6.3 Hook 监控配置

# templates/hooks/monitoring-hook.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ include "myapp.fullname" . }}-monitoring-setup"
  annotations:
    "helm.sh/hook": post-install,post-upgrade
    "helm.sh/hook-weight": "15"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: monitoring-setup
        image: curlimages/curl:7.85.0
        command:
        - /bin/sh
        - -c
        - |
          echo "Setting up monitoring for hooks..."
          
          # 创建 Prometheus 监控规则
          cat <<EOF | kubectl apply -f -
          apiVersion: monitoring.coreos.com/v1
          kind: PrometheusRule
          metadata:
            name: {{ include "myapp.fullname" . }}-hook-monitoring
            namespace: {{ .Release.Namespace }}
          spec:
            groups:
            - name: helm-hooks
              rules:
              - alert: HelmHookFailed
                expr: kube_job_status_failed{job_name=~".*-hook.*"} > 0
                for: 0m
                labels:
                  severity: warning
                annotations:
                  summary: "Helm hook job failed"
                  description: "Hook job {{ "{{" }} $labels.job_name {{ "}}" }} has failed"
          EOF
          
          echo "Monitoring setup completed"

5.7 实践练习

5.7.1 练习1:数据库迁移 Hook

创建一个完整的数据库迁移系统:

# 1. 创建 Chart
helm create webapp-with-migration
cd webapp-with-migration

# 2. 创建迁移 Hook
mkdir -p templates/hooks
vim templates/hooks/db-migration.yaml

# 3. 创建迁移脚本 ConfigMap
vim templates/migration-configmap.yaml

# 4. 配置 values.yaml
vim values.yaml

# 5. 测试迁移
helm template . --debug
helm install webapp . --dry-run

5.7.2 练习2:多阶段部署 Hook

实现一个包含验证、备份、部署、测试的完整流程:

# 阶段1:预检查 (权重 -10)
# 阶段2:备份 (权重 -5)
# 阶段3:部署 (权重 0)
# 阶段4:迁移 (权重 5)
# 阶段5:测试 (权重 10)
# 阶段6:通知 (权重 15)

5.7.3 练习3:错误恢复系统

创建一个具有自动错误恢复能力的 Hook 系统:

# 1. 创建监控 Hook
vim templates/hooks/health-monitor.yaml

# 2. 创建恢复 Hook
vim templates/hooks/auto-recovery.yaml

# 3. 创建通知 Hook
vim templates/hooks/notification.yaml

# 4. 测试错误场景
helm install webapp . --set app.simulateError=true

5.8 故障排除

5.8.1 常见 Hook 问题

  1. Hook 执行失败 “`bash

    查看 Hook 状态

    kubectl get jobs -l “helm.sh/hook”

查看失败原因

kubectl describe job myapp-pre-install kubectl logs job/myapp-pre-install

手动重新运行 Hook

kubectl delete job myapp-pre-install helm upgrade myapp . –force


2. **Hook 超时**
```bash
# 检查 Hook 配置
kubectl get job myapp-pre-install -o yaml

# 增加超时时间
# 在 Job spec 中添加:
# activeDeadlineSeconds: 1800  # 30分钟
  1. Hook 权重问题 “`bash

    查看 Hook 执行顺序

    kubectl get jobs -l “helm.sh/hook” –sort-by=.metadata.creationTimestamp

检查权重配置

kubectl get jobs -l “helm.sh/hook” -o jsonpath=‘{range .items[*]}{.metadata.name}{“\t”}{.metadata.annotations.helm.sh/hook-weight}{“\n”}{end}’


### 5.8.2 调试技巧

```bash
# 启用 Hook 调试
helm install myapp . --debug --dry-run

# 查看 Hook 模板渲染结果
helm template myapp . --show-only templates/hooks/

# 手动创建 Hook 进行测试
helm template myapp . --show-only templates/hooks/pre-install.yaml | kubectl apply -f -

# 监控 Hook 执行
watch kubectl get jobs,pods -l "helm.sh/hook"

5.8.3 性能优化

# 优化 Hook 性能
spec:
  parallelism: 1          # 并行度
  completions: 1          # 完成数
  backoffLimit: 3         # 重试限制
  activeDeadlineSeconds: 300  # 超时设置
  
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: hook-task
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 512Mi

5.9 本章小结

核心概念回顾

  1. Helm Hooks:在 Chart 生命周期特定时刻执行的 Kubernetes 资源
  2. Hook 类型:pre-install、post-install、pre-upgrade、post-upgrade、pre-delete、post-delete、pre-rollback、post-rollback、test
  3. Hook 权重:控制同类型 Hook 的执行顺序
  4. 删除策略:控制 Hook 资源的清理时机
  5. 错误处理:重试机制、超时控制、错误恢复

技术要点总结

  • Hook 通过特定注解标识和配置
  • 权重数值越小越先执行
  • 支持多种删除策略管理资源生命周期
  • 可以实现复杂的部署流程控制
  • 需要合理的错误处理和监控机制

最佳实践

  1. 合理规划权重:确保 Hook 按正确顺序执行
  2. 错误处理:实现适当的重试和恢复机制
  3. 资源清理:选择合适的删除策略避免资源泄露
  4. 监控告警:监控 Hook 执行状态和性能
  5. 测试验证:充分测试 Hook 在各种场景下的行为

下一章预告

下一章我们将学习「测试与验证」,探讨如何使用 Helm 的测试功能验证 Chart 的正确性,包括单元测试、集成测试、端到端测试等测试策略,以及如何构建完整的 Chart 测试流程。