学习目标

通过本章学习,您将能够:

  • 理解 Docker Swarm 存储架构和类型
  • 掌握卷(Volume)和绑定挂载的使用
  • 学会配置和管理存储驱动
  • 了解数据持久化和备份策略
  • 掌握存储性能优化和故障排除

1. 存储架构概述

1.1 存储类型

Docker 存储类型

# Docker 支持的存储类型:

# 1. 卷(Volumes)
# - Docker 管理的存储
# - 独立于容器生命周期
# - 支持驱动插件
# - 可在容器间共享

# 2. 绑定挂载(Bind Mounts)
# - 主机文件系统路径
# - 直接映射到容器
# - 依赖主机文件系统
# - 性能最佳

# 3. tmpfs 挂载
# - 内存中的临时文件系统
# - 容器停止时数据丢失
# - 适用于临时数据
# - 高性能读写

# 4. 命名管道(Windows)
# - Windows 容器专用
# - 进程间通信
# - 类似 Unix 套接字

存储驱动架构

# 存储驱动层次结构:

┌─────────────────────────────────────┐
│           Container Layer           │  ← 可写层
├─────────────────────────────────────┤
│           Image Layer N             │  ← 只读层
├─────────────────────────────────────┤
│           Image Layer N-1           │  ← 只读层
├─────────────────────────────────────┤
│              ...                    │
├─────────────────────────────────────┤
│           Image Layer 1             │  ← 只读层
├─────────────────────────────────────┤
│           Base Layer                │  ← 只读层
└─────────────────────────────────────┘

# 常用存储驱动:
# - overlay2:推荐的存储驱动
# - aufs:较老的联合文件系统
# - devicemapper:基于设备映射
# - btrfs:B-tree 文件系统
# - zfs:ZFS 文件系统

1.2 Swarm 存储特性

集群存储挑战

# Swarm 集群存储面临的挑战:

# 1. 节点间数据共享
# - 服务可能在不同节点运行
# - 需要共享存储解决方案
# - 数据一致性保证

# 2. 数据持久化
# - 容器重启数据保持
# - 节点故障数据恢复
# - 服务迁移数据跟随

# 3. 性能和可扩展性
# - 存储 I/O 性能
# - 并发访问控制
# - 存储容量扩展

# 4. 数据安全
# - 数据加密
# - 访问权限控制
# - 备份和恢复

存储解决方案

# Swarm 存储解决方案:

# 1. 本地存储
# - 节点本地卷
# - 绑定挂载
# - 适用于单节点服务

# 2. 共享存储
# - NFS 网络文件系统
# - GlusterFS 分布式文件系统
# - Ceph 分布式存储
# - 云存储服务(EBS、EFS 等)

# 3. 存储插件
# - Docker 卷插件
# - CSI(Container Storage Interface)
# - 第三方存储解决方案

2. 卷管理

2.1 基本卷操作

创建和管理卷

# 创建卷
docker volume create my-volume

# 创建带驱动的卷
docker volume create --driver local my-local-volume

# 创建带选项的卷
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=192.168.1.100,rw \
  --opt device=:/path/to/dir \
  nfs-volume

# 查看卷列表
docker volume ls

# 查看卷详情
docker volume inspect my-volume

# 删除卷
docker volume rm my-volume

# 清理未使用的卷
docker volume prune

卷配置选项

# 本地卷配置
docker volume create \
  --driver local \
  --opt type=ext4 \
  --opt device=/dev/sdb1 \
  block-volume

# tmpfs 卷配置
docker volume create \
  --driver local \
  --opt type=tmpfs \
  --opt tmpfs-size=100m \
  temp-volume

# 绑定挂载卷
docker volume create \
  --driver local \
  --opt type=none \
  --opt o=bind \
  --opt device=/host/path \
  bind-volume

# 查看卷挂载点
docker volume inspect my-volume --format '{{.Mountpoint}}'

2.2 服务卷挂载

基本卷挂载

# 创建带卷的服务
docker service create --name web \
  --mount type=volume,source=web-data,target=/var/www/html \
  nginx

# 多个卷挂载
docker service create --name app \
  --mount type=volume,source=app-data,target=/app/data \
  --mount type=volume,source=app-logs,target=/app/logs \
  my-app:latest

# 只读卷挂载
docker service create --name readonly-app \
  --mount type=volume,source=config-data,target=/app/config,readonly \
  my-app:latest

# 绑定挂载
docker service create --name bind-app \
  --mount type=bind,source=/host/data,target=/app/data \
  my-app:latest

高级挂载选项

# 带标签的卷挂载
docker service create --name labeled-app \
  --mount type=volume,source=app-data,target=/app/data,volume-label=app=myapp \
  my-app:latest

# 卷驱动选项
docker service create --name nfs-app \
  --mount type=volume,source=nfs-data,target=/app/data,volume-driver=local,volume-opt=type=nfs,volume-opt=o=addr=192.168.1.100,volume-opt=device=:/exports/data \
  my-app:latest

# tmpfs 挂载
docker service create --name temp-app \
  --mount type=tmpfs,target=/tmp,tmpfs-size=100m \
  my-app:latest

# 多种挂载类型组合
docker service create --name complex-app \
  --mount type=volume,source=persistent-data,target=/app/data \
  --mount type=bind,source=/host/config,target=/app/config,readonly \
  --mount type=tmpfs,target=/tmp,tmpfs-size=50m \
  my-app:latest

2.3 卷管理脚本

卷管理工具

#!/bin/bash
# volume-manager.sh

# 卷管理配置
VOLUME_PREFIX="myapp"
BACKUP_DIR="/backup/volumes"
NFS_SERVER="192.168.1.100"
NFS_PATH="/exports"

# 创建应用卷集
create_app_volumes() {
    local app_name=$1
    
    echo "Creating volumes for application: $app_name"
    
    # 数据卷
    docker volume create ${VOLUME_PREFIX}-${app_name}-data
    
    # 日志卷
    docker volume create ${VOLUME_PREFIX}-${app_name}-logs
    
    # 配置卷(NFS)
    docker volume create \
        --driver local \
        --opt type=nfs \
        --opt o=addr=${NFS_SERVER},rw \
        --opt device=:${NFS_PATH}/${app_name}/config \
        ${VOLUME_PREFIX}-${app_name}-config
    
    # 缓存卷(tmpfs)
    docker volume create \
        --driver local \
        --opt type=tmpfs \
        --opt tmpfs-size=256m \
        ${VOLUME_PREFIX}-${app_name}-cache
    
    echo "Volumes created for $app_name:"
    docker volume ls --filter name=${VOLUME_PREFIX}-${app_name}
}

# 备份卷数据
backup_volume() {
    local volume_name=$1
    local backup_name="${volume_name}_$(date +%Y%m%d_%H%M%S).tar.gz"
    
    echo "Backing up volume: $volume_name"
    
    # 创建备份目录
    mkdir -p $BACKUP_DIR
    
    # 使用临时容器备份卷数据
    docker run --rm \
        --volume $volume_name:/data \
        --volume $BACKUP_DIR:/backup \
        alpine tar czf /backup/$backup_name -C /data .
    
    if [ $? -eq 0 ]; then
        echo "✓ Backup completed: $BACKUP_DIR/$backup_name"
    else
        echo "✗ Backup failed for volume: $volume_name"
    fi
}

# 恢复卷数据
restore_volume() {
    local volume_name=$1
    local backup_file=$2
    
    if [ ! -f "$backup_file" ]; then
        echo "✗ Backup file not found: $backup_file"
        return 1
    fi
    
    echo "Restoring volume: $volume_name from $backup_file"
    
    # 创建卷(如果不存在)
    docker volume create $volume_name
    
    # 使用临时容器恢复数据
    docker run --rm \
        --volume $volume_name:/data \
        --volume $(dirname $backup_file):/backup \
        alpine tar xzf /backup/$(basename $backup_file) -C /data
    
    if [ $? -eq 0 ]; then
        echo "✓ Restore completed for volume: $volume_name"
    else
        echo "✗ Restore failed for volume: $volume_name"
    fi
}

# 清理未使用的卷
cleanup_volumes() {
    echo "Cleaning up unused volumes..."
    
    # 显示将要删除的卷
    echo "Unused volumes:"
    docker volume ls --filter dangling=true
    
    # 确认删除
    read -p "Do you want to remove these volumes? (y/N): " confirm
    if [ "$confirm" = "y" ] || [ "$confirm" = "Y" ]; then
        docker volume prune -f
        echo "✓ Unused volumes removed"
    else
        echo "Volume cleanup cancelled"
    fi
}

# 卷使用情况报告
volume_usage_report() {
    echo "=== Volume Usage Report ==="
    
    # 卷列表和大小
    echo "\nVolume List:"
    printf "%-30s %-15s %-20s\n" "NAME" "DRIVER" "SIZE"
    printf "%-30s %-15s %-20s\n" "----" "------" "----"
    
    for volume in $(docker volume ls --format '{{.Name}}'); do
        driver=$(docker volume inspect $volume --format '{{.Driver}}')
        mountpoint=$(docker volume inspect $volume --format '{{.Mountpoint}}')
        
        if [ -d "$mountpoint" ]; then
            size=$(du -sh "$mountpoint" 2>/dev/null | cut -f1)
        else
            size="N/A"
        fi
        
        printf "%-30s %-15s %-20s\n" "$volume" "$driver" "$size"
    done
    
    # 卷使用统计
    echo "\nVolume Statistics:"
    total_volumes=$(docker volume ls | wc -l)
    unused_volumes=$(docker volume ls --filter dangling=true | wc -l)
    used_volumes=$((total_volumes - unused_volumes - 1))  # 减去标题行
    
    echo "  Total volumes: $total_volumes"
    echo "  Used volumes: $used_volumes"
    echo "  Unused volumes: $unused_volumes"
}

# 主菜单
case "$1" in
    "create")
        if [ -n "$2" ]; then
            create_app_volumes $2
        else
            echo "Usage: $0 create <app-name>"
        fi
        ;;
    "backup")
        if [ -n "$2" ]; then
            backup_volume $2
        else
            echo "Usage: $0 backup <volume-name>"
        fi
        ;;
    "restore")
        if [ -n "$2" ] && [ -n "$3" ]; then
            restore_volume $2 $3
        else
            echo "Usage: $0 restore <volume-name> <backup-file>"
        fi
        ;;
    "cleanup")
        cleanup_volumes
        ;;
    "report")
        volume_usage_report
        ;;
    *)
        echo "Usage: $0 {create|backup|restore|cleanup|report}"
        echo "  create <app-name>           - Create volume set for application"
        echo "  backup <volume-name>        - Backup volume data"
        echo "  restore <volume-name> <file> - Restore volume from backup"
        echo "  cleanup                     - Remove unused volumes"
        echo "  report                      - Show volume usage report"
        ;;
esac

3. 共享存储

3.1 NFS 存储

NFS 服务器配置

# 在 NFS 服务器上配置
# /etc/exports
/exports/data    *(rw,sync,no_subtree_check,no_root_squash)
/exports/logs    *(rw,sync,no_subtree_check,no_root_squash)
/exports/config  *(ro,sync,no_subtree_check,no_root_squash)

# 重启 NFS 服务
sudo systemctl restart nfs-server
sudo exportfs -ra

# 验证 NFS 导出
sudo exportfs -v

NFS 卷创建

# 创建 NFS 卷
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=192.168.1.100,rw,nfsvers=4 \
  --opt device=:/exports/data \
  nfs-data

# 创建只读 NFS 卷
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=192.168.1.100,ro,nfsvers=4 \
  --opt device=:/exports/config \
  nfs-config

# 使用 NFS 卷的服务
docker service create --name web \
  --mount type=volume,source=nfs-data,target=/var/www/html \
  --mount type=volume,source=nfs-config,target=/etc/nginx,readonly \
  --replicas 3 \
  nginx

# 验证 NFS 挂载
docker exec $(docker ps --filter name=web -q | head -1) mount | grep nfs

NFS 性能优化

# 高性能 NFS 配置
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=192.168.1.100,rw,nfsvers=4.1,proto=tcp,port=2049,timeo=14,intr,rsize=32768,wsize=32768 \
  --opt device=:/exports/data \
  nfs-optimized

# NFS 挂载选项说明:
# nfsvers=4.1    - 使用 NFSv4.1 协议
# proto=tcp      - 使用 TCP 协议
# timeo=14       - 超时时间(1.4秒)
# intr           - 允许中断
# rsize=32768    - 读取块大小
# wsize=32768    - 写入块大小
# hard           - 硬挂载(默认)
# soft           - 软挂载(可选)

3.2 GlusterFS 存储

GlusterFS 集群配置

# 在每个 GlusterFS 节点上
# 1. 安装 GlusterFS
sudo apt-get install glusterfs-server

# 2. 启动服务
sudo systemctl start glusterd
sudo systemctl enable glusterd

# 3. 配置集群(在主节点执行)
sudo gluster peer probe gluster-node2
sudo gluster peer probe gluster-node3

# 4. 查看集群状态
sudo gluster peer status

# 5. 创建卷
sudo gluster volume create gv0 replica 3 \
  gluster-node1:/data/brick1/gv0 \
  gluster-node2:/data/brick1/gv0 \
  gluster-node3:/data/brick1/gv0

# 6. 启动卷
sudo gluster volume start gv0

# 7. 查看卷信息
sudo gluster volume info gv0

GlusterFS Docker 集成

# 安装 GlusterFS 卷插件
docker plugin install --grant-all-permissions trajano/glusterfs-volume-plugin

# 创建 GlusterFS 卷
docker volume create \
  --driver trajano/glusterfs-volume-plugin \
  --opt glusterfs-server=gluster-node1 \
  --opt glusterfs-volname=gv0 \
  gluster-volume

# 或使用本地驱动
docker volume create \
  --driver local \
  --opt type=glusterfs \
  --opt o=addr=gluster-node1 \
  --opt device=:/gv0 \
  gluster-local

# 使用 GlusterFS 卷的服务
docker service create --name distributed-app \
  --mount type=volume,source=gluster-volume,target=/app/data \
  --replicas 5 \
  my-app:latest

3.3 云存储集成

AWS EFS 集成

# 安装 EFS 工具
sudo apt-get install amazon-efs-utils

# 创建 EFS 卷
docker volume create \
  --driver local \
  --opt type=nfs4 \
  --opt o=addr=fs-12345678.efs.us-west-2.amazonaws.com,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,intr,timeo=600 \
  --opt device=:/ \
  efs-volume

# 使用 EFS 卷
docker service create --name efs-app \
  --mount type=volume,source=efs-volume,target=/app/data \
  --constraint 'node.platform.os == linux' \
  my-app:latest

Azure Files 集成

# 创建 Azure Files 卷
docker volume create \
  --driver local \
  --opt type=cifs \
  --opt o=addr=mystorageaccount.file.core.windows.net,username=mystorageaccount,password=mykey,uid=1000,gid=1000,iocharset=utf8,file_mode=0777,dir_mode=0777 \
  --opt device=//mystorageaccount.file.core.windows.net/myshare \
  azure-files

# 使用 Azure Files 卷
docker service create --name azure-app \
  --mount type=volume,source=azure-files,target=/app/data \
  my-app:latest

4. 数据持久化策略

4.1 数据分类和策略

数据分类

# 数据分类策略:

# 1. 持久化数据(Persistent Data)
# - 数据库文件
# - 用户上传文件
# - 应用状态数据
# - 需要跨容器生命周期保持

# 2. 配置数据(Configuration Data)
# - 应用配置文件
# - 证书和密钥
# - 环境配置
# - 通常只读,偶尔更新

# 3. 缓存数据(Cache Data)
# - 临时计算结果
# - 会话数据
# - 可重新生成的数据
# - 性能优化用途

# 4. 日志数据(Log Data)
# - 应用日志
# - 访问日志
# - 审计日志
# - 需要长期保存和分析

存储策略设计

#!/bin/bash
# storage-strategy.sh

# 存储策略配置
declare -A STORAGE_STRATEGY=(
    ["database"]="persistent,replicated,backup"
    ["uploads"]="persistent,shared,backup"
    ["config"]="persistent,readonly,versioned"
    ["cache"]="temporary,local,fast"
    ["logs"]="persistent,centralized,rotated"
    ["temp"]="temporary,memory,fast"
)

# 根据策略创建存储
create_storage_by_strategy() {
    local data_type=$1
    local app_name=$2
    local strategy=${STORAGE_STRATEGY[$data_type]}
    
    echo "Creating storage for $data_type with strategy: $strategy"
    
    case $data_type in
        "database")
            # 持久化、复制、备份
            docker volume create \
                --driver local \
                --label backup=daily \
                --label replicated=true \
                ${app_name}-database
            ;;
        "uploads")
            # 持久化、共享、备份
            docker volume create \
                --driver local \
                --opt type=nfs \
                --opt o=addr=nfs-server,rw \
                --opt device=:/exports/${app_name}/uploads \
                --label backup=hourly \
                ${app_name}-uploads
            ;;
        "config")
            # 持久化、只读、版本控制
            docker volume create \
                --driver local \
                --opt type=nfs \
                --opt o=addr=nfs-server,ro \
                --opt device=:/exports/${app_name}/config \
                --label versioned=true \
                ${app_name}-config
            ;;
        "cache")
            # 临时、本地、快速
            docker volume create \
                --driver local \
                --opt type=tmpfs \
                --opt tmpfs-size=512m \
                --label temporary=true \
                ${app_name}-cache
            ;;
        "logs")
            # 持久化、集中化、轮转
            docker volume create \
                --driver local \
                --opt type=nfs \
                --opt o=addr=log-server,rw \
                --opt device=:/exports/logs/${app_name} \
                --label centralized=true \
                --label rotation=daily \
                ${app_name}-logs
            ;;
        "temp")
            # 临时、内存、快速
            # 使用 tmpfs 挂载,不创建卷
            echo "Temp storage will use tmpfs mount"
            ;;
        *)
            echo "Unknown data type: $data_type"
            return 1
            ;;
    esac
}

# 部署带存储策略的应用
deploy_app_with_storage() {
    local app_name=$1
    
    echo "Deploying $app_name with storage strategy..."
    
    # 创建各类存储
    for data_type in "${!STORAGE_STRATEGY[@]}"; do
        create_storage_by_strategy $data_type $app_name
    done
    
    # 部署服务
    docker service create --name $app_name \
        --mount type=volume,source=${app_name}-database,target=/app/data \
        --mount type=volume,source=${app_name}-uploads,target=/app/uploads \
        --mount type=volume,source=${app_name}-config,target=/app/config,readonly \
        --mount type=volume,source=${app_name}-cache,target=/app/cache \
        --mount type=volume,source=${app_name}-logs,target=/app/logs \
        --mount type=tmpfs,target=/tmp,tmpfs-size=100m \
        my-app:latest
    
    echo "Application $app_name deployed with storage strategy"
}

# 使用示例
if [ -n "$1" ]; then
    deploy_app_with_storage $1
else
    echo "Usage: $0 <app-name>"
fi

4.2 数据备份和恢复

自动备份系统

#!/bin/bash
# backup-system.sh

BACKUP_ROOT="/backup"
RETENTION_DAYS=30
LOG_FILE="/var/log/docker-backup.log"

# 日志函数
log() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a $LOG_FILE
}

# 备份单个卷
backup_volume() {
    local volume_name=$1
    local backup_type=$2  # full, incremental
    local backup_date=$(date +%Y%m%d_%H%M%S)
    local backup_dir="$BACKUP_ROOT/$volume_name"
    local backup_file="${backup_dir}/${volume_name}_${backup_type}_${backup_date}.tar.gz"
    
    log "Starting $backup_type backup for volume: $volume_name"
    
    # 创建备份目录
    mkdir -p $backup_dir
    
    # 检查卷是否存在
    if ! docker volume inspect $volume_name > /dev/null 2>&1; then
        log "ERROR: Volume $volume_name does not exist"
        return 1
    fi
    
    # 执行备份
    if [ "$backup_type" = "full" ]; then
        # 全量备份
        docker run --rm \
            --volume $volume_name:/data:ro \
            --volume $backup_dir:/backup \
            alpine tar czf /backup/$(basename $backup_file) -C /data .
    else
        # 增量备份(基于最新的全量备份)
        local last_full=$(ls -t ${backup_dir}/*_full_*.tar.gz 2>/dev/null | head -1)
        if [ -z "$last_full" ]; then
            log "No full backup found, performing full backup instead"
            backup_volume $volume_name "full"
            return $?
        fi
        
        # 创建增量备份(这里简化为差异备份)
        docker run --rm \
            --volume $volume_name:/data:ro \
            --volume $backup_dir:/backup \
            alpine sh -c "find /data -newer /backup/.last_backup 2>/dev/null | tar czf /backup/$(basename $backup_file) -T - 2>/dev/null || tar czf /backup/$(basename $backup_file) -C /data ."
    fi
    
    if [ $? -eq 0 ]; then
        log "SUCCESS: Backup completed - $backup_file"
        # 更新备份时间戳
        touch "${backup_dir}/.last_backup"
        # 计算备份大小
        local size=$(du -h $backup_file | cut -f1)
        log "Backup size: $size"
        return 0
    else
        log "ERROR: Backup failed for volume: $volume_name"
        return 1
    fi
}

# 恢复卷
restore_volume() {
    local volume_name=$1
    local backup_file=$2
    local restore_mode=${3:-replace}  # replace, merge
    
    log "Starting restore for volume: $volume_name from $backup_file"
    
    if [ ! -f "$backup_file" ]; then
        log "ERROR: Backup file not found: $backup_file"
        return 1
    fi
    
    # 创建卷(如果不存在)
    docker volume create $volume_name > /dev/null 2>&1
    
    # 执行恢复
    if [ "$restore_mode" = "replace" ]; then
        # 替换模式:清空卷后恢复
        docker run --rm \
            --volume $volume_name:/data \
            --volume $(dirname $backup_file):/backup \
            alpine sh -c "rm -rf /data/* /data/.[^.]* 2>/dev/null; tar xzf /backup/$(basename $backup_file) -C /data"
    else
        # 合并模式:直接解压到卷中
        docker run --rm \
            --volume $volume_name:/data \
            --volume $(dirname $backup_file):/backup \
            alpine tar xzf /backup/$(basename $backup_file) -C /data
    fi
    
    if [ $? -eq 0 ]; then
        log "SUCCESS: Restore completed for volume: $volume_name"
        return 0
    else
        log "ERROR: Restore failed for volume: $volume_name"
        return 1
    fi
}

# 清理过期备份
cleanup_old_backups() {
    log "Starting cleanup of backups older than $RETENTION_DAYS days"
    
    find $BACKUP_ROOT -name "*.tar.gz" -mtime +$RETENTION_DAYS -type f | while read backup_file; do
        log "Removing old backup: $backup_file"
        rm -f "$backup_file"
    done
    
    # 清理空目录
    find $BACKUP_ROOT -type d -empty -delete
    
    log "Backup cleanup completed"
}

# 备份所有标记的卷
backup_all_volumes() {
    local backup_type=${1:-full}
    
    log "Starting $backup_type backup for all marked volumes"
    
    # 查找需要备份的卷(带有 backup 标签)
    for volume in $(docker volume ls --filter label=backup --format '{{.Name}}'); do
        backup_volume $volume $backup_type
    done
    
    # 清理过期备份
    cleanup_old_backups
    
    log "All volume backups completed"
}

# 备份状态报告
backup_status_report() {
    echo "=== Backup Status Report ==="
    echo "Backup Root: $BACKUP_ROOT"
    echo "Retention: $RETENTION_DAYS days"
    echo ""
    
    # 统计备份信息
    local total_backups=$(find $BACKUP_ROOT -name "*.tar.gz" | wc -l)
    local total_size=$(du -sh $BACKUP_ROOT 2>/dev/null | cut -f1)
    
    echo "Total Backups: $total_backups"
    echo "Total Size: $total_size"
    echo ""
    
    # 按卷显示备份信息
    echo "Backup Details:"
    printf "%-20s %-10s %-15s %-10s\n" "VOLUME" "COUNT" "LATEST" "SIZE"
    printf "%-20s %-10s %-15s %-10s\n" "------" "-----" "------" "----"
    
    for volume_dir in $(find $BACKUP_ROOT -maxdepth 1 -type d | tail -n +2); do
        local volume_name=$(basename $volume_dir)
        local backup_count=$(ls $volume_dir/*.tar.gz 2>/dev/null | wc -l)
        local latest_backup=$(ls -t $volume_dir/*.tar.gz 2>/dev/null | head -1)
        local latest_date="N/A"
        local volume_size="N/A"
        
        if [ -n "$latest_backup" ]; then
            latest_date=$(date -r "$latest_backup" +%m/%d)
            volume_size=$(du -sh $volume_dir | cut -f1)
        fi
        
        printf "%-20s %-10s %-15s %-10s\n" "$volume_name" "$backup_count" "$latest_date" "$volume_size"
    done
}

# 主菜单
case "$1" in
    "backup")
        if [ -n "$2" ]; then
            backup_volume $2 ${3:-full}
        else
            backup_all_volumes ${2:-full}
        fi
        ;;
    "restore")
        if [ -n "$2" ] && [ -n "$3" ]; then
            restore_volume $2 $3 $4
        else
            echo "Usage: $0 restore <volume-name> <backup-file> [replace|merge]"
        fi
        ;;
    "cleanup")
        cleanup_old_backups
        ;;
    "status")
        backup_status_report
        ;;
    "schedule")
        # 设置定时备份(需要 cron)
        echo "Setting up scheduled backups..."
        echo "0 2 * * * $0 backup full" | crontab -
        echo "0 */6 * * * $0 backup incremental" | crontab -
        echo "0 3 * * 0 $0 cleanup" | crontab -
        echo "Scheduled backups configured"
        ;;
    *)
        echo "Usage: $0 {backup|restore|cleanup|status|schedule}"
        echo "  backup [volume-name] [full|incremental] - Backup volume(s)"
        echo "  restore <volume> <backup-file> [mode]   - Restore volume"
        echo "  cleanup                                 - Remove old backups"
        echo "  status                                  - Show backup status"
        echo "  schedule                                - Setup scheduled backups"
        ;;
esac

5. 存储性能优化

5.1 性能监控

存储性能监控脚本

#!/bin/bash
# storage-performance-monitor.sh

# 监控配置
MONITOR_INTERVAL=5
LOG_FILE="/var/log/storage-performance.log"
ALERT_THRESHOLD_IOPS=1000
ALERT_THRESHOLD_LATENCY=100  # ms

# 日志函数
log_metric() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> $LOG_FILE
}

# 获取存储设备信息
get_storage_devices() {
    # 获取 Docker 存储设备
    docker system df --format "table {{.Type}}\t{{.Total}}\t{{.Active}}\t{{.Size}}\t{{.Reclaimable}}"
    
    # 获取系统存储设备
    echo "\nSystem Storage Devices:"
    lsblk -o NAME,SIZE,TYPE,MOUNTPOINT,FSTYPE
}

# 监控 I/O 性能
monitor_io_performance() {
    echo "=== Storage I/O Performance Monitor ==="
    echo "Press Ctrl+C to stop monitoring"
    echo ""
    
    # 表头
    printf "%-10s %-15s %-10s %-10s %-10s %-10s\n" "TIME" "DEVICE" "IOPS" "READ_MB/s" "WRITE_MB/s" "UTIL%"
    printf "%-10s %-15s %-10s %-10s %-10s %-10s\n" "----" "------" "----" "---------" "----------" "-----"
    
    while true; do
        # 使用 iostat 监控 I/O
        iostat -x 1 1 | awk '
        /^[a-z]/ && !/^avg-cpu/ && !/^Device/ {
            time = strftime("%H:%M:%S")
            device = $1
            iops = $4 + $5
            read_mb = $6 / 1024
            write_mb = $7 / 1024
            util = $10
            printf "%-10s %-15s %-10.0f %-10.2f %-10.2f %-10.1f\n", time, device, iops, read_mb, write_mb, util
        }'
        
        sleep $MONITOR_INTERVAL
    done
}

# 测试存储性能
test_storage_performance() {
    local test_path=${1:-/tmp}
    local test_size=${2:-1G}
    
    echo "=== Storage Performance Test ==="
    echo "Test Path: $test_path"
    echo "Test Size: $test_size"
    echo ""
    
    # 顺序写测试
    echo "Sequential Write Test:"
    dd if=/dev/zero of=$test_path/test_write bs=1M count=1024 oflag=direct 2>&1 | grep -E "copied|MB/s"
    
    # 顺序读测试
    echo "\nSequential Read Test:"
    dd if=$test_path/test_write of=/dev/null bs=1M iflag=direct 2>&1 | grep -E "copied|MB/s"
    
    # 随机写测试(使用 fio 如果可用)
    if command -v fio > /dev/null; then
        echo "\nRandom Write Test (4K):"
        fio --name=random-write --ioengine=libaio --rw=randwrite --bs=4k --size=100M --numjobs=1 --iodepth=1 --runtime=10 --time_based --filename=$test_path/test_random
        
        echo "\nRandom Read Test (4K):"
        fio --name=random-read --ioengine=libaio --rw=randread --bs=4k --size=100M --numjobs=1 --iodepth=1 --runtime=10 --time_based --filename=$test_path/test_random
    else
        echo "\nfio not available, skipping random I/O tests"
    fi
    
    # 清理测试文件
    rm -f $test_path/test_write $test_path/test_random
    
    echo "\nPerformance test completed"
}

# 卷性能分析
analyze_volume_performance() {
    local volume_name=$1
    
    if [ -z "$volume_name" ]; then
        echo "Usage: analyze_volume_performance <volume-name>"
        return 1
    fi
    
    echo "=== Volume Performance Analysis: $volume_name ==="
    
    # 获取卷信息
    local mountpoint=$(docker volume inspect $volume_name --format '{{.Mountpoint}}')
    local driver=$(docker volume inspect $volume_name --format '{{.Driver}}')
    
    echo "Volume: $volume_name"
    echo "Driver: $driver"
    echo "Mountpoint: $mountpoint"
    echo ""
    
    # 检查挂载点存在性
    if [ ! -d "$mountpoint" ]; then
        echo "ERROR: Mountpoint does not exist or is not accessible"
        return 1
    fi
    
    # 文件系统信息
    echo "Filesystem Information:"
    df -h $mountpoint
    echo ""
    
    # 测试卷性能
    test_storage_performance $mountpoint 100M
    
    # 检查使用该卷的容器
    echo "\nContainers using this volume:"
    docker ps --filter volume=$volume_name --format "table {{.Names}}\t{{.Image}}\t{{.Status}}"
}

# 存储健康检查
storage_health_check() {
    echo "=== Storage Health Check ==="
    
    # 检查 Docker 存储驱动
    echo "Docker Storage Driver:"
    docker info --format '{{.Driver}}'
    
    # 检查存储空间
    echo "\nDocker Storage Usage:"
    docker system df
    
    # 检查卷状态
    echo "\nVolume Status:"
    docker volume ls --format "table {{.Name}}\t{{.Driver}}\t{{.Scope}}"
    
    # 检查存储设备健康
    echo "\nStorage Device Health:"
    if command -v smartctl > /dev/null; then
        for device in $(lsblk -d -o NAME --noheadings); do
            echo "Device: /dev/$device"
            smartctl -H /dev/$device 2>/dev/null | grep -E "SMART overall-health|PASSED|FAILED" || echo "  SMART not available"
        done
    else
        echo "smartctl not available, skipping device health check"
    fi
    
    # 检查文件系统错误
    echo "\nFilesystem Check:"
    dmesg | grep -i "error\|fail" | grep -i "ext4\|xfs\|btrfs" | tail -5
}

# 主菜单
case "$1" in
    "monitor")
        monitor_io_performance
        ;;
    "test")
        test_storage_performance $2 $3
        ;;
    "analyze")
        analyze_volume_performance $2
        ;;
    "health")
        storage_health_check
        ;;
    "devices")
        get_storage_devices
        ;;
    *)
        echo "Usage: $0 {monitor|test|analyze|health|devices}"
        echo "  monitor                    - Monitor I/O performance"
        echo "  test [path] [size]         - Test storage performance"
        echo "  analyze <volume-name>      - Analyze volume performance"
        echo "  health                     - Storage health check"
        echo "  devices                    - Show storage devices"
        ;;
esac

5.2 性能优化配置

存储驱动优化

# Docker 存储驱动优化配置
# /etc/docker/daemon.json
{
  "storage-driver": "overlay2",
  "storage-opts": [
    "overlay2.override_kernel_check=true",
    "overlay2.size=20G"
  ],
  "data-root": "/var/lib/docker",
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  }
}

# 重启 Docker 服务
sudo systemctl restart docker

文件系统优化

# XFS 文件系统优化
# 挂载选项
/dev/sdb1 /var/lib/docker xfs defaults,noatime,largeio,inode64,swalloc 0 0

# EXT4 文件系统优化
# 挂载选项
/dev/sdb1 /var/lib/docker ext4 defaults,noatime,data=writeback,barrier=0,nobh 0 0

# 创建优化的文件系统
# XFS
sudo mkfs.xfs -f -i size=512 -d agcount=4 /dev/sdb1

# EXT4
sudo mkfs.ext4 -F -E stride=32,stripe-width=128 /dev/sdb1

6. 实践练习

练习 1:多层存储架构

目标:为 Web 应用构建多层存储架构

# 1. 创建不同类型的存储
# 数据库存储(本地高性能)
docker volume create --driver local db-data

# 文件存储(NFS 共享)
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=nfs-server,rw \
  --opt device=:/exports/files \
  file-storage

# 缓存存储(内存)
docker volume create \
  --driver local \
  --opt type=tmpfs \
  --opt tmpfs-size=512m \
  cache-storage

# 2. 部署应用栈
docker service create --name database \
  --mount type=volume,source=db-data,target=/var/lib/mysql \
  --constraint 'node.labels.storage == ssd' \
  mysql:8.0

docker service create --name app \
  --mount type=volume,source=file-storage,target=/app/files \
  --mount type=volume,source=cache-storage,target=/app/cache \
  --replicas 3 \
  my-app:latest

docker service create --name web \
  --mount type=volume,source=file-storage,target=/var/www/html,readonly \
  --publish 80:80 \
  --replicas 2 \
  nginx

练习 2:备份和恢复测试

目标:实现自动化备份和恢复流程

# 1. 创建测试数据
docker volume create test-data
docker run --rm -v test-data:/data alpine sh -c 'echo "Test data $(date)" > /data/test.txt'

# 2. 执行备份
./backup-system.sh backup test-data full

# 3. 模拟数据丢失
docker run --rm -v test-data:/data alpine rm -f /data/test.txt

# 4. 恢复数据
BACKUP_FILE=$(ls -t /backup/test-data/*_full_*.tar.gz | head -1)
./backup-system.sh restore test-data $BACKUP_FILE

# 5. 验证恢复
docker run --rm -v test-data:/data alpine cat /data/test.txt

练习 3:性能基准测试

目标:比较不同存储配置的性能

# 1. 创建不同类型的卷
docker volume create local-volume
docker volume create \
  --driver local \
  --opt type=nfs \
  --opt o=addr=nfs-server \
  --opt device=:/exports/test \
  nfs-volume

# 2. 性能测试
echo "Testing local volume:"
./storage-performance-monitor.sh test /var/lib/docker/volumes/local-volume/_data

echo "Testing NFS volume:"
./storage-performance-monitor.sh test /var/lib/docker/volumes/nfs-volume/_data

# 3. 分析结果
./storage-performance-monitor.sh analyze local-volume
./storage-performance-monitor.sh analyze nfs-volume

7. 本章总结

关键要点

  1. 存储架构

    • 理解不同存储类型的特性和适用场景
    • 掌握 Docker 存储驱动和卷管理
    • 学会设计多层存储架构
  2. 卷管理

    • 创建和配置各种类型的卷
    • 服务卷挂载和管理
    • 卷生命周期管理
  3. 共享存储

    • NFS、GlusterFS 等共享存储集成
    • 云存储服务集成
    • 存储插件使用
  4. 数据持久化

    • 数据分类和存储策略
    • 备份和恢复机制
    • 数据安全和一致性
  5. 性能优化

    • 存储性能监控和分析
    • 存储驱动和文件系统优化
    • I/O 性能调优

最佳实践

  1. 存储规划:根据数据特性选择合适的存储类型
  2. 性能优化:合理配置存储驱动和文件系统
  3. 数据保护:建立完善的备份和恢复机制
  4. 监控告警:实施存储性能和健康监控
  5. 容量管理:定期清理和优化存储使用

下一步学习

在下一章中,我们将学习安全管理,包括:

  • 集群安全配置
  • 密钥和证书管理
  • 访问控制和权限管理
  • 安全扫描和合规性

检查清单: - [ ] 理解 Docker Swarm 存储架构 - [ ] 掌握卷创建和管理 - [ ] 学会配置共享存储 - [ ] 实现数据备份和恢复 - [ ] 建立存储性能监控