学习目标
通过本章学习,您将能够:
- 理解 Docker Swarm 存储架构和类型
- 掌握卷(Volume)和绑定挂载的使用
- 学会配置和管理存储驱动
- 了解数据持久化和备份策略
- 掌握存储性能优化和故障排除
1. 存储架构概述
1.1 存储类型
Docker 存储类型
# Docker 支持的存储类型:
# 1. 卷(Volumes)
# - Docker 管理的存储
# - 独立于容器生命周期
# - 支持驱动插件
# - 可在容器间共享
# 2. 绑定挂载(Bind Mounts)
# - 主机文件系统路径
# - 直接映射到容器
# - 依赖主机文件系统
# - 性能最佳
# 3. tmpfs 挂载
# - 内存中的临时文件系统
# - 容器停止时数据丢失
# - 适用于临时数据
# - 高性能读写
# 4. 命名管道(Windows)
# - Windows 容器专用
# - 进程间通信
# - 类似 Unix 套接字
存储驱动架构
# 存储驱动层次结构:
┌─────────────────────────────────────┐
│ Container Layer │ ← 可写层
├─────────────────────────────────────┤
│ Image Layer N │ ← 只读层
├─────────────────────────────────────┤
│ Image Layer N-1 │ ← 只读层
├─────────────────────────────────────┤
│ ... │
├─────────────────────────────────────┤
│ Image Layer 1 │ ← 只读层
├─────────────────────────────────────┤
│ Base Layer │ ← 只读层
└─────────────────────────────────────┘
# 常用存储驱动:
# - overlay2:推荐的存储驱动
# - aufs:较老的联合文件系统
# - devicemapper:基于设备映射
# - btrfs:B-tree 文件系统
# - zfs:ZFS 文件系统
1.2 Swarm 存储特性
集群存储挑战
# Swarm 集群存储面临的挑战:
# 1. 节点间数据共享
# - 服务可能在不同节点运行
# - 需要共享存储解决方案
# - 数据一致性保证
# 2. 数据持久化
# - 容器重启数据保持
# - 节点故障数据恢复
# - 服务迁移数据跟随
# 3. 性能和可扩展性
# - 存储 I/O 性能
# - 并发访问控制
# - 存储容量扩展
# 4. 数据安全
# - 数据加密
# - 访问权限控制
# - 备份和恢复
存储解决方案
# Swarm 存储解决方案:
# 1. 本地存储
# - 节点本地卷
# - 绑定挂载
# - 适用于单节点服务
# 2. 共享存储
# - NFS 网络文件系统
# - GlusterFS 分布式文件系统
# - Ceph 分布式存储
# - 云存储服务(EBS、EFS 等)
# 3. 存储插件
# - Docker 卷插件
# - CSI(Container Storage Interface)
# - 第三方存储解决方案
2. 卷管理
2.1 基本卷操作
创建和管理卷
# 创建卷
docker volume create my-volume
# 创建带驱动的卷
docker volume create --driver local my-local-volume
# 创建带选项的卷
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=192.168.1.100,rw \
--opt device=:/path/to/dir \
nfs-volume
# 查看卷列表
docker volume ls
# 查看卷详情
docker volume inspect my-volume
# 删除卷
docker volume rm my-volume
# 清理未使用的卷
docker volume prune
卷配置选项
# 本地卷配置
docker volume create \
--driver local \
--opt type=ext4 \
--opt device=/dev/sdb1 \
block-volume
# tmpfs 卷配置
docker volume create \
--driver local \
--opt type=tmpfs \
--opt tmpfs-size=100m \
temp-volume
# 绑定挂载卷
docker volume create \
--driver local \
--opt type=none \
--opt o=bind \
--opt device=/host/path \
bind-volume
# 查看卷挂载点
docker volume inspect my-volume --format '{{.Mountpoint}}'
2.2 服务卷挂载
基本卷挂载
# 创建带卷的服务
docker service create --name web \
--mount type=volume,source=web-data,target=/var/www/html \
nginx
# 多个卷挂载
docker service create --name app \
--mount type=volume,source=app-data,target=/app/data \
--mount type=volume,source=app-logs,target=/app/logs \
my-app:latest
# 只读卷挂载
docker service create --name readonly-app \
--mount type=volume,source=config-data,target=/app/config,readonly \
my-app:latest
# 绑定挂载
docker service create --name bind-app \
--mount type=bind,source=/host/data,target=/app/data \
my-app:latest
高级挂载选项
# 带标签的卷挂载
docker service create --name labeled-app \
--mount type=volume,source=app-data,target=/app/data,volume-label=app=myapp \
my-app:latest
# 卷驱动选项
docker service create --name nfs-app \
--mount type=volume,source=nfs-data,target=/app/data,volume-driver=local,volume-opt=type=nfs,volume-opt=o=addr=192.168.1.100,volume-opt=device=:/exports/data \
my-app:latest
# tmpfs 挂载
docker service create --name temp-app \
--mount type=tmpfs,target=/tmp,tmpfs-size=100m \
my-app:latest
# 多种挂载类型组合
docker service create --name complex-app \
--mount type=volume,source=persistent-data,target=/app/data \
--mount type=bind,source=/host/config,target=/app/config,readonly \
--mount type=tmpfs,target=/tmp,tmpfs-size=50m \
my-app:latest
2.3 卷管理脚本
卷管理工具
#!/bin/bash
# volume-manager.sh
# 卷管理配置
VOLUME_PREFIX="myapp"
BACKUP_DIR="/backup/volumes"
NFS_SERVER="192.168.1.100"
NFS_PATH="/exports"
# 创建应用卷集
create_app_volumes() {
local app_name=$1
echo "Creating volumes for application: $app_name"
# 数据卷
docker volume create ${VOLUME_PREFIX}-${app_name}-data
# 日志卷
docker volume create ${VOLUME_PREFIX}-${app_name}-logs
# 配置卷(NFS)
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=${NFS_SERVER},rw \
--opt device=:${NFS_PATH}/${app_name}/config \
${VOLUME_PREFIX}-${app_name}-config
# 缓存卷(tmpfs)
docker volume create \
--driver local \
--opt type=tmpfs \
--opt tmpfs-size=256m \
${VOLUME_PREFIX}-${app_name}-cache
echo "Volumes created for $app_name:"
docker volume ls --filter name=${VOLUME_PREFIX}-${app_name}
}
# 备份卷数据
backup_volume() {
local volume_name=$1
local backup_name="${volume_name}_$(date +%Y%m%d_%H%M%S).tar.gz"
echo "Backing up volume: $volume_name"
# 创建备份目录
mkdir -p $BACKUP_DIR
# 使用临时容器备份卷数据
docker run --rm \
--volume $volume_name:/data \
--volume $BACKUP_DIR:/backup \
alpine tar czf /backup/$backup_name -C /data .
if [ $? -eq 0 ]; then
echo "✓ Backup completed: $BACKUP_DIR/$backup_name"
else
echo "✗ Backup failed for volume: $volume_name"
fi
}
# 恢复卷数据
restore_volume() {
local volume_name=$1
local backup_file=$2
if [ ! -f "$backup_file" ]; then
echo "✗ Backup file not found: $backup_file"
return 1
fi
echo "Restoring volume: $volume_name from $backup_file"
# 创建卷(如果不存在)
docker volume create $volume_name
# 使用临时容器恢复数据
docker run --rm \
--volume $volume_name:/data \
--volume $(dirname $backup_file):/backup \
alpine tar xzf /backup/$(basename $backup_file) -C /data
if [ $? -eq 0 ]; then
echo "✓ Restore completed for volume: $volume_name"
else
echo "✗ Restore failed for volume: $volume_name"
fi
}
# 清理未使用的卷
cleanup_volumes() {
echo "Cleaning up unused volumes..."
# 显示将要删除的卷
echo "Unused volumes:"
docker volume ls --filter dangling=true
# 确认删除
read -p "Do you want to remove these volumes? (y/N): " confirm
if [ "$confirm" = "y" ] || [ "$confirm" = "Y" ]; then
docker volume prune -f
echo "✓ Unused volumes removed"
else
echo "Volume cleanup cancelled"
fi
}
# 卷使用情况报告
volume_usage_report() {
echo "=== Volume Usage Report ==="
# 卷列表和大小
echo "\nVolume List:"
printf "%-30s %-15s %-20s\n" "NAME" "DRIVER" "SIZE"
printf "%-30s %-15s %-20s\n" "----" "------" "----"
for volume in $(docker volume ls --format '{{.Name}}'); do
driver=$(docker volume inspect $volume --format '{{.Driver}}')
mountpoint=$(docker volume inspect $volume --format '{{.Mountpoint}}')
if [ -d "$mountpoint" ]; then
size=$(du -sh "$mountpoint" 2>/dev/null | cut -f1)
else
size="N/A"
fi
printf "%-30s %-15s %-20s\n" "$volume" "$driver" "$size"
done
# 卷使用统计
echo "\nVolume Statistics:"
total_volumes=$(docker volume ls | wc -l)
unused_volumes=$(docker volume ls --filter dangling=true | wc -l)
used_volumes=$((total_volumes - unused_volumes - 1)) # 减去标题行
echo " Total volumes: $total_volumes"
echo " Used volumes: $used_volumes"
echo " Unused volumes: $unused_volumes"
}
# 主菜单
case "$1" in
"create")
if [ -n "$2" ]; then
create_app_volumes $2
else
echo "Usage: $0 create <app-name>"
fi
;;
"backup")
if [ -n "$2" ]; then
backup_volume $2
else
echo "Usage: $0 backup <volume-name>"
fi
;;
"restore")
if [ -n "$2" ] && [ -n "$3" ]; then
restore_volume $2 $3
else
echo "Usage: $0 restore <volume-name> <backup-file>"
fi
;;
"cleanup")
cleanup_volumes
;;
"report")
volume_usage_report
;;
*)
echo "Usage: $0 {create|backup|restore|cleanup|report}"
echo " create <app-name> - Create volume set for application"
echo " backup <volume-name> - Backup volume data"
echo " restore <volume-name> <file> - Restore volume from backup"
echo " cleanup - Remove unused volumes"
echo " report - Show volume usage report"
;;
esac
3. 共享存储
3.1 NFS 存储
NFS 服务器配置
# 在 NFS 服务器上配置
# /etc/exports
/exports/data *(rw,sync,no_subtree_check,no_root_squash)
/exports/logs *(rw,sync,no_subtree_check,no_root_squash)
/exports/config *(ro,sync,no_subtree_check,no_root_squash)
# 重启 NFS 服务
sudo systemctl restart nfs-server
sudo exportfs -ra
# 验证 NFS 导出
sudo exportfs -v
NFS 卷创建
# 创建 NFS 卷
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=192.168.1.100,rw,nfsvers=4 \
--opt device=:/exports/data \
nfs-data
# 创建只读 NFS 卷
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=192.168.1.100,ro,nfsvers=4 \
--opt device=:/exports/config \
nfs-config
# 使用 NFS 卷的服务
docker service create --name web \
--mount type=volume,source=nfs-data,target=/var/www/html \
--mount type=volume,source=nfs-config,target=/etc/nginx,readonly \
--replicas 3 \
nginx
# 验证 NFS 挂载
docker exec $(docker ps --filter name=web -q | head -1) mount | grep nfs
NFS 性能优化
# 高性能 NFS 配置
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=192.168.1.100,rw,nfsvers=4.1,proto=tcp,port=2049,timeo=14,intr,rsize=32768,wsize=32768 \
--opt device=:/exports/data \
nfs-optimized
# NFS 挂载选项说明:
# nfsvers=4.1 - 使用 NFSv4.1 协议
# proto=tcp - 使用 TCP 协议
# timeo=14 - 超时时间(1.4秒)
# intr - 允许中断
# rsize=32768 - 读取块大小
# wsize=32768 - 写入块大小
# hard - 硬挂载(默认)
# soft - 软挂载(可选)
3.2 GlusterFS 存储
GlusterFS 集群配置
# 在每个 GlusterFS 节点上
# 1. 安装 GlusterFS
sudo apt-get install glusterfs-server
# 2. 启动服务
sudo systemctl start glusterd
sudo systemctl enable glusterd
# 3. 配置集群(在主节点执行)
sudo gluster peer probe gluster-node2
sudo gluster peer probe gluster-node3
# 4. 查看集群状态
sudo gluster peer status
# 5. 创建卷
sudo gluster volume create gv0 replica 3 \
gluster-node1:/data/brick1/gv0 \
gluster-node2:/data/brick1/gv0 \
gluster-node3:/data/brick1/gv0
# 6. 启动卷
sudo gluster volume start gv0
# 7. 查看卷信息
sudo gluster volume info gv0
GlusterFS Docker 集成
# 安装 GlusterFS 卷插件
docker plugin install --grant-all-permissions trajano/glusterfs-volume-plugin
# 创建 GlusterFS 卷
docker volume create \
--driver trajano/glusterfs-volume-plugin \
--opt glusterfs-server=gluster-node1 \
--opt glusterfs-volname=gv0 \
gluster-volume
# 或使用本地驱动
docker volume create \
--driver local \
--opt type=glusterfs \
--opt o=addr=gluster-node1 \
--opt device=:/gv0 \
gluster-local
# 使用 GlusterFS 卷的服务
docker service create --name distributed-app \
--mount type=volume,source=gluster-volume,target=/app/data \
--replicas 5 \
my-app:latest
3.3 云存储集成
AWS EFS 集成
# 安装 EFS 工具
sudo apt-get install amazon-efs-utils
# 创建 EFS 卷
docker volume create \
--driver local \
--opt type=nfs4 \
--opt o=addr=fs-12345678.efs.us-west-2.amazonaws.com,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,intr,timeo=600 \
--opt device=:/ \
efs-volume
# 使用 EFS 卷
docker service create --name efs-app \
--mount type=volume,source=efs-volume,target=/app/data \
--constraint 'node.platform.os == linux' \
my-app:latest
Azure Files 集成
# 创建 Azure Files 卷
docker volume create \
--driver local \
--opt type=cifs \
--opt o=addr=mystorageaccount.file.core.windows.net,username=mystorageaccount,password=mykey,uid=1000,gid=1000,iocharset=utf8,file_mode=0777,dir_mode=0777 \
--opt device=//mystorageaccount.file.core.windows.net/myshare \
azure-files
# 使用 Azure Files 卷
docker service create --name azure-app \
--mount type=volume,source=azure-files,target=/app/data \
my-app:latest
4. 数据持久化策略
4.1 数据分类和策略
数据分类
# 数据分类策略:
# 1. 持久化数据(Persistent Data)
# - 数据库文件
# - 用户上传文件
# - 应用状态数据
# - 需要跨容器生命周期保持
# 2. 配置数据(Configuration Data)
# - 应用配置文件
# - 证书和密钥
# - 环境配置
# - 通常只读,偶尔更新
# 3. 缓存数据(Cache Data)
# - 临时计算结果
# - 会话数据
# - 可重新生成的数据
# - 性能优化用途
# 4. 日志数据(Log Data)
# - 应用日志
# - 访问日志
# - 审计日志
# - 需要长期保存和分析
存储策略设计
#!/bin/bash
# storage-strategy.sh
# 存储策略配置
declare -A STORAGE_STRATEGY=(
["database"]="persistent,replicated,backup"
["uploads"]="persistent,shared,backup"
["config"]="persistent,readonly,versioned"
["cache"]="temporary,local,fast"
["logs"]="persistent,centralized,rotated"
["temp"]="temporary,memory,fast"
)
# 根据策略创建存储
create_storage_by_strategy() {
local data_type=$1
local app_name=$2
local strategy=${STORAGE_STRATEGY[$data_type]}
echo "Creating storage for $data_type with strategy: $strategy"
case $data_type in
"database")
# 持久化、复制、备份
docker volume create \
--driver local \
--label backup=daily \
--label replicated=true \
${app_name}-database
;;
"uploads")
# 持久化、共享、备份
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=nfs-server,rw \
--opt device=:/exports/${app_name}/uploads \
--label backup=hourly \
${app_name}-uploads
;;
"config")
# 持久化、只读、版本控制
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=nfs-server,ro \
--opt device=:/exports/${app_name}/config \
--label versioned=true \
${app_name}-config
;;
"cache")
# 临时、本地、快速
docker volume create \
--driver local \
--opt type=tmpfs \
--opt tmpfs-size=512m \
--label temporary=true \
${app_name}-cache
;;
"logs")
# 持久化、集中化、轮转
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=log-server,rw \
--opt device=:/exports/logs/${app_name} \
--label centralized=true \
--label rotation=daily \
${app_name}-logs
;;
"temp")
# 临时、内存、快速
# 使用 tmpfs 挂载,不创建卷
echo "Temp storage will use tmpfs mount"
;;
*)
echo "Unknown data type: $data_type"
return 1
;;
esac
}
# 部署带存储策略的应用
deploy_app_with_storage() {
local app_name=$1
echo "Deploying $app_name with storage strategy..."
# 创建各类存储
for data_type in "${!STORAGE_STRATEGY[@]}"; do
create_storage_by_strategy $data_type $app_name
done
# 部署服务
docker service create --name $app_name \
--mount type=volume,source=${app_name}-database,target=/app/data \
--mount type=volume,source=${app_name}-uploads,target=/app/uploads \
--mount type=volume,source=${app_name}-config,target=/app/config,readonly \
--mount type=volume,source=${app_name}-cache,target=/app/cache \
--mount type=volume,source=${app_name}-logs,target=/app/logs \
--mount type=tmpfs,target=/tmp,tmpfs-size=100m \
my-app:latest
echo "Application $app_name deployed with storage strategy"
}
# 使用示例
if [ -n "$1" ]; then
deploy_app_with_storage $1
else
echo "Usage: $0 <app-name>"
fi
4.2 数据备份和恢复
自动备份系统
#!/bin/bash
# backup-system.sh
BACKUP_ROOT="/backup"
RETENTION_DAYS=30
LOG_FILE="/var/log/docker-backup.log"
# 日志函数
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a $LOG_FILE
}
# 备份单个卷
backup_volume() {
local volume_name=$1
local backup_type=$2 # full, incremental
local backup_date=$(date +%Y%m%d_%H%M%S)
local backup_dir="$BACKUP_ROOT/$volume_name"
local backup_file="${backup_dir}/${volume_name}_${backup_type}_${backup_date}.tar.gz"
log "Starting $backup_type backup for volume: $volume_name"
# 创建备份目录
mkdir -p $backup_dir
# 检查卷是否存在
if ! docker volume inspect $volume_name > /dev/null 2>&1; then
log "ERROR: Volume $volume_name does not exist"
return 1
fi
# 执行备份
if [ "$backup_type" = "full" ]; then
# 全量备份
docker run --rm \
--volume $volume_name:/data:ro \
--volume $backup_dir:/backup \
alpine tar czf /backup/$(basename $backup_file) -C /data .
else
# 增量备份(基于最新的全量备份)
local last_full=$(ls -t ${backup_dir}/*_full_*.tar.gz 2>/dev/null | head -1)
if [ -z "$last_full" ]; then
log "No full backup found, performing full backup instead"
backup_volume $volume_name "full"
return $?
fi
# 创建增量备份(这里简化为差异备份)
docker run --rm \
--volume $volume_name:/data:ro \
--volume $backup_dir:/backup \
alpine sh -c "find /data -newer /backup/.last_backup 2>/dev/null | tar czf /backup/$(basename $backup_file) -T - 2>/dev/null || tar czf /backup/$(basename $backup_file) -C /data ."
fi
if [ $? -eq 0 ]; then
log "SUCCESS: Backup completed - $backup_file"
# 更新备份时间戳
touch "${backup_dir}/.last_backup"
# 计算备份大小
local size=$(du -h $backup_file | cut -f1)
log "Backup size: $size"
return 0
else
log "ERROR: Backup failed for volume: $volume_name"
return 1
fi
}
# 恢复卷
restore_volume() {
local volume_name=$1
local backup_file=$2
local restore_mode=${3:-replace} # replace, merge
log "Starting restore for volume: $volume_name from $backup_file"
if [ ! -f "$backup_file" ]; then
log "ERROR: Backup file not found: $backup_file"
return 1
fi
# 创建卷(如果不存在)
docker volume create $volume_name > /dev/null 2>&1
# 执行恢复
if [ "$restore_mode" = "replace" ]; then
# 替换模式:清空卷后恢复
docker run --rm \
--volume $volume_name:/data \
--volume $(dirname $backup_file):/backup \
alpine sh -c "rm -rf /data/* /data/.[^.]* 2>/dev/null; tar xzf /backup/$(basename $backup_file) -C /data"
else
# 合并模式:直接解压到卷中
docker run --rm \
--volume $volume_name:/data \
--volume $(dirname $backup_file):/backup \
alpine tar xzf /backup/$(basename $backup_file) -C /data
fi
if [ $? -eq 0 ]; then
log "SUCCESS: Restore completed for volume: $volume_name"
return 0
else
log "ERROR: Restore failed for volume: $volume_name"
return 1
fi
}
# 清理过期备份
cleanup_old_backups() {
log "Starting cleanup of backups older than $RETENTION_DAYS days"
find $BACKUP_ROOT -name "*.tar.gz" -mtime +$RETENTION_DAYS -type f | while read backup_file; do
log "Removing old backup: $backup_file"
rm -f "$backup_file"
done
# 清理空目录
find $BACKUP_ROOT -type d -empty -delete
log "Backup cleanup completed"
}
# 备份所有标记的卷
backup_all_volumes() {
local backup_type=${1:-full}
log "Starting $backup_type backup for all marked volumes"
# 查找需要备份的卷(带有 backup 标签)
for volume in $(docker volume ls --filter label=backup --format '{{.Name}}'); do
backup_volume $volume $backup_type
done
# 清理过期备份
cleanup_old_backups
log "All volume backups completed"
}
# 备份状态报告
backup_status_report() {
echo "=== Backup Status Report ==="
echo "Backup Root: $BACKUP_ROOT"
echo "Retention: $RETENTION_DAYS days"
echo ""
# 统计备份信息
local total_backups=$(find $BACKUP_ROOT -name "*.tar.gz" | wc -l)
local total_size=$(du -sh $BACKUP_ROOT 2>/dev/null | cut -f1)
echo "Total Backups: $total_backups"
echo "Total Size: $total_size"
echo ""
# 按卷显示备份信息
echo "Backup Details:"
printf "%-20s %-10s %-15s %-10s\n" "VOLUME" "COUNT" "LATEST" "SIZE"
printf "%-20s %-10s %-15s %-10s\n" "------" "-----" "------" "----"
for volume_dir in $(find $BACKUP_ROOT -maxdepth 1 -type d | tail -n +2); do
local volume_name=$(basename $volume_dir)
local backup_count=$(ls $volume_dir/*.tar.gz 2>/dev/null | wc -l)
local latest_backup=$(ls -t $volume_dir/*.tar.gz 2>/dev/null | head -1)
local latest_date="N/A"
local volume_size="N/A"
if [ -n "$latest_backup" ]; then
latest_date=$(date -r "$latest_backup" +%m/%d)
volume_size=$(du -sh $volume_dir | cut -f1)
fi
printf "%-20s %-10s %-15s %-10s\n" "$volume_name" "$backup_count" "$latest_date" "$volume_size"
done
}
# 主菜单
case "$1" in
"backup")
if [ -n "$2" ]; then
backup_volume $2 ${3:-full}
else
backup_all_volumes ${2:-full}
fi
;;
"restore")
if [ -n "$2" ] && [ -n "$3" ]; then
restore_volume $2 $3 $4
else
echo "Usage: $0 restore <volume-name> <backup-file> [replace|merge]"
fi
;;
"cleanup")
cleanup_old_backups
;;
"status")
backup_status_report
;;
"schedule")
# 设置定时备份(需要 cron)
echo "Setting up scheduled backups..."
echo "0 2 * * * $0 backup full" | crontab -
echo "0 */6 * * * $0 backup incremental" | crontab -
echo "0 3 * * 0 $0 cleanup" | crontab -
echo "Scheduled backups configured"
;;
*)
echo "Usage: $0 {backup|restore|cleanup|status|schedule}"
echo " backup [volume-name] [full|incremental] - Backup volume(s)"
echo " restore <volume> <backup-file> [mode] - Restore volume"
echo " cleanup - Remove old backups"
echo " status - Show backup status"
echo " schedule - Setup scheduled backups"
;;
esac
5. 存储性能优化
5.1 性能监控
存储性能监控脚本
#!/bin/bash
# storage-performance-monitor.sh
# 监控配置
MONITOR_INTERVAL=5
LOG_FILE="/var/log/storage-performance.log"
ALERT_THRESHOLD_IOPS=1000
ALERT_THRESHOLD_LATENCY=100 # ms
# 日志函数
log_metric() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> $LOG_FILE
}
# 获取存储设备信息
get_storage_devices() {
# 获取 Docker 存储设备
docker system df --format "table {{.Type}}\t{{.Total}}\t{{.Active}}\t{{.Size}}\t{{.Reclaimable}}"
# 获取系统存储设备
echo "\nSystem Storage Devices:"
lsblk -o NAME,SIZE,TYPE,MOUNTPOINT,FSTYPE
}
# 监控 I/O 性能
monitor_io_performance() {
echo "=== Storage I/O Performance Monitor ==="
echo "Press Ctrl+C to stop monitoring"
echo ""
# 表头
printf "%-10s %-15s %-10s %-10s %-10s %-10s\n" "TIME" "DEVICE" "IOPS" "READ_MB/s" "WRITE_MB/s" "UTIL%"
printf "%-10s %-15s %-10s %-10s %-10s %-10s\n" "----" "------" "----" "---------" "----------" "-----"
while true; do
# 使用 iostat 监控 I/O
iostat -x 1 1 | awk '
/^[a-z]/ && !/^avg-cpu/ && !/^Device/ {
time = strftime("%H:%M:%S")
device = $1
iops = $4 + $5
read_mb = $6 / 1024
write_mb = $7 / 1024
util = $10
printf "%-10s %-15s %-10.0f %-10.2f %-10.2f %-10.1f\n", time, device, iops, read_mb, write_mb, util
}'
sleep $MONITOR_INTERVAL
done
}
# 测试存储性能
test_storage_performance() {
local test_path=${1:-/tmp}
local test_size=${2:-1G}
echo "=== Storage Performance Test ==="
echo "Test Path: $test_path"
echo "Test Size: $test_size"
echo ""
# 顺序写测试
echo "Sequential Write Test:"
dd if=/dev/zero of=$test_path/test_write bs=1M count=1024 oflag=direct 2>&1 | grep -E "copied|MB/s"
# 顺序读测试
echo "\nSequential Read Test:"
dd if=$test_path/test_write of=/dev/null bs=1M iflag=direct 2>&1 | grep -E "copied|MB/s"
# 随机写测试(使用 fio 如果可用)
if command -v fio > /dev/null; then
echo "\nRandom Write Test (4K):"
fio --name=random-write --ioengine=libaio --rw=randwrite --bs=4k --size=100M --numjobs=1 --iodepth=1 --runtime=10 --time_based --filename=$test_path/test_random
echo "\nRandom Read Test (4K):"
fio --name=random-read --ioengine=libaio --rw=randread --bs=4k --size=100M --numjobs=1 --iodepth=1 --runtime=10 --time_based --filename=$test_path/test_random
else
echo "\nfio not available, skipping random I/O tests"
fi
# 清理测试文件
rm -f $test_path/test_write $test_path/test_random
echo "\nPerformance test completed"
}
# 卷性能分析
analyze_volume_performance() {
local volume_name=$1
if [ -z "$volume_name" ]; then
echo "Usage: analyze_volume_performance <volume-name>"
return 1
fi
echo "=== Volume Performance Analysis: $volume_name ==="
# 获取卷信息
local mountpoint=$(docker volume inspect $volume_name --format '{{.Mountpoint}}')
local driver=$(docker volume inspect $volume_name --format '{{.Driver}}')
echo "Volume: $volume_name"
echo "Driver: $driver"
echo "Mountpoint: $mountpoint"
echo ""
# 检查挂载点存在性
if [ ! -d "$mountpoint" ]; then
echo "ERROR: Mountpoint does not exist or is not accessible"
return 1
fi
# 文件系统信息
echo "Filesystem Information:"
df -h $mountpoint
echo ""
# 测试卷性能
test_storage_performance $mountpoint 100M
# 检查使用该卷的容器
echo "\nContainers using this volume:"
docker ps --filter volume=$volume_name --format "table {{.Names}}\t{{.Image}}\t{{.Status}}"
}
# 存储健康检查
storage_health_check() {
echo "=== Storage Health Check ==="
# 检查 Docker 存储驱动
echo "Docker Storage Driver:"
docker info --format '{{.Driver}}'
# 检查存储空间
echo "\nDocker Storage Usage:"
docker system df
# 检查卷状态
echo "\nVolume Status:"
docker volume ls --format "table {{.Name}}\t{{.Driver}}\t{{.Scope}}"
# 检查存储设备健康
echo "\nStorage Device Health:"
if command -v smartctl > /dev/null; then
for device in $(lsblk -d -o NAME --noheadings); do
echo "Device: /dev/$device"
smartctl -H /dev/$device 2>/dev/null | grep -E "SMART overall-health|PASSED|FAILED" || echo " SMART not available"
done
else
echo "smartctl not available, skipping device health check"
fi
# 检查文件系统错误
echo "\nFilesystem Check:"
dmesg | grep -i "error\|fail" | grep -i "ext4\|xfs\|btrfs" | tail -5
}
# 主菜单
case "$1" in
"monitor")
monitor_io_performance
;;
"test")
test_storage_performance $2 $3
;;
"analyze")
analyze_volume_performance $2
;;
"health")
storage_health_check
;;
"devices")
get_storage_devices
;;
*)
echo "Usage: $0 {monitor|test|analyze|health|devices}"
echo " monitor - Monitor I/O performance"
echo " test [path] [size] - Test storage performance"
echo " analyze <volume-name> - Analyze volume performance"
echo " health - Storage health check"
echo " devices - Show storage devices"
;;
esac
5.2 性能优化配置
存储驱动优化
# Docker 存储驱动优化配置
# /etc/docker/daemon.json
{
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true",
"overlay2.size=20G"
],
"data-root": "/var/lib/docker",
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
# 重启 Docker 服务
sudo systemctl restart docker
文件系统优化
# XFS 文件系统优化
# 挂载选项
/dev/sdb1 /var/lib/docker xfs defaults,noatime,largeio,inode64,swalloc 0 0
# EXT4 文件系统优化
# 挂载选项
/dev/sdb1 /var/lib/docker ext4 defaults,noatime,data=writeback,barrier=0,nobh 0 0
# 创建优化的文件系统
# XFS
sudo mkfs.xfs -f -i size=512 -d agcount=4 /dev/sdb1
# EXT4
sudo mkfs.ext4 -F -E stride=32,stripe-width=128 /dev/sdb1
6. 实践练习
练习 1:多层存储架构
目标:为 Web 应用构建多层存储架构
# 1. 创建不同类型的存储
# 数据库存储(本地高性能)
docker volume create --driver local db-data
# 文件存储(NFS 共享)
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=nfs-server,rw \
--opt device=:/exports/files \
file-storage
# 缓存存储(内存)
docker volume create \
--driver local \
--opt type=tmpfs \
--opt tmpfs-size=512m \
cache-storage
# 2. 部署应用栈
docker service create --name database \
--mount type=volume,source=db-data,target=/var/lib/mysql \
--constraint 'node.labels.storage == ssd' \
mysql:8.0
docker service create --name app \
--mount type=volume,source=file-storage,target=/app/files \
--mount type=volume,source=cache-storage,target=/app/cache \
--replicas 3 \
my-app:latest
docker service create --name web \
--mount type=volume,source=file-storage,target=/var/www/html,readonly \
--publish 80:80 \
--replicas 2 \
nginx
练习 2:备份和恢复测试
目标:实现自动化备份和恢复流程
# 1. 创建测试数据
docker volume create test-data
docker run --rm -v test-data:/data alpine sh -c 'echo "Test data $(date)" > /data/test.txt'
# 2. 执行备份
./backup-system.sh backup test-data full
# 3. 模拟数据丢失
docker run --rm -v test-data:/data alpine rm -f /data/test.txt
# 4. 恢复数据
BACKUP_FILE=$(ls -t /backup/test-data/*_full_*.tar.gz | head -1)
./backup-system.sh restore test-data $BACKUP_FILE
# 5. 验证恢复
docker run --rm -v test-data:/data alpine cat /data/test.txt
练习 3:性能基准测试
目标:比较不同存储配置的性能
# 1. 创建不同类型的卷
docker volume create local-volume
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=nfs-server \
--opt device=:/exports/test \
nfs-volume
# 2. 性能测试
echo "Testing local volume:"
./storage-performance-monitor.sh test /var/lib/docker/volumes/local-volume/_data
echo "Testing NFS volume:"
./storage-performance-monitor.sh test /var/lib/docker/volumes/nfs-volume/_data
# 3. 分析结果
./storage-performance-monitor.sh analyze local-volume
./storage-performance-monitor.sh analyze nfs-volume
7. 本章总结
关键要点
存储架构
- 理解不同存储类型的特性和适用场景
- 掌握 Docker 存储驱动和卷管理
- 学会设计多层存储架构
卷管理
- 创建和配置各种类型的卷
- 服务卷挂载和管理
- 卷生命周期管理
共享存储
- NFS、GlusterFS 等共享存储集成
- 云存储服务集成
- 存储插件使用
数据持久化
- 数据分类和存储策略
- 备份和恢复机制
- 数据安全和一致性
性能优化
- 存储性能监控和分析
- 存储驱动和文件系统优化
- I/O 性能调优
最佳实践
- 存储规划:根据数据特性选择合适的存储类型
- 性能优化:合理配置存储驱动和文件系统
- 数据保护:建立完善的备份和恢复机制
- 监控告警:实施存储性能和健康监控
- 容量管理:定期清理和优化存储使用
下一步学习
在下一章中,我们将学习安全管理,包括:
- 集群安全配置
- 密钥和证书管理
- 访问控制和权限管理
- 安全扫描和合规性
检查清单: - [ ] 理解 Docker Swarm 存储架构 - [ ] 掌握卷创建和管理 - [ ] 学会配置共享存储 - [ ] 实现数据备份和恢复 - [ ] 建立存储性能监控