13.1 性能优化概述

性能问题识别

常见性能瓶颈:

Jenkins性能问题分类:

1. 系统资源瓶颈
   - CPU使用率过高
   - 内存不足或泄漏
   - 磁盘I/O瓶颈
   - 网络带宽限制

2. 应用层面问题
   - JVM配置不当
   - 垃圾回收频繁
   - 线程池配置问题
   - 数据库连接池不足

3. 架构设计问题
   - 单点瓶颈
   - 负载分布不均
   - 缓存策略不当
   - 同步操作过多

4. 配置和使用问题
   - 插件冲突或性能差
   - 构建配置不合理
   - 并发设置不当
   - 日志级别过详细

性能监控指标:

关键性能指标(KPI):

1. 响应时间指标
   - 页面加载时间
   - API响应时间
   - 构建启动延迟
   - 队列等待时间

2. 吞吐量指标
   - 并发构建数量
   - 每分钟构建数
   - 用户并发数
   - 请求处理速率

3. 资源利用率
   - CPU使用率
   - 内存使用率
   - 磁盘使用率
   - 网络使用率

4. 错误率指标
   - 构建失败率
   - 系统错误率
   - 超时错误率
   - 连接失败率

性能测试方法:

性能测试策略:

1. 基准测试
   - 建立性能基线
   - 定期性能回归测试
   - 版本间性能对比
   - 配置变更影响评估

2. 负载测试
   - 模拟正常负载
   - 测试系统稳定性
   - 验证性能指标
   - 识别性能拐点

3. 压力测试
   - 测试系统极限
   - 识别瓶颈点
   - 验证故障恢复
   - 评估扩展需求

4. 容量规划
   - 预测增长需求
   - 评估硬件需求
   - 规划扩展策略
   - 成本效益分析

性能优化策略

分层优化方法:

优化层次结构:

┌─────────────────────────────────────┐
│           应用层优化                  │
│  - 代码优化                          │
│  - 算法优化                          │
│  - 缓存策略                          │
│  - 异步处理                          │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│           中间件优化                  │
│  - JVM调优                          │
│  - 数据库优化                        │
│  - 网络优化                          │
│  - 负载均衡                          │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│           系统层优化                  │
│  - 操作系统调优                      │
│  - 硬件配置                          │
│  - 存储优化                          │
│  - 网络配置                          │
└─────────────────────────────────────┘

优化原则:
1. 先测量,后优化
2. 优化最大瓶颈
3. 平衡各项指标
4. 持续监控验证

13.2 JVM调优

内存配置优化

堆内存配置:

# Jenkins启动脚本优化
#!/bin/bash

# 基础内存配置(适用于中等规模Jenkins)
JAVA_OPTS="
  -Xms4g                    # 初始堆大小
  -Xmx8g                    # 最大堆大小
  -XX:NewRatio=1            # 新生代与老年代比例
  -XX:SurvivorRatio=8       # Eden与Survivor比例
  -XX:MaxMetaspaceSize=512m # 元空间最大大小
  -XX:CompressedClassSpaceSize=128m
"

# 垃圾回收器配置(推荐G1GC)
GC_OPTS="
  -XX:+UseG1GC              # 使用G1垃圾回收器
  -XX:MaxGCPauseMillis=200  # 最大GC暂停时间
  -XX:G1HeapRegionSize=16m  # G1堆区域大小
  -XX:G1NewSizePercent=30   # 新生代初始占比
  -XX:G1MaxNewSizePercent=40 # 新生代最大占比
  -XX:G1MixedGCCountTarget=8 # 混合GC目标次数
  -XX:InitiatingHeapOccupancyPercent=45 # 并发标记触发阈值
"

# 大规模Jenkins配置(高并发场景)
LARGE_SCALE_OPTS="
  -Xms16g
  -Xmx32g
  -XX:NewRatio=1
  -XX:SurvivorRatio=6
  -XX:MaxMetaspaceSize=1g
  -XX:+UseG1GC
  -XX:MaxGCPauseMillis=100
  -XX:G1HeapRegionSize=32m
  -XX:ParallelGCThreads=16
  -XX:ConcGCThreads=4
"

# 性能监控和调试选项
MONITORING_OPTS="
  -XX:+PrintGC              # 打印GC信息
  -XX:+PrintGCDetails       # 详细GC信息
  -XX:+PrintGCTimeStamps    # GC时间戳
  -XX:+PrintGCApplicationStoppedTime # 应用暂停时间
  -Xloggc:/var/log/jenkins/gc.log    # GC日志文件
  -XX:+UseGCLogFileRotation # GC日志轮转
  -XX:NumberOfGCLogFiles=10 # GC日志文件数量
  -XX:GCLogFileSize=100M    # GC日志文件大小
  -XX:+HeapDumpOnOutOfMemoryError # OOM时生成堆转储
  -XX:HeapDumpPath=/var/log/jenkins/heapdump.hprof
"

# JIT编译器优化
JIT_OPTS="
  -XX:+TieredCompilation    # 分层编译
  -XX:TieredStopAtLevel=4   # 编译级别
  -XX:CompileThreshold=10000 # 编译阈值
  -XX:+UseCodeCacheFlushing # 代码缓存清理
  -XX:ReservedCodeCacheSize=256m # 代码缓存大小
"

# 网络和I/O优化
NETWORK_OPTS="
  -Djava.net.preferIPv4Stack=true
  -Djava.awt.headless=true
  -Dfile.encoding=UTF-8
  -Dsun.jnu.encoding=UTF-8
  -Dhudson.model.DirectoryBrowserSupport.CSP=
  -Djenkins.install.runSetupWizard=false
"

# 组合所有选项
export JAVA_OPTS="$JAVA_OPTS $GC_OPTS $MONITORING_OPTS $JIT_OPTS $NETWORK_OPTS"

# 启动Jenkins
java $JAVA_OPTS -jar jenkins.war --httpPort=8080

内存分析脚本:

#!/bin/bash
# jenkins_memory_analysis.sh

JENKINS_PID=$(pgrep -f jenkins.war)

if [ -z "$JENKINS_PID" ]; then
    echo "Jenkins进程未找到"
    exit 1
fi

echo "=== Jenkins内存分析报告 ==="
echo "时间: $(date)"
echo "PID: $JENKINS_PID"
echo

# 基本内存信息
echo "=== 基本内存信息 ==="
jcmd $JENKINS_PID VM.info | grep -E "(heap|metaspace|code cache)"
echo

# 堆内存使用情况
echo "=== 堆内存使用情况 ==="
jcmd $JENKINS_PID GC.run_finalization
jcmd $JENKINS_PID VM.memory
echo

# GC统计信息
echo "=== GC统计信息 ==="
jstat -gc $JENKINS_PID
echo

# 类加载统计
echo "=== 类加载统计 ==="
jstat -class $JENKINS_PID
echo

# 编译统计
echo "=== JIT编译统计 ==="
jstat -compiler $JENKINS_PID
echo

# 生成堆转储(可选)
read -p "是否生成堆转储文件?(y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
    DUMP_FILE="/tmp/jenkins_heapdump_$(date +%Y%m%d_%H%M%S).hprof"
    echo "生成堆转储文件: $DUMP_FILE"
    jcmd $JENKINS_PID GC.run_finalization
    jcmd $JENKINS_PID VM.memory
    jhsdb jmap --heap --pid $JENKINS_PID
fi

# 内存使用趋势分析
echo "=== 内存使用趋势(最近10次采样) ==="
for i in {1..10}; do
    echo "采样 $i:"
    jstat -gc $JENKINS_PID | tail -1
    sleep 5
done

垃圾回收优化

G1GC调优配置:

# G1GC详细配置
G1_TUNING_OPTS="
  # 基础G1配置
  -XX:+UseG1GC
  -XX:MaxGCPauseMillis=200
  
  # 堆区域配置
  -XX:G1HeapRegionSize=16m
  -XX:G1NewSizePercent=20
  -XX:G1MaxNewSizePercent=30
  
  # 并发标记配置
  -XX:InitiatingHeapOccupancyPercent=45
  -XX:G1MixedGCLiveThresholdPercent=85
  -XX:G1HeapWastePercent=5
  
  # 混合GC配置
  -XX:G1MixedGCCountTarget=8
  -XX:G1OldCSetRegionThreshold=10
  
  # 并发线程配置
  -XX:ConcGCThreads=4
  -XX:ParallelGCThreads=16
  
  # 字符串去重(Java 8u20+)
  -XX:+UseStringDeduplication
  
  # 大对象处理
  -XX:G1ReservePercent=10
"

# GC日志详细配置
GC_LOGGING_OPTS="
  -Xloggc:/var/log/jenkins/gc-%t.log
  -XX:+UseGCLogFileRotation
  -XX:NumberOfGCLogFiles=10
  -XX:GCLogFileSize=100M
  -XX:+PrintGC
  -XX:+PrintGCDetails
  -XX:+PrintGCTimeStamps
  -XX:+PrintGCDateStamps
  -XX:+PrintGCApplicationStoppedTime
  -XX:+PrintGCApplicationConcurrentTime
  -XX:+PrintStringDeduplicationStatistics
"

GC分析脚本:

#!/usr/bin/env python3
# gc_analysis.py

import re
import sys
from datetime import datetime
from collections import defaultdict

class GCAnalyzer:
    def __init__(self, log_file):
        self.log_file = log_file
        self.gc_events = []
        self.pause_times = []
        self.heap_usage = []
        
    def parse_gc_log(self):
        """解析GC日志文件"""
        with open(self.log_file, 'r') as f:
            for line in f:
                self._parse_line(line.strip())
    
    def _parse_line(self, line):
        """解析单行GC日志"""
        # 解析G1GC暂停时间
        pause_pattern = r'\[GC pause.*?([0-9.]+) secs\]'
        pause_match = re.search(pause_pattern, line)
        if pause_match:
            pause_time = float(pause_match.group(1)) * 1000  # 转换为毫秒
            self.pause_times.append(pause_time)
        
        # 解析堆使用情况
        heap_pattern = r'(\d+)M->(\d+)M\((\d+)M\)'
        heap_match = re.search(heap_pattern, line)
        if heap_match:
            before = int(heap_match.group(1))
            after = int(heap_match.group(2))
            total = int(heap_match.group(3))
            self.heap_usage.append({
                'before': before,
                'after': after,
                'total': total,
                'utilization': (after / total) * 100
            })
    
    def analyze(self):
        """分析GC性能"""
        if not self.pause_times:
            print("未找到GC暂停时间数据")
            return
        
        # 暂停时间统计
        avg_pause = sum(self.pause_times) / len(self.pause_times)
        max_pause = max(self.pause_times)
        min_pause = min(self.pause_times)
        
        # 计算百分位数
        sorted_pauses = sorted(self.pause_times)
        p95_pause = sorted_pauses[int(len(sorted_pauses) * 0.95)]
        p99_pause = sorted_pauses[int(len(sorted_pauses) * 0.99)]
        
        print("=== GC性能分析报告 ===")
        print(f"总GC次数: {len(self.pause_times)}")
        print(f"平均暂停时间: {avg_pause:.2f}ms")
        print(f"最大暂停时间: {max_pause:.2f}ms")
        print(f"最小暂停时间: {min_pause:.2f}ms")
        print(f"95%暂停时间: {p95_pause:.2f}ms")
        print(f"99%暂停时间: {p99_pause:.2f}ms")
        
        # 堆使用情况分析
        if self.heap_usage:
            avg_utilization = sum(h['utilization'] for h in self.heap_usage) / len(self.heap_usage)
            max_utilization = max(h['utilization'] for h in self.heap_usage)
            
            print(f"\n=== 堆使用情况 ===")
            print(f"平均堆使用率: {avg_utilization:.2f}%")
            print(f"最大堆使用率: {max_utilization:.2f}%")
        
        # 性能建议
        self._provide_recommendations(avg_pause, max_pause, p95_pause)
    
    def _provide_recommendations(self, avg_pause, max_pause, p95_pause):
        """提供优化建议"""
        print("\n=== 优化建议 ===")
        
        if avg_pause > 200:
            print("- 平均暂停时间过长,建议减小MaxGCPauseMillis目标")
        
        if max_pause > 1000:
            print("- 最大暂停时间过长,建议增加堆大小或调整G1参数")
        
        if p95_pause > 500:
            print("- 95%暂停时间过长,建议优化应用代码减少对象分配")
        
        if len(self.pause_times) > 1000:
            print("- GC频率过高,建议增加堆大小")

if __name__ == "__main__":
    if len(sys.argv) != 2:
        print("用法: python3 gc_analysis.py <gc_log_file>")
        sys.exit(1)
    
    analyzer = GCAnalyzer(sys.argv[1])
    analyzer.parse_gc_log()
    analyzer.analyze()

13.3 系统级优化

操作系统调优

Linux系统优化:

#!/bin/bash
# jenkins_system_tuning.sh

echo "=== Jenkins系统优化脚本 ==="

# 1. 内核参数优化
echo "配置内核参数..."
cat >> /etc/sysctl.conf << EOF
# Jenkins系统优化参数

# 网络优化
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_max_tw_buckets = 6000
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_rmem = 4096 65536 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216

# 内存管理
vm.swappiness = 1
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
vm.vfs_cache_pressure = 50
vm.min_free_kbytes = 65536

# 文件系统
fs.file-max = 2097152
fs.nr_open = 2097152

# 进程限制
kernel.pid_max = 4194304
kernel.threads-max = 4194304
EOF

# 应用内核参数
sysctl -p

# 2. 文件描述符限制
echo "配置文件描述符限制..."
cat >> /etc/security/limits.conf << EOF
# Jenkins用户限制
jenkins soft nofile 65535
jenkins hard nofile 65535
jenkins soft nproc 32768
jenkins hard nproc 32768
jenkins soft memlock unlimited
jenkins hard memlock unlimited

# 所有用户默认限制
* soft nofile 65535
* hard nofile 65535
EOF

# 3. systemd服务限制
echo "配置systemd服务限制..."
mkdir -p /etc/systemd/system/jenkins.service.d
cat > /etc/systemd/system/jenkins.service.d/limits.conf << EOF
[Service]
LimitNOFILE=65535
LimitNPROC=32768
LimitMEMLOCK=infinity
EOF

# 4. 磁盘I/O优化
echo "优化磁盘I/O..."
# 设置I/O调度器为deadline(适合SSD)
echo deadline > /sys/block/sda/queue/scheduler

# 禁用透明大页
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

# 5. CPU优化
echo "优化CPU设置..."
# 设置CPU调度器
echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

# 6. 创建优化的挂载选项
echo "优化文件系统挂载选项..."
cat >> /etc/fstab << EOF
# Jenkins工作目录优化挂载
/dev/sdb1 /var/lib/jenkins ext4 defaults,noatime,nodiratime,barrier=0 0 2
EOF

echo "系统优化完成,建议重启系统使所有设置生效"

性能监控脚本:

#!/bin/bash
# jenkins_performance_monitor.sh

LOG_FILE="/var/log/jenkins/performance.log"
INTERVAL=60  # 监控间隔(秒)

# 创建日志目录
mkdir -p $(dirname $LOG_FILE)

echo "Jenkins性能监控启动,日志文件: $LOG_FILE"
echo "监控间隔: ${INTERVAL}秒"

while true; do
    TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
    
    # 获取Jenkins进程信息
    JENKINS_PID=$(pgrep -f jenkins.war)
    
    if [ -n "$JENKINS_PID" ]; then
        # CPU使用率
        CPU_USAGE=$(ps -p $JENKINS_PID -o %cpu --no-headers)
        
        # 内存使用情况
        MEMORY_INFO=$(ps -p $JENKINS_PID -o %mem,vsz,rss --no-headers)
        MEM_PERCENT=$(echo $MEMORY_INFO | awk '{print $1}')
        VSZ=$(echo $MEMORY_INFO | awk '{print $2}')
        RSS=$(echo $MEMORY_INFO | awk '{print $3}')
        
        # 文件描述符使用情况
        FD_COUNT=$(lsof -p $JENKINS_PID 2>/dev/null | wc -l)
        
        # 线程数量
        THREAD_COUNT=$(ps -p $JENKINS_PID -o nlwp --no-headers)
        
        # 系统负载
        LOAD_AVG=$(uptime | awk -F'load average:' '{print $2}' | sed 's/^[ \t]*//')
        
        # 磁盘使用情况
        DISK_USAGE=$(df -h /var/lib/jenkins | tail -1 | awk '{print $5}' | sed 's/%//')
        
        # JVM堆内存使用情况(如果jstat可用)
        if command -v jstat >/dev/null 2>&1; then
            HEAP_INFO=$(jstat -gc $JENKINS_PID | tail -1)
            HEAP_USED=$(echo $HEAP_INFO | awk '{print ($3+$4+$6+$8)/1024}' | bc -l 2>/dev/null || echo "N/A")
            HEAP_TOTAL=$(echo $HEAP_INFO | awk '{print ($1+$2+$5+$7)/1024}' | bc -l 2>/dev/null || echo "N/A")
        else
            HEAP_USED="N/A"
            HEAP_TOTAL="N/A"
        fi
        
        # 记录性能数据
        echo "$TIMESTAMP,CPU:${CPU_USAGE}%,MEM:${MEM_PERCENT}%,VSZ:${VSZ}KB,RSS:${RSS}KB,FD:${FD_COUNT},THREADS:${THREAD_COUNT},LOAD:${LOAD_AVG},DISK:${DISK_USAGE}%,HEAP_USED:${HEAP_USED}MB,HEAP_TOTAL:${HEAP_TOTAL}MB" >> $LOG_FILE
        
        # 检查性能阈值并告警
        if (( $(echo "$CPU_USAGE > 80" | bc -l) )); then
            echo "[$TIMESTAMP] 警告: CPU使用率过高 ${CPU_USAGE}%" | tee -a $LOG_FILE
        fi
        
        if (( $(echo "$MEM_PERCENT > 85" | bc -l) )); then
            echo "[$TIMESTAMP] 警告: 内存使用率过高 ${MEM_PERCENT}%" | tee -a $LOG_FILE
        fi
        
        if [ "$FD_COUNT" -gt 50000 ]; then
            echo "[$TIMESTAMP] 警告: 文件描述符使用过多 $FD_COUNT" | tee -a $LOG_FILE
        fi
        
        if [ "$DISK_USAGE" -gt 85 ]; then
            echo "[$TIMESTAMP] 警告: 磁盘使用率过高 ${DISK_USAGE}%" | tee -a $LOG_FILE
        fi
        
    else
        echo "[$TIMESTAMP] Jenkins进程未运行" >> $LOG_FILE
    fi
    
    sleep $INTERVAL
done

存储优化

磁盘配置优化:

#!/bin/bash
# jenkins_storage_optimization.sh

echo "=== Jenkins存储优化 ==="

# 1. 创建优化的文件系统结构
echo "创建优化的目录结构..."

# Jenkins主目录
JENKINS_HOME="/var/lib/jenkins"

# 分离不同类型的数据
mkdir -p $JENKINS_HOME/{jobs,workspace,logs,plugins,tools,secrets}
mkdir -p /var/cache/jenkins/{builds,artifacts}
mkdir -p /tmp/jenkins/{workspace,builds}

# 2. 配置tmpfs用于临时文件
echo "配置tmpfs..."
cat >> /etc/fstab << EOF
# Jenkins临时文件系统
tmpfs /tmp/jenkins tmpfs defaults,size=4G,mode=1777 0 0
EOF

# 3. 设置合适的文件权限
echo "设置文件权限..."
chown -R jenkins:jenkins $JENKINS_HOME
chown -R jenkins:jenkins /var/cache/jenkins
chown -R jenkins:jenkins /tmp/jenkins

# 4. 配置日志轮转
echo "配置日志轮转..."
cat > /etc/logrotate.d/jenkins << EOF
/var/lib/jenkins/logs/*.log {
    daily
    missingok
    rotate 30
    compress
    delaycompress
    notifempty
    copytruncate
    su jenkins jenkins
}

/var/log/jenkins/*.log {
    daily
    missingok
    rotate 30
    compress
    delaycompress
    notifempty
    copytruncate
    su jenkins jenkins
}
EOF

# 5. 创建清理脚本
cat > /usr/local/bin/jenkins_cleanup.sh << 'EOF'
#!/bin/bash
# Jenkins存储清理脚本

JENKINS_HOME="/var/lib/jenkins"
RETENTION_DAYS=30
WORKSPACE_RETENTION_DAYS=7

echo "开始Jenkins存储清理..."

# 清理旧的构建日志
echo "清理构建日志..."
find $JENKINS_HOME/jobs/*/builds/*/log -type f -mtime +$RETENTION_DAYS -delete

# 清理旧的工作空间
echo "清理工作空间..."
find $JENKINS_HOME/workspace/* -type d -mtime +$WORKSPACE_RETENTION_DAYS -exec rm -rf {} + 2>/dev/null

# 清理临时文件
echo "清理临时文件..."
find /tmp/jenkins -type f -mtime +1 -delete
find /var/cache/jenkins -type f -mtime +$RETENTION_DAYS -delete

# 清理旧的插件缓存
echo "清理插件缓存..."
find $JENKINS_HOME/plugins -name "*.tmp" -delete
find $JENKINS_HOME/plugins -name "*.bak" -mtime +7 -delete

# 压缩旧的构建产物
echo "压缩构建产物..."
find $JENKINS_HOME/jobs/*/builds/*/archive -type f -name "*.jar" -mtime +7 ! -name "*.gz" -exec gzip {} \;

# 统计清理结果
echo "清理完成,当前磁盘使用情况:"
df -h $JENKINS_HOME

echo "Jenkins目录大小:"
du -sh $JENKINS_HOME
EOF

chmod +x /usr/local/bin/jenkins_cleanup.sh

# 6. 设置定时清理任务
echo "设置定时清理任务..."
cat > /etc/cron.d/jenkins-cleanup << EOF
# Jenkins存储清理任务
0 2 * * * jenkins /usr/local/bin/jenkins_cleanup.sh >> /var/log/jenkins/cleanup.log 2>&1
EOF

echo "存储优化配置完成"

存储监控脚本:

#!/usr/bin/env python3
# jenkins_storage_monitor.py

import os
import sys
import json
import time
from datetime import datetime
from pathlib import Path

class StorageMonitor:
    def __init__(self, jenkins_home='/var/lib/jenkins'):
        self.jenkins_home = Path(jenkins_home)
        self.report_file = '/var/log/jenkins/storage_report.json'
        
    def get_directory_size(self, path):
        """获取目录大小"""
        total_size = 0
        try:
            for dirpath, dirnames, filenames in os.walk(path):
                for filename in filenames:
                    filepath = os.path.join(dirpath, filename)
                    try:
                        total_size += os.path.getsize(filepath)
                    except (OSError, IOError):
                        continue
        except (OSError, IOError):
            pass
        return total_size
    
    def get_disk_usage(self, path):
        """获取磁盘使用情况"""
        try:
            statvfs = os.statvfs(path)
            total = statvfs.f_frsize * statvfs.f_blocks
            free = statvfs.f_frsize * statvfs.f_available
            used = total - free
            return {
                'total': total,
                'used': used,
                'free': free,
                'usage_percent': (used / total) * 100 if total > 0 else 0
            }
        except OSError:
            return None
    
    def analyze_jenkins_storage(self):
        """分析Jenkins存储使用情况"""
        report = {
            'timestamp': datetime.now().isoformat(),
            'jenkins_home': str(self.jenkins_home),
            'directories': {},
            'disk_usage': {},
            'recommendations': []
        }
        
        # 分析各个目录的大小
        directories_to_check = [
            'jobs',
            'workspace',
            'plugins',
            'logs',
            'tools',
            'secrets',
            'userContent',
            'war'
        ]
        
        total_jenkins_size = 0
        for dir_name in directories_to_check:
            dir_path = self.jenkins_home / dir_name
            if dir_path.exists():
                size = self.get_directory_size(dir_path)
                total_jenkins_size += size
                report['directories'][dir_name] = {
                    'size_bytes': size,
                    'size_mb': size / (1024 * 1024),
                    'size_gb': size / (1024 * 1024 * 1024)
                }
        
        report['total_jenkins_size'] = {
            'size_bytes': total_jenkins_size,
            'size_mb': total_jenkins_size / (1024 * 1024),
            'size_gb': total_jenkins_size / (1024 * 1024 * 1024)
        }
        
        # 获取磁盘使用情况
        disk_usage = self.get_disk_usage(self.jenkins_home)
        if disk_usage:
            report['disk_usage'] = {
                'total_gb': disk_usage['total'] / (1024 * 1024 * 1024),
                'used_gb': disk_usage['used'] / (1024 * 1024 * 1024),
                'free_gb': disk_usage['free'] / (1024 * 1024 * 1024),
                'usage_percent': disk_usage['usage_percent']
            }
        
        # 生成建议
        self._generate_recommendations(report)
        
        return report
    
    def _generate_recommendations(self, report):
        """生成优化建议"""
        recommendations = []
        
        # 检查磁盘使用率
        if 'disk_usage' in report and report['disk_usage']['usage_percent'] > 85:
            recommendations.append({
                'type': 'critical',
                'message': f"磁盘使用率过高 ({report['disk_usage']['usage_percent']:.1f}%),需要立即清理"
            })
        
        # 检查各目录大小
        if 'directories' in report:
            # 检查workspace目录
            if 'workspace' in report['directories']:
                workspace_size_gb = report['directories']['workspace']['size_gb']
                if workspace_size_gb > 10:
                    recommendations.append({
                        'type': 'warning',
                        'message': f"workspace目录过大 ({workspace_size_gb:.1f}GB),建议清理旧的工作空间"
                    })
            
            # 检查jobs目录
            if 'jobs' in report['directories']:
                jobs_size_gb = report['directories']['jobs']['size_gb']
                if jobs_size_gb > 20:
                    recommendations.append({
                        'type': 'warning',
                        'message': f"jobs目录过大 ({jobs_size_gb:.1f}GB),建议清理旧的构建记录"
                    })
            
            # 检查logs目录
            if 'logs' in report['directories']:
                logs_size_gb = report['directories']['logs']['size_gb']
                if logs_size_gb > 5:
                    recommendations.append({
                        'type': 'info',
                        'message': f"logs目录较大 ({logs_size_gb:.1f}GB),建议配置日志轮转"
                    })
        
        report['recommendations'] = recommendations
    
    def save_report(self, report):
        """保存报告到文件"""
        os.makedirs(os.path.dirname(self.report_file), exist_ok=True)
        with open(self.report_file, 'w') as f:
            json.dump(report, f, indent=2)
    
    def print_report(self, report):
        """打印报告"""
        print("=== Jenkins存储分析报告 ===")
        print(f"时间: {report['timestamp']}")
        print(f"Jenkins主目录: {report['jenkins_home']}")
        print()
        
        # 总体使用情况
        if 'total_jenkins_size' in report:
            total_size = report['total_jenkins_size']
            print(f"Jenkins总大小: {total_size['size_gb']:.2f} GB")
        
        if 'disk_usage' in report:
            disk = report['disk_usage']
            print(f"磁盘使用情况: {disk['used_gb']:.1f}GB / {disk['total_gb']:.1f}GB ({disk['usage_percent']:.1f}%)")
        
        print()
        
        # 目录详情
        print("=== 目录大小详情 ===")
        if 'directories' in report:
            for dir_name, info in sorted(report['directories'].items(), 
                                       key=lambda x: x[1]['size_gb'], reverse=True):
                print(f"{dir_name:15}: {info['size_gb']:8.2f} GB")
        
        print()
        
        # 建议
        if 'recommendations' in report and report['recommendations']:
            print("=== 优化建议 ===")
            for rec in report['recommendations']:
                icon = {'critical': '🚨', 'warning': '⚠️', 'info': 'ℹ️'}.get(rec['type'], '')
                print(f"{icon} {rec['message']}")
        else:
            print("✅ 存储使用情况良好,无需特别优化")
    
    def run(self):
        """运行存储监控"""
        report = self.analyze_jenkins_storage()
        self.save_report(report)
        self.print_report(report)
        return report

if __name__ == '__main__':
    jenkins_home = sys.argv[1] if len(sys.argv) > 1 else '/var/lib/jenkins'
    monitor = StorageMonitor(jenkins_home)
    monitor.run()

13.4 构建优化

Pipeline性能优化

并行化优化策略:

// 高性能Pipeline示例
pipeline {
    agent none
    
    options {
        // 构建保留策略
        buildDiscarder(logRotator(
            numToKeepStr: '10',
            daysToKeepStr: '30',
            artifactNumToKeepStr: '5'
        ))
        
        // 超时设置
        timeout(time: 30, unit: 'MINUTES')
        
        // 禁用并发构建
        disableConcurrentBuilds()
        
        // 跳过默认检出
        skipDefaultCheckout()
    }
    
    environment {
        // 优化环境变量
        MAVEN_OPTS = '-Xmx2g -XX:+UseG1GC -Dmaven.repo.local=/var/cache/maven'
        GRADLE_OPTS = '-Xmx2g -XX:+UseG1GC -Dorg.gradle.daemon=false'
        DOCKER_BUILDKIT = '1'
    }
    
    stages {
        stage('Checkout') {
            agent { label 'fast-ssd' }
            steps {
                sh 'mvn clean compile -T 4'
            }
        }
    }
}

资源池管理:

// 资源池管理脚本
class ResourcePoolManager {
    def jenkins = Jenkins.instance
    def pools = [:]
    
    def initializePools() {
        pools['build'] = [
            maxConcurrent: 10,
            current: 0,
            queue: [],
            nodes: ['build-1', 'build-2', 'build-3']
        ]
        
        pools['test'] = [
            maxConcurrent: 5,
            current: 0,
            queue: [],
            nodes: ['test-1', 'test-2']
        ]
        
        pools['deploy'] = [
            maxConcurrent: 2,
            current: 0,
            queue: [],
            nodes: ['deploy-1']
        ]
    }
    
    def requestResource(String poolName, Closure task) {
        def pool = pools[poolName]
        
        if (pool.current < pool.maxConcurrent) {
            pool.current++
            try {
                task()
            } finally {
                pool.current--
                processQueue(poolName)
            }
        } else {
            pool.queue.add(task)
            echo "任务已加入${poolName}队列,当前队列长度: ${pool.queue.size()}"
        }
    }
    
    def processQueue(String poolName) {
        def pool = pools[poolName]
        
        if (pool.queue.size() > 0 && pool.current < pool.maxConcurrent) {
            def nextTask = pool.queue.remove(0)
            pool.current++
            
            // 异步执行下一个任务
            Thread.start {
                try {
                    nextTask()
                } finally {
                    pool.current--
                    processQueue(poolName)
                }
            }
        }
    }
}

// 使用示例
def resourceManager = new ResourcePoolManager()
resourceManager.initializePools()

pipeline {
    agent none
    
    stages {
        stage('Build') {
            steps {
                script {
                    resourceManager.requestResource('build') {
                        node('build') {
                            sh 'mvn clean package'
                        }
                    }
                }
            }
        }
    }
}

13.5 网络优化

带宽优化

网络配置优化:

#!/bin/bash
# jenkins_network_optimization.sh

echo "=== Jenkins网络优化配置 ==="

# 1. TCP优化
echo "配置TCP参数..."
cat >> /etc/sysctl.conf << EOF
# Jenkins网络优化
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_moderate_rcvbuf = 1
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq

# 连接跟踪优化
net.netfilter.nf_conntrack_max = 1048576
net.netfilter.nf_conntrack_tcp_timeout_established = 7200
EOF

sysctl -p

# 2. 配置Jenkins反向代理
echo "配置Nginx反向代理..."
cat > /etc/nginx/sites-available/jenkins << 'EOF'
upstream jenkins {
    server 127.0.0.1:8080 fail_timeout=0;
}

server {
    listen 80;
    server_name jenkins.company.com;
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name jenkins.company.com;
    
    # SSL配置
    ssl_certificate /etc/ssl/certs/jenkins.crt;
    ssl_certificate_key /etc/ssl/private/jenkins.key;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;
    ssl_prefer_server_ciphers off;
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 10m;
    
    # 性能优化
    client_max_body_size 100M;
    client_body_timeout 60s;
    client_header_timeout 60s;
    keepalive_timeout 65s;
    send_timeout 60s;
    
    # 压缩配置
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types
        text/plain
        text/css
        text/xml
        text/javascript
        application/json
        application/javascript
        application/xml+rss
        application/atom+xml
        image/svg+xml;
    
    # 缓存配置
    location ~* \.(css|js|png|jpg|jpeg|gif|ico|svg)$ {
        expires 1y;
        add_header Cache-Control "public, immutable";
        access_log off;
    }
    
    # Jenkins代理配置
    location / {
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        
        proxy_pass http://jenkins;
        proxy_read_timeout 90s;
        proxy_redirect http://jenkins https://jenkins.company.com;
        
        # WebSocket支持
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        
        # 缓冲优化
        proxy_buffering on;
        proxy_buffer_size 128k;
        proxy_buffers 4 256k;
        proxy_busy_buffers_size 256k;
    }
    
    # 健康检查
    location /health {
        access_log off;
        return 200 "healthy\n";
        add_header Content-Type text/plain;
    }
}
EOF

# 启用站点
ln -sf /etc/nginx/sites-available/jenkins /etc/nginx/sites-enabled/
nginx -t && systemctl reload nginx

echo "网络优化配置完成"

CDN配置:

# cloudfront-jenkins.yml
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Jenkins CDN配置'

Parameters:
  JenkinsOrigin:
    Type: String
    Default: 'jenkins.company.com'
    Description: 'Jenkins服务器域名'

Resources:
  JenkinsCDN:
    Type: AWS::CloudFront::Distribution
    Properties:
      DistributionConfig:
        Enabled: true
        Comment: 'Jenkins CDN Distribution'
        
        Origins:
          - Id: jenkins-origin
            DomainName: !Ref JenkinsOrigin
            CustomOriginConfig:
              HTTPPort: 443
              HTTPSPort: 443
              OriginProtocolPolicy: https-only
              OriginSSLProtocols:
                - TLSv1.2
        
        DefaultCacheBehavior:
          TargetOriginId: jenkins-origin
          ViewerProtocolPolicy: redirect-to-https
          AllowedMethods:
            - GET
            - HEAD
            - OPTIONS
            - PUT
            - POST
            - PATCH
            - DELETE
          CachedMethods:
            - GET
            - HEAD
            - OPTIONS
          Compress: true
          ForwardedValues:
            QueryString: true
            Headers:
              - Authorization
              - Host
              - X-Forwarded-For
              - X-Forwarded-Proto
            Cookies:
              Forward: all
          DefaultTTL: 0
          MaxTTL: 31536000
          MinTTL: 0
        
        CacheBehaviors:
          # 静态资源缓存
          - PathPattern: '*.css'
            TargetOriginId: jenkins-origin
            ViewerProtocolPolicy: redirect-to-https
            AllowedMethods: [GET, HEAD]
            CachedMethods: [GET, HEAD]
            Compress: true
            ForwardedValues:
              QueryString: false
              Headers: []
            DefaultTTL: 86400
            MaxTTL: 31536000
            MinTTL: 0
          
          - PathPattern: '*.js'
            TargetOriginId: jenkins-origin
            ViewerProtocolPolicy: redirect-to-https
            AllowedMethods: [GET, HEAD]
            CachedMethods: [GET, HEAD]
            Compress: true
            ForwardedValues:
              QueryString: false
              Headers: []
            DefaultTTL: 86400
            MaxTTL: 31536000
            MinTTL: 0
          
          - PathPattern: '*.png'
            TargetOriginId: jenkins-origin
            ViewerProtocolPolicy: redirect-to-https
            AllowedMethods: [GET, HEAD]
            CachedMethods: [GET, HEAD]
            Compress: false
            ForwardedValues:
              QueryString: false
              Headers: []
            DefaultTTL: 2592000
            MaxTTL: 31536000
            MinTTL: 0
        
        PriceClass: PriceClass_100
        ViewerCertificate:
          AcmCertificateArn: !Ref SSLCertificate
          SslSupportMethod: sni-only
          MinimumProtocolVersion: TLSv1.2_2021
  
  SSLCertificate:
    Type: AWS::CertificateManager::Certificate
    Properties:
      DomainName: !Sub 'cdn.${JenkinsOrigin}'
      ValidationMethod: DNS

Outputs:
  CDNDomainName:
    Description: 'CloudFront域名'
    Value: !GetAtt JenkinsCDN.DomainName
    Export:
      Name: !Sub '${AWS::StackName}-CDN-Domain'

连接优化

连接池配置:

// Jenkins系统配置脚本
import jenkins.model.Jenkins
import org.jenkinsci.plugins.workflow.libs.GlobalLibraries
import org.jenkinsci.plugins.workflow.libs.LibraryConfiguration
import org.jenkinsci.plugins.workflow.libs.SCMSourceRetriever
import jenkins.plugins.git.GitSCMSource

// HTTP连接池优化
System.setProperty('hudson.model.ParametersAction.keepUndefinedParameters', 'true')
System.setProperty('hudson.model.DirectoryBrowserSupport.CSP', '')
System.setProperty('jenkins.model.Jenkins.slaveAgentPort', '50000')
System.setProperty('jenkins.model.Jenkins.slaveAgentPortEnforce', 'true')

// 网络超时设置
System.setProperty('hudson.remoting.Launcher.pingIntervalSec', '300')
System.setProperty('hudson.remoting.Launcher.pingTimeoutSec', '60')
System.setProperty('hudson.slaves.ChannelPinger.pingInterval', '5')
System.setProperty('hudson.slaves.ChannelPinger.pingTimeout', '10')

// Git连接优化
System.setProperty('org.jenkinsci.plugins.gitclient.Git.timeOut', '30')
System.setProperty('hudson.plugins.git.GitSCM.ALLOW_LOCAL_CHECKOUT', 'true')

// HTTP客户端优化
System.setProperty('hudson.ProxyConfiguration.DEFAULT_CONNECT_TIMEOUT_MILLIS', '20000')
System.setProperty('hudson.ProxyConfiguration.DEFAULT_READ_TIMEOUT_MILLIS', '60000')

println "网络连接优化配置完成"

13.6 监控和调优工具

性能分析工具

JProfiler集成脚本:

#!/bin/bash
# jenkins_profiling.sh

JPROFILER_HOME="/opt/jprofiler"
JENKINS_PID=$(pgrep -f jenkins.war)

if [ -z "$JENKINS_PID" ]; then
    echo "Jenkins进程未找到"
    exit 1
fi

echo "=== Jenkins性能分析 ==="
echo "Jenkins PID: $JENKINS_PID"

# 1. 启动JProfiler代理
echo "启动JProfiler代理..."
$JPROFILER_HOME/bin/jpenable --pid=$JENKINS_PID --port=8849

# 2. 生成堆转储
echo "生成堆转储..."
HEAP_DUMP_FILE="/tmp/jenkins_heap_$(date +%Y%m%d_%H%M%S).hprof"
jcmd $JENKINS_PID GC.run_finalization
jhsdb jmap --heap --pid $JENKINS_PID > "${HEAP_DUMP_FILE}.txt"
jcmd $JENKINS_PID VM.memory >> "${HEAP_DUMP_FILE}.txt"

# 3. 线程转储
echo "生成线程转储..."
THREAD_DUMP_FILE="/tmp/jenkins_threads_$(date +%Y%m%d_%H%M%S).txt"
jstack $JENKINS_PID > $THREAD_DUMP_FILE

# 4. GC分析
echo "分析GC日志..."
GC_LOG_FILE="/var/log/jenkins/gc.log"
if [ -f "$GC_LOG_FILE" ]; then
    # 使用GCViewer分析GC日志
    java -jar $JPROFILER_HOME/lib/gcviewer.jar $GC_LOG_FILE
fi

# 5. 性能报告
echo "生成性能报告..."
cat > "/tmp/jenkins_performance_report_$(date +%Y%m%d_%H%M%S).txt" << EOF
Jenkins性能分析报告
生成时间: $(date)
Jenkins PID: $JENKINS_PID

=== 系统信息 ===
$(uname -a)

=== CPU信息 ===
$(lscpu)

=== 内存信息 ===
$(free -h)

=== 磁盘信息 ===
$(df -h)

=== 网络连接 ===
$(netstat -an | grep :8080)

=== Java进程信息 ===
$(ps -p $JENKINS_PID -o pid,ppid,cmd,%mem,%cpu,etime)

=== JVM信息 ===
$(jcmd $JENKINS_PID VM.info)

=== 类加载统计 ===
$(jstat -class $JENKINS_PID)

=== 编译统计 ===
$(jstat -compiler $JENKINS_PID)

=== GC统计 ===
$(jstat -gc $JENKINS_PID)
EOF

echo "性能分析完成,文件保存在 /tmp/ 目录"
ls -la /tmp/jenkins_*

自动化调优脚本:

#!/usr/bin/env python3
# jenkins_auto_tuning.py

import os
import re
import json
import subprocess
from datetime import datetime, timedelta

class JenkinsAutoTuner:
    def __init__(self):
        self.jenkins_home = '/var/lib/jenkins'
        self.config_file = '/etc/jenkins/tuning.json'
        self.metrics = {}
        
    def collect_metrics(self):
        """收集性能指标"""
        # 获取Jenkins进程信息
        jenkins_pid = self._get_jenkins_pid()
        if not jenkins_pid:
            return False
            
        # CPU使用率
        cpu_usage = self._get_cpu_usage(jenkins_pid)
        
        # 内存使用情况
        memory_info = self._get_memory_info(jenkins_pid)
        
        # GC信息
        gc_info = self._get_gc_info(jenkins_pid)
        
        # 响应时间
        response_time = self._get_response_time()
        
        # 构建队列长度
        queue_length = self._get_queue_length()
        
        self.metrics = {
            'timestamp': datetime.now().isoformat(),
            'cpu_usage': cpu_usage,
            'memory': memory_info,
            'gc': gc_info,
            'response_time': response_time,
            'queue_length': queue_length
        }
        
        return True
    
    def analyze_performance(self):
        """分析性能并生成调优建议"""
        recommendations = []
        
        # CPU分析
        if self.metrics['cpu_usage'] > 80:
            recommendations.append({
                'type': 'cpu',
                'severity': 'high',
                'message': 'CPU使用率过高,建议增加执行器或优化构建脚本',
                'action': 'increase_executors'
            })
        
        # 内存分析
        heap_usage = self.metrics['memory']['heap_usage_percent']
        if heap_usage > 85:
            recommendations.append({
                'type': 'memory',
                'severity': 'high',
                'message': '堆内存使用率过高,建议增加堆大小',
                'action': 'increase_heap_size'
            })
        
        # GC分析
        gc_time_percent = self.metrics['gc']['time_percent']
        if gc_time_percent > 5:
            recommendations.append({
                'type': 'gc',
                'severity': 'medium',
                'message': 'GC时间占比过高,建议调整GC参数',
                'action': 'tune_gc_parameters'
            })
        
        # 响应时间分析
        if self.metrics['response_time'] > 5000:  # 5秒
            recommendations.append({
                'type': 'response',
                'severity': 'medium',
                'message': '响应时间过长,建议优化插件或增加资源',
                'action': 'optimize_plugins'
            })
        
        # 队列分析
        if self.metrics['queue_length'] > 20:
            recommendations.append({
                'type': 'queue',
                'severity': 'medium',
                'message': '构建队列过长,建议增加构建节点',
                'action': 'add_build_nodes'
            })
        
        return recommendations
    
    def apply_tuning(self, recommendations):
        """应用调优建议"""
        applied_changes = []
        
        for rec in recommendations:
            if rec['action'] == 'increase_heap_size':
                if self._increase_heap_size():
                    applied_changes.append('增加堆内存大小')
            
            elif rec['action'] == 'tune_gc_parameters':
                if self._tune_gc_parameters():
                    applied_changes.append('优化GC参数')
            
            elif rec['action'] == 'increase_executors':
                if self._increase_executors():
                    applied_changes.append('增加执行器数量')
        
        return applied_changes
    
    def _get_jenkins_pid(self):
        """获取Jenkins进程ID"""
        try:
            result = subprocess.run(['pgrep', '-f', 'jenkins.war'], 
                                  capture_output=True, text=True)
            return result.stdout.strip() if result.returncode == 0 else None
        except:
            return None
    
    def _get_cpu_usage(self, pid):
        """获取CPU使用率"""
        try:
            result = subprocess.run(['ps', '-p', pid, '-o', '%cpu', '--no-headers'],
                                  capture_output=True, text=True)
            return float(result.stdout.strip()) if result.returncode == 0 else 0
        except:
            return 0
    
    def _get_memory_info(self, pid):
        """获取内存信息"""
        try:
            # 获取进程内存使用
            ps_result = subprocess.run(['ps', '-p', pid, '-o', '%mem,vsz,rss', '--no-headers'],
                                     capture_output=True, text=True)
            mem_percent, vsz, rss = ps_result.stdout.strip().split()
            
            # 获取JVM堆信息
            jstat_result = subprocess.run(['jstat', '-gc', pid],
                                        capture_output=True, text=True)
            gc_data = jstat_result.stdout.strip().split('\n')[-1].split()
            
            heap_used = float(gc_data[2]) + float(gc_data[3]) + float(gc_data[5]) + float(gc_data[7])
            heap_total = float(gc_data[0]) + float(gc_data[1]) + float(gc_data[4]) + float(gc_data[6])
            
            return {
                'mem_percent': float(mem_percent),
                'vsz_kb': int(vsz),
                'rss_kb': int(rss),
                'heap_used_kb': heap_used,
                'heap_total_kb': heap_total,
                'heap_usage_percent': (heap_used / heap_total) * 100 if heap_total > 0 else 0
            }
        except:
            return {}
    
    def _get_gc_info(self, pid):
        """获取GC信息"""
        try:
            result = subprocess.run(['jstat', '-gc', pid],
                                  capture_output=True, text=True)
            lines = result.stdout.strip().split('\n')
            if len(lines) >= 2:
                headers = lines[0].split()
                values = lines[1].split()
                gc_data = dict(zip(headers, values))
                
                # 计算GC时间占比
                gc_time = float(gc_data.get('GCT', 0))
                uptime_result = subprocess.run(['ps', '-p', pid, '-o', 'etime', '--no-headers'],
                                             capture_output=True, text=True)
                uptime_str = uptime_result.stdout.strip()
                uptime_seconds = self._parse_uptime(uptime_str)
                
                time_percent = (gc_time / uptime_seconds) * 100 if uptime_seconds > 0 else 0
                
                return {
                    'total_time': gc_time,
                    'time_percent': time_percent,
                    'young_gc_count': int(gc_data.get('YGC', 0)),
                    'full_gc_count': int(gc_data.get('FGC', 0))
                }
        except:
            pass
        return {}
    
    def _parse_uptime(self, uptime_str):
        """解析进程运行时间"""
        # 格式: [[DD-]HH:]MM:SS
        parts = uptime_str.split(':')
        seconds = 0
        
        if len(parts) == 2:  # MM:SS
            seconds = int(parts[0]) * 60 + int(parts[1])
        elif len(parts) == 3:  # HH:MM:SS
            seconds = int(parts[0]) * 3600 + int(parts[1]) * 60 + int(parts[2])
        elif '-' in uptime_str:  # DD-HH:MM:SS
            day_part, time_part = uptime_str.split('-')
            days = int(day_part)
            time_parts = time_part.split(':')
            seconds = days * 86400 + int(time_parts[0]) * 3600 + int(time_parts[1]) * 60 + int(time_parts[2])
        
        return seconds
    
    def _get_response_time(self):
        """获取响应时间"""
        try:
            import time
            import urllib.request
            
            start_time = time.time()
            urllib.request.urlopen('http://localhost:8080/api/json', timeout=10)
            end_time = time.time()
            
            return (end_time - start_time) * 1000  # 转换为毫秒
        except:
            return 0
    
    def _get_queue_length(self):
        """获取构建队列长度"""
        try:
            import urllib.request
            import json
            
            response = urllib.request.urlopen('http://localhost:8080/queue/api/json', timeout=5)
            data = json.loads(response.read().decode())
            return len(data.get('items', []))
        except:
            return 0
    
    def _increase_heap_size(self):
        """增加堆内存大小"""
        # 这里应该修改Jenkins启动脚本
        # 实际实现需要根据具体的部署方式
        print("建议增加堆内存大小到当前的1.5倍")
        return True
    
    def _tune_gc_parameters(self):
        """调整GC参数"""
        print("建议调整GC参数以减少GC时间")
        return True
    
    def _increase_executors(self):
        """增加执行器数量"""
        print("建议增加执行器数量以提高并发处理能力")
        return True
    
    def run(self):
        """运行自动调优"""
        print("=== Jenkins自动调优开始 ===")
        
        # 收集指标
        if not self.collect_metrics():
            print("无法收集性能指标")
            return
        
        print(f"当前性能指标:")
        print(f"  CPU使用率: {self.metrics['cpu_usage']:.1f}%")
        print(f"  内存使用率: {self.metrics['memory'].get('heap_usage_percent', 0):.1f}%")
        print(f"  GC时间占比: {self.metrics['gc'].get('time_percent', 0):.1f}%")
        print(f"  响应时间: {self.metrics['response_time']:.0f}ms")
        print(f"  队列长度: {self.metrics['queue_length']}")
        
        # 分析性能
        recommendations = self.analyze_performance()
        
        if not recommendations:
            print("✅ 系统性能良好,无需调优")
            return
        
        print(f"\n发现 {len(recommendations)} 个优化建议:")
        for i, rec in enumerate(recommendations, 1):
            print(f"  {i}. [{rec['severity'].upper()}] {rec['message']}")
        
        # 应用调优
        applied_changes = self.apply_tuning(recommendations)
        
        if applied_changes:
            print(f"\n已应用以下优化:")
            for change in applied_changes:
                print(f"  ✓ {change}")
        
        print("\n=== 自动调优完成 ===")

if __name__ == '__main__':
    tuner = JenkinsAutoTuner()
    tuner.run()

本章小结

本章详细介绍了Jenkins的性能优化:

  1. 性能优化概述:了解性能问题识别和优化策略
  2. JVM调优:掌握内存配置和垃圾回收优化
  3. 系统级优化:学习操作系统和存储优化
  4. 构建优化:实现Pipeline和资源管理优化
  5. 网络优化:配置带宽和连接优化
  6. 监控调优工具:使用性能分析和自动调优工具

通过系统性的性能优化,可以显著提升Jenkins的运行效率和用户体验。

下一章预告

下一章我们将学习Jenkins的故障排除,包括常见问题诊断、日志分析和恢复策略。

练习与思考

理论练习

  1. 性能分析

    • 分析不同类型的性能瓶颈
    • 设计性能监控方案
    • 制定性能优化计划
  2. 调优策略

    • 比较不同JVM垃圾回收器的特点
    • 设计资源分配策略
    • 规划网络优化方案

实践练习

  1. JVM调优

    • 配置G1GC参数
    • 分析GC日志
    • 优化内存配置
  2. 系统优化

    • 实施操作系统调优
    • 配置存储优化
    • 部署监控工具

思考题

  1. 优化平衡

    • 如何在性能和稳定性之间找到平衡?
    • 如何评估优化效果?
    • 如何避免过度优化?
  2. 持续改进

    • 如何建立性能优化的持续改进机制?

    • 如何处理性能回归问题?

    • 如何在团队中推广性能优化最佳实践? { script { // 优化的检出策略 checkout([ $class: ‘GitSCM’, branches: [[name: env.BRANCH_NAME]], doGenerateSubmoduleConfigurations: false, extensions: [ [$class: ‘CloneOption’, depth: 1, noTags: true, shallow: true], [$class: ‘CheckoutOption’, timeout: 10] ], userRemoteConfigs: [[url: env.GIT_URL]] ]) }

          // 缓存依赖
          stash includes: '**', name: 'source-code'
      }
      

      }

      stage(‘Parallel Build & Test’) { parallel { stage(‘Unit Tests’) { agent { label ‘test-runner’ } steps { unstash ‘source-code’

                  // 使用缓存的依赖
                  script {
                      if (fileExists('pom.xml')) {
                          sh '''
                              # Maven并行构建
                              mvn clean test \
                                  -T 4 \
                                  -Dmaven.test.failure.ignore=true \
                                  -Dmaven.repo.local=/var/cache/maven \
                                  -Dparallel=methods \
                                  -DthreadCount=4
                          '''
                      } else if (fileExists('build.gradle')) {
                          sh '''
                              # Gradle并行构建
                              ./gradlew test \
                                  --parallel \
                                  --max-workers=4 \
                                  --build-cache \
                                  --gradle-user-home=/var/cache/gradle
                          '''
                      }
                  }
              }
              post {
                  always {
                      publishTestResults(
                          testResultsPattern: '**/target/surefire-reports/*.xml,**/build/test-results/**/*.xml',
                          allowEmptyResults: true
                      )
                  }
              }
          }
      
      
          stage('Code Quality') {
              agent { label 'sonar-scanner' }
              steps {
                  unstash 'source-code'
      
      
                  script {
                      // 并行代码质量检查
                      parallel([
                          'SonarQube': {
                              sh '''
                                  sonar-scanner \
                                      -Dsonar.projectKey=${JOB_NAME} \
                                      -Dsonar.sources=src \
                                      -Dsonar.host.url=${SONAR_URL} \
                                      -Dsonar.login=${SONAR_TOKEN}
                              '''
                          },
                          'Security Scan': {
                              sh '''
                                  # OWASP依赖检查
                                  dependency-check.sh \
                                      --project ${JOB_NAME} \
                                      --scan . \
                                      --format XML \
                                      --out dependency-check-report.xml
                              '''
                          }
                      ])
                  }
              }
          }
      
      
          stage('Build Artifacts') {
              agent { label 'build-server' }
              steps {
                  unstash 'source-code'
      
      
                  script {
                      if (fileExists('pom.xml')) {
                          sh '''
                              # Maven优化构建
                              mvn clean package \
                                  -T 4 \
                                  -DskipTests \
                                  -Dmaven.repo.local=/var/cache/maven \
                                  -Dmaven.compile.fork=true \
                                  -Dmaven.compiler.maxmem=1024m
                          '''
                      } else if (fileExists('Dockerfile')) {
                          sh '''
                              # Docker多阶段构建
                              docker build \
                                  --build-arg BUILDKIT_INLINE_CACHE=1 \
                                  --cache-from ${IMAGE_NAME}:cache \
                                  -t ${IMAGE_NAME}:${BUILD_NUMBER} \
                                  -t ${IMAGE_NAME}:latest .
                          '''
                      }
                  }
      
      
                  // 存储构建产物
                  stash includes: '**/target/*.jar,**/build/libs/*.jar', name: 'artifacts'
              }
          }
      }
      

      }

      stage(‘Integration Tests’) { agent { label ‘integration-test’ } when { anyOf { branch ‘main’ branch ‘develop’ changeRequest() } } steps { unstash ‘source-code’ unstash ‘artifacts’

          script {
              // 并行集成测试
              def testStages = [:]
      
      
              ['api-tests', 'ui-tests', 'performance-tests'].each { testType ->
                  testStages[testType] = {
                      sh "./run-${testType}.sh"
                  }
              }
      
      
              parallel testStages
          }
      }
      

      }

      stage(‘Deploy’) { agent { label ‘deployment’ } when { branch ‘main’ } steps { unstash ‘artifacts’

          script {
              // 蓝绿部署
              sh '''
                  # 部署到蓝绿环境
                  ./deploy.sh --strategy=blue-green --timeout=300
              '''
          }
      }
      

      } }

    post { always { script { // 清理工作空间 cleanWs( cleanWhenAborted: true, cleanWhenFailure: true, cleanWhenNotBuilt: true, cleanWhenSuccess: true, cleanWhenUnstable: true, deleteDirs: true ) } }

    success {
        script {
            // 成功通知
            if (env.BRANCH_NAME == 'main') {
                slackSend(
                    channel: '#deployments',
                    color: 'good',
                    message: "✅ 部署成功: ${env.JOB_NAME} #${env.BUILD_NUMBER}"
                )
            }
        }
    }
    
    
    failure {
        script {
            // 失败通知和分析
            emailext(
                subject: "构建失败: ${env.JOB_NAME} #${env.BUILD_NUMBER}",
                body: '''
                    构建失败详情:
    
    
                    项目: ${env.JOB_NAME}
                    构建号: ${env.BUILD_NUMBER}
                    分支: ${env.BRANCH_NAME}
                    提交: ${env.GIT_COMMIT}
    
    
                    查看详情: ${env.BUILD_URL}
                ''',
                to: '${DEFAULT_RECIPIENTS}'
            )
        }
    }
    

    } }

    **构建缓存优化:**
    ```groovy
    // 共享库中的缓存管理
    @Library('jenkins-shared-library') _
    def buildWithCache(Map config) {
    def cacheKey = generateCacheKey(config)
    def cacheHit = false
    stage('Cache Check') {
        script {
            // 检查缓存是否存在
            cacheHit = checkCache(cacheKey)
            if (cacheHit) {
                echo "缓存命中: ${cacheKey}"
                restoreCache(cacheKey)
            } else {
                echo "缓存未命中,开始构建"
            }
        }
    }
    if (!cacheHit) {
        stage('Build') {
            script {
                // 执行构建
                config.buildSteps()
                // 保存缓存
                saveCache(cacheKey, config.cachePatterns)
            }
        }
    }
    }
    def generateCacheKey(Map config) {
    // 基于文件内容生成缓存键
    def checksums = []
    config.cacheFiles.each { file ->
        if (fileExists(file)) {
            def checksum = sh(
                script: "sha256sum ${file} | cut -d' ' -f1",
                returnStdout: true
            ).trim()
            checksums.add(checksum)
        }
    }
    def combinedChecksum = sh(
        script: "echo '${checksums.join(',')}' | sha256sum | cut -d' ' -f1",
        returnStdout: true
    ).trim()
    return "${config.projectName}-${combinedChecksum}"
    }
    def checkCache(String cacheKey) {
    // 检查S3或其他缓存存储
    def exitCode = sh(
        script: "aws s3 ls s3://jenkins-cache/${cacheKey}.tar.gz",
        returnStatus: true
    )
    return exitCode == 0
    }
    def restoreCache(String cacheKey) {
    sh """
        aws s3 cp s3://jenkins-cache/${cacheKey}.tar.gz cache.tar.gz
        tar -xzf cache.tar.gz
        rm cache.tar.gz
    """
    }
    def saveCache(String cacheKey, List patterns) {
    def files = patterns.join(' ')
    sh """
        tar -czf cache.tar.gz ${files}
        aws s3 cp cache.tar.gz s3://jenkins-cache/${cacheKey}.tar.gz
        rm cache.tar.gz
    """
    }
    // 使用示例
    pipeline {
    agent any
    stages {
        stage('Build with Cache') {
            steps {
                script {
                    buildWithCache([
                        projectName: 'my-app',
                        cacheFiles: ['pom.xml', 'package.json', 'requirements.txt'],
                        cachePatterns: ['~/.m2/repository', 'node_modules', '.venv'],
                        buildSteps: {
                            sh 'mvn clean package'
                            sh 'npm install'
                            sh 'pip install -r requirements.txt'
                        }
                    ])
                }
            }
        }
    }
    }
    

资源管理优化

动态节点管理: “`groovy // 智能节点分配脚本 @Library(‘jenkins-shared-library’) _

def allocateOptimalNode(Map requirements) { def availableNodes = getAvailableNodes() def optimalNode = selectOptimalNode(availableNodes, requirements)

if (optimalNode) {
    return optimalNode
} else {
    // 动态创建节点
    return createDynamicNode(requirements)
}

}

def getAvailableNodes() { def nodes = []

Jenkins.instance.computers.each { computer ->
    if (computer.isOnline() && !computer.isTemporarilyOffline()) {
        def node = computer.getNode()
        def executor = computer.getExecutors().find { !it.isBusy() }

        if (executor) {
            nodes.add([
                name: node.getNodeName(),
                labels: node.getLabelString().split(' '),
                cpu: getNodeCpuUsage(computer),
                memory: getNodeMemoryUsage(computer),
                disk: getNodeDiskUsage(computer),
                load: getNodeLoad(computer)
            ])
        }
    }
}

return nodes

}

def selectOptimalNode(List nodes, Map requirements) { // 过滤满足标签要求的节点 def candidateNodes = nodes.findAll { node -> requirements.labels.every { label -> node.labels.contains(label) } }

if (candidateNodes.isEmpty()) {
    return null
}

// 计算节点得分
def scoredNodes = candidateNodes.collect { node ->
    def score = calculateNodeScore(node, requirements)
    [node: node, score: score]
}

// 选择得分最高的节点
def bestNode = scoredNodes.max { it.score }
return bestNode.node

}

def calculateNodeScore(Map node, Map requirements) { def score = 0

// CPU得分(使用率越低得分越高)
score += (100 - node.cpu) * 0.3

// 内存得分
score += (100 - node.memory) * 0.3

// 磁盘得分
score += (100 - node.disk) * 0.2

// 负载得分
score += Math.max(0, 100 - node.load * 20) * 0.2

// 特殊要求加分
if (requirements.preferSSD && node.labels.contains('ssd')) {
    score += 10
}

if (requirements.preferHighCpu && node.labels.contains('high-cpu')) {
    score += 10
}

return score

}

def createDynamicNode(Map requirements) { // 基于需求创建云节点 def nodeTemplate = selectNodeTemplate(requirements) def cloudName = nodeTemplate.cloud

// 触发节点创建
def cloud = Jenkins.instance.getCloud(cloudName)
def provisionedNode = cloud.provision(nodeTemplate, 1)

// 等待节点上线
waitForNodeOnline(provisionedNode.name, 300) // 5分钟超时

return provisionedNode

}

def selectNodeTemplate(Map requirements) { def templates = [ [ name: ‘small-node’, cloud: ‘aws-ec2’, instanceType: ‘t3.medium’, labels: [‘linux’, ‘docker’], cpu: 2, memory: 4 ], [ name: ‘medium-node’, cloud: ‘aws-ec2’, instanceType: ‘t3.large’, labels: [‘linux’, ‘docker’, ‘maven’], cpu: 2, memory: 8 ], [ name: ‘large-node’, cloud: ‘aws-ec2’, instanceType: ‘t3.xlarge’, labels: [‘linux’, ‘docker’, ‘high-cpu’], cpu: 4, memory: 16 ] ]

// 选择满足需求的最小模板
def suitableTemplates = templates.findAll { template ->
    template.cpu >= requirements.minCpu &&
    template.memory >= requirements.minMemory &&
    requirements.labels.every { label ->
        template.labels.contains(label)
    }
}

return suitableTemplates.min { it.cpu + it.memory }

}

// 使用示例 pipeline { agent none

stages {
    stage('Lightweight Tasks') {
        agent {
            label allocateOptimalNode([
                labels: ['linux', 'docker'],
                minCpu: 1,
                minMemory: 2,
                preferSSD: false
            ]).name
        }
        steps {
            sh 'echo "Running on optimized node"'
        }
    }

    stage('Heavy Compilation') {
        agent {
            label allocateOptimalNode([
                labels: ['linux', 'maven', 'high-cpu'],
                minCpu: 4,
                minMemory: 8,
                preferSSD: true,
                preferHighCpu: true
            ]).name
        }
        steps