第6章：线性代数与随机数 - 在线学习与练习平台

📚 本章概述

本章将深入探讨NumPy在线性代数和随机数生成方面的强大功能。线性代数是科学计算的基础，而随机数生成在模拟、统计分析和机器学习中都有重要应用。您将学习矩阵运算、特征值分解、随机数生成等核心技能。

🎯 学习目标

掌握NumPy的线性代数运算
学会矩阵分解和特征值计算
理解随机数生成的原理和应用
掌握概率分布的采样方法
学会在实际项目中应用线性代数和随机数

1. 线性代数基础

1.1 矩阵基本运算

import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg
import time

# 设置中文字体
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

print("🔢 线性代数基础运算")
print("=" * 50)

# 创建矩阵
A = np.array([[1, 2, 3], 
              [4, 5, 6], 
              [7, 8, 9]])

B = np.array([[9, 8, 7], 
              [6, 5, 4], 
              [3, 2, 1]])

print("矩阵A:")
print(A)
print("\n矩阵B:")
print(B)

# 基本运算
print(f"\n矩阵加法 A + B:")
print(A + B)

print(f"\n矩阵减法 A - B:")
print(A - B)

print(f"\n元素乘法 A * B:")
print(A * B)

print(f"\n矩阵乘法 A @ B:")
print(A @ B)

# 等价的矩阵乘法
print(f"\nnp.dot(A, B):")
print(np.dot(A, B))

print(f"\nnp.matmul(A, B):")
print(np.matmul(A, B))

1.2 矩阵属性和特殊矩阵

# 2. 矩阵属性
print("\n📐 矩阵属性:")

# 矩阵的基本属性
print(f"矩阵A的形状: {A.shape}")
print(f"矩阵A的维度: {A.ndim}")
print(f"矩阵A的大小: {A.size}")
print(f"矩阵A的数据类型: {A.dtype}")

# 矩阵转置
print(f"\n矩阵A的转置:")
print(A.T)

# 矩阵的迹（对角线元素之和）
print(f"\n矩阵A的迹: {np.trace(A)}")

# 创建特殊矩阵
print(f"\n特殊矩阵:")

# 单位矩阵
I = np.eye(3)
print(f"3x3单位矩阵:")
print(I)

# 零矩阵
zeros = np.zeros((3, 3))
print(f"\n3x3零矩阵:")
print(zeros)

# 全1矩阵
ones = np.ones((3, 3))
print(f"\n3x3全1矩阵:")
print(ones)

# 对角矩阵
diag_matrix = np.diag([1, 2, 3])
print(f"\n对角矩阵:")
print(diag_matrix)

# 随机矩阵
np.random.seed(42)
random_matrix = np.random.random((3, 3))
print(f"\n随机矩阵:")
print(random_matrix)

1.3 矩阵的行列式和逆矩阵

# 3. 行列式和逆矩阵
print("\n🔍 行列式和逆矩阵:")

# 创建可逆矩阵
C = np.array([[2, 1, 1], 
              [1, 3, 2], 
              [1, 0, 0]])

print("矩阵C:")
print(C)

# 计算行列式
det_C = np.linalg.det(C)
print(f"\n矩阵C的行列式: {det_C:.6f}")

# 计算逆矩阵
if det_C != 0:
    inv_C = np.linalg.inv(C)
    print(f"\n矩阵C的逆矩阵:")
    print(inv_C)
    
    # 验证逆矩阵
    identity_check = C @ inv_C
    print(f"\nC @ inv(C) (应该是单位矩阵):")
    print(identity_check)
    
    # 检查是否接近单位矩阵
    is_identity = np.allclose(identity_check, np.eye(3))
    print(f"是否为单位矩阵: {is_identity}")
else:
    print("矩阵不可逆（行列式为0）")

# 伪逆矩阵（对于不可逆矩阵）
print(f"\n伪逆矩阵示例:")
singular_matrix = np.array([[1, 2], [2, 4], [3, 6]])
print(f"奇异矩阵:")
print(singular_matrix)

pinv_matrix = np.linalg.pinv(singular_matrix)
print(f"\n伪逆矩阵:")
print(pinv_matrix)

2. 高级线性代数运算

2.1 特征值和特征向量

# 4. 特征值和特征向量
print("\n🎯 特征值和特征向量:")

# 创建对称矩阵（保证实特征值）
symmetric_matrix = np.array([[4, 2, 1], 
                            [2, 3, 0], 
                            [1, 0, 2]])

print("对称矩阵:")
print(symmetric_matrix)

# 计算特征值和特征向量
eigenvalues, eigenvectors = np.linalg.eig(symmetric_matrix)

print(f"\n特征值:")
for i, val in enumerate(eigenvalues):
    print(f"λ{i+1} = {val:.6f}")

print(f"\n特征向量:")
for i, vec in enumerate(eigenvectors.T):
    print(f"v{i+1} = {vec}")

# 验证特征值和特征向量
print(f"\n验证 Av = λv:")
for i in range(len(eigenvalues)):
    Av = symmetric_matrix @ eigenvectors[:, i]
    lambda_v = eigenvalues[i] * eigenvectors[:, i]
    print(f"特征值{i+1}: Av ≈ λv? {np.allclose(Av, lambda_v)}")

# 特征值分解重构矩阵
reconstructed = eigenvectors @ np.diag(eigenvalues) @ eigenvectors.T
print(f"\n重构矩阵是否等于原矩阵: {np.allclose(symmetric_matrix, reconstructed)}")

2.2 矩阵分解

# 5. 矩阵分解
print("\n🔧 矩阵分解:")

# 创建测试矩阵
test_matrix = np.random.random((4, 3))
print(f"测试矩阵 {test_matrix.shape}:")
print(test_matrix)

# 1. QR分解
Q, R = np.linalg.qr(test_matrix)
print(f"\nQR分解:")
print(f"Q矩阵 {Q.shape} (正交矩阵):")
print(Q)
print(f"\nR矩阵 {R.shape} (上三角矩阵):")
print(R)

# 验证QR分解
reconstructed_qr = Q @ R
print(f"\nQR重构是否正确: {np.allclose(test_matrix, reconstructed_qr)}")

# 验证Q的正交性
print(f"Q是否正交: {np.allclose(Q.T @ Q, np.eye(Q.shape[1]))}")

# 2. SVD分解（奇异值分解）
U, s, Vt = np.linalg.svd(test_matrix, full_matrices=False)
print(f"\nSVD分解:")
print(f"U矩阵 {U.shape}:")
print(U)
print(f"\n奇异值 {s.shape}: {s}")
print(f"\nVt矩阵 {Vt.shape}:")
print(Vt)

# 验证SVD分解
S = np.diag(s)
reconstructed_svd = U @ S @ Vt
print(f"\nSVD重构是否正确: {np.allclose(test_matrix, reconstructed_svd)}")

# 3. Cholesky分解（对于正定矩阵）
print(f"\nCholesky分解:")
# 创建正定矩阵
A_pos_def = np.array([[4, 2], [2, 3]])
print(f"正定矩阵:")
print(A_pos_def)

L = np.linalg.cholesky(A_pos_def)
print(f"\nCholesky分解结果 L:")
print(L)

# 验证 A = L @ L.T
reconstructed_chol = L @ L.T
print(f"Cholesky重构是否正确: {np.allclose(A_pos_def, reconstructed_chol)}")

2.3 线性方程组求解

# 6. 线性方程组求解
print("\n📊 线性方程组求解:")

# 求解 Ax = b
A_eq = np.array([[3, 2, -1], 
                 [2, -2, 4], 
                 [-1, 0.5, -1]])

b_eq = np.array([1, -2, 0])

print("系数矩阵A:")
print(A_eq)
print(f"\n常数向量b: {b_eq}")

# 方法1: 直接求解
x_solve = np.linalg.solve(A_eq, b_eq)
print(f"\n解向量x: {x_solve}")

# 验证解
verification = A_eq @ x_solve
print(f"验证 Ax: {verification}")
print(f"是否等于b: {np.allclose(verification, b_eq)}")

# 方法2: 使用逆矩阵
x_inv = np.linalg.inv(A_eq) @ b_eq
print(f"\n使用逆矩阵求解: {x_inv}")
print(f"两种方法结果一致: {np.allclose(x_solve, x_inv)}")

# 方法3: 最小二乘解（对于超定系统）
print(f"\n最小二乘解示例:")
A_over = np.random.random((5, 3))  # 5个方程，3个未知数
b_over = np.random.random(5)

x_lstsq, residuals, rank, s = np.linalg.lstsq(A_over, b_over, rcond=None)
print(f"最小二乘解: {x_lstsq}")
print(f"残差: {residuals}")
print(f"矩阵秩: {rank}")

3. 随机数生成基础

3.1 随机数生成器

# 7. 随机数生成基础
print("\n🎲 随机数生成基础:")

# 设置随机种子（确保结果可重现）
np.random.seed(42)
print("设置随机种子为42")

# 基本随机数生成
print(f"\n基本随机数生成:")
print(f"单个随机数: {np.random.random()}")
print(f"随机数数组: {np.random.random(5)}")
print(f"随机矩阵:\n{np.random.random((3, 3))}")

# 不同范围的随机数
print(f"\n不同范围的随机数:")
print(f"[0, 1)范围: {np.random.random(5)}")
print(f"[0, 10)范围: {np.random.random(5) * 10}")
print(f"[5, 15)范围: {np.random.random(5) * 10 + 5}")

# 使用uniform函数
print(f"uniform[2, 8): {np.random.uniform(2, 8, 5)}")

# 整数随机数
print(f"\n整数随机数:")
print(f"randint[0, 10): {np.random.randint(0, 10, 5)}")
print(f"randint[1, 7): {np.random.randint(1, 7, 10)}")  # 模拟骰子

# 随机选择
print(f"\n随机选择:")
choices = ['apple', 'banana', 'orange', 'grape']
print(f"随机选择: {np.random.choice(choices, 3)}")
print(f"不重复选择: {np.random.choice(choices, 3, replace=False)}")

# 带权重的随机选择
weights = [0.1, 0.3, 0.4, 0.2]
print(f"带权重选择: {np.random.choice(choices, 5, p=weights)}")

3.2 新式随机数生成器

# 8. 新式随机数生成器
print("\n🆕 新式随机数生成器:")

# 创建新式生成器
rng = np.random.default_rng(42)
print("创建新式随机数生成器")

# 基本用法
print(f"\n新式生成器用法:")
print(f"random(): {rng.random(5)}")
print(f"integers(): {rng.integers(1, 10, 5)}")
print(f"choice(): {rng.choice(['A', 'B', 'C'], 5)}")

# 多个独立的生成器
rng1 = np.random.default_rng(123)
rng2 = np.random.default_rng(456)

print(f"\n独立生成器:")
print(f"生成器1: {rng1.random(3)}")
print(f"生成器2: {rng2.random(3)}")

# 重置生成器1
rng1 = np.random.default_rng(123)
print(f"重置后生成器1: {rng1.random(3)}")

# 生成器的优势：线程安全和更好的统计性质
print(f"\n生成器优势演示:")
# 创建多个独立流
streams = [np.random.default_rng(i) for i in range(5)]
results = [stream.random(3) for stream in streams]
for i, result in enumerate(results):
    print(f"流{i}: {result}")

4. 概率分布

4.1 常见连续分布

# 9. 常见连续分布
print("\n📈 常见连续分布:")

# 正态分布
print("正态分布:")
normal_samples = np.random.normal(0, 1, 1000)  # 均值0，标准差1
print(f"样本均值: {np.mean(normal_samples):.3f}")
print(f"样本标准差: {np.std(normal_samples):.3f}")

# 不同参数的正态分布
normal_custom = np.random.normal(10, 2, 1000)  # 均值10，标准差2
print(f"自定义正态分布均值: {np.mean(normal_custom):.3f}")

# 均匀分布
uniform_samples = np.random.uniform(-1, 1, 1000)
print(f"\n均匀分布[-1, 1]:")
print(f"最小值: {np.min(uniform_samples):.3f}")
print(f"最大值: {np.max(uniform_samples):.3f}")
print(f"均值: {np.mean(uniform_samples):.3f}")

# 指数分布
exponential_samples = np.random.exponential(2, 1000)  # 尺度参数2
print(f"\n指数分布:")
print(f"均值: {np.mean(exponential_samples):.3f}")
print(f"理论均值: 2.0")

# 伽马分布
gamma_samples = np.random.gamma(2, 2, 1000)  # 形状参数2，尺度参数2
print(f"\n伽马分布:")
print(f"均值: {np.mean(gamma_samples):.3f}")
print(f"理论均值: {2 * 2}")

# 贝塔分布
beta_samples = np.random.beta(2, 5, 1000)
print(f"\n贝塔分布Beta(2,5):")
print(f"均值: {np.mean(beta_samples):.3f}")
print(f"理论均值: {2/(2+5):.3f}")

# 卡方分布
chi2_samples = np.random.chisquare(3, 1000)  # 自由度3
print(f"\n卡方分布:")
print(f"均值: {np.mean(chi2_samples):.3f}")
print(f"理论均值: 3.0")

4.2 常见离散分布

# 10. 常见离散分布
print("\n🎯 常见离散分布:")

# 二项分布
binomial_samples = np.random.binomial(10, 0.3, 1000)  # n=10, p=0.3
print(f"二项分布B(10, 0.3):")
print(f"均值: {np.mean(binomial_samples):.3f}")
print(f"理论均值: {10 * 0.3}")

# 泊松分布
poisson_samples = np.random.poisson(3, 1000)  # λ=3
print(f"\n泊松分布Poisson(3):")
print(f"均值: {np.mean(poisson_samples):.3f}")
print(f"理论均值: 3.0")

# 几何分布
geometric_samples = np.random.geometric(0.2, 1000)  # p=0.2
print(f"\n几何分布Geometric(0.2):")
print(f"均值: {np.mean(geometric_samples):.3f}")
print(f"理论均值: {1/0.2}")

# 负二项分布
negative_binomial = np.random.negative_binomial(5, 0.3, 1000)
print(f"\n负二项分布:")
print(f"均值: {np.mean(negative_binomial):.3f}")

# 多项分布
multinomial_sample = np.random.multinomial(100, [0.2, 0.3, 0.5])
print(f"\n多项分布样本: {multinomial_sample}")
print(f"总和: {np.sum(multinomial_sample)}")

4.3 分布的可视化

# 11. 分布可视化
print("\n📊 分布可视化:")

# 创建图形
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
fig.suptitle('常见概率分布', fontsize=16)

# 正态分布
ax = axes[0, 0]
normal_data = np.random.normal(0, 1, 1000)
ax.hist(normal_data, bins=30, density=True, alpha=0.7, color='blue')
ax.set_title('正态分布 N(0,1)')
ax.set_xlabel('值')
ax.set_ylabel('密度')

# 指数分布
ax = axes[0, 1]
exp_data = np.random.exponential(2, 1000)
ax.hist(exp_data, bins=30, density=True, alpha=0.7, color='red')
ax.set_title('指数分布 Exp(2)')
ax.set_xlabel('值')
ax.set_ylabel('密度')

# 均匀分布
ax = axes[0, 2]
uniform_data = np.random.uniform(0, 1, 1000)
ax.hist(uniform_data, bins=30, density=True, alpha=0.7, color='green')
ax.set_title('均匀分布 U(0,1)')
ax.set_xlabel('值')
ax.set_ylabel('密度')

# 二项分布
ax = axes[1, 0]
binomial_data = np.random.binomial(20, 0.3, 1000)
ax.hist(binomial_data, bins=range(21), density=True, alpha=0.7, color='orange')
ax.set_title('二项分布 B(20,0.3)')
ax.set_xlabel('值')
ax.set_ylabel('概率')

# 泊松分布
ax = axes[1, 1]
poisson_data = np.random.poisson(5, 1000)
ax.hist(poisson_data, bins=range(20), density=True, alpha=0.7, color='purple')
ax.set_title('泊松分布 Poisson(5)')
ax.set_xlabel('值')
ax.set_ylabel('概率')

# 伽马分布
ax = axes[1, 2]
gamma_data = np.random.gamma(2, 2, 1000)
ax.hist(gamma_data, bins=30, density=True, alpha=0.7, color='brown')
ax.set_title('伽马分布 Gamma(2,2)')
ax.set_xlabel('值')
ax.set_ylabel('密度')

plt.tight_layout()
plt.show()

print("分布图已生成")

5. 实际应用案例

5.1 主成分分析(PCA)

# 12. 主成分分析应用
print("\n🔍 主成分分析(PCA)应用:")

# 生成示例数据
np.random.seed(42)
n_samples = 200
n_features = 4

# 创建相关的数据
data = np.random.randn(n_samples, n_features)
# 添加相关性
data[:, 1] = data[:, 0] + 0.5 * np.random.randn(n_samples)
data[:, 2] = data[:, 0] - data[:, 1] + 0.3 * np.random.randn(n_samples)
data[:, 3] = 0.5 * data[:, 1] + 0.2 * np.random.randn(n_samples)

print(f"原始数据形状: {data.shape}")
print(f"原始数据前5行:")
print(data[:5])

# 数据标准化
data_centered = data - np.mean(data, axis=0)
data_std = data_centered / np.std(data_centered, axis=0)

print(f"\n标准化后数据统计:")
print(f"均值: {np.mean(data_std, axis=0)}")
print(f"标准差: {np.std(data_std, axis=0)}")

# 计算协方差矩阵
cov_matrix = np.cov(data_std.T)
print(f"\n协方差矩阵:")
print(cov_matrix)

# 特征值分解
eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)

# 按特征值大小排序
idx = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]

print(f"\n特征值: {eigenvalues}")
print(f"解释方差比: {eigenvalues / np.sum(eigenvalues)}")
print(f"累积解释方差比: {np.cumsum(eigenvalues) / np.sum(eigenvalues)}")

# PCA变换
n_components = 2
pca_data = data_std @ eigenvectors[:, :n_components]

print(f"\nPCA降维后数据形状: {pca_data.shape}")
print(f"前5个样本的主成分:")
print(pca_data[:5])

# 可视化PCA结果
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.scatter(data_std[:, 0], data_std[:, 1], alpha=0.6)
plt.xlabel('特征1')
plt.ylabel('特征2')
plt.title('原始数据（前两个特征）')
plt.grid(True)

plt.subplot(1, 2, 2)
plt.scatter(pca_data[:, 0], pca_data[:, 1], alpha=0.6, color='red')
plt.xlabel('第一主成分')
plt.ylabel('第二主成分')
plt.title('PCA降维后数据')
plt.grid(True)

plt.tight_layout()
plt.show()

5.2 蒙特卡罗模拟

# 13. 蒙特卡罗模拟
print("\n🎰 蒙特卡罗模拟:")

# 应用1: 估算π值
print("应用1: 估算π值")
n_points = 100000

# 生成随机点
x = np.random.uniform(-1, 1, n_points)
y = np.random.uniform(-1, 1, n_points)

# 计算距离原点的距离
distances = np.sqrt(x**2 + y**2)

# 统计在单位圆内的点
inside_circle = np.sum(distances <= 1)

# 估算π
pi_estimate = 4 * inside_circle / n_points
print(f"使用{n_points}个点估算π: {pi_estimate:.6f}")
print(f"真实π值: {np.pi:.6f}")
print(f"误差: {abs(pi_estimate - np.pi):.6f}")

# 应用2: 期权定价（Black-Scholes模型）
print(f"\n应用2: 期权定价模拟")
# 参数设置
S0 = 100    # 初始股价
K = 105     # 执行价格
T = 1       # 到期时间（年）
r = 0.05    # 无风险利率
sigma = 0.2 # 波动率
n_simulations = 100000

# 生成随机数
np.random.seed(42)
Z = np.random.standard_normal(n_simulations)

# 计算到期股价
ST = S0 * np.exp((r - 0.5 * sigma**2) * T + sigma * np.sqrt(T) * Z)

# 计算期权收益
call_payoffs = np.maximum(ST - K, 0)
put_payoffs = np.maximum(K - ST, 0)

# 折现到现值
call_price = np.exp(-r * T) * np.mean(call_payoffs)
put_price = np.exp(-r * T) * np.mean(put_payoffs)

print(f"看涨期权价格: {call_price:.4f}")
print(f"看跌期权价格: {put_price:.4f}")

# Black-Scholes理论价格（用于比较）
from scipy.stats import norm
d1 = (np.log(S0/K) + (r + 0.5*sigma**2)*T) / (sigma*np.sqrt(T))
d2 = d1 - sigma*np.sqrt(T)

bs_call = S0*norm.cdf(d1) - K*np.exp(-r*T)*norm.cdf(d2)
bs_put = K*np.exp(-r*T)*norm.cdf(-d2) - S0*norm.cdf(-d1)

print(f"Black-Scholes看涨期权价格: {bs_call:.4f}")
print(f"Black-Scholes看跌期权价格: {bs_put:.4f}")
print(f"看涨期权误差: {abs(call_price - bs_call):.4f}")
print(f"看跌期权误差: {abs(put_price - bs_put):.4f}")

5.3 线性回归

# 14. 线性回归应用
print("\n📈 线性回归应用:")

# 生成示例数据
np.random.seed(42)
n_samples = 100
n_features = 3

# 真实参数
true_coefficients = np.array([2, -1, 0.5])
true_intercept = 1

# 生成特征矩阵
X = np.random.randn(n_samples, n_features)

# 生成目标变量（带噪声）
y = X @ true_coefficients + true_intercept + 0.1 * np.random.randn(n_samples)

print(f"数据形状: X{X.shape}, y{y.shape}")
print(f"真实系数: {true_coefficients}")
print(f"真实截距: {true_intercept}")

# 方法1: 正规方程求解
# 添加截距项
X_with_intercept = np.column_stack([np.ones(n_samples), X])

# 正规方程: θ = (X^T X)^(-1) X^T y
theta_normal = np.linalg.inv(X_with_intercept.T @ X_with_intercept) @ X_with_intercept.T @ y

print(f"\n正规方程解:")
print(f"截距: {theta_normal[0]:.4f}")
print(f"系数: {theta_normal[1:]}")

# 方法2: 使用伪逆
theta_pinv = np.linalg.pinv(X_with_intercept) @ y
print(f"\n伪逆解:")
print(f"截距: {theta_pinv[0]:.4f}")
print(f"系数: {theta_pinv[1:]}")

# 方法3: SVD求解
U, s, Vt = np.linalg.svd(X_with_intercept, full_matrices=False)
theta_svd = Vt.T @ np.diag(1/s) @ U.T @ y
print(f"\nSVD解:")
print(f"截距: {theta_svd[0]:.4f}")
print(f"系数: {theta_svd[1:]}")

# 预测和评估
y_pred = X_with_intercept @ theta_normal

# 计算R²
ss_res = np.sum((y - y_pred) ** 2)
ss_tot = np.sum((y - np.mean(y)) ** 2)
r_squared = 1 - (ss_res / ss_tot)

print(f"\n模型评估:")
print(f"R²: {r_squared:.4f}")
print(f"均方误差: {np.mean((y - y_pred)**2):.4f}")
print(f"均方根误差: {np.sqrt(np.mean((y - y_pred)**2)):.4f}")

# 可视化结果（选择一个特征）
plt.figure(figsize=(10, 6))

plt.subplot(1, 2, 1)
plt.scatter(X[:, 0], y, alpha=0.6, label='真实数据')
# 计算预测线
x_line = np.linspace(X[:, 0].min(), X[:, 0].max(), 100)
y_line = theta_normal[0] + theta_normal[1] * x_line
plt.plot(x_line, y_line, 'r-', label='拟合线')
plt.xlabel('特征1')
plt.ylabel('目标变量')
plt.title('线性回归拟合')
plt.legend()
plt.grid(True)

plt.subplot(1, 2, 2)
plt.scatter(y_pred, y, alpha=0.6)
plt.plot([y.min(), y.max()], [y.min(), y.max()], 'r--', label='完美预测')
plt.xlabel('预测值')
plt.ylabel('真实值')
plt.title('预测 vs 真实')
plt.legend()
plt.grid(True)

plt.tight_layout()
plt.show()

6. 性能优化

6.1 线性代数性能优化

# 15. 线性代数性能优化
print("\n⚡ 线性代数性能优化:")

import time

# 创建大矩阵
sizes = [100, 500, 1000]
methods = ['@', 'np.dot', 'np.matmul']

print("矩阵乘法性能比较:")
print("大小\t@运算符\tnp.dot\tnp.matmul")

for size in sizes:
    A = np.random.random((size, size))
    B = np.random.random((size, size))
    
    times = []
    
    # @ 运算符
    start = time.time()
    C1 = A @ B
    times.append(time.time() - start)
    
    # np.dot
    start = time.time()
    C2 = np.dot(A, B)
    times.append(time.time() - start)
    
    # np.matmul
    start = time.time()
    C3 = np.matmul(A, B)
    times.append(time.time() - start)
    
    print(f"{size}\t{times[0]:.4f}s\t{times[1]:.4f}s\t{times[2]:.4f}s")
    
    # 验证结果一致性
    assert np.allclose(C1, C2) and np.allclose(C2, C3)

# 内存布局对性能的影响
print(f"\n内存布局对性能的影响:")
size = 1000
A = np.random.random((size, size))

# C顺序（行主序）
A_c = np.ascontiguousarray(A)
# Fortran顺序（列主序）
A_f = np.asfortranarray(A)

print(f"C顺序数组: {A_c.flags['C_CONTIGUOUS']}")
print(f"Fortran顺序数组: {A_f.flags['F_CONTIGUOUS']}")

# 测试转置性能
start = time.time()
A_c_T = A_c.T
time_c_transpose = time.time() - start

start = time.time()
A_f_T = A_f.T
time_f_transpose = time.time() - start

print(f"C顺序转置时间: {time_c_transpose:.6f}s")
print(f"Fortran顺序转置时间: {time_f_transpose:.6f}s")

6.2 随机数生成性能

# 16. 随机数生成性能
print("\n🎲 随机数生成性能:")

# 比较不同随机数生成方法
n_samples = 1000000

# 旧式生成器
start = time.time()
old_random = np.random.random(n_samples)
old_time = time.time() - start

# 新式生成器
rng = np.random.default_rng()
start = time.time()
new_random = rng.random(n_samples)
new_time = time.time() - start

print(f"旧式生成器时间: {old_time:.4f}s")
print(f"新式生成器时间: {new_time:.4f}s")
print(f"性能提升: {old_time/new_time:.2f}x")

# 批量生成 vs 循环生成
print(f"\n批量生成 vs 循环生成:")

# 批量生成
start = time.time()
batch_samples = np.random.normal(0, 1, n_samples)
batch_time = time.time() - start

# 循环生成
start = time.time()
loop_samples = np.array([np.random.normal(0, 1) for _ in range(n_samples)])
loop_time = time.time() - start

print(f"批量生成时间: {batch_time:.4f}s")
print(f"循环生成时间: {loop_time:.4f}s")
print(f"批量生成速度提升: {loop_time/batch_time:.1f}x")

# 预分配数组 vs 动态增长
print(f"\n预分配 vs 动态增长:")
n_iterations = 10000

# 预分配
start = time.time()
pre_allocated = np.zeros(n_iterations)
for i in range(n_iterations):
    pre_allocated[i] = np.random.random()
pre_time = time.time() - start

# 动态增长
start = time.time()
dynamic_list = []
for i in range(n_iterations):
    dynamic_list.append(np.random.random())
dynamic_array = np.array(dynamic_list)
dynamic_time = time.time() - start

print(f"预分配时间: {pre_time:.4f}s")
print(f"动态增长时间: {dynamic_time:.4f}s")
print(f"预分配速度提升: {dynamic_time/pre_time:.1f}x")

7. 本章小结

7.1 核心知识点

线性代数运算
- 矩阵基本运算：加减乘除、转置
- 特殊矩阵：单位矩阵、零矩阵、对角矩阵
- 行列式和逆矩阵计算
高级线性代数
- 特征值和特征向量分解
- 矩阵分解：QR、SVD、Cholesky
- 线性方程组求解
随机数生成
- 基本随机数生成器
- 新式随机数生成器
- 随机种子和可重现性
概率分布
- 连续分布：正态、均匀、指数、伽马等
- 离散分布：二项、泊松、几何等
- 分布参数和统计性质
实际应用
- 主成分分析(PCA)
- 蒙特卡罗模拟
- 线性回归

7.2 最佳实践

🚀 选择合适的算法: 根据矩阵性质选择最优算法
📊 理解数值稳定性: 注意条件数和数值精度
🎲 正确使用随机数: 设置种子确保可重现性
💾 优化内存使用: 注意数组布局和内存连续性

7.3 常见陷阱

❌ 奇异矩阵: 检查矩阵是否可逆
❌ 数值精度: 使用适当的容差进行比较
❌ 随机种子: 忘记设置种子导致结果不可重现
❌ 分布参数: 搞错分布参数的含义

7.4 下一步学习

📁 学习文件输入输出
⚡ 掌握性能优化技巧
🔧 了解与其他库的集成
📚 深入学习数值计算理论

8. 练习题

8.1 基础练习

矩阵运算
- 实现矩阵的基本运算
- 计算矩阵的行列式和逆矩阵
- 验证矩阵运算的性质
特征值分解
- 计算对称矩阵的特征值和特征向量
- 验证特征值分解的正确性
- 实现矩阵的对角化
随机数生成
- 生成不同分布的随机数
- 验证生成数据的统计性质
- 比较不同生成器的性能

8.2 进阶练习

PCA实现
- 从头实现PCA算法
- 应用PCA进行数据降维
- 分析主成分的含义
线性回归
- 实现多种线性回归求解方法
- 比较不同方法的数值稳定性
- 处理病态矩阵问题
蒙特卡罗方法
- 实现各种蒙特卡罗算法
- 分析收敛性和精度
- 优化计算效率

8.3 挑战练习

数值优化
- 实现梯度下降算法
- 处理大规模矩阵运算
- 优化内存使用
统计模拟
- 实现复杂的统计模型
- 进行假设检验
- 计算置信区间
机器学习算法
- 实现基础的机器学习算法
- 使用线性代数优化计算
- 处理高维数据

恭喜您完成第6章的学习！ 🎉

您已经掌握了NumPy的线性代数和随机数生成功能，这些是科学计算和数据分析的重要基础。在下一章中，我们将学习文件输入输出和数据持久化。

👉 继续学习第7章：文件IO与数据持久化

📂 分类导航

▶ 学与练
- ▶ 软件技术基础
  - ▶ 操作系统技术
    - Linux实战
    - ▶ Linux技巧
      - debug-remote-api.md
  - ▶ 容器化与编排
    - Docker实战
    - ▶ Docker高级
- ▶ 前端开发技术
  - ▶ 框架与库
    - js
    - vue
  - ▶ 前端生态
    - bootstrap
    - vue-ssr
- ▶ 后端开发技术
  - ▶ 编程语言
    - ▶ Java
    - ▶ Go
      - go-server.md
      - mini.md
    - Rust
    - Python
    - csharp
  - ▶ 中间件
    - redis
    - ▶ minio
      - minio.md
    - elasticsearch
    - kafka
    - elk
    - caddy
  - ▶ 数据库
    - MySQL
    - SQLServer
    - ▶ Dameng
      - sql.md
    - clickhouse
- ▶ 数据开发与运维
  - ▶ 数据开发
    - hadoop
  - ▶ 运维开发
    - ▶ CI/CD
      - jenkins
    - ▶ 自动化
      - allinssl.md
    - ▶ 日志处理
      - elk
    - ▶ 监控
- 软件入门教程
▶ 软件园
- AI智能体与应用
- 开发工具与环境
- AI 开发和编排
- 业务与生产力应用
- 数据和中间件
▶ 工具箱
- 内容管理
- 编码解码
- ▶ 系统监控
  - miaotixing.md
- ▶ 日常工具
- 工具命令
- 使用教程

📚 第6章：线性代数与随机数

📚 本章概述

🎯 学习目标

1. 线性代数基础

1.1 矩阵基本运算

1.2 矩阵属性和特殊矩阵

1.3 矩阵的行列式和逆矩阵

2. 高级线性代数运算

2.1 特征值和特征向量

2.2 矩阵分解

2.3 线性方程组求解

3. 随机数生成基础

3.1 随机数生成器

3.2 新式随机数生成器

4. 概率分布

4.1 常见连续分布

4.2 常见离散分布

4.3 分布的可视化

5. 实际应用案例

5.1 主成分分析(PCA)

5.2 蒙特卡罗模拟

5.3 线性回归

6. 性能优化

6.1 线性代数性能优化

6.2 随机数生成性能

7. 本章小结

7.1 核心知识点

7.2 最佳实践

7.3 常见陷阱

7.4 下一步学习

8. 练习题

8.1 基础练习

8.2 进阶练习

8.3 挑战练习

📂 分类导航

📰 最新文章