灰狼优化算法(GWO)：从理论到深度学习中的实践应用

灰狼优化算法（GWO）是一种模拟灰狼群体猎食行为的优化算法，它主要通过模拟灰狼群体中的领袖狼（alpha）、跟随狼（beta、delta）和普通狼（omega）的角色来实现寻优

1. 灰狼优化算法过程：

1.1 初始化

2.2 计算适应度

2.3 位置更新

根据alpha、beta和delta狼的位置来更新其余灰狼的位置，更新公式如下：

2.3.1 距离向量计算

2.3.2 计算新的位置向量

2.3.3 更新灰狼位置

1.2 流程图

两图分别为灰狼优化算法流程图、灰狼的等级制度(从上到下的优势递减)，灰狼优化算法通过模拟灰狼猎食过程中领袖狼和群体狼的行为，利用领袖狼的位置引导其他灰狼逐步逼近猎物（全局最优解），从而实现全局优化，其核心在于通过位置更新公式，不断逼近最优解

2. 灰狼算法优缺点

优点：与其他优化算法相比，灰狼算法的优化过程更快，因为它们先得出答案，再把不同答案进行比较并相应地进行排序，以此输出最佳解决方案缺点：灰狼优化算法属于启发式优化算法，产生的最优解仅接近于原始最优解，并不是问题真正的最优解

3. 灰狼算法实现

3.1 灰狼算法在简单函数上求最值

import numpy as np

import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = 'SimHei'

plt.rcParams['axes.unicode_minus'] = False



# 定义优化问题的目标函数

def objective_function(x):

    return np.sum(x ** 2 + x)



# 初始化灰狼优化算法的参数

dim = 2  # 解的维度

num_wolves = 300  # 种群大小

max_iter = 100  # 最大迭代次数



# 初始化灰狼种群的位置

wolves = np.random.uniform(-10, 10, (num_wolves, dim))



# 初始化 alpha、beta、delta 狼的位置

alpha_pos = np.zeros(dim)

beta_pos = np.zeros(dim)

delta_pos = np.zeros(dim)

alpha_score = float("inf")

beta_score = float("inf")

delta_score = float("inf")



# 开始迭代

a = 2  # 控制参数，逐渐减小

convergence_curve = []



for t in range(max_iter):

    for i in range(num_wolves):

        fitness = objective_function(wolves[i])



        if fitness < alpha_score:

            alpha_score = fitness

            alpha_pos = wolves[i].copy()

        elif fitness < beta_score:

            beta_score = fitness

            beta_pos = wolves[i].copy()

        elif fitness < delta_score:

            delta_score = fitness

            delta_pos = wolves[i].copy()



    a = 2 - t * (2 / max_iter)  # 线性减少 a



    for i in range(num_wolves):

        for j in range(dim):

            r1, r2 = np.random.rand(), np.random.rand()

            A1 = 2 * a * r1 - a

            C1 = 2 * r2

            D_alpha = abs(C1 * alpha_pos[j] - wolves[i][j])

            X1 = alpha_pos[j] - A1 * D_alpha



            r1, r2 = np.random.rand(), np.random.rand()

            A2 = 2 * a * r1 - a

            C2 = 2 * r2

            D_beta = abs(C2 * beta_pos[j] - wolves[i][j])

            X2 = beta_pos[j] - A2 * D_beta



            r1, r2 = np.random.rand(), np.random.rand()

            A3 = 2 * a * r1 - a

            C3 = 2 * r2

            D_delta = abs(C3 * delta_pos[j] - wolves[i][j])

            X3 = delta_pos[j] - A3 * D_delta



            wolves[i][j] = (X1 + X2 + X3) / 3



    convergence_curve.append(alpha_score)



print(f"最优解: {alpha_pos}")

print(f"最优值: {alpha_score}")



# 绘制收敛曲线

plt.figure(figsize=(15,5))

plt.plot(convergence_curve)

plt.title("收敛曲线")

plt.xlabel("迭代次数")

plt.ylabel("适应度值")

plt.show()



# 可视化灰狼种群位置

if dim == 2:

    plt.figure(figsize=(15,5))

    plt.scatter(wolves[:, 0], wolves[:, 1], c='blue', label='Wolves')

    plt.scatter(alpha_pos[0], alpha_pos[1], c='red', label='Alpha', marker='x')

    plt.scatter(beta_pos[0], beta_pos[1], c='green', label='Beta', marker='x')

    plt.scatter(delta_pos[0], delta_pos[1], c='purple', label='Delta', marker='x')

    plt.title("灰狼种群位置")

    plt.legend()

    plt.show()

3.2 灰狼算法在标准测试函数Rastrigin上求最值

import numpy as np

import matplotlib.pyplot as plt

from mpl_toolkits.mplot3d import Axes3D



plt.rcParams['font.sans-serif'] = 'SimHei'

plt.rcParams['axes.unicode_minus'] = False



# 定义优化问题的目标函数（Rastrigin函数）

def objective_function(x):

    return 10 * len(x) + np.sum(x ** 2 - 10 * np.cos(2 * np.pi * x))



# 初始化灰狼优化算法的参数

dim = 2  # 解的维度

num_wolves = 300  # 种群大小

max_iter = 100  # 最大迭代次数



# 初始化灰狼种群的位置

wolves = np.random.uniform(-5.12, 5.12, (num_wolves, dim))



# 初始化 alpha、beta、delta 狼的位置

alpha_pos = np.zeros(dim)

beta_pos = np.zeros(dim)

delta_pos = np.zeros(dim)

alpha_score = float("inf")

beta_score = float("inf")

delta_score = float("inf")



# 开始迭代

a = 2  # 控制参数，逐渐减小

convergence_curve = []



for t in range(max_iter):

    for i in range(num_wolves):

        fitness = objective_function(wolves[i])



        if fitness < alpha_score:

            alpha_score = fitness

            alpha_pos = wolves[i].copy()

        elif fitness < beta_score:

            beta_score = fitness

            beta_pos = wolves[i].copy()

        elif fitness < delta_score:

            delta_score = fitness

            delta_pos = wolves[i].copy()



    a = 2 - t * (2 / max_iter)  # 线性减少 a



    for i in range(num_wolves):

        for j in range(dim):

            r1, r2 = np.random.rand(), np.random.rand()

            A1 = 2 * a * r1 - a

            C1 = 2 * r2

            D_alpha = abs(C1 * alpha_pos[j] - wolves[i][j])

            X1 = alpha_pos[j] - A1 * D_alpha



            r1, r2 = np.random.rand(), np.random.rand()

            A2 = 2 * a * r1 - a

            C2 = 2 * r2

            D_beta = abs(C2 * beta_pos[j] - wolves[i][j])

            X2 = beta_pos[j] - A2 * D_beta



            r1, r2 = np.random.rand(), np.random.rand()

            A3 = 2 * a * r1 - a

            C3 = 2 * r2

            D_delta = abs(C3 * delta_pos[j] - wolves[i][j])

            X3 = delta_pos[j] - A3 * D_delta



            wolves[i][j] = (X1 + X2 + X3) / 3



    convergence_curve.append(alpha_score)



print(f"最优解: {alpha_pos}")

print(f"最优值: {alpha_score}")



plt.figure(figsize=(15,5))

plt.plot(convergence_curve)

plt.title("收敛曲线")

plt.xlabel("迭代次数")

plt.ylabel("适应度值")

plt.show()



if dim == 2:

    x = np.linspace(-5.12, 5.12, 400)

    y = np.linspace(-5.12, 5.12, 400)

    X, Y = np.meshgrid(x, y)

    Z = 10 * 2 + (X ** 2 - 10 * np.cos(2 * np.pi * X)) + (Y ** 2 - 10 * np.cos(2 * np.pi * Y))

    fig = plt.figure(figsize=(8,8))

    ax = fig.add_subplot(111, projection='3d')

    ax.plot_surface(X, Y, Z, cmap='viridis', alpha=0.2)

    ax.scatter(alpha_pos[0], alpha_pos[1], alpha_score, c='red', label='Alpha', marker='x')

    ax.scatter(beta_pos[0], beta_pos[1], beta_score, c='green', label='Beta', marker='x')

    ax.scatter(delta_pos[0], delta_pos[1], delta_score, c='purple', label='Delta', marker='x')

    ax.set_title("Rastrigin求解结果")

    ax.legend()

    plt.show()

Rastrigin函数是一种常用于测试优化算法性能的多模态函数，其数学表达式为：

4. 灰狼算法在模型上优化的运用

4.1 数据简单预处理

import pandas as pd

import numpy as np

df = pd.read_excel('数据.xlsx',index_col=0, parse_dates=['数据时间'])



# 数据预处理

df_max = np.max(df['总有功功率（kw）'])

df_min = np.min(df['总有功功率（kw）'])

df_bz = (df['总有功功率（kw）']-df_min)/(df_max-df_min)



def prepare_data(data, win_size):

    X = []  

    y = [] 

    for i in range(len(data) - win_size):

        temp_x = data[i:i + win_size]

        temp_y = data[i + win_size] 

        X.append(temp_x)

        y.append(temp_y)

    X = np.asarray(X)

    y = np.asarray(y)

    X = np.expand_dims(X, axis=-1)



    return X, y

win_size = 12

X, y = prepare_data(df_bz.values, win_size)

train_size = int(len(X) * 0.7)  

X_train, X_test = X[:train_size], X[train_size:]

y_train, y_test = y[:train_size], y[train_size:]

4.2 灰狼算法寻找最优参数

import tensorflow.compat.v1 as tf

from tensorflow.keras.layers import Flatten, Dense

from tensorflow.keras.models import Sequential

import matplotlib.pyplot as plt

from tcn import TCN



# 关闭TensorFlow 2.x的急切执行

tf.disable_eager_execution()



# 定义目标函数

def objective_function(params):

    dense_1, dense_2, filters1 = params

    dense_1, dense_2, filters1 = int(dense_1), int(dense_2), int(filters1)



    model = Sequential()

    model.add(TCN(nb_filters=filters1, kernel_size=6, activation='relu', input_shape=(win_size, 1), dilations=[1, 2, 4, 8, 16]))

    model.add(Flatten())

    model.add(Dense(dense_1, activation='relu'))

    model.add(Dense(dense_2, activation='relu'))

    model.add(Dense(1, activation='sigmoid'))



    model.compile(optimizer='adam', loss='mse')

    history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test), verbose=0)

    val_loss = min(history.history['val_loss'])



    return val_loss



# 初始化GWO参数

dim = 3 # 超参数的数量

num_wolves = 5  # 灰狼种群的大小

max_iter = 20  # 最大迭代次数

lower_bound = [32, 64, 32]  # 超参数的下界

upper_bound = [128, 256, 128]  # 超参数的上界



# 初始化灰狼种群的位置

wolves = np.random.uniform(lower_bound, upper_bound, (num_wolves, dim))



# 初始化 alpha、beta、delta 狼的位置

alpha_pos = np.zeros(dim)

beta_pos = np.zeros(dim)

delta_pos = np.zeros(dim)

alpha_score = float("inf")

beta_score = float("inf")

delta_score = float("inf")



# 开始迭代

a = 2  # 控制参数，逐渐减小

convergence_curve = []



for t in range(max_iter):

    for i in range(num_wolves):

        fitness = objective_function(wolves[i])



        if fitness < alpha_score:

            alpha_score = fitness

            alpha_pos = wolves[i].copy()

        elif fitness < beta_score:

            beta_score = fitness

            beta_pos = wolves[i].copy()

        elif fitness < delta_score:

            delta_score = fitness

            delta_pos = wolves[i].copy()



    a = 2 - t * (2 / max_iter)  # 线性减少 a



    for i in range(num_wolves):

        for j in range(dim):

            r1, r2 = np.random.rand(), np.random.rand()

            A1 = 2 * a * r1 - a

            C1 = 2 * r2

            D_alpha = abs(C1 * alpha_pos[j] - wolves[i][j])

            X1 = alpha_pos[j] - A1 * D_alpha



            r1, r2 = np.random.rand(), np.random.rand()

            A2 = 2 * a * r1 - a

            C2 = 2 * r2

            D_beta = abs(C2 * beta_pos[j] - wolves[i][j])

            X2 = beta_pos[j] - A2 * D_beta



            r1, r2 = np.random.rand(), np.random.rand()

            A3 = 2 * a * r1 - a

            C3 = 2 * r2

            D_delta = abs(C3 * delta_pos[j] - wolves[i][j])

            X3 = delta_pos[j] - A3 * D_delta



            wolves[i][j] = np.clip((X1 + X2 + X3) / 3, lower_bound[j], upper_bound[j])



    convergence_curve.append(alpha_score)



print(f"最优解: {alpha_pos}")

print(f"最优值: {alpha_score}")



plt.plot(convergence_curve)

plt.title("收敛曲线")

plt.xlabel("迭代次数")

plt.ylabel("验证损失")

plt.show()

在这里目标函数 objective_function(params) 负责创建和训练模型，并返回验证集上的最小损失值（验证损失），它的输入 params 是一个包含超参数的列表，灰狼算法初始化值如下：

dim = 3  # 超参数的数量

num_wolves = 5  # 灰狼种群的大小

max_iter = 20  # 最大迭代次数

lower_bound = [32, 64, 32]  # 超参数的下界

upper_bound = [128, 256, 128]  # 超参数的上界

这里的参数值都设置的比较小，实际上应该根据任务复杂度进行调整，通常较大的种群、较大迭代次数，可以提供更好的解决结果，当然计算成本也会增加，这里只是为了演示如何利用灰狼算法进行超参数搜索，通过算法我们输出了当前搜索范围，迭代次数下的最优参数，接下来使用该参数进行模型训练

4.3 最优参数下的模型训练

# 使用最优超参数重新训练模型

dense_1, dense_2, filters1 = map(int, alpha_pos)



model = Sequential()

model.add(TCN(nb_filters=filters1, kernel_size=6, activation='relu', input_shape=(win_size, 1), dilations=[1, 2, 4, 8, 16]))

model.add(Flatten())

model.add(Dense(dense_1, activation='relu'))

model.add(Dense(dense_2, activation='relu'))

model.add(Dense(1, activation='sigmoid'))



model.compile(optimizer='adam', loss='mse')



history = model.fit(X_train, y_train, epochs=100, batch_size=32, validation_data=(X_test, y_test), verbose=0)



# 可视化训练集和测试集的损失

plt.plot(history.history['loss'], label='Training Loss')

plt.plot(history.history['val_loss'], label='Validation Loss')

plt.xlabel('Epoch')

plt.ylabel('Loss')

plt.legend()

plt.show()

文章转自微信公众号@Python机器学习AI