基于VMD分解的VMD-LSTM时间序列预测模型实现,大力提升预测精度!
VMD是一种信号处理技术,用于将复杂的时间序列信号分解成多个局部频率模式,这些模式可以反映信号中的不同频率成分和振幅变化。VMD的主要思想是通过优化问题,将信号分解成多个本征模态函数(EMD),这些函数具有不同的频率和振幅特征,并且在一定程度上是正交的,与传统的信号分解方法相比,VMD能够更好地适应非线性和非平稳信号,并且具有更好的局部特征提取能力。VMD在多个领域中得到了广泛的应用,包括信号处理、生物医学工程、地震学等。
1. VMD-LSTM
1.1 VMD-LSTM原理
VMD-LSTM将VMD用于对原始时间序列进行分解,得到多个本征模态函数,每个本征模态函数代表了原始信号在不同频率和振幅上的成分,具有更好的局部特征表示,将这些本征模态函数作为输入序列,每个模态分别构建一个LSTM预测模型进行学习预测,从而提高模型的预测性能。
1.2 VMD-LSTM流程图
1.3 参考文献
2. 代码实现
2.1 数据解读
import pandas as pd
df = pd.read_excel('数据.xlsx',index_col=0, parse_dates=['数据时间'])
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif'] = 'SimHei' # 设置中文显示
plt.rcParams['axes.unicode_minus'] = False
plt.figure(figsize=(15, 5))
plt.plot(df['总有功功率(kw)'], label='原始数据', color='r', alpha=0.3)
plt.title('时序图')
plt.grid(True)
plt.legend()
plt.show()
在电力行业,对每日功率波动进行准确预测对于电网调度和能源规划至关重要。通过深度学习模型,可以更好地理解数据的复杂关系,从而提高预测的准确性,首先,加载数据(此数据已经过一定的异常值、缺失值处理)该数据为每日平均总有功功率(kw)。
2.2 LSTM模型建立
# 0-1标准化
arr_max = np.max(np.array(df))
arr_min = np.min(np.array(df))
data_bz = (np.array(df)-arr_min)/(arr_max-arr_min)
data_bz = data_bz.ravel() # 转换为一维数组
def dataset(data, win_size=12):
X = [] # 用于存储输入特征的列表
Y = [] # 用于存储目标值的列表
# 遍历数据,形成样本窗口
for i in range(len(data) - win_size):
temp_x = data[i:i + win_size] # 提取窗口大小范围内的输入特征
temp_y = data[i + win_size] # 提取对应的目标值
X.append(temp_x)
Y.append(temp_y)
# 转换列表为 NumPy 数组
X = np.asarray(X)
Y = np.asarray(Y)
return X, Y
data_x, data_y = dataset(data_bz, 12)
data_x = np.expand_dims(data_x, axis=1)
from sklearn.model_selection import train_test_split
from keras.layers import LSTM, Dense
from keras.models import Sequential
train_x, test_x, train_y, test_y = train_test_split(data_x, data_y, test_size = 0.2, shuffle = False)
# 使用 Sequential API 构建模型
model =Sequential()
# LSTM 层中有 256 个隐藏单元 input_shape参数指定输入数据的形状,这里为(1, 时间步数)
model.add(LSTM(256, input_shape=(train_x.shape[1], train_x.shape[2])))
# 这里定义激活函数为sigmoid(请文对输出数据进行归一化处理)
model.add(Dense(1, activation='sigmoid'))
# 编译模型
# 损失函数(loss)为均方误差(Mean Squared Error),适用于回归问题
# 优化器(optimizer)为 Adam,是一种常用的优化算法
model.compile(loss='mse', optimizer='adam')
# 训练模型
# epochs:训练的轮数,即遍历整个训练数据的次数
# batch_size:每批次训练的样本数
# validation_split:用于在训练过程中划分一部分数据作为验证集,这里是取训练数据的20%作为验证集
# shuffle:是否在每个 epoch 之前随机打乱训练数据
history = model.fit(train_x, train_y, epochs=18, batch_size=64, validation_split=0.2, shuffle=False)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'], c='r')
plt.legend(['loss', 'val_loss'])
plt.show()
from sklearn import metrics
# 计算均方误差(MSE)
mse = metrics.mean_squared_error(test_y, np.array([i for arr in y_pred for i in arr]))
# 计算均方根误差(RMSE)
rmse = np.sqrt(mse)
# 计算平均绝对误差(MAE)
mae = metrics.mean_absolute_error(test_y, np.array([i for arr in y_pred for i in arr]))
from sklearn.metrics import r2_score # 拟合优度
r2 = r2_score(test_y, np.array([i for arr in y_pred for i in arr]))
print("均方误差 (MSE):", mse)
print("均方根误差 (RMSE):", rmse)
print("平均绝对误差 (MAE):", mae)
print("拟合优度:", r2)
def predict_all(model, last_x, num=24):
pred_y=[]
for i in range(num):
temp_y = model.predict(last_x)
pred_y.append(temp_y[0,0])
temp_y = np.expand_dims(temp_y, 0)
last_x = np.concatenate([last_x[:,:,1:], temp_y], axis = 2)
return np.asarray(pred_y)
last_x = test_x[-1]
last_x = np.expand_dims(last_x, 0)
series_pre = predict_all(model, last_x, num=30)
series = series_pre*(arr_max-arr_min)+arr_min # 未来30天波动预测值归一化数据还原
plt.figure(figsize=(15,4), dpi =300)
plt.plot(test_y*(arr_max-arr_min)+arr_min, color = 'c', label = '每日实际功率波动曲线')
plt.plot(y_pred*(arr_max-arr_min)+arr_min, color = 'r', label = '预测每日功率波动曲线')
plt.plot(range(len(y_pred), len(y_pred)+len(series)), series, color = 'b', label = '向后预测30天')
plt.title('每日实际功率波动与预测每日功率波动比对图')
plt.grid(True)
plt.xlabel('时间')
plt.ylabel('总有功功率(kw)')
plt.legend()
plt.show()
这里将不进行代码解读,代码解读请参考往期文章末尾也将给出往期文章链接,通过建立LSTM模型可以发现普通的LSTM模型对于预测任务的表现不够理想,预测结果与实际结果之间存在较大的误差,拟合优度也不高,说明模型并未很好地拟合数据的变异性。可能需要进一步优化模型结构、调整超参数或者尝试其他模型来提高预测精度,为了提高模型精确度接下来将引入模型VMD-LSTM进行模型建立。
2.3 VMD分解时间序列
from vmdpy import VMD
# VMD 的一些参数
alpha = 1300 # 适度的频带宽度约束
tau = 0. # 容忍噪声(没有严格的保真度要求)
K = 5 # 3个模态
DC = 0 # 不强制直流分量
init = 1 # 均匀初始化角频率
tol = 1e-7
# 运行 VMD
u, u_hat, omega = VMD(df['总有功功率(kw)'], alpha, tau, K, DC, init, tol)
# 可视化分解后的 5 个 IMF
plt.figure(figsize=(15, 8))
for i in range(K):
plt.subplot(K, 1, i+1)
plt.plot(u[i, :])
plt.title(f'IMF {i+1}')
plt.xlabel('时间')
plt.ylabel('幅度')
plt.tight_layout()
plt.show()
# 保存每个 IMF 到不同的 DataFrame
imf_dataframes = {}
for i in range(len(u)):
imf_name = 'imf_{}'.format(i+1)
imf_dataframes[imf_name] = pd.DataFrame({'Value': u[i]})
通过代码,可以清晰地展示VMD分解后得到的每个本征模态函数,以及得到分解后的数据,接下来将对每一个分解数据进行LSTM模型建立。
2.3.1 IMF1—LSTM
arr_max_1 = np.max(np.array(imf_dataframes['imf_1']))
arr_min_1 = np.min(np.array(imf_dataframes['imf_1']))
data_bz_1 = (np.array(imf_dataframes['imf_1'])-arr_min_1)/(arr_max_1-arr_min_1)
data_bz_1 = data_bz_1.ravel() # 转换为一维数组
data_x_1, data_y_1 = dataset(data_bz_1, 12)
data_x_1 = np.expand_dims(data_x_1, axis=1)
train_x_1, test_x_1, train_y_1, test_y_1 = train_test_split(data_x_1, data_y_1, test_size = 0.2, shuffle = False)
model_1 =Sequential()
model_1.add(LSTM(256, input_shape=(train_x_1.shape[1], train_x_1.shape[2])))
model_1.add(Dense(1, activation='sigmoid'))
model_1.compile(loss='mse', optimizer='adam')
history_1 = model_1.fit(train_x_1, train_y_1, epochs=18, batch_size=64, validation_split=0.2, shuffle=False)
plt.plot(history_1.history['loss'])
plt.plot(history_1.history['val_loss'], c='r')
plt.legend(['loss', 'val_loss'])
plt.show()
y_pred_1 = model_1.predict(test_x_1)
# 计算均方误差(MSE)
mse = metrics.mean_squared_error(test_y_1, np.array([i for arr in y_pred_1 for i in arr]))
# 计算均方根误差(RMSE)
rmse = np.sqrt(mse)
# 计算平均绝对误差(MAE)
mae = metrics.mean_absolute_error(test_y_1, np.array([i for arr in y_pred_1 for i in arr]))
from sklearn.metrics import r2_score # 拟合优度
r2 = r2_score(test_y_1, np.array([i for arr in y_pred_1 for i in arr]))
print("均方误差 (MSE):", mse)
print("均方根误差 (RMSE):", rmse)
print("平均绝对误差 (MAE):", mae)
print("拟合优度:", r2)
last_x_1 = test_x_1[-1]
last_x_1 = np.expand_dims(last_x_1, 0)
series_pre_1 = predict_all(model_1, last_x_1, num=30)
series_1 = series_pre_1*(arr_max_1-arr_min_1)+arr_min_1 # 未来10天波动预测值归一化数据还原
plt.figure(figsize=(15,4), dpi =300)
plt.plot(test_y_1*(arr_max_1-arr_min_1)+arr_min_1, color = 'c', label = 'imf_1测试集实际功率波动曲线')
plt.plot(y_pred_1*(arr_max_1-arr_min_1)+arr_min_1, color = 'r', label = 'imf_1测试集预测功率波动曲线')
plt.plot(range(len(y_pred_1), len(y_pred_1)+len(series_1)), series, color = 'b', label = 'imf_1向后预测30天功率波动曲线')
plt.title('imf_1')
plt.grid(True)
plt.xlabel('时间')
plt.ylabel('总有功功率(kw)')
plt.legend()
plt.show()
2.3.2 IMF2—LSTM
arr_max_2 = np.max(np.array(imf_dataframes['imf_2']))
arr_min_2 = np.min(np.array(imf_dataframes['imf_2']))
data_bz_2 = (np.array(imf_dataframes['imf_2'])-arr_min_2)/(arr_max_2-arr_min_2)
data_bz_2 = data_bz_2.ravel() # 转换为一维数组
data_x_2, data_y_2 = dataset(data_bz_2, 12)
data_x_2 = np.expand_dims(data_x_2, axis=1)
train_x_2, test_x_2, train_y_2, test_y_2 = train_test_split(data_x_2, data_y_2, test_size = 0.2, shuffle = False)
model_2 =Sequential()
model_2.add(LSTM(256, input_shape=(train_x_2.shape[1], train_x_2.shape[2])))
model_2.add(Dense(1, activation='sigmoid'))
model_2.compile(loss='mse', optimizer='adam')
history_2 = model_2.fit(train_x_2, train_y_2, epochs=18, batch_size=64, validation_split=0.2, shuffle=False)
plt.plot(history_2.history['loss'])
plt.plot(history_2.history['val_loss'], c='r')
plt.legend(['loss', 'val_loss'])
plt.show()
y_pred_2 = model_2.predict(test_x_2)
# 计算均方误差(MSE)
mse = metrics.mean_squared_error(test_y_2, np.array([i for arr in y_pred_2 for i in arr]))
# 计算均方根误差(RMSE)
rmse = np.sqrt(mse)
# 计算平均绝对误差(MAE)
mae = metrics.mean_absolute_error(test_y_2, np.array([i for arr in y_pred_2 for i in arr]))
from sklearn.metrics import r2_score # 拟合优度
r2 = r2_score(test_y_2, np.array([i for arr in y_pred_2 for i in arr]))
print("均方误差 (MSE):", mse)
print("均方根误差 (RMSE):", rmse)
print("平均绝对误差 (MAE):", mae)
print("拟合优度:", r2)
last_x_2 = test_x_2[-1]
last_x_2 = np.expand_dims(last_x_2, 0)
series_pre_2 = predict_all(model_2, last_x_2, num=30)
series_2 = series_pre_2*(arr_max_2-arr_min_2)+arr_min_2 # 未来10天波动预测值归一化数据还原
plt.figure(figsize=(15,4), dpi =300)
plt.plot(test_y_2*(arr_max_2-arr_min_2)+arr_min_2, color = 'c', label = 'imf_2测试集实际功率波动曲线')
plt.plot(y_pred_2*(arr_max_2-arr_min_2)+arr_min_2, color = 'r', label = 'imf_2测试集预测功率波动曲线')
plt.plot(range(len(y_pred_2), len(y_pred_2)+len(series_2)), series_2, color = 'b', label = 'imf_2向后预测30天功率波动曲线')
plt.title('imf_2')
plt.grid(True)
plt.xlabel('时间')
plt.ylabel('总有功功率(kw)')
plt.legend()
plt.show()
2.3.3 IMF3—LSTM
arr_max_3 = np.max(np.array(imf_dataframes['imf_3']))
arr_min_3 = np.min(np.array(imf_dataframes['imf_3']))
data_bz_3 = (np.array(imf_dataframes['imf_3'])-arr_min_3)/(arr_max_3-arr_min_3)
data_bz_3 = data_bz_3.ravel() # 转换为一维数组
data_x_3, data_y_3 = dataset(data_bz_3, 12)
data_x_3 = np.expand_dims(data_x_3, axis=1)
train_x_3, test_x_3, train_y_3, test_y_3 = train_test_split(data_x_3, data_y_3, test_size = 0.2, shuffle = False)
model_3 =Sequential()
model_3.add(LSTM(256, input_shape=(train_x_3.shape[1], train_x_3.shape[2])))
model_3.add(Dense(1, activation='sigmoid'))
model_3.compile(loss='mse', optimizer='adam')
history_3 = model_3.fit(train_x_3, train_y_3, epochs=18, batch_size=64, validation_split=0.2, shuffle=False)
plt.plot(history_3.history['loss'])
plt.plot(history_3.history['val_loss'], c='r')
plt.legend(['loss', 'val_loss'])
plt.show()
y_pred_3 = model_3.predict(test_x_3)
# 计算均方误差(MSE)
mse = metrics.mean_squared_error(test_y_3, np.array([i for arr in y_pred_3 for i in arr]))
# 计算均方根误差(RMSE)
rmse = np.sqrt(mse)
# 计算平均绝对误差(MAE)
mae = metrics.mean_absolute_error(test_y_3, np.array([i for arr in y_pred_3 for i in arr]))
from sklearn.metrics import r2_score # 拟合优度
r2 = r2_score(test_y_3, np.array([i for arr in y_pred_3 for i in arr]))
print("均方误差 (MSE):", mse)
print("均方根误差 (RMSE):", rmse)
print("平均绝对误差 (MAE):", mae)
print("拟合优度:", r2)
last_x_3 = test_x_3[-1]
last_x_3 = np.expand_dims(last_x_3, 0)
series_pre_3 = predict_all(model_3, last_x_3, num=30)
series_3 = series_pre_3*(arr_max_3-arr_min_3)+arr_min_3 # 未来10天波动预测值归一化数据还原
plt.figure(figsize=(15,4), dpi =300)
plt.plot(test_y_3*(arr_max_3-arr_min_3)+arr_min_3, color = 'c', label = 'imf_3测试集实际功率波动曲线')
plt.plot(y_pred_3*(arr_max_3-arr_min_3)+arr_min_3, color = 'r', label = 'imf_3测试集预测功率波动曲线')
plt.plot(range(len(y_pred_3), len(y_pred_3)+len(series_3)), series_3, color = 'b', label = 'imf_3向后预测30天功率波动曲线')
plt.title('imf_3')
plt.grid(True)
plt.xlabel('时间')
plt.ylabel('总有功功率(kw)')
plt.legend()
plt.show()
2.3.4 IMF4—LSTM
arr_max_4 = np.max(np.array(imf_dataframes['imf_4']))
arr_min_4 = np.min(np.array(imf_dataframes['imf_4']))
data_bz_4 = (np.array(imf_dataframes['imf_4'])-arr_min_4)/(arr_max_4-arr_min_4)
data_bz_4 = data_bz_4.ravel() # 转换为一维数组
data_x_4, data_y_4 = dataset(data_bz_4, 12)
data_x_4 = np.expand_dims(data_x_4, axis=1)
train_x_4, test_x_4, train_y_4, test_y_4 = train_test_split(data_x_4, data_y_4, test_size = 0.2, shuffle = False)
model_4 =Sequential()
model_4.add(LSTM(256, input_shape=(train_x_4.shape[1], train_x_4.shape[2])))
model_4.add(Dense(1, activation='sigmoid'))
model_4.compile(loss='mse', optimizer='adam')
history_4 = model_4.fit(train_x_4, train_y_4, epochs=18, batch_size=64, validation_split=0.2, shuffle=False)
plt.plot(history_4.history['loss'])
plt.plot(history_4.history['val_loss'], c='r')
plt.legend(['loss', 'val_loss'])
plt.show()
y_pred_4 = model_4.predict(test_x_4)
# 计算均方误差(MSE)
mse = metrics.mean_squared_error(test_y_4, np.array([i for arr in y_pred_4 for i in arr]))
# 计算均方根误差(RMSE)
rmse = np.sqrt(mse)
# 计算平均绝对误差(MAE)
mae = metrics.mean_absolute_error(test_y_4, np.array([i for arr in y_pred_4 for i in arr]))
from sklearn.metrics import r2_score # 拟合优度
r2 = r2_score(test_y_4, np.array([i for arr in y_pred_4 for i in arr]))
print("均方误差 (MSE):", mse)
print("均方根误差 (RMSE):", rmse)
print("平均绝对误差 (MAE):", mae)
print("拟合优度:", r2)
last_x_4 = test_x_4[-1]
last_x_4 = np.expand_dims(last_x_4, 0)
series_pre_4 = predict_all(model_4, last_x_4, num=30)
series_4 = series_pre_4*(arr_max_4-arr_min_4)+arr_min_4 # 未来10天波动预测值归一化数据还原
plt.figure(figsize=(15,4), dpi =300)
plt.plot(test_y_4*(arr_max_4-arr_min_4)+arr_min_4, color = 'c', label = 'imf_4测试集实际功率波动曲线')
plt.plot(y_pred_4*(arr_max_4-arr_min_4)+arr_min_4, color = 'r', label = 'imf_4测试集预测功率波动曲线')
plt.plot(range(len(y_pred_4), len(y_pred_4)+len(series_4)), series_4, color = 'b', label = 'imf_4向后预测30天功率波动曲线')
plt.title('imf_4')
plt.grid(True)
plt.xlabel('时间')
plt.ylabel('总有功功率(kw)')
plt.legend()
plt.show()
2.3.5 IMF5—LSTM
arr_max_5 = np.max(np.array(imf_dataframes['imf_5']))
arr_min_5 = np.min(np.array(imf_dataframes['imf_5']))
data_bz_5 = (np.array(imf_dataframes['imf_5'])-arr_min_5)/(arr_max_5-arr_min_5)
data_bz_5 = data_bz_5.ravel() # 转换为一维数组
data_x_5, data_y_5 = dataset(data_bz_5, 12)
data_x_5 = np.expand_dims(data_x_5, axis=1)
train_x_5, test_x_5, train_y_5, test_y_5 = train_test_split(data_x_5, data_y_5, test_size = 0.2, shuffle = False)
model_5 =Sequential()
model_5.add(LSTM(256, input_shape=(train_x_5.shape[1], train_x_5.shape[2])))
model_5.add(Dense(1, activation='sigmoid'))
model_5.compile(loss='mse', optimizer='adam')
history_5 = model_5.fit(train_x_5, train_y_5, epochs=18, batch_size=64, validation_split=0.2, shuffle=False)
plt.plot(history_5.history['loss'])
plt.plot(history_5.history['val_loss'], c='r')
plt.legend(['loss', 'val_loss'])
plt.show()
y_pred_5 = model_5.predict(test_x_5)
# 计算均方误差(MSE)
mse = metrics.mean_squared_error(test_y_5, np.array([i for arr in y_pred_5 for i in arr]))
# 计算均方根误差(RMSE)
rmse = np.sqrt(mse)
# 计算平均绝对误差(MAE)
mae = metrics.mean_absolute_error(test_y_5, np.array([i for arr in y_pred_5 for i in arr]))
from sklearn.metrics import r2_score # 拟合优度
r2 = r2_score(test_y_5, np.array([i for arr in y_pred_5 for i in arr]))
print("均方误差 (MSE):", mse)
print("均方根误差 (RMSE):", rmse)
print("平均绝对误差 (MAE):", mae)
print("拟合优度:", r2)
last_x_5 = test_x_5[-1]
last_x_5 = np.expand_dims(last_x_5, 0)
series_pre_5 = predict_all(model_5, last_x_5, num=30)
series_5 = series_pre_5*(arr_max_5-arr_min_5)+arr_min_5 # 未来10天波动预测值归一化数据还原
plt.figure(figsize=(15,4), dpi =300)
plt.plot(test_y_5*(arr_max_5-arr_min_5)+arr_min_5, color = 'c', label = 'imf_5测试集实际功率波动曲线')
plt.plot(y_pred_5*(arr_max_5-arr_min_5)+arr_min_5, color = 'r', label = 'imf_5测试集预测功率波动曲线')
plt.plot(range(len(y_pred_5), len(y_pred_5)+len(series_5)), series_5, color = 'b', label = 'imf_5向后预测30天功率波动曲线')
plt.title('imf_5')
plt.grid(True)
plt.xlabel('时间')
plt.ylabel('总有功功率(kw)')
plt.legend()
plt.show()
这里的模型虽然是一模一样的只是更改了数据集没有利用循环来进行建立模型是为了去强调每次建立的LSTM模型可以修改相关参数进行模型优化,并不影响VMD-LSTM最后的构建。
如果不需要参数修改,直接进行for循环构建模型省略代码复杂度请参考以下代码。
from vmdpy import VMD
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras import metrics
plt.rcParams['font.sans-serif'] = 'SimHei' # 设置中文显示
plt.rcParams['axes.unicode_minus'] = False
# VMD 的一些参数
alpha = 1300 # 适度的频带宽度约束
tau = 0. # 容忍噪声(没有严格的保真度要求)
K = 5 # 5个模态
DC = 0 # 不强制直流分量
init = 1 # 均匀初始化角频率
tol = 1e-7
# 运行 VMD
u, _, _ = VMD(df['总有功功率(kw)'], alpha, tau, K, DC, init, tol)
# 创建 LSTM 模型
def create_lstm_model(input_shape):
model = Sequential()
model.add(LSTM(256, input_shape=input_shape))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
# 准备训练数据
def prepare_data(imf_data):
arr_max = np.max(np.array(imf_data))
arr_min = np.min(np.array(imf_data))
data_bz = (np.array(imf_data) - arr_min) / (arr_max - arr_min)
data_bz = data_bz.ravel() # 转换为一维数组
data_x, data_y = dataset(data_bz, 12)
data_x = np.expand_dims(data_x, axis=1)
train_x, test_x, train_y, test_y = train_test_split(data_x, data_y, test_size=0.2, shuffle=False)
return train_x, test_x, train_y, test_y, arr_max, arr_min
# 训练模型并进行预测
def train_and_predict(model, train_x, train_y, test_x, test_y, arr_max, arr_min):
history = model.fit(train_x, train_y, epochs=18, batch_size=64, validation_split=0.2, shuffle=False)
y_pred = model.predict(test_x)
mse = mean_squared_error(test_y, y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(test_y, y_pred)
r2 = r2_score(test_y, y_pred)
print("均方误差 (MSE):", mse)
print("均方根误差 (RMSE):", rmse)
print("平均绝对误差 (MAE):", mae)
print("拟合优度:", r2)
last_x = test_x[-1]
last_x = np.expand_dims(last_x, 0)
series_pre = predict_all(model, last_x, num=30)
series = series_pre * (arr_max - arr_min) + arr_min
plt.figure(figsize=(15, 4), dpi=300)
plt.plot(test_y * (arr_max - arr_min) + arr_min, color='c', label='测试集实际功率波动曲线')
plt.plot(y_pred * (arr_max - arr_min) + arr_min, color='r', label='测试集预测功率波动曲线')
plt.plot(range(len(y_pred), len(y_pred) + len(series)), series, color='b', label='向后预测30天功率波动曲线')
plt.title('IMF')
plt.grid(True)
plt.xlabel('时间')
plt.ylabel('总有功功率(kw)')
plt.legend()
plt.show()
# 循环处理每个 IMFs
for i, imf in enumerate(u):
imf_name = f'imf_{i+1}'
train_x, test_x, train_y, test_y, arr_max, arr_min = prepare_data(imf)
model = create_lstm_model((train_x.shape[1], train_x.shape[2]))
train_and_predict(model, train_x, train_y, test_x, test_y, arr_max, arr_min)
2.4 VMD—LSTM
pred = y_pred_1*(arr_max_1-arr_min_1)+arr_min_1+y_pred_2*(arr_max_2-arr_min_2)+arr_min_2+y_pred_3*(arr_max_3-arr_min_3)+arr_min_3+y_pred_4*(arr_max_4-arr_min_4)+arr_min_4+y_pred_5*(arr_max_5-arr_min_5)+arr_min_5
prophet = series_1+series_2+series_3+series_4+series_5
test = test_y*(arr_max-arr_min)+arr_min
plt.figure(figsize=(15,4), dpi =300)
plt.plot(test, color = 'c', label = '测试集实际功率波动曲线')
plt.plot(pred, color = 'r', label = 'vmd_lstm测试集预测功率波动曲线')
plt.plot(range(len(pred), len(pred)+len(prophet)), prophet, color = 'b', label = 'vmd_lstm向后预测30天功率波动曲线')
plt.title('vmd_lstm')
plt.grid(True)
plt.xlabel('时间')
plt.ylabel('总有功功率(kw)')
plt.legend()
plt.show()
2.5 VMD—LSTM与LSTM模型评价指标比较
# 计算均方误差(MSE)
mse = metrics.mean_squared_error(test, np.array([i for arr in pred for i in arr]))
# 计算均方根误差(RMSE)
rmse = np.sqrt(mse)
# 计算平均绝对误差(MAE)
mae = metrics.mean_absolute_error(test, np.array([i for arr in pred for i in arr]))
from sklearn.metrics import r2_score # 拟合优度
r2 = r2_score(test, np.array([i for arr in pred for i in arr]))
print("均方误差 (MSE):", mse)
print("均方根误差 (RMSE):", rmse)
print("平均绝对误差 (MAE):", mae)
print("拟合优度:", r2)
这里的均方误差、均方根误差、平均绝对误差都是对归一化还原数据进行计算得到,前文的LSTM模型评价指标是对归一化的数据计算得到,为了比较两个模型的差异对前文LSTM模型也进行还原数据计算评价指标。
# 更好对比对lstm数据进行归一化还原输出评价指标
# 计算均方误差(MSE)
test = test_y*(arr_max-arr_min)+arr_min
pred_1 = y_pred*(arr_max-arr_min)+arr_min
mse = metrics.mean_squared_error(test, np.array([i for arr in pred_1 for i in arr]))
# 计算均方根误差(RMSE)
rmse = np.sqrt(mse)
# 计算平均绝对误差(MAE)
mae = metrics.mean_absolute_error(test, np.array([i for arr in pred_1 for i in arr]))
from sklearn.metrics import r2_score # 拟合优度
r2 = r2_score(test, np.array([i for arr in pred_1 for i in arr]))
print("均方误差 (MSE):", mse)
print("均方根误差 (RMSE):", rmse)
print("平均绝对误差 (MAE):", mae)
print("拟合优度:", r2)
# 定义模型和评价指标
models = ['LSTM', 'VMD-LSTM']
metrics = ['MSE', 'RMSE', 'MAE', 'R2']
values = {
'LSTM': [433308080.1553673, 20816.053424109174, 14396.407640033123, 0.490216331578196],
'VMD-LSTM': [169117880.61997274, 13004.53307966006, 9261.236531374, 0.801034096693379]
}
# 绘制柱状图
fig, axs = plt.subplots(2, 2, figsize=(12, 10))
for i, metric in enumerate(metrics):
ax = axs[i//2, i%2]
ax.bar(models, [values[model][i] for model in models], color=['skyblue', 'lightgreen'])
ax.set_title(metric)
ax.set_xlabel('Model')
ax.set_ylabel(metric)
plt.tight_layout()
plt.show()
综合来看,VMD-LSTM模型相对于普通的LSTM模型在各个评价指标上表现更好,具有更低的预测误差、更高的预测精度和更好的数据拟合效果。