发布于2022-09-20 13:46 阅读(1103) 评论(0) 点赞(26) 收藏(3)
接下来几个月我可以免费试用 GCE,所以我想在上面训练我的第一个 DL 模型。我选择了 E2-highmem-8 标准,即:8vCPU 和 64GB 内存。
现在,当我试图在这个实例上运行我的 NN 算法时,一个时期需要 12 秒才能完成,但在我的电脑上只需要 3-4 秒。所以基本上我必须等待 3-4 倍的时间才能完成单次训练。
我试过使用:
tf.compat.v1.ConfigProto(device_count={"CPU": 8},
inter_op_parallelism_threads=1,
intra_op_parallelism_threads=16,
)
和
sess = tf.compat.v1.Session(tf.compat.v1.ConfigProto(
inter_op_parallelism_threads=1))
但两者都没有奏效。
由于我是初学者,无论是关于 DL 编程还是使用 GCE,我不确定我是否为 VM 使用了正确的设置来利用 100% 的 CPU。
由于我不知道是我错误地使用了 tf 设置还是我的代码不好,所以我将我的代码放在下面。但首先我将向您介绍为什么我在 NN 代码上使用循环。由于我是初学者,而且我在调整超参数方面遇到了困难,所以我将 NN 代码放在三个循环中,每个循环都会改变神经元、lr 和 epoch 的数量,这就是为什么计算时间有点重要。
我在下面添加我的代码,因为我不知道它是否存在导致计算如此缓慢的东西。
代码:
import sys
import os
import sklearn
import math
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
from datetime import datetime
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from tensorflow import keras
from contextlib import contextmanager
np.set_printoptions(linewidth=3000)
import joblib
@contextmanager
def show_complete_array():
oldoptions = np.get_printoptions()
np.set_printoptions(threshold=np.inf)
try:
yield
finally:
np.set_printoptions(**oldoptions)
def split_dataset(data, output_len):
input_len = len(data)
samples = []
for i in range(0, input_len, output_len):
sample = data[i : i + input_len, :]
samples.append(sample)
return samples
n_steps = 60
def plot_series(series, y=None, y_pred=None, x_label="$t$", y_label="$x(t)$"):
plt.plot(series, ".-")
if y is not None:
plt.plot(n_steps, y, "bx", markersize=10)
if y_pred is not None:
plt.plot(n_steps, y_pred, "ro")
plt.grid(True)
if x_label:
plt.xlabel(x_label, fontsize=16)
if y_label:
plt.ylabel(y_label, fontsize=16, rotation=0)
plt.hlines(0, 0, 100, linewidth=1)
mini = min(series) - 0.1*min(series)
maxi = max(series) + 0.1*max(series)
plt.axis([0, n_steps + 1, mini, maxi])
FILE = 'TVCSILVER60.csv'
FOLDER = 'Data SLV'
PROJECT_ROOT_DIR = '.'
csv_path = os.path.join(PROJECT_ROOT_DIR, FOLDER, FILE)
print(csv_path)
def save_fig(fig_id, tight_layout=True, fig_extension="png", resolution=300):
path = os.path.join(PROJECT_ROOT_DIR, fig_id + "." + fig_extension)
print("Zapisywanie rysunku", fig_id)
if tight_layout:
plt.tight_layout()
plt.savefig(path, format=fig_extension, dpi=resolution)
Silver_CFD = pd.read_csv(csv_path, delimiter=',')
Silver_CFD.rename(columns={'Date' : 'Date', 'O' : 'Open', 'H' : 'High', 'L' : 'Low',
'C' : 'Close', 'Volume' : 'Volume', 'Volume MA' : 'Volume MA'}, inplace=True)
# Przygotowanie danych do lasu losowego
Silver_CFD_prepared = Silver_CFD.drop('Date', axis=1)
Silver_CFD_prepared = Silver_CFD_prepared.drop('High', axis=1)
Silver_CFD_prepared = Silver_CFD_prepared.drop('Low', axis=1)
Silver_CFD_prepared = Silver_CFD_prepared.drop('Open', axis=1)
Silver_CFD_prepared = Silver_CFD_prepared.drop('Volume', axis=1)
Silver_CFD_prepared = Silver_CFD_prepared.drop('Volume MA', axis=1)
#Silver_CFD_prepared_High = Silver_CFD_prepared['High']
#Silver_CFD_prepared_Low = Silver_CFD_prepared['Low']
#Silver_CFD_prepared_Open = Silver_CFD_prepared['Open']
#Silver_CFD_prepared_Close = Silver_CFD_prepared['Close']
scaler = MinMaxScaler(feature_range=(0.1,0.9), copy=False)
Silver_CFD_prepared = Silver_CFD_prepared.to_numpy(dtype='float32')
data_to_plot = Silver_CFD_prepared
Silver_CFD_prepared = scaler.fit_transform(Silver_CFD_prepared)
joblib.dump(scaler, 'scaler.joblib')
data_train = Silver_CFD_prepared[:12530]
#data_train = data_train.to_numpy(dtype='float32')
data_valid = Silver_CFD_prepared[:12530]
#data_valid = data_valid.to_numpy(dtype='float32')
data_test = Silver_CFD_prepared[:12530]
#data_test = data_test.to_numpy(dtype='float32')
data_to_operate = Silver_CFD_prepared
to_y = Silver_CFD_prepared.copy()
print(data_train.shape)
x_train = np.empty((12470, 60, 1))
y_train = np.empty((12470, 20))
index = 0
to_file = np.empty((12470))
for i in range(12470):
for j in range(60):
x_train[i, j, 0] = data_train[index]
index += 1
for k in range(20):
y_train[i, k] = to_y[i + 60 + k]
index = 1
index += i
print(x_train.shape)
print(y_train.shape)
x_valid = np.empty((12470, 60, 1))
y_valid = np.empty((12470, 20))
index = 0
for i in range(12470):
for j in range(60):
x_valid[i, j, 0] = data_valid[index]
index += 1
for k in range(20):
y_valid[i, k] = to_y[i + 60 + k]
index = 1
index += i
x_test = np.empty((12470, 60, 1))
y_test = np.empty((12470, 20))
index = 0
for i in range(12470):
for j in range(60):
x_test[i, j, 0] = data_test[index]
index += 1
for k in range(20):
y_test[i, k] = to_y[i + 60 + k]
index = 1
index += i
print('x_train.shape', x_train.shape, 'x_valid.shape', x_valid.shape, 'x_test.shape', x_test.shape)
print('y_train.shape', y_train.shape, 'y_valid.shape', y_valid.shape, 'y_test.shape', y_test.shape)
#x_train = x_train[:, :-1, :]
#x_valid = x_valid[:, :-1, :]
#x_test = x_test[:, :-1, :]
#print('x_train.shape', x_train.shape, 'x_valid.shape', x_valid.shape, 'x_test.shape', x_test.shape)
'''
# prognozowanie naiwne
y_naive_pred = x_valid[:, -1]
naive_mse = np.mean(keras.losses.mean_squared_error(y_valid, y_naive_pred))
print(naive_mse)
plot_series(x_valid[0, :, 0], y_valid[0, 0], y_naive_pred[0, 0])
plt.show()
'''
n_neurons = 16
epochs = 50
lr = 0.0000
for trial in range(12):
model = keras.models.Sequential([
keras.layers.SimpleRNN(n_neurons, activation='relu', return_sequences=True, input_shape=[None, 1]),
keras.layers.Dropout(0.2),
keras.layers.SimpleRNN(n_neurons),
keras.layers.Dropout(0.2),
keras.layers.Dense(20, activation='linear')
])
for iters in range(50):
lr = 0
for rates in range(10):
lr = lr + 0.0005
optimizer = keras.optimizers.Adam(lr=lr)
model.compile(loss='mse', optimizer=optimizer)
history = model.fit(x_train, y_train, epochs=epochs,
validation_data=(x_valid, y_valid))
train_predict = model.predict(x_train)
valid_predict = model.predict(x_valid)
test_predict = model.predict(x_test)
print(test_predict.shape)
w1, w2 = y_test.shape
for i in range(w2-1):
train_score = math.sqrt(mean_squared_error(y_train[:, i], train_predict[:, i]))
valid_score = math.sqrt(mean_squared_error(y_valid[:, i], valid_predict[:, i]))
test_score = math.sqrt(mean_squared_error(y_test[:, i], test_predict[:, i]))
print('Train Score: %3f RMSE' % (train_score))
print('Valid Score: %3f RMSE' % (valid_score))
print('Test Score: %3f RMSE' % (test_score))
sc = joblib.load('scaler.joblib')
print(x_test.shape)
print(y_test.shape)
print(test_predict.shape)
y_test_temp = y_test.copy()
x_test_temp = x_test.copy()
x_test_to_plot = x_test_temp.reshape(12470, 60) #[-1, :, :]
y_test_to_plot = y_test_temp # [-1, :]
test_predict_to_plot = test_predict #[-1, :]
print(x_test_to_plot.shape)
print(y_test_to_plot.shape)
print(test_predict_to_plot.shape)
x_test_inv = sc.inverse_transform(x_test_to_plot)
y_test_inv = sc.inverse_transform(y_test_to_plot)
test_predict_inv = sc.inverse_transform(test_predict_to_plot)
print(x_test_inv.shape)
print(y_test_inv.shape)
print(test_predict_inv.shape)
n_steps_to_plot = np.empty((20))
for i in range(20):
n_steps_to_plot[i] = n_steps + i
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='loss')
plt.plot(history.history['val_loss'], label='val_loss')
plt.legend()
plt.xlabel('Epochs')
plt.ylabel('loss function value')
plt.grid()
plt.subplot(1, 2, 2)
plt.plot(x_test_inv[-1, :], label='input prices')
plt.plot(n_steps_to_plot, y_test_inv[-1, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[-1, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
save_fig(f'loss_and_prediction_n_neurons_{n_neurons}_epochs_{epochs}_lr_{lr}')
plt.close()
plt.subplot(3, 3, 1)
plt.plot(n_steps_to_plot, y_test_inv[35, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[35, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
plt.subplot(3, 3, 2)
plt.plot(n_steps_to_plot, y_test_inv[87, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[87, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
plt.subplot(3, 3, 3)
plt.plot(n_steps_to_plot, y_test_inv[457, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[457, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
plt.subplot(3, 3, 4)
plt.plot(n_steps_to_plot, y_test_inv[990, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[990, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
plt.subplot(3, 3, 5)
plt.plot(n_steps_to_plot, y_test_inv[3524, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[3524, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
plt.subplot(3, 3, 6)
plt.plot(n_steps_to_plot, y_test_inv[7896, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[7896, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
plt.subplot(3, 3, 7)
plt.plot(n_steps_to_plot, y_test_inv[12422, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[1422, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
plt.subplot(3, 3, 8)
plt.plot(n_steps_to_plot, y_test_inv[-1, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[-1, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
plt.subplot(3, 3, 9)
plt.plot(n_steps_to_plot, y_test_inv[9544, :], 'b', label='real price')
plt.plot(n_steps_to_plot, test_predict_inv[9554, :], 'r', label='predicted price')
plt.grid()
plt.xlabel('time')
plt.ylabel('Price')
plt.legend()
save_fig(f'predictions_neurons_{n_neurons}_epochs_{epochs}_lr_{lr}')
plt.close()
epochs = epochs + 50
n_neurons = n_neurons + 16
epochs = 50
编辑:我创建了第二个虚拟机,在该虚拟机上使用与上面相同的脚本,除了这个没有循环,但利用率没有改变,仍然限制在 30%。在两个虚拟机上,我都检查了僵尸进程,但没有找到。然而,两台机器都有很多睡眠进程。例如:VM2 - 运行无循环脚本的实例,共有 120 个进程,1 个正在运行,119 个正在休眠,
VM3 - 使用循环运行脚本的实例,共有 117 个进程,1 个运行 116 个睡眠。
但是每个实例的空闲 cpu 仍然在 70%(id 值)左右,并且 7%(sy 值)被其他进程使用 - 由 top 命令显示
EDIT2:我已经使用 AMD 米兰 CPU 将 VM2 更改为 c2d-highmem-8 标准。现在完成一个 epoch 的时间是 6 秒。利用率上升到 45% 并停留在那里。
(不知道为什么利用率长大了¯_( ཀ ʖ̯ ཀ)_/¯)
我也尝试添加此代码:
from keras import backend as K
import tensorflow as tf
config = tf.ConfigProto(intra_op_parallelism_threads=8, inter_op_parallelism_threads=2, allow_soft_placement=True, device_count = {'CPU': 8 })
session = tf.Session(config=config)
K.set_session(session)
os.environ["OMP_NUM_THREADS"] = "8"
os.environ["KMP_BLOCKTIME"] = "30"
os.environ["KMP_SETTINGS"] = "1"
os.environ["KMP_AFFINITY"]= "granularity=fine,verbose,compact,1,0"
但是一个时代的计算时间没有改变,当我将数字“8”(代表核心数)更改为“4”时,也没有任何变化。
我用多个关键字搜索了 1 个谷歌页面上的每个网站。我不知道我还能做什么。
也许有人会知道如何处理这个问题......
老实说,因为我的电脑一个纪元平均需要 4 秒,我希望 VM 实例上的一个纪元需要 1 秒或更低。
作者:黑洞官方问答小能手
链接:https://www.pythonheidong.com/blog/article/1758876/8a77ea64153901dd2ff0/
来源:python黑洞网
任何形式的转载都请注明出处,如有侵权 一经发现 必将追究其法律责任
昵称:
评论内容:(最多支持255个字符)
---无人问津也好,技不如人也罢,你都要试着安静下来,去做自己该做的事,而不是让内心的烦躁、焦虑,坏掉你本来就不多的热情和定力
Copyright © 2018-2021 python黑洞网 All Rights Reserved 版权所有,并保留所有权利。 京ICP备18063182号-1
投诉与举报,广告合作请联系vgs_info@163.com或QQ3083709327
免责声明:网站文章均由用户上传,仅供读者学习交流使用,禁止用做商业用途。若文章涉及色情,反动,侵权等违法信息,请向我们举报,一经核实我们会立即删除!