Tensorflow: Numpy等效的tf批次和重塑

我在尝试使用Tensorflow通过LSTM方法对图像进行分类，使用独热编码输出，并在最后的LSTM输出处使用softmax分类器。我的数据集是CSV格式，我在Numpy和Tensorflow上研究了很多关于如何进行一些修改的方法。我仍然遇到一个错误：

AttributeError: 'numpy.ndarray' object has no attribute 'next_batch'

正如你所见，我无法在我的数据集上使用next_batch(batch_size)，并且接下来的tf.reshape需要替换为其Numpy等效方法。

我的问题是：我应该如何纠正这两个问题？

'''Tensorflow LSTM分类16x30图像。'''from __future__ import print_functionimport tensorflow as tffrom tensorflow.python.ops import rnn, rnn_cellimport numpy as npfrom numpy import genfromtxtfrom sklearn.cross_validation import train_test_splitimport pandas as pd'''一个Tensorflow LSTM，它将顺序输入每个单一图像的几行。即Tensorflow图将以Multi-layerperceptron MNIST Tensorflow教程中所做的那样，接受一个平面（1,480）特征图像，但随后以每个16特征和30个时间步长的顺序方式重塑它。'''blaine = genfromtxt('./Desktop/Blaine_CSV_lstm.csv',delimiter=',')  # CSV转换为数组target = [row[0] for row in blaine]             # CSV中的第一列作为目标data = blaine[:, 1:480]                          #平面特征向量X_train, X_test, y_train, y_test = train_test_split(data, target, test_size=0.05, random_state=42)f=open('cs-training.csv','w')       #第一个分割用于训练for i,j in enumerate(X_train):        k=np.append(np.array(y_train[i]),j   )        f.write(",".join([str(s) for s in k]) + '\n')f.close()f=open('cs-testing.csv','w')        #第二个分割用于测试for i,j in enumerate(X_test):        k=np.append(np.array(y_test[i]),j   )        f.write(",".join([str(s) for s in k]) + '\n')f.close()ss = pd.Series(y_train)     #需要为后面的Pandas Dummies独热向量进行索引系列gg = pd.Series(y_test)new_data = genfromtxt('cs-training.csv',delimiter=',')  # 训练数据new_test_data = genfromtxt('cs-testing.csv',delimiter=',')  # 测试数据x_train=np.array([ i[1::] for i in new_data])y_train_onehot = pd.get_dummies(ss)x_test=np.array([ i[1::] for i in new_test_data])y_test_onehot = pd.get_dummies(gg)# 通用参数learning_rate = 0.001training_iters = 100000batch_size = 128display_step = 10# Tensorflow LSTM网络参数n_input = 16 # MNIST数据输入（图像形状：28*28）n_steps = 30 # 时间步长n_hidden = 128 # 隐藏层特征数n_classes = 20 # MNIST总类别（0-9数字）# tf图输入x = tf.placeholder("float", [None, n_steps, n_input])y = tf.placeholder("float", [None, n_classes])# 定义权重weights = {    'out': tf.Variable(tf.random_normal([n_hidden, n_classes]))}biases = {    'out': tf.Variable(tf.random_normal([n_classes]))}def RNN(x, weights, biases):    # 准备数据形状以匹配`rnn`函数要求    # 当前数据输入形状：(batch_size, n_steps, n_input)    # 所需形状：'n_steps'张量列表，形状为(batch_size, n_input)    # 置换batch_size和n_steps    x = tf.transpose(x, [1, 0, 2])    # 重塑为(n_steps*batch_size, n_input)    x = tf.reshape(x, [-1, n_input])    # 分割以获取'n_steps'个形状为(batch_size, n_input)的张量列表    x = tf.split(0, n_steps, x)    # 使用tensorflow定义lstm单元    lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)    # 获取lstm单元输出    outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32)    # 线性激活，使用rnn内部循环的最后输出    return tf.matmul(outputs[-1], weights['out']) + biases['out']pred = RNN(x, weights, biases)# 定义损失和优化器cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)# 评估模型correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))# 初始化变量init = tf.initialize_all_variables()# 启动图with tf.Session() as sess:    sess.run(init)    step = 1    # 持续训练直到达到最大迭代次数    while step * batch_size < training_iters:        x_train, y_train = new_data.next_batch(batch_size)        # 重塑数据以获得30个序列，每个序列16个元素        x_train = x_train.reshape((batch_size, n_steps, n_input))        # 运行优化操作（反向传播）        sess.run(optimizer, feed_dict={x: x_train, y: y_train})        if step % display_step == 0:            # 计算批次准确率            acc = sess.run(accuracy, feed_dict={x: x_train, y: y_train})            # 计算批次损失            loss = sess.run(cost, feed_dict={x: x_train, y: y_train})            print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \                  "{:.6f}".format(loss) + ", Training Accuracy= " + \                  "{:.5f}".format(acc))        step += 1    print("Optimization Finished!")

回答：

你可以创建一个名为nextbatch的自定义函数，根据给定的Numpy数组和索引返回该Numpy数组的切片。

def nextbatch(x,i,j):    return x[i:j,...]

你还可以传入当前步骤，并可能使用取模操作，但这是使其工作的基本方法。

至于重塑操作，使用：

x_train = np.reshape(x_train,(batch_size, n_steps, n_input))

学技术

Tensorflow: Numpy等效的tf批次和重塑

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复