使用Python进行MXNet分类总是给出相同的预测

我尝试使用这里提供的网球比赛统计数据集作为输入，构建了一个神经网络模型来预测比赛结果（1或0）。

根据官方MXNet文档，我开发了下面的程序。我尝试了各种配置参数，如batch_size、unit_size、act_type、learning_rate，但无论我做了什么样的修改，得到的准确率总是大约0.5，并且总是预测所有结果为1或0。

import numpy as npfrom sklearn.preprocessing import normalizeimport mxnet as mximport loggingimport warningswarnings.filterwarnings("ignore", category=DeprecationWarning)logging.basicConfig(level=logging.DEBUG, format='%(asctime)-15s %(message)s')batch_size = 100train_data = np.loadtxt("dm.csv",delimiter=",")train_data = normalize(train_data, norm='l1', axis=0)train_lbl = np.loadtxt("dm_lbl.csv",delimiter=",")eval_data = np.loadtxt("dw.csv",delimiter=",")eval_data = normalize(eval_data, norm='l1', axis=0)eval_lbl = np.loadtxt("dw_lbl.csv",delimiter=",")train_iter = mx.io.NDArrayIter(train_data, train_lbl, batch_size=batch_size, shuffle=True)val_iter = mx.io.NDArrayIter(eval_data, eval_lbl, batch_size=batch_size)data = mx.sym.var('data')# The first fully-connected layer and the corresponding activation functionfc1  = mx.sym.FullyConnected(data=data, num_hidden=220)#bn1 = mx.sym.BatchNorm(data = fc1, name="bn1")act1 = mx.sym.Activation(data=fc1, act_type="sigmoid")# The second fully-connected layer and the corresponding activation functionfc2  = mx.sym.FullyConnected(data=act1, num_hidden = 220)#bn2 = mx.sym.BatchNorm(data = fc2, name="bn2")act2 = mx.sym.Activation(data=fc2, act_type="sigmoid")# The third fully-connected layer and the corresponding activation functionfc3  = mx.sym.FullyConnected(data=act2, num_hidden = 110)#bn3 = mx.sym.BatchNorm(data = fc3, name="bn3")act3 = mx.sym.Activation(data=fc3, act_type="sigmoid")# output class(es)fc4  = mx.sym.FullyConnected(data=act3, num_hidden=2)# Softmax with cross entropy lossmlp  = mx.sym.SoftmaxOutput(data=fc4, name='softmax')mod = mx.mod.Module(symbol=mlp,                    context=mx.cpu(),                    data_names=['data'],                    label_names=['softmax_label'])mod.fit(train_iter,        eval_data=val_iter,        optimizer='sgd',        optimizer_params={'learning_rate':0.03},        eval_metric='rmse',        num_epoch=10,        batch_end_callback = mx.callback.Speedometer(batch_size, 100)) # output progress for each 200 data batches)prob = mod.predict(val_iter).asnumpy()#print(prob)for unit in prob:    print 'Classified as %d with probability %f' % (unit.argmax(), max(unit))

这是日志输出：

2017-06-19 17:18:34,961 Epoch[0] Train-rmse=0.5005742017-06-19 17:18:34,961 Epoch[0] Time cost=0.0072017-06-19 17:18:34,968 Epoch[0] Validation-rmse=0.5002842017-06-19 17:18:34,975 Epoch[1] Train-rmse=0.5007032017-06-19 17:18:34,975 Epoch[1] Time cost=0.0072017-06-19 17:18:34,982 Epoch[1] Validation-rmse=0.5003012017-06-19 17:18:34,990 Epoch[2] Train-rmse=0.5007132017-06-19 17:18:34,990 Epoch[2] Time cost=0.0082017-06-19 17:18:34,998 Epoch[2] Validation-rmse=0.5003022017-06-19 17:18:35,005 Epoch[3] Train-rmse=0.5007132017-06-19 17:18:35,005 Epoch[3] Time cost=0.0072017-06-19 17:18:35,012 Epoch[3] Validation-rmse=0.5003022017-06-19 17:18:35,019 Epoch[4] Train-rmse=0.5007132017-06-19 17:18:35,019 Epoch[4] Time cost=0.0072017-06-19 17:18:35,027 Epoch[4] Validation-rmse=0.5003022017-06-19 17:18:35,035 Epoch[5] Train-rmse=0.5007132017-06-19 17:18:35,035 Epoch[5] Time cost=0.0082017-06-19 17:18:35,042 Epoch[5] Validation-rmse=0.5003022017-06-19 17:18:35,049 Epoch[6] Train-rmse=0.5007132017-06-19 17:18:35,049 Epoch[6] Time cost=0.0072017-06-19 17:18:35,056 Epoch[6] Validation-rmse=0.5003022017-06-19 17:18:35,064 Epoch[7] Train-rmse=0.5007122017-06-19 17:18:35,064 Epoch[7] Time cost=0.0082017-06-19 17:18:35,071 Epoch[7] Validation-rmse=0.5003022017-06-19 17:18:35,079 Epoch[8] Train-rmse=0.5007122017-06-19 17:18:35,079 Epoch[8] Time cost=0.0072017-06-19 17:18:35,085 Epoch[8] Validation-rmse=0.5003012017-06-19 17:18:35,093 Epoch[9] Train-rmse=0.5007122017-06-19 17:18:35,093 Epoch[9] Time cost=0.0072017-06-19 17:18:35,099 Epoch[9] Validation-rmse=0.500301Classified as 0 with probability 0.530638Classified as 0 with probability 0.530638Classified as 0 with probability 0.530638...Classified as 0 with probability 0.530638

请问有人能告诉我哪里出错了么？

python version == 2.7.10mxnet == 0.10.0numpy==1.12.0

我从数据集中删除了一些非信息性列和标题，然后将其转换为csv格式。

train_data.shape == (491, 22)train_lbl.shape == (491,)eval_data.shape == (452, 22)eval_lbl.shape == (452,)

回答：

网络定义看起来是正确的。你能打印train_iter和val_iter来查看标准化后的数据是否仍然是你期望的吗？另外，你从原始数据中删除了哪些列？

学技术

使用Python进行MXNet分类总是给出相同的预测

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复