我尝试使用这里提供的网球比赛统计数据集作为输入,构建了一个神经网络模型来预测比赛结果(1或0)。
根据官方MXNet文档,我开发了下面的程序。我尝试了各种配置参数,如batch_size、unit_size、act_type、learning_rate,但无论我做了什么样的修改,得到的准确率总是大约0.5,并且总是预测所有结果为1或0。
import numpy as npfrom sklearn.preprocessing import normalizeimport mxnet as mximport loggingimport warningswarnings.filterwarnings("ignore", category=DeprecationWarning)logging.basicConfig(level=logging.DEBUG, format='%(asctime)-15s %(message)s')batch_size = 100train_data = np.loadtxt("dm.csv",delimiter=",")train_data = normalize(train_data, norm='l1', axis=0)train_lbl = np.loadtxt("dm_lbl.csv",delimiter=",")eval_data = np.loadtxt("dw.csv",delimiter=",")eval_data = normalize(eval_data, norm='l1', axis=0)eval_lbl = np.loadtxt("dw_lbl.csv",delimiter=",")train_iter = mx.io.NDArrayIter(train_data, train_lbl, batch_size=batch_size, shuffle=True)val_iter = mx.io.NDArrayIter(eval_data, eval_lbl, batch_size=batch_size)data = mx.sym.var('data')# The first fully-connected layer and the corresponding activation functionfc1 = mx.sym.FullyConnected(data=data, num_hidden=220)#bn1 = mx.sym.BatchNorm(data = fc1, name="bn1")act1 = mx.sym.Activation(data=fc1, act_type="sigmoid")# The second fully-connected layer and the corresponding activation functionfc2 = mx.sym.FullyConnected(data=act1, num_hidden = 220)#bn2 = mx.sym.BatchNorm(data = fc2, name="bn2")act2 = mx.sym.Activation(data=fc2, act_type="sigmoid")# The third fully-connected layer and the corresponding activation functionfc3 = mx.sym.FullyConnected(data=act2, num_hidden = 110)#bn3 = mx.sym.BatchNorm(data = fc3, name="bn3")act3 = mx.sym.Activation(data=fc3, act_type="sigmoid")# output class(es)fc4 = mx.sym.FullyConnected(data=act3, num_hidden=2)# Softmax with cross entropy lossmlp = mx.sym.SoftmaxOutput(data=fc4, name='softmax')mod = mx.mod.Module(symbol=mlp, context=mx.cpu(), data_names=['data'], label_names=['softmax_label'])mod.fit(train_iter, eval_data=val_iter, optimizer='sgd', optimizer_params={'learning_rate':0.03}, eval_metric='rmse', num_epoch=10, batch_end_callback = mx.callback.Speedometer(batch_size, 100)) # output progress for each 200 data batches)prob = mod.predict(val_iter).asnumpy()#print(prob)for unit in prob: print 'Classified as %d with probability %f' % (unit.argmax(), max(unit))
这是日志输出:
2017-06-19 17:18:34,961 Epoch[0] Train-rmse=0.5005742017-06-19 17:18:34,961 Epoch[0] Time cost=0.0072017-06-19 17:18:34,968 Epoch[0] Validation-rmse=0.5002842017-06-19 17:18:34,975 Epoch[1] Train-rmse=0.5007032017-06-19 17:18:34,975 Epoch[1] Time cost=0.0072017-06-19 17:18:34,982 Epoch[1] Validation-rmse=0.5003012017-06-19 17:18:34,990 Epoch[2] Train-rmse=0.5007132017-06-19 17:18:34,990 Epoch[2] Time cost=0.0082017-06-19 17:18:34,998 Epoch[2] Validation-rmse=0.5003022017-06-19 17:18:35,005 Epoch[3] Train-rmse=0.5007132017-06-19 17:18:35,005 Epoch[3] Time cost=0.0072017-06-19 17:18:35,012 Epoch[3] Validation-rmse=0.5003022017-06-19 17:18:35,019 Epoch[4] Train-rmse=0.5007132017-06-19 17:18:35,019 Epoch[4] Time cost=0.0072017-06-19 17:18:35,027 Epoch[4] Validation-rmse=0.5003022017-06-19 17:18:35,035 Epoch[5] Train-rmse=0.5007132017-06-19 17:18:35,035 Epoch[5] Time cost=0.0082017-06-19 17:18:35,042 Epoch[5] Validation-rmse=0.5003022017-06-19 17:18:35,049 Epoch[6] Train-rmse=0.5007132017-06-19 17:18:35,049 Epoch[6] Time cost=0.0072017-06-19 17:18:35,056 Epoch[6] Validation-rmse=0.5003022017-06-19 17:18:35,064 Epoch[7] Train-rmse=0.5007122017-06-19 17:18:35,064 Epoch[7] Time cost=0.0082017-06-19 17:18:35,071 Epoch[7] Validation-rmse=0.5003022017-06-19 17:18:35,079 Epoch[8] Train-rmse=0.5007122017-06-19 17:18:35,079 Epoch[8] Time cost=0.0072017-06-19 17:18:35,085 Epoch[8] Validation-rmse=0.5003012017-06-19 17:18:35,093 Epoch[9] Train-rmse=0.5007122017-06-19 17:18:35,093 Epoch[9] Time cost=0.0072017-06-19 17:18:35,099 Epoch[9] Validation-rmse=0.500301Classified as 0 with probability 0.530638Classified as 0 with probability 0.530638Classified as 0 with probability 0.530638...Classified as 0 with probability 0.530638
请问有人能告诉我哪里出错了么?
python version == 2.7.10mxnet == 0.10.0numpy==1.12.0
我从数据集中删除了一些非信息性列和标题,然后将其转换为csv格式。
train_data.shape == (491, 22)train_lbl.shape == (491,)eval_data.shape == (452, 22)eval_lbl.shape == (452,)
回答:
网络定义看起来是正确的。你能打印train_iter和val_iter来查看标准化后的数据是否仍然是你期望的吗?另外,你从原始数据中删除了哪些列?