使用scikit-learn SVM和optunity时出现’bad input shape’错误

我正在尝试使用optunity包来调整我的SVM模型,我直接复制并粘贴了其最新的示例代码,只需导入特征数组和数据数组

import optunityimport optunity.metricsimport sklearn.svmimport numpy as npdata_path = '/python/Feature'files = ['A.npy', 'B.npy', 'C.npy']array = []labels = []for i,name in enumerate(files):    data = np.load('{}/{}'.format(data_path, name))    for j in range(0,len(data)):        labels.append(data[j])        array.append(data)print len(array)   #=> 1247print len(labels)  #=> 1247# score function: twice iterated 10-fold cross-validated accuracy@optunity.cross_validated(x=data, y=labels, num_folds=10, num_iter=2)def svm_auc(x_train, y_train, x_test, y_test, C, gamma):    model = sklearn.svm.SVC(C=C, gamma=gamma).fit(x_train, y_train)    decision_values = model.decision_function(x_test)    return optunity.metrics.roc_auc(y_test, decision_values)# perform tuningoptimal_pars, _, _ = optunity.maximize(svm_auc, num_evals=200, C=[0, 10], gamma=[0, 1])# train model on the full training set with tuned hyperparametersoptimal_model = sklearn.svm.SVC(**optimal_pars).fit(data, labels)

然而,编译器看起来非常不高兴,我查看了SVM类文档以再次确认输入格式,但我无法理解optunity的代码语法…有谁能帮我找出问题所在吗?非常感谢…(我使用的是’rbf’内核,我尝试添加但语法出错,奇怪的是optunity的示例中没有内核选择..)

Traceback (most recent call last):  File "python/SVM_turning.py", line 26, in <module>    optimal_pars, _, _ = optunity.maximize(svm_auc, num_evals=200, C=[0, 10], gamma=[0, 1])  File "/lib/python2.7/site-packages/optunity/api.py", line 181, in maximize    pmap=pmap)  File "/lib/python2.7/site-packages/optunity/api.py", line 245, in optimize    solution, report = solver.optimize(f, maximize, pmap=pmap)  File "/lib/python2.7/site-packages/optunity/solvers/ParticleSwarm.py", line 257, in optimize    fitnesses = pmap(evaluate, list(map(self.particle2dict, pop)))  File "/lib/python2.7/site-packages/optunity/solvers/ParticleSwarm.py", line 246, in evaluate    return f(**d)  File "/lib/python2.7/site-packages/optunity/functions.py", line 286, in wrapped_f    value = f(*args, **kwargs)  File "/lib/python2.7/site-packages/optunity/functions.py", line 341, in wrapped_f    return f(*args, **kwargs)  File "/lib/python2.7/site-packages/optunity/constraints.py", line 150, in wrapped_f    return f(*args, **kwargs)  File "/lib/python2.7/site-packages/optunity/constraints.py", line 128, in wrapped_f    return f(*args, **kwargs)  File "/lib/python2.7/site-packages/optunity/constraints.py", line 265, in func    return f(*args, **kwargs)  File "/lib/python2.7/site-packages/optunity/cross_validation.py", line 386, in __call__    scores.append(self.f(**kwargs))  File "/python/SVM_turning.py", line 21, in svm_auc    model = sklearn.svm.SVC(C=C, gamma=gamma).fit(x_train, y_train)  File "/lib/python2.7/site-packages/sklearn/svm/base.py", line 138, in fit    y = self._validate_targets(y)  File "/lib/python2.7/site-packages/sklearn/svm/base.py", line 441, in _validate_targets    y_ = column_or_1d(y, warn=True)  File "/lib/python2.7/site-packages/sklearn/utils/validation.py", line 319, in column_or_1d    raise ValueError("bad input shape {0}".format(shape))ValueError: bad input shape (428, 600)

回答:

我想我找到了问题所在。你在读取文件时准备了arraylabels列表。array被顺序填充了data。然而,之后你这样做:

@optunity.cross_validated(x=data, y=labels, num_folds=10, num_iter=2)

optimal_model = sklearn.svm.SVC(**optimal_pars).fit(data, labels)

因此使用data作为你的数据集,而不是你准备的array。我不知道你从文件中读取的数据格式,所以我不能确定发生了什么。然而,datalabels的维度几乎肯定不会匹配。

这里有一个使用arraylabels的玩具示例,它可以正常工作:

import optunityimport optunity.metricsimport sklearn.svmimport numpy as np#print len(array)   #=> 1247#print len(labels)  #=> 1247# make dummy dataarray = np.array([[i] for i in range(1247)])labels = [True] * 100 + [False] * 1147# score function: twice iterated 10-fold cross-validated accuracy@optunity.cross_validated(x=array, y=labels, num_folds=10, num_iter=2)def svm_auc(x_train, y_train, x_test, y_test, C, gamma):    model = sklearn.svm.SVC(C=C, gamma=gamma).fit(x_train, y_train)    decision_values = model.decision_function(x_test)    return optunity.metrics.roc_auc(y_test, decision_values)# perform tuningoptimal_pars, _, _ = optunity.maximize(svm_auc, num_evals=200, C=[0, 10], gamma=[0, 1])# train model on the full training set with tuned hyperparametersoptimal_model = sklearn.svm.SVC(**optimal_pars).fit(array, labels)print(optimal_pars)

输出示例为:

{‘C’: 8.0126953125, ‘gamma’: 0.35791015625}

抱歉回复得这么晚。

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注