K折交叉验证—KeyError: ‘[] not in index’

我在应用K折交叉验证时遇到了问题。请有人帮助我解决这个问题。当我使用train_test_split时没有问题,但K折交叉验证在索引方面出现了麻烦。

如何在我的数据集中应用K折交叉验证?

我的代码如下

from sklearn.model_selection import KFolddf = pd.read_csv('CD.TXT',delimiter=',')df.head() X = df[['A', 'B', 'C', 'D']].valuesY=df['Label'].valuesX=pd.DataFrame(X)Y=pd.DataFrame(Y)cv = KFold(n_splits=10, random_state=42, shuffle=False)for train_index, test_index in cv.split(X):    print("Train Index: ", train_index, "\n")    print("Test Index: ", test_index)X_train, X_test, Y_train, Y_test = X[train_index], X[test_index], Y[train_index], Y[test_index]print(X_train)print(Y_train)

我的数据集如下

A,B,C,D,Label10,20,30,40,120,20,15,60,010,20,30,40,110,20,30,40,110,20,39,40,110,20,30,40,110,20,30,40,110,20,32,40,110,20,30,40,110,20,30,40,110,20,3,40,120,20,15,60,020,20,15,60,020,20,12,60,020,20,15,60,020,20,15,60,020,20,12,60,020,20,15,60,0

我遇到的错误如下

Test Index:  [18]Traceback (most recent call last):  File "<ipython-input-11-10016b897261>", line 1, in <module>    runfile('D:/experiments/untitled0.py', wdir='D:/experiments')  File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile    execfile(filename, namespace)  File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile    exec(compile(f.read(), filename, 'exec'), namespace)  File "D:/experiments/untitled0.py", line 61, in <module>    X_train, X_test, Y_train, Y_test = X[train_index], X[test_index], Y[train_index], Y[test_index]  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2934, in __getitem__    raise_missing=True)  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1354, in _convert_to_indexer    return self._get_listlike_indexer(obj, axis, **kwargs)[1]  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1161, in _get_listlike_indexer    raise_missing=raise_missing)  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1252, in _validate_read_indexer    raise KeyError("{} not in index".format(not_found))KeyError: '[4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17] not in index'

回答:

错误的原因是您尝试使用numpy索引来索引数据框。

尝试注释掉X=pd.DataFrame(X) Y=pd.DataFrame(Y)

from sklearn.model_selection import KFolddf = pd.read_csv('CD.TXT',delimiter=',')df.head() X = df[['A', 'B', 'C', 'D']].valuesY=df['Label'].values#X=pd.DataFrame(X)#Y=pd.DataFrame(Y)cv = KFold(n_splits=10, random_state=42, shuffle=False)for train_index, test_index in cv.split(X):    print("Train Index: ", train_index, "\n")    print("Test Index: ", test_index)X_train, X_test, Y_train, Y_test = X[train_index], X[test_index], Y[train_index], Y[test_index]print(X_train)print(Y_train)

或者尝试使用

from sklearn.model_selection import KFolddf = pd.read_csv('CD.TXT',delimiter=',')df.head() X = df[['A', 'B', 'C', 'D']].valuesY=df['Label'].valuesX=pd.DataFrame(X)Y=pd.DataFrame(Y)cv = KFold(n_splits=10, random_state=42, shuffle=False)for train_index, test_index in cv.split(X):    print("Train Index: ", train_index, "\n")    print("Test Index: ", test_index)X_train, X_test, Y_train, Y_test = X.iloc[train_index,:], X.iloc[test_index,:], Y.iloc[train_index], Y.iloc[test_index]print(X_train)print(Y_train)

Related Posts

使用LSTM在Python中预测未来值

这段代码可以预测指定股票的当前日期之前的值,但不能预测…

如何在gensim的word2vec模型中查找双词组的相似性

我有一个word2vec模型,假设我使用的是googl…

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

我试图使用 XGBoost 创建模型。 看起来我成功地…

ML Tuning – Cross Validation in Spark

我在https://spark.apache.org/…

如何在React JS中使用fetch从REST API获取预测

我正在开发一个应用程序,其中Flask REST AP…

如何分析ML.NET中多类分类预测得分数组?

我在ML.NET中创建了一个多类分类项目。该项目可以对…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注