import numpy as npimport pandas as pddf = pd.read_csv('concrete_data.csv', delimiter=',', sep=r', ')X_raw = df.drop(['concrete_compressive_strength'], axis=1)y_raw = df['concrete_compressive_strength']# 隔离标记数据集的示例n_labeled_examples = X_raw.shape[0]training_indices = np.random.randint(low=0, high=len(X_raw)+1, size=3)# 定义训练数据X_training = X_raw.iloc[training_indices]y_training = y_raw.iloc[training_indices]
这些变量的形状如下:
X_training.shape
(3, 8)
y_training.shape
(3,)
X_raw.shape
(1030, 8)
y_raw.shape
(1030,)
现在,我想隔离非训练示例:
X_pool = np.delete(X_raw, training_indices, axis=0)y_pool = np.delete(y_raw, training_indices, axis=0)
这会导致以下错误:
ValueError: Shape of passed values is (1027, 8), indices imply (1030, 8)
我尝试重塑training_indices
但仍然得到相同的错误。
r = np.reshape(training_indices, (3,1), order='C')
请问问题出在哪里,如何更改training_indices
的形状来修复这个问题。
回答:
你可以使用以下代码行:
X_pool = X_raw.drop(training_indices.tolist())y_pool = y_raw.drop(training_indices.tolist())
代替以下代码行:
X_pool = np.delete(X_raw, training_indices, axis=0)y_pool = np.delete(y_raw, training_indices, axis=0)