我在处理链接预测问题时,数据集是一个numpy数组,需要将其解析并存储到另一个numpy数组中。我试图这样做,但在第9行出现了IndexError:有效索引只能是整数、切片(:
)、省略号(...
)、numpy.newaxis(None
)以及整数或布尔数组。我甚至尝试将索引类型转换为int,但似乎不起作用。我在这里遗漏了什么?
1. train_edges, test_edges, = train_test_split(edgeL,test_size=0.3,random_state=16) 2. out_dim = int(W_out.shape[1]) 3. in_dim = int(W_in.shape[1]) 4. train_x = np.zeros((len(train_edges), (out_dim + in_dim) * 2)) 5. train_y = np.zeros((len(train_edges), 1)) 6. for i, edge in enumerate(train_edges): 7. u = edge[0] 8. v = edge[1] 9. train_x[int(i), : int(out_dim)] = W_out[u] 10. train_x[int(i), int(out_dim): int(out_dim + in_dim)] = W_in[u] 11. train_x[i, out_dim + in_dim: out_dim * 2 + in_dim] = W_out[v] 12. train_x[i, out_dim * 2 + in_dim:] = W_in[v] 13. if edge[2] > 0: 14. train_y[i] = 1 15. else: 16. train_y[i] = -1
编辑:
供参考,W_out
是一个64维的元组,看起来像这样
print(W_out[0])type(W_out.shape[1])Output:[[0.10160154 0. 0.70414263 0.6772633 0.07685234 0.75205046 0.421092 0.1776721 0.8622188 0.15669271 0. 0.40653425 0.5768579 0.75861764 0.6745151 0.37883565 0.18074909 0.73928916 0.6289512 0. 0.33160248 0.7441727 0. 0.8810399 0.1110919 0.53732747 0. 0.33330196 0.36220717 0.298112 0.10643011 0.8997948 0.53510064 0.6845873 0.03440218 0.23005858 0.8097505 0.7108275 0.38826624 0.28532124 0.37821335 0.3566149 0.42527163 0.71940386 0.8075657 0.5775364 0.01444144 0.21734199 0.47439903 0.21176265 0.32279345 0.00187511 0.43511534 0.4302601 0.39407462 0.20941389 0.199842 0.8710182 0.2160332 0.30246672 0.27159846 0.19009161 0.32349357 0.08938174]]int
而edge
是从训练数据集中提取的元组,包含源、目标、符号。它看起来像这样…
train_edges, test_edges, = train_test_split(edgeL,test_size=0.3,random_state=16)for i, edge in enumerate(train_edges): print(edge) print(i) type(i) type(edge)Output: Streaming output truncated to the last 5000 lines.2936['16936', '17031', '1']2937['15307', '14904', '1']2938['22852', '13045', '1']2939['14291', '96703', '1']2940
任何帮助/建议都将不胜感激。
回答:
你的语法引起了错误。
看起来访问edge对象可能是问题所在。使用type()和len()对edge进行调试,看看索引错误是什么。
隐式指定int(i)是不需要的,所以问题可能在于train_index[x]的赋值或者你的枚举逻辑不正确。