我正在尝试不使用sklearn库来构建一个混淆矩阵。我在正确形成混淆矩阵时遇到了问题。以下是我的代码:
def comp_confmat(): currentDataClass = [1,3,3,2,5,5,3,2,1,4,3,2,1,1,2] predictedClass = [1,2,3,4,2,3,3,2,1,2,3,1,5,1,1] cm = [] classes = int(max(currentDataClass) - min(currentDataClass)) + 1 #find number of classes for c1 in range(1,classes+1):#for every true class counts = [] for c2 in range(1,classes+1):#for every predicted class count = 0 for p in range(len(currentDataClass)): if currentDataClass[p] == predictedClass[p]: count += 1 counts.append(count) cm.append(counts) print(np.reshape(cm,(classes,classes)))
然而,这返回的是:
[[7 7 7 7 7][7 7 7 7 7][7 7 7 7 7][7 7 7 7 7][7 7 7 7 7]]
但我不明白为什么每次迭代结果都是7,因为我每次都重置了计数,并且它在循环不同的值?
这是我应该得到的结果(使用sklearn的confusion_matrix函数):
[[3 0 0 0 1][2 1 0 1 0][0 1 3 0 0][0 1 0 0 0][0 1 1 0 0]]
回答:
在你最内层的循环中,应该有一个情况区分:目前这个循环计算的是一致性,但你只希望在c1 == c2
时才这样做。
这里是另一种方法,使用嵌套列表解析:
currentDataClass = [1,3,3,2,5,5,3,2,1,4,3,2,1,1,2] predictedClass = [1,2,3,4,2,3,3,2,1,2,3,1,5,1,1]classes = int(max(currentDataClass) - min(currentDataClass)) + 1 #find number of classescounts = [[sum([(currentDataClass[i] == true_class) and (predictedClass[i] == pred_class) for i in range(len(currentDataClass))]) for pred_class in range(1, classes + 1)] for true_class in range(1, classes + 1)]counts
[[3, 0, 0, 0, 1], [2, 1, 0, 1, 0], [0, 1, 3, 0, 0], [0, 1, 0, 0, 0], [0, 1, 1, 0, 0]]