使用Python绘制sklearn聚类

我使用亲和传播算法获得了以下sklearn聚类结果。

import sklearn.clusterimport numpy as npsims =  np.array([[0, 17, 10, 32, 32], [18, 0, 6, 20, 15], [10, 8, 0, 20, 21], [30, 16, 20, 0, 17], [30, 15, 21, 17, 0]])affprop = sklearn.cluster.AffinityPropagation(affinity="precomputed", damping=0.5)affprop.fit(sims)cluster_centers_indices = affprop.cluster_centers_indices_labels = affprop.labels_#number of clustersn_clusters_ = len(cluster_centers_indices)

现在我想绘制这些聚类结果。我是sklearn的新手。请建议我一个适合在Python中绘制聚类的方法。是否可以使用pandas数据框来完成这个任务?

编辑:

我直接使用了@MohammedKashif指出的sklearn中的代码,如下所示:

import sklearn.clusterimport numpy as npsims =  np.array([[0, 17, 10, 32, 32], [18, 0, 6, 20, 15], [10, 8, 0, 20, 21], [30, 16, 20, 0, 17], [30, 15, 21, 17, 0]])affprop = sklearn.cluster.AffinityPropagation(affinity="precomputed", damping=0.5)affprop.fit(sims)cluster_centers_indices = affprop.cluster_centers_indices_print(cluster_centers_indices)labels = affprop.labels_n_clusters_ = len(cluster_centers_indices)print(n_clusters_)import matplotlib.pyplot as pltfrom itertools import cycleplt.close('all')plt.figure(1)plt.clf()colors = cycle('bgrcmykbgrcmykbgrcmykbgrcmyk')for k, col in zip(range(n_clusters_), colors):    class_members = labels == k    cluster_center = sims[cluster_centers_indices[k]]    plt.plot(sims[class_members, 0], sims[class_members, 1], col + '.')    plt.plot(cluster_center[0], cluster_center[1], 'o', markerfacecolor=col,             markeredgecolor='k', markersize=14)    for x in sims[class_members]:        plt.plot([cluster_center[0], x[0]], [cluster_center[1], x[1]], col)plt.title('Estimated number of clusters: %d' % n_clusters_)plt.show()

然而,我得到的输出有点奇怪,如下所示(第二个聚类点(绿色)位于蓝色线上。因此,我认为它不应该被单独聚类,应该也属于蓝色聚类)。请告诉我代码中是否有任何错误。enter image description here

编辑2

如σηγ所指出的,我添加了以下代码:

se = SpectralEmbedding(n_components=2, affinity='precomputed')X = se.fit_transform(sims)print(X)

然而,对于数组np.array([[0, 17, 10, 32, 32], [0, 17, 10, 32, 32], [0, 17, 10, 32, 33], [0, 17, 10, 32, 32], [0, 17, 10, 32, 32]]),它给我显示了3个点,如下图所示。这让我感到困惑,因为这5个数组代表一个点。

enter image description here

请帮助我。


回答:

根据前面的例子,我会尝试如下方法:

import sklearn.clusterfrom sklearn.manifold import SpectralEmbeddingimport numpy as npimport matplotlib.pyplot as pltfrom itertools import cyclesims =  np.array([[0, 17, 10, 32, 32], [18, 0, 6, 20, 15], [10, 8, 0, 20, 21], [30, 16, 20, 0, 17], [30, 15, 21, 17, 0]])affprop = sklearn.cluster.AffinityPropagation(affinity="precomputed", damping=0.5)affprop.fit(sims)cluster_centers_indices = affprop.cluster_centers_indices_print(cluster_centers_indices)labels = affprop.labels_n_clusters_ = len(cluster_centers_indices)print(n_clusters_)se = SpectralEmbedding(n_components=2, affinity='precomputed')X = se.fit_transform(sims)plt.close('all')plt.figure(1)plt.clf()colors = cycle('bgrcmykbgrcmykbgrcmykbgrcmyk')for k, col in zip(range(n_clusters_), colors):    class_members = labels == k    cluster_center = X[cluster_centers_indices[k]]    plt.plot(X[class_members, 0], X[class_members, 1], col + '.')    plt.plot(cluster_center[0], cluster_center[1], 'o', markerfacecolor=col,             markeredgecolor='k', markersize=14)    for x in X[class_members]:        plt.plot([cluster_center[0], x[0]], [cluster_center[1], x[1]], col)plt.title('Estimated number of clusters: %d' % n_clusters_)plt.show()       

AP_SE

Related Posts

Keras Dense层输入未被展平

这是我的测试代码: from keras import…

无法将分类变量输入随机森林

我有10个分类变量和3个数值变量。我在分割后直接将它们…

如何在Keras中对每个输出应用Sigmoid函数?

这是我代码的一部分。 model = Sequenti…

如何选择类概率的最佳阈值?

我的神经网络输出是一个用于多标签分类的预测类概率表: …

在Keras中使用深度学习得到不同的结果

我按照一个教程使用Keras中的深度神经网络进行文本分…

‘MatMul’操作的输入’b’类型为float32,与参数’a’的类型float64不匹配

我写了一个简单的TensorFlow代码,但不断遇到T…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注