我正在进行一个聚类任务,并使用了肘部法来确定最佳的聚类数量(k),但我得到的是一条线性图,我无法从图中确定k值。[enter image description here][2]
谢谢
回答:
There are many ways to do this kind of thing. For one thing, you can use Yellowbrick to do the work.import pandas as pdimport matplotlib as mpl import matplotlib.pyplot as pltfrom mpl_toolkits.mplot3d import Axes3Dfrom sklearn.cluster import KMeansfrom sklearn.datasets import make_blobsfrom sklearn import datasetsfrom yellowbrick.cluster import KElbowVisualizer, SilhouetteVisualizermpl.rcParams["figure.figsize"] = (9,6)# Load iris flower datasetiris = datasets.load_iris()X = iris.data #clustering is unsupervised learning hence we load only X(i.e.iris.data) and not Y(i.e. iris.target)# Converting the data into dataframefeature_names = iris.feature_namesiris_dataframe = pd.DataFrame(X, columns=feature_names)iris_dataframe.head(10)# Fitting the model with a dummy model, with 3 clusters (we already know there are 3 classes in the Iris dataset)k_means = KMeans(n_clusters=3)k_means.fit(X)# Plotting a 3d plot using matplotlib to visualize the data pointsfig = plt.figure(figsize=(7,7))ax = fig.add_subplot(111, projection='3d')# Setting the colors to match cluster resultscolors = ['red' if label == 0 else 'purple' if label==1 else 'green' for label in k_means.labels_]ax.scatter(X[:,3], X[:,0], X[:,2], c=colors)
# Instantiate the clustering model and visualizermodel = KMeans()visualizer = KElbowVisualizer(model, k=(2,11))visualizer.fit(X) # Fit the data to the visualizervisualizer.show() # Draw/show/show the data
请查看下面的链接以获取更多信息。