KMeans对象没有属性’labels_’

在我的代码中,我使用了sklearn的KMeans算法。当我执行代码时,出现了这样的错误:“KMeans对象没有属性’labels_’

Traceback (most recent call last): File ".\kmeans.py", line 56, in <module>   np.unique(km.labels_, return_counts=True)AttributeError: 'KMeans' object has no attribute 'labels_'

这是我的代码:

import pandas as pdsimport nltk,re,stringfrom nltk.probability import FreqDistfrom collections import defaultdictfrom nltk.tokenize import sent_tokenize, word_tokenize, RegexpTokenizerfrom nltk.tokenize.punkt import PunktSentenceTokenizerfrom nltk.corpus import stopwordsfrom string import punctuationfrom heapq import nlargest# 导入并实例化CountVectorizerfrom sklearn.feature_extraction.text import CountVectorizervect = CountVectorizer()    from sklearn.feature_extraction.text import TfidfVectorizervectorizer = TfidfVectorizer(ngram_range=(1,2),max_df=0.5, min_df=2,stop_words='english')train_X = vectorizer.fit_transform(x)  from sklearn.cluster import KMeansimport sklearn.cluster.k_means_km = KMeans(n_clusters=3, init='k-means++', max_iter=100, n_init=1,   verbose=True)import numpy as npnp.unique(km.labels_, return_counts=True)text = {}for i,cluster in enumerate(km.labels_):    oneDocument = X[i]         if cluster not in text.keys():        text[cluster] = oneDocument    else:        text[cluster] += oneDocument        _stopwords = set(stopwords.words('english')+ list(punctuation))keywords = {}counts = {}for cluster in range(3):    word_sent =  word_tokenize(text[cluster].lower())    word_sent = [word for word in word_sent if word not in _stopwords]    freq = FreqDist(word_sent)    keywords[cluster] =  nlargest(100, freq, key=freq.get)    counts[cluster] = frequnique_keys={}for cluster in range(3):    other_clusters = list(set(range(3))-set([cluster]))    keys_other_clusters =     set(keywords[other_clusters[0]]).union(set(keywords[other_clusters[1]]))    unique=set(keywords[cluster])-keys_other_clusters    unique_keys[cluster]= nlargest(100, unique, key=counts[cluster].get)#print(unique_keys)print(keywords)

为了获取关键词聚类。我尝试解决这个问题..但我不知道问题出在哪里..


回答:

你需要先对KMeans对象进行拟合操作,它才会有标签属性:

如果没有拟合,它会抛出错误:

from sklearn.cluster import KMeanskm = KMeans()print(km.labels_)>>>AttributeError: 'KMeans' object has no attribute 'labels_'

拟合之后:

from sklearn.cluster import KMeansimport numpy as npkm = KMeans()X = np.random.rand(100, 2)km.fit(X)print(km.labels_)>>>[1 6 7 4 6 6 7 5 6 0 0 7 3 4 5 7 5 0 3 4 0 6 1 6 7 5 4 3 4 2 1 2 1 4 6 3 6 1 7 6 6 7 4 1 1 0 4 2 5 0 6 3 1 0 7 6 2 7 7 5 2 7 7 3 2 1 2 2 4 7 5 3 2 65 1 6 2 4 2 3 2 2 2 1 2 0 5 7 2 4 4 5 4 4 1 1 4 5 0]

Related Posts

L1-L2正则化的不同系数

我想对网络的权重同时应用L1和L2正则化。然而,我找不…

使用scikit-learn的无监督方法将列表分类成不同组别,有没有办法?

我有一系列实例,每个实例都有一份列表,代表它所遵循的不…

f1_score metric in lightgbm

我想使用自定义指标f1_score来训练一个lgb模型…

通过相关系数矩阵进行特征选择

我在测试不同的算法时,如逻辑回归、高斯朴素贝叶斯、随机…

可以将机器学习库用于流式输入和输出吗?

已关闭。此问题需要更加聚焦。目前不接受回答。 想要改进…

在TensorFlow中,queue.dequeue_up_to()方法的用途是什么?

我对这个方法感到非常困惑,特别是当我发现这个令人费解的…

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注