Home IT技术使用HDBSCAN检索集群成员

使用HDBSCAN检索集群成员

IT技术 xiaolong · 2025年5月26日 · 0 Comment

我有一些字符串数据，经过一些处理后，使用HDBSCAN创建了一个集群：

textData = train['eudexHash'].apply(lambda x: str(x))clusterer = hdbscan.HDBSCAN(min_cluster_size=5,                            gen_min_span_tree=True,                            prediction_data=True).fit(textData.values.reshape(-1,1))

现在，当我调用集群使用approximate_predict进行预测时，我得到了这些结果：

>>>> hdbscan.approximate_predict(clusterer, testCase)(array([113]), array([1.]))

太好了，看起来它在预测新案例，所以它认为新的字符串值对应于标签[113]。那么，我如何找到该标签/桶/集群中的其他成员呢？

谢谢！

回答：

如果你想找出你的训练数据中哪些属于标签113，你可以这样做：

textdata_with_label_113 = textData[clusterer.labels_ == 113]

cluster-analysis hdbscan k-means machine-learning python

发表回复取消回复