我使用以下代码执行SOM(自组织映射,也称为Kohonen网络)机器学习算法来可视化一些数据。然后,我在可视化上应用了一个聚类算法(我选择了8个聚类):
#load librarylibrary(tidyverse)library(kohonen)library(GGally)library(purrr)library(tidyr)library(dplyr)library(mlr)#load datadata(flea)fleaTib <- as_tibble(flea)#define SOM gridsomGrid <- somgrid(xdim = 5, ydim = 5, topo = "hexagonal",neighbourhood.fct = "bubble", toroidal = FALSE)#format datafleaScaled <- fleaTib %>%select(-species) %>%scale()#perform somfleaSom <- som(fleaScaled, grid = somGrid, rlen = 5000,alpha = c(0.05, 0.01))par(mfrow = c(2, 3))plotTypes <- c("codes", "changes", "counts", "quality","dist.neighbours", "mapping")walk(plotTypes, ~plot(fleaSom, type = ., shape = "straight"))getCodes(fleaSom) %>%as_tibble() %>%iwalk(~plot(fleaSom, type = "property", property = .,main = .y, shape = "straight"))# listing flea species on SOMpar(mfrow = c(1, 2))nodeCols <- c("cyan3", "yellow", "purple", "red", "blue", "green", "white", "pink")plot(fleaSom, type = "mapping", pch = 21,bg = nodeCols[as.numeric(fleaTib$species)],shape = "straight", bgcol = "lightgrey")# CLUSTER AND ADD TO SOM MAP ---- (8 clusters)clusters <- cutree(hclust(dist(fleaSom$codes[[1]], method = "manhattan")), 8)somClusters <- map_dbl(clusters, ~{ if(. == 1) 3 else if(. == 2) 2 else 1})plot(fleaSom, type = "mapping", pch = 21, bg = nodeCols[as.numeric(fleaTib$species)], shape = "straight", bgcol = nodeCols[as.integer(somClusters)])add.cluster.boundaries(fleaSom, somClusters)
但在上面的图表中,只显示了3种颜色,而不是8种。
请问有人能告诉我我做错了什么吗?
回答:
在最后一个图表中定义背景颜色时,请将somClusters
替换为clusters
。主要问题是您定义的somClusters
只有三个值,而不是8个。如果您用它来索引颜色向量,它只会显示三种颜色。
plot(fleaSom, type = "mapping", pch = 21, bg = nodeCols[as.numeric(fleaTib$species)], shape = "straight", bgcol = nodeCols[as.integer(clusters)])add.cluster.boundaries(fleaSom, somClusters)