我有两个维度为6的向量,我希望得到一个介于0和1之间的数值。
a=c("HDa","2Pb","2","BxU","BuQ","Bve")b=c("HCK","2Pb","2","09","F","G")
谁能解释一下我应该怎么做?
回答:
使用lsa
包及其手册
# 创建一些文件library('lsa')td = tempfile()dir.create(td)write( c("HDa","2Pb","2","BxU","BuQ","Bve"), file=paste(td, "D1", sep="/"))write( c("HCK","2Pb","2","09","F","G"), file=paste(td, "D2", sep="/"))# 将文件读入文档-词矩阵myMatrix = textmatrix(td, minWordLength=1)
编辑:展示mymatrix
对象的结构
myMatrix#myMatrix# docs# terms D1 D2# 2 1 1# 2pb 1 1# buq 1 0# bve 1 0# bxu 1 0# hda 1 0# 09 0 1# f 0 1# g 0 1# hck 0 1# 计算余弦相似度res <- lsa::cosine(myMatrix[,1], myMatrix[,2])res#0.3333