我目前正在学习贝叶斯分类的算法,当我试图按照书中的示例操作时,得到了一些奇怪的结果,这些结果与书中的示例不一致。
我认为我的代码没有问题(因为我基本上是手动抄写的),但我在REPL中仍然得到了不可能的结果,例如:
> (+ (evidence-of-sea-bass) (evidence-of-salmon)) ==> 2.8139728009700775
它应该返回1.000…,只有一些小的浮点精度误差。
这是代码:
(defn make-sea-bass [] #{:sea-bass (if (< (rand) 0.2) :fat :thin) (if (< (rand) 0.7) :long :short) (if (< (rand) 0.8) :light :dark)})(defn make-salmon [] #{:salmon (if (< (rand) 0.8) :fat :thin) (if (< (rand) 0.5) :long :short) (if (< (rand) 0.3) :light :dark)})(defn make-sample-fish [] (if (< (rand) 0.3) (make-sea-bass) (make-salmon)))(def fish-training-data (for [i (range 10000)] (make-sample-fish)))(defn probability [attribute & {:keys [category prior-positive prior-negative data] :or {category nil data fish-training-data}}] (let [by-category (if category (filter category data) data) positive (count (filter attribute by-category)) negative (- (count by-category) positive) total (+ positive negative)] (/ positive negative)))(defn evidence-of-salmon [& attrs] (let [attr-prob (map #(probability % :category :salmon) attrs) class-and-attr-prob (conj attr-prob (probability :salmon))] (float (apply * class-and-attr-prob))))(defn evidence-of-sea-bass [& attrs] (let [attr-prob (map #(probability % :category :sea-bass) attrs) class-and-attr-prob (conj attr-prob (probability :sea-bass))] (float (apply * class-and-attr-prob))))
回答:
如果你期望结果是1.0,那么你的probability函数的结果应该是(/ positive total)