如何在tensorflow.js中训练LSTM来分类垃圾邮件？

我在训练一个LSTM来处理一些垃圾邮件 – 我有两个类别：“垃圾邮件”和“非垃圾邮件”。我通过将每条消息拆分成字符，然后对字符进行独热编码来预处理数据。然后我将其归属于相应的向量 – “非垃圾邮件”为[0]，“垃圾邮件”为[1]。以下代码用于预处理数据：

const fs = require("fs");const R = require("ramda");const txt = fs.readFileSync("spam.txt").toString();const encodeChars = string => {    const vecLength = 127;    const genVec = (char) => R.update(char.charCodeAt(0), 1, Array(vecLength).fill(0));    return string.split('').map(char => char.charCodeAt(0) < vecLength ? genVec(char) : "invalid");}const data = R.pipe(    R.split(",,,"),    R.map(        R.pipe(            x => [(x.split(",").slice(1).concat("")).reduce((t, v) => t.concat(v)), x.split(",")[0]],            R.adjust(1, R.replace(/\r|\n/g, "")),            R.adjust(0, encodeChars),            R.adjust(1, x => x === "ham" ? [0] : [1])        )    ),    R.filter(R.pipe(        R.prop(0),        x => !R.contains("invalid", x)    )))(txt);fs.writeFileSync("data.json", JSON.stringify(data))

然后，使用来自data.json的编码向量，我将数据导入到tensorflow中：

const fs = require("fs");const data = JSON.parse(fs.readFileSync("data.json").toString()).sort(() => Math.random() - 0.5)const train = data.slice(0, Math.floor(data.length * 0.8));const test = data.slice(Math.floor(data.length * 0.8));const tf = require("@tensorflow/tfjs-node");const model = tf.sequential({    layers: [        tf.layers.lstm({ inputShape: [null, 127], units: 16, activation: "relu", returnSequences: true }),        tf.layers.lstm({ units: 16, activation: "relu", returnSequences: true }),        tf.layers.lstm({ units: 16, activation: "relu", returnSequences: true }),        tf.layers.dense({ units: 1, activation: "softmax" }),    ]})const tdata = tf.tensor3d(train.map(x => x[0]));const tlabels = tf.tensor2d(train.map(x => x[1]));model.compile({    optimizer: "adam",    loss: "categoricalCrossentropy",    metrics: ["accuracy"]})model.fit(tdata, tlabels, {    epochs: 1,    batchSize: 32,    callbacks: {        onBatchEnd(batch, logs) {            console.log(logs.acc)        }    }})

tdata是三维的，tlabels是二维的，所以一切应该正常工作。然而，当我运行代码时，我得到了以下错误：Error when checking target: expected dense_Dense1 to have 3 dimension(s). but got array with shape 4032,1有谁知道这里出了什么问题 – 我无法解决。谢谢！

备注：我已经尝试通过在消息向量的末尾添加“null”来标准化向量的长度，但我仍然得到了相同的错误。

回答：

LSTM的最后一层应该设置returnSequences: false，相当于一个扁平层。这将修复答案中的错误。

Error when checking target: expected dense_Dense1 to have 3 dimension(s). but got array with shape 4032,1

为了进一步解释答案，除了字符编码外，还有更多内容。实际上，应该对数据集进行分词处理。可以使用简单的词汇分词器，如这里所解释的，或者使用通用句子编码器自带的分词器。LSTM序列可以由每个标记的唯一标识符组成。

此外，最后一层使用单个单元并不反映分类方法。这更像是预测一个值而不是一个类。应该使用两个单元（一个用于垃圾邮件，另一个用于非垃圾邮件），以便对标签进行独热编码。

学技术

如何在tensorflow.js中训练LSTM来分类垃圾邮件？

发表回复取消回复

相关文章：

Related Posts

Keras Dense层输入未被展平

无法将分类变量输入随机森林

如何在Keras中对每个输出应用Sigmoid函数？

如何选择类概率的最佳阈值？

在Keras中使用深度学习得到不同的结果

‘MatMul’操作的输入’b’类型为float32，与参数’a’的类型float64不匹配

发表回复 取消回复

发表回复取消回复