这是一个基于这个教程的简单模型架构。数据集看起来像这样,尽管是在10个维度上:
import tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras.models import Sequentialfrom tensorflow.keras import layers, optimizersfrom sklearn.datasets import make_blobsdef pre_processing(inputs, targets): inputs = tf.cast(inputs, tf.float32) targets = tf.cast(targets, tf.int64) return inputs, targetsdef get_data(): inputs, targets = make_blobs(n_samples=1000, n_features=10, centers=7, cluster_std=1) data = tf.data.Dataset.from_tensor_slices((inputs, targets)) data = data.map(pre_processing) data = data.take(count=1000).shuffle(buffer_size=1000).batch(batch_size=256) return datamodel = Sequential([ layers.Dense(8, input_shape=(10,), activation='relu'), layers.Dense(16, activation='relu'), layers.Dense(32, activation='relu'), layers.Dense(7)])@tf.functiondef compute_loss(logits, labels): return tf.reduce_mean( tf.nn.sparse_softmax_cross_entropy_with_logits( logits=logits, labels=labels))@tf.functiondef compute_accuracy(logits, labels): predictions = tf.argmax(logits, axis=1) return tf.reduce_mean(tf.cast(tf.equal(predictions, labels), tf.float32))@tf.functiondef train_step(model, optim, x, y): with tf.GradientTape() as tape: logits = model(x) loss = compute_loss(logits, y) grads = tape.gradient(loss, model.trainable_variables) optim.apply_gradients(zip(grads, model.trainable_variables)) accuracy = compute_accuracy(logits, y) return loss, accuracydef train(epochs, model, optim): train_ds = get_data() loss = 0. acc = 0. for step, (x, y) in enumerate(train_ds): loss, acc = train_step(model, optim, x, y) if step % 500 == 0: print(f'Epoch {epochs} loss {loss.numpy()} acc {acc.numpy()}') return loss, accoptim = optimizers.Adam(learning_rate=1e-6)for epoch in range(100): loss, accuracy = train(epoch, model, optim)
Epoch 85 loss 2.530677080154419 acc 0.140625Epoch 86 loss 3.3184046745300293 acc 0.0Epoch 87 loss 3.138179063796997 acc 0.30078125Epoch 88 loss 3.7781732082366943 acc 0.0Epoch 89 loss 3.4101686477661133 acc 0.14453125Epoch 90 loss 2.2888522148132324 acc 0.13671875Epoch 91 loss 5.993691444396973 acc 0.16015625
我哪里做错了?
回答:
你的代码中存在两个问题:
-
第一个问题是你每次epoch都在生成一个新的训练数据集(参见
train
函数的第一行,即在每个epoch中都调用了get_data
函数)。由于你使用sklearn.datasets.make_blobs
函数来生成数据集群,因此无法保证不同调用之间生成的数据集群遵循相同的分布和/或标签映射。因此,模型在每个epoch上对完全不同的数据集所能做的最好的事情就是随机猜测(因此,你在结果中看到的平均1/7 ~= 0.14的准确率)。为了解决这个问题,将数据生成从train
函数中移出(即通过调用get_data
函数在全局级别生成一次数据),然后在每个epoch中将生成的数据作为参数传递给train
函数。 -
第二个问题是你为优化器设置了一个非常低的学习率,即1e-6;因此,模型被卡住,实际上根本没有在训练。相反,使用Adam优化器的默认学习率,即1e-3,并根据需要进行更改(例如,根据你进行的实验的结果)。