我的CSV数据在这里:https://storage.googleapis.com/download.tensorflow.org/data/abalone_train.csv我想根据其他列预测“Age”。训练代码如下:
import pandas as pdimport numpy as np# Make numpy values easier to read.np.set_printoptions(precision=3, suppress=True)import tensorflow as tffrom tensorflow.keras import layersfrom tensorflow.keras.layers.experimental import preprocessingabalone_train = pd.read_csv("https://storage.googleapis.com/download.tensorflow.org/data/abalone_train.csv", header=None, names=["Length", "Diameter", "Height", "Whole weight", "Shucked weight","Viscera weight", "Shell weight", "Age"])abalone_train.head()abalone_features = abalone_train.copy()abalone_labels = abalone_features.pop('Age')abalone_features = np.array(abalone_features)abalone_featuresabalone_model = tf.keras.Sequential([layers.Dense(64),layers.Dense(1)])abalone_model.compile(loss = tf.losses.MeanSquaredError(),optimizer = tf.optimizers.Adam())abalone_model.fit(abalone_features, abalone_labels, epochs=10)
输出:
第1/10轮 104/104 [==============================] – 0s 1ms/step – 损失: 63.1474 第2/10轮 104/104 [==============================] – 0s 924us/step – 损失: 11.8933 第3/10轮 104/104 [==============================] – 0s 920us/step – 损失: 8.4037 第4/10轮 104/104 [==============================] – 0s 885us/step – 损失: 7.9656 第5/10轮 104/104 [==============================] – 0s 900us/step – 损失: 7.5481 第6/10轮 104/104 [==============================] – 0s 908us/step – 损失: 7.2339 第7/10轮 104/104 [==============================] – 0s 926us/step – 损失: 6.9871 第8/10轮 104/104 [==============================] – 0s 919us/step – 损失: 6.7886 第9/10轮 104/104 [==============================] – 0s 956us/step – 损失: 6.6482 第10/10轮 104/104 [==============================] – 0s 953us/step – 损失: 6.5404 <tensorflow.python.keras.callbacks.History at 0x7f20abb1a518>
现在我想上传另一个CSV文件,这个文件的“Age”列是空白的,然后查看预测结果,但我在这个步骤上卡住了。我学了一些课程,但所有课程都只讲到“epoch”阶段。在“epoch”阶段之后,如何导入我的“空白Age”CSV文件并查看“Age预测”结果呢?
回答:
根据文档(https://www.tensorflow.org/api_docs/python/tf/keras/Sequential#predict),Sequential
对象有一个predict
方法。输入数据可以是:
- NumPy数组
- TensorFlow张量或张量列表
tf.data
数据集
你可以使用abalone_model.predict(YourData)
,其中YourData
是上述提到的数据类型之一。当然,你也可以在自己的训练数据上使用predict()
,但这可能会导致过拟合。如果提供了分离的验证集或测试集,可以尝试在这些数据集上进行预测,或者对现有数据集进行拆分。这里有一个很好的回归问题示例,类似于你遇到的问题:https://www.tensorflow.org/tutorials/keras/regression