首先,我按照教程中的示例进行了编码,
并编写了以下代码:
import numpy as npimport pandas as pdimport tensorflow as tffrom tensorflow import feature_columnfrom tensorflow.keras import layersfrom sklearn.model_selection import train_test_splitURL = 'https://storage.googleapis.com/applied-dl/heart.csv'dataframe = pd.read_csv(URL)dataframe.head()train, test = train_test_split(dataframe, test_size=0.2)train, val = train_test_split(train, test_size=0.2)def df_to_dataset(dataframe, shuffle=True, batch_size=32): dataframe = dataframe.copy() labels = dataframe.pop('target') ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels)) if shuffle: ds = ds.shuffle(buffer_size=len(dataframe)) ds = ds.batch(batch_size) return dsbatch_size = 32train_ds = df_to_dataset(train, batch_size=batch_size)val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)feature_columns = []age = feature_column.numeric_column("age")# numeric colsfor header in ['age', 'trestbps', 'chol', 'thalach', 'oldpeak', 'slope', 'ca']: feature_columns.append(feature_column.numeric_column(header))# bucketized colsage_buckets = feature_column.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])feature_columns.append(age_buckets)# indicator colsthal = feature_column.categorical_column_with_vocabulary_list( 'thal', ['fixed', 'normal', 'reversible'])thal_one_hot = feature_column.indicator_column(thal)feature_columns.append(thal_one_hot)# embedding colsthal_embedding = feature_column.embedding_column(thal, dimension=8)feature_columns.append(thal_embedding)# crossed colscrossed_feature = feature_column.crossed_column([age_buckets, thal], hash_bucket_size=1000)crossed_feature = feature_column.indicator_column(crossed_feature)feature_columns.append(crossed_feature)feature_layer = tf.keras.layers.DenseFeatures(feature_columns)model = tf.keras.Sequential([ feature_layer, layers.Dense(128, activation='relu'), layers.Dense(128, activation='relu'), layers.Dense(1)])model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), metrics=['accuracy'])model.fit(train_ds, validation_data=val_ds, epochs=5)loss, accuracy = model.evaluate(test_ds)print("Accuracy", accuracy)# Try to use predict to get the same accuracypredictions = model.predict(test_ds)for i, p in enumerate(predictions): print(p, test.iloc[i,-1])
执行后,我得到了准确率为0.6885246。
然后,我尝试使用predict
方法来获取评估数据集的预测结果,但在print(p, test.iloc[i,-1]
中得到的结果是:
[-1.7059733] 0[-0.914219] 0[2.6422875] 1[-0.50430596] 1[-1.2348572] 0[-0.57301724] 0[-2.1014583] 0[-4.370711] 0[0.21761642] 0[-2.8065221] 0[-3.2469923] 0[-0.25715744] 1[0.05394493] 1[1.2391514] 0[-3.7102253] 1[-4.0611124] 0[1.36385] 0[-1.1096503] 0[3.4140522] 1[0.6951326] 0[-3.232728] 0[0.98346126] 0[0.04960524] 0[-0.90004027] 0[1.918218] 0[-0.02936329] 0[-0.55671084] 1[-2.1650188] 1[-4.8975983] 0[-1.5514184] 1[-2.1743653] 0[0.56928] 0[-2.8607953] 0[2.4095147] 0[0.5155109] 1[0.7517127] 0[-1.6738821] 0[-3.733505] 0[2.2426589] 1[-2.6165645] 0[-2.1079547] 0[-1.8746301] 0[-4.116344] 0[0.33854234] 1[-2.3230617] 0[-0.02075209] 1[-0.33064234] 0[1.6755556] 1[1.1898655] 1[0.40846193] 0[-0.33131325] 0[-0.63726294] 0[-2.7144134] 0[-0.48318636] 0[1.516653] 1[2.5299337] 1[-2.1182806] 0[-2.5583768] 1[-0.65298045] 1[-1.4936553] 0[-0.7257029] 0
我的问题是,我应该使用什么方法将浮点数结果转换为二进制(0或1)并与目标进行比较?我的最终目标是获得通过evaluate方法得到的准确率值0.6885246。
在获得解决方案后的编辑:
- 更改为”from_logits=false”
- 更改输出层为”layers.Dense(1, activation=’sigmoid’)”
- 在model.predict之后添加以下代码
final_preds = [1 if x>0.5 else 0 for x in predictions]m = 0for i, p in enumerate(final_preds): if p == test.iloc[i, -1] m += 1print(m / len(final_preds))
运行后,我得到了:
Accuracy 0.68852460.6885245901639344
回答:
我对Tensorflow教程中最近流行的在模型最后一层使用线性激活函数(Dense(1)
)来处理分类问题的做法感到非常惊讶,然后在损失函数中请求from_logits=True
。我猜测这样做的原因可能是为了获得更好的数值稳定性,正如文档中声称的:
from_logits
:是否将y_pred
解释为logit值的张量。默认情况下,我们假设y_pred
包含概率(即,值在[0, 1]之间)。注意:使用from_logits=True
可能更数值稳定。
这里的“默认”指的是损失函数参数的默认值是from_logits=False
。
无论如何,你最终得到的是logit的预测,而不是像之前类似的教程(以及实践中)通常情况下的概率。而logit的问题恰恰在于它们缺乏直观的解释,与概率预测相反。
你应该做的是将你的logit通过一个sigmoid函数传递,将它们转换为概率:
import numpy as npdef sigmoid(x): return 1 / (1 + np.exp(-x))
以你的前四个预测为例:
preds = np.array([-1.7059733, -0.914219, 2.6422875, -0.50430596])sigmoid(preds)# array([0.15368673, 0.28613728, 0.93353404, 0.37652929])
然后使用0.5的阈值将它们转换为“硬”预测:
final_preds = [1 if x>0.5 else 0 for x in preds]final_preds# [0, 0, 1, 0]
以这种形式,你可以将它们与真实值进行比较。
但是,为了避免这种情况,我建议你考虑将最后一层更改为
Dense(1, activation='sigmoid')
并从损失函数定义中移除(from_logits=True)
参数。这样,model.predict
应该返回硬预测(未经测试)。