我正在学习推荐系统。我使用了Tensorflow的随机森林。我的损失结果出现了问题。如何修复我的代码?请帮帮我。
这是x_data的值
shape=(6000,116)
值为0或1
array([[1, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 1, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 1, 1, 0], [0, 0, 0, ..., 0, 0, 1], [0, 0, 0, ..., 0, 0, 1]])
这是y_data的值
shape=(6000,1)
值为0或1
array([[0], [0], [1], ..., [0], [0], [0]])
这是我的代码
def next_batch(x_data, y_data, batch_size): if (len(x_data) != len(y_data)): return None, None batch_mask = np.random.choice(len(x_data), batch_size) x_batch = x_data[batch_mask] y_batch = y_data[batch_mask] return x_batch, y_batchx_train = train.iloc[:, 3:].valuesy_train = train.iloc[:,2:3].valuesx_test = test.iloc[:,2:].valuesx_data = np.array(x_train, dtype=np.float32)y_data = np.array(y_train, dtype=np.int64)test_data = np.array(x_test, dtype=np.float32)# Parametersnum_steps = 500 batch_size = 1024num_classes = 2 num_features = 116num_trees = 10max_nodes = 1000tf.reset_default_graph()# Input and Target placeholdersX = tf.placeholder(tf.float32, shape=[None, num_features])Y = tf.placeholder(tf.int64, shape=[None,1])# Random Forest Parametershparams = tensor_forest.ForestHParams(num_classes=num_classes, num_features=num_features, num_trees=num_trees, max_nodes=max_nodes).fill()#Build the Random Forestforest_graph = tensor_forest.RandomForestGraphs(hparams)# Get training graph and losstrain_op = forest_graph.training_graph(X, Y)loss_op = forest_graph.training_loss(X,Y)# Measure the accuracyinfer_op, _, _ = forest_graph.inference_graph(X)correct_prediction = tf.equal(tf.argmax(infer_op, 1), tf.cast(Y, tf.int64))accuracy_op = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))init_vars = tf.group(tf.global_variables_initializer(), resources.initialize_resources(resources.shared_resources()))sess = tf.Session()sess.run(init_vars)# Trainingfor i in range(1, num_steps + 1): # Prepare Data # Get the next batch of MNIST data (only images are needed, not labels) batch_x, batch_y = next_batch(x_data, y_data, batch_size) _, l = sess.run([train_op, loss_op], feed_dict={X: batch_x, Y: batch_y}) if i % 50 == 0 or i == 1: acc = sess.run(accuracy_op, feed_dict={X: batch_x, Y: batch_y}) print('Step %i, Loss: %f, Acc: %f' % (i, l, acc))
为什么我的损失函数返回负值?
结果
INFO:tensorflow:Constructing forest with params = INFO:tensorflow:{'num_trees': 10, 'max_nodes': 1000, 'bagging_fraction': 1.0, 'feature_bagging_fraction': 1.0, 'num_splits_to_consider': 10, 'max_fertile_nodes': 0, 'split_after_samples': 250, 'valid_leaf_threshold': 1, 'dominate_method': 'bootstrap', 'dominate_fraction': 0.99, 'model_name': 'all_dense', 'split_finish_name': 'basic', 'split_pruning_name': 'none', 'collate_examples': False, 'checkpoint_stats': False, 'use_running_stats_method': False, 'initialize_average_splits': False, 'inference_tree_paths': False, 'param_file': None, 'split_name': 'less_or_equal', 'early_finish_check_every_samples': 0, 'prune_every_samples': 0, 'num_classes': 2, 'num_features': 116, 'bagged_num_features': 116, 'bagged_features': None, 'regression': False, 'num_outputs': 1, 'num_output_columns': 3, 'base_random_seed': 0, 'leaf_model_type': 0, 'stats_model_type': 0, 'finish_type': 0, 'pruning_type': 0, 'split_type': 0}WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/contrib/tensor_forest/python/tensor_forest.py:529: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.Instructions for updating:Deprecated in favor of operator or tf.math.divide.Step 1, Loss: -1.000000, Acc: 0.873047Step 50, Loss: -250.399994, Acc: 0.833313Step 100, Loss: -537.200012, Acc: 0.856388Step 150, Loss: -822.799988, Acc: 0.841568Step 200, Loss: -1001.000000, Acc: 0.835522Step 250, Loss: -1001.000000, Acc: 0.839737Step 300, Loss: -1001.000000, Acc: 0.817566Step 350, Loss: -1001.000000, Acc: 0.816372Step 400, Loss: -1001.000000, Acc: 0.843414Step 450, Loss: -1001.000000, Acc: 0.829651Step 500, Loss: -1001.000000, Acc: 0.839970
回答:
损失只是你试图最小化的标量。它不一定要是正值。
你得到负值损失的一个原因是training_loss
在RandomForestGraphs
中是使用交叉熵损失或负对数似然实现的,参考代码见这里。
另外,正如你所见,损失在后面的迭代中保持不变,我认为进行超参数调优会使树对数据的变化更加robust。你可以从这里参考一些想法。