我已经完成了TensorFlow的入门教程(https://www.tensorflow.org/get_started/get_started_for_beginners),并对代码进行了一些小的修改以适应我的应用。我的情况下的特征列如下:
transaction_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Transaction', vocabulary_list=["buy", "rent"])localization_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Localization', vocabulary_list=["barcelona", "girona"])dimensions_feature_column = tf.feature_column.numeric_column("Dimensions")buy_price_feature_column = tf.feature_column.numeric_column("BuyPrice")rent_price_feature_column = tf.feature_column.numeric_column("RentPrice")my_feature_columns = [ tf.feature_column.indicator_column(transaction_column), tf.feature_column.indicator_column(localization_column), tf.feature_column.bucketized_column(source_column = dimensions_feature_column, boundaries = [50, 75, 100]), tf.feature_column.numeric_column(key='Rooms'), tf.feature_column.numeric_column(key='Toilets'), tf.feature_column.bucketized_column(source_column = buy_price_feature_column, boundaries = [1, 180000, 200000, 225000, 250000, 275000, 300000]), tf.feature_column.bucketized_column(source_column = rent_price_feature_column, boundaries = [1, 700, 1000, 1300])]
之后,我保存了模型,以便在Cloud ML Engine中使用它进行预测。为了导出模型,我在评估模型后添加了以下代码:
feature_spec = tf.feature_column.make_parse_example_spec(my_feature_columns)export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)servable_model_dir = "modeloutput"servable_model_path = classifier.export_savedmodel(servable_model_dir, export_input_fn)
运行代码后,我在“modeloutput”目录中得到了正确的模型文件,并按照https://cloud.google.com/ml-engine/docs/tensorflow/getting-started-training-prediction#deploy_a_model_to_support_prediction(“部署模型以支持预测”)中的说明,在云端创建了模型。
模型版本创建后,我尝试使用以下命令在Cloud Shell中启动在线预测:
gcloud ml-engine predict --model $MODEL_NAME --version v1 --json-instances ../prediction.json
其中$MODEL_NAME是我的模型名称,prediction.json是一个包含以下内容的JSON文件:
{"inputs":[ { "Transaction":"rent", "Localization":"girona", "Dimensions":90, "Rooms":4, "Toilets":2, "BuyPrice":0, "RentPrice":1100 } ]}
然而,预测失败,我收到了以下错误消息:
“error”: “预测失败:处理输入时出错:期望字符串,但得到{u’BuyPrice’: 0, u’Transaction’: u’rent’, u’Rooms’: 4, u’Localization’: u’girona’, u’Toilets’: 2, u’RentPrice’: 1100, u’Dimensions’: 90},类型为’dict’。”
错误很明显,期望的是字符串而不是字典。如果我检查我的SavedModel SignatureDef,我得到了以下信息:
给定的SavedModel SignatureDef包含以下输入:inputs['inputs'] tensor_info: dtype: DT_STRING shape: (-1) name: input_example_tensor:0给定的SavedModel SignatureDef包含以下输出:outputs['classes'] tensor_info: dtype: DT_STRING shape: (-1, 12) name: dnn/head/Tile:0outputs['scores'] tensor_info: dtype: DT_FLOAT shape: (-1, 12) name: dnn/head/predictions/probabilities:0方法名称为:tensorflow/serving/classify
很明显,输入期望的dtype是字符串(DT_STRING),但我不知道如何格式化我的输入数据以使预测成功。我尝试以多种不同的方式编写输入JSON,但始终收到错误。如果我查看教程中如何进行预测(https://www.tensorflow.org/get_started/get_started_for_beginners),我认为很明显,预测输入是以字典形式传递的(教程代码中的predict_x)。
那么,我哪里做错了?我如何使用这些输入数据进行预测?
感谢您的宝贵时间。
基于答案的编辑 ——
根据@Lak的第二个建议,我更新了导出模型的代码,现在看起来是这样的:
export_input_fn = serving_input_fnservable_model_dir = "savedmodeloutput"servable_model_path = classifier.export_savedmodel(servable_model_dir, export_input_fn)...def serving_input_fn():feature_placeholders = { 'Transaction': tf.placeholder(tf.string, [None]), 'Localization': tf.placeholder(tf.string, [None]), 'Dimensions': tf.placeholder(tf.float32, [None]), 'Rooms': tf.placeholder(tf.int32, [None]), 'Toilets': tf.placeholder(tf.int32, [None]), 'BuyPrice': tf.placeholder(tf.float32, [None]), 'RentPrice': tf.placeholder(tf.float32, [None]) }features = { key: tf.expand_dims(tensor, -1) for key, tensor in feature_placeholders.items()}return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)
之后,我创建了一个新模型,并为其提供了以下JSON以获取预测:
{ "Transaction":"rent", "Localization":"girona", "Dimensions":90.0, "Rooms":4, "Toilets":2, "BuyPrice":0.0, "RentPrice":1100.0}
请注意,我从JSON结构中删除了“inputs”,因为在进行预测时收到了“Unexpected tensor name: inputs”的错误。然而,现在我得到了一个新的、更难看的错误:
“error”: “预测失败:模型执行期间出错:AbortionError(code=StatusCode.INVALID_ARGUMENT, details=\”NodeDef提到了Op索引中不存在的属性’T’:int64>; NodeDef: dnn/input_from_feature_columns/input_layer/Transaction_indicator/to_sparse_input/indices = WhereT=DT_BOOL, _output_shapes=[[?,2]], _device=\”/job:localhost/replica:0/task:0/device:CPU:0\”。 (检查你的GraphDef解释二进制文件是否与你的GraphDef生成二进制文件保持最新。)\n\t [[Node: dnn/input_from_feature_columns/input_layer/Transaction_indicator/to_sparse_input/indices = WhereT=DT_BOOL, _output_shapes=[[?,2]], _device=\”/job:localhost/replica:0/task:0/device:CPU:0\”]]\”)”
我再次检查了SignatureDef,并得到了以下信息:
给定的SavedModel SignatureDef包含以下输入: inputs['Toilets'] tensor_info: dtype: DT_INT32 shape: (-1) name: Placeholder_4:0 inputs['Rooms'] tensor_info: dtype: DT_INT32 shape: (-1) name: Placeholder_3:0 inputs['Localization'] tensor_info: dtype: DT_STRING shape: (-1) name: Placeholder_1:0 inputs['RentPrice'] tensor_info: dtype: DT_FLOAT shape: (-1) name: Placeholder_6:0 inputs['BuyPrice'] tensor_info: dtype: DT_FLOAT shape: (-1) name: Placeholder_5:0 inputs['Dimensions'] tensor_info: dtype: DT_FLOAT shape: (-1) name: Placeholder_2:0 inputs['Transaction'] tensor_info: dtype: DT_STRING shape: (-1) name: Placeholder:0给定的SavedModel SignatureDef包含以下输出: outputs['class_ids'] tensor_info: dtype: DT_INT64 shape: (-1, 1) name: dnn/head/predictions/ExpandDims:0 outputs['classes'] tensor_info: dtype: DT_STRING shape: (-1, 1) name: dnn/head/predictions/str_classes:0 outputs['logits'] tensor_info: dtype: DT_FLOAT shape: (-1, 12) name: dnn/logits/BiasAdd:0 outputs['probabilities'] tensor_info: dtype: DT_FLOAT shape: (-1, 12) name: dnn/head/predictions/probabilities:0方法名称为:tensorflow/serving/predict
我在某些步骤中犯了错误吗?谢谢!
新的更新
我已经运行了本地预测,并且成功执行,收到了预期的预测结果。使用的命令如下:
gcloud ml-engine local predict --model-dir $MODEL_DIR --json-instances=../prediction.json
其中MODEL_DIR是包含模型训练生成文件的目录。所以问题似乎出在导出模型上。不知何故,导出的模型在后续预测中使用时不正确。我读到了一些关于TensorFlow版本可能是问题的起因,但我不是很理解。难道我的整个代码不是在同一个TF版本下执行的吗?关于这一点有什么想法吗?
谢谢!
回答:
问题出在你的服务输入函数上。你使用的是build_parsing_serving_input_receiver_fn
,这个函数应该在你发送tf.Example
字符串时使用:
有两种方法可以解决这个问题:
- 发送
tf.Example
example = tf.train.Example(features=tf.train.Features(feature= {'transaction': tf.train.Feature(bytes_list=tf.train.BytesList(value=['rent'])), 'rentPrice': tf.train.Feature(float32_list=tf.train.Float32List(value=[1000.0])) })) string_to_send = example.SerializeToString()
- 更改服务输入函数,以便你可以发送JSON:
def serving_input_fn(): feature_placeholders = { 'transaction': tf.placeholder(tf.string, [None]), ... 'rentPrice': tf.placeholder(tf.float32, [None]), } features = { key: tf.expand_dims(tensor, -1) for key, tensor in feature_placeholders.items() } return tf.estimator.export.ServingInputReceiver(features, feature_placeholders) export_input_fn = serving_input_fn