使用TensorFlow Lite C API调整使用make_image_classifier创建的模型的输入维度

如果这个问题看起来很熟悉，请原谅，我之前发布了一个更广泛的问题描述，但由于我在调查中取得了一些进展，可以缩小到更具体的问题，所以我已经删除了之前的帖子。

背景：

我正在使用make_image_classifier创建一个图像分类模型。
我想使用C API加载生成的模型并对图像进行标记。我在这里遇到了数据输入问题。
我可以使用label_image.py示例标记图像，因此模型没有问题，问题出在我对C API的使用上。
如果我正确理解make_image_classifier，它生成的模型期望一个4维输入。我们处理的是图像，所以除了宽度、高度和通道之外，我不知道这第四个维度是什么。这种缺乏理解可能是我的问题的根源。
我在代码中加入了一些错误处理，我遇到的错误是在尝试在调整大小后从输入缓冲区复制时发生的。

问题：

Q1: 为什么make_image_classifier生成的模型期望一个4维输入？有高度、宽度和通道，但第四个是什么？

当我使用C API以我的图像输入运行模型时，执行以下操作：

int inputDims[3] = {224, 224, 3};tflStatus = TfLiteInterpreterResizeInputTensor(interpreter, 0, inputDims, 3);

我得到以下错误：

ERROR: tensorflow/lite/kernels/conv.cc:329 input->dims->size != 4 (3 != 4)ERROR: Node number 2 (CONV_2D) failed to prepare.

所以我最终做了以下操作：

int inputDims[4] = {1, 224, 224, 3};tflStatus = TfLiteInterpreterResizeInputTensor(interpreter, 0, inputDims, 4);

据我所知，第一个维度大小是批处理大小，以防我想要处理多个图像。这是正确的吗？

Q2: 我应该以与调用TfLiteInterpreterResizeInputTensor时使用的相同维度结构来构建我的数据输入吗？我在使用以下图像RGB输入缓冲区时遇到了这个问题中的错误：

// RGB范围是0-255。将其缩放到0-1。for(int i = 0; i < imageSize; i++){    imageDataBuffer[i] = (float)pImage[i] / 255.0;}

当我构建一个模仿给TfLiteInterpreterResizeInputTensor的输入维度的输入时，我也遇到了错误，但这看起来很傻：

float imageData[1][224][224][3];int j = 0;for(int h = 0; h < 224; h++){  for(int w = 0; w < 224; w++){    imageData[0][h][w][0] = (float)pImage[j] * (1.0 / 255.0);    imageData[0][h][w][1] = (float)pImage[j+1] * (1.0 / 255.0);    imageData[0][h][w][2] = (float)pImage[j+2] * (1.0 / 255.0);    j = j + 3;  }}

最后一个输入结构类似于在Python的label_image.py中使用时使用的输入结构，它执行以下操作：

input_data = np.expand_dims(img, axis=0)

Q3: 我的输入缓冲区有什么问题导致TfLiteTensorCopyFromBuffer返回错误代码？

谢谢！

完整代码：

#include "tensorflow/lite/c/c_api.h"#include "tensorflow/lite/c/c_api_experimental.h"#include "tensorflow/lite/c/common.h"#include "tensorflow/lite/c/builtin_op_data.h"#include "tensorflow/lite/c/ujpeg.h"#include <stdio.h>#include <stdlib.h>#include <string.h>// 释放模型和解释器对象。int disposeTfLiteObjects(TfLiteModel* pModel, TfLiteInterpreter* pInterpreter){    if(pModel != NULL)    {      TfLiteModelDelete(pModel);    }    if(pInterpreter)    {      TfLiteInterpreterDelete(pInterpreter);    }}// 主函数。int main(void) {    TfLiteStatus tflStatus;    // 创建JPEG图像对象。    ujImage img = ujCreate();    // 解码JPEG文件。    ujDecodeFile(img, "image_224x224.jpeg");    // 检查解码是否成功。    if(ujIsValid(img) == 0){        return 1;    }        // 始终有3个通道。    int channel = 3;    // 高度始终为224，无需调整大小。    int height = ujGetHeight(img);    // 宽度始终为224，无需调整大小。    int width = ujGetWidth(img);    // 图像大小是通道 * 高度 * 宽度。    int imageSize = ujGetImageSize(img);    // 从解码的JPEG图像输入文件中获取RGB数据。    uint8_t* pImage = (uint8_t*)ujGetImage(img, NULL);    // 收集JPEG RGB值的数组。    float imageDataBuffer[imageSize];    // RGB范围是0-255。将其缩放到0-1。    int j=0;    for(int i = 0; i < imageSize; i++){        imageDataBuffer[i] = (float)pImage[i] / 255.0;    }    // 加载模型。    TfLiteModel* model = TfLiteModelCreateFromFile("model.tflite");    // 创建解释器。    TfLiteInterpreter* interpreter = TfLiteInterpreterCreate(model, NULL);    // 分配张量。    tflStatus = TfLiteInterpreterAllocateTensors(interpreter);    // 若发生错误，则记录并退出。    if(tflStatus != kTfLiteOk)    {      printf("分配张量时出错。\n");      disposeTfLiteObjects(model, interpreter);      return 1;    }        int inputDims[4] = {1, 224, 224, 3};    tflStatus = TfLiteInterpreterResizeInputTensor(interpreter, 0, inputDims, 4);    // 若发生错误，则记录并退出。    if(tflStatus != kTfLiteOk)    {      printf("调整张量大小时出错。\n");      disposeTfLiteObjects(model, interpreter);      return 1;    }    tflStatus = TfLiteInterpreterAllocateTensors(interpreter);    // 若发生错误，则记录并退出。    if(tflStatus != kTfLiteOk)    {      printf("调整大小后分配张量时出错。\n");      disposeTfLiteObjects(model, interpreter);      return 1;    }    // 输入张量。    TfLiteTensor* inputTensor = TfLiteInterpreterGetInputTensor(interpreter, 0);    // 将JPEG图像数据复制到输入张量中。    tflStatus = TfLiteTensorCopyFromBuffer(inputTensor, imageDataBuffer, imageSize * sizeof(float));        // 若发生错误，则记录并退出。    // 修复：错误在这里发生。    if(tflStatus != kTfLiteOk)    {      printf("从缓冲区复制输入时出错。\n");      disposeTfLiteObjects(model, interpreter);      return 1;    }    // 调用解释器。    tflStatus = TfLiteInterpreterInvoke(interpreter);    // 若发生错误，则记录并退出。    if(tflStatus != kTfLiteOk)    {      printf("调用解释器时出错。\n");      disposeTfLiteObjects(model, interpreter);      return 1;    }    // 提取输出张量数据。    const TfLiteTensor* outputTensor = TfLiteInterpreterGetOutputTensor(interpreter, 0);    // 有三个可能的标签。相应地调整输出大小。    float output[3];    tflStatus = TfLiteTensorCopyToBuffer(outputTensor, output, 3 * sizeof(float));    // 若发生错误，则记录并退出。    if(tflStatus != kTfLiteOk)    {      printf("将输出复制到缓冲区时出错。\n");      disposeTfLiteObjects(model, interpreter);      return 1;    }    // 打印分类结果。    printf("置信度: %f, %f, %f.\n", output[0], output[1], output[2]);     // 释放TensorFlow对象。    disposeTfLiteObjects(model, interpreter);        // 释放图像对象。    ujFree(img);        return 0;}

编辑 #1: 好的，所以在TfLiteTensorCopyFromBuffer内部：

TfLiteStatus TfLiteTensorCopyFromBuffer(TfLiteTensor* tensor,                                    const void* input_data,                                    size_t input_data_size) {    if (tensor->bytes != input_data_size) {        return kTfLiteError;    }    memcpy(tensor->data.raw, input_data, input_data_size);    return kTfLiteOk;}

我的input_data_size值是150,528（3通道 x 224像素高度 x 224像素宽度），但tensor->bytes是602,112（3通道 x 448像素高度 x 448像素宽度，我假设？）。我无法理解这种差异，尤其是当我使用{1, 224, 224, 3}调用TfLiteInterpreterResizeInputTensor时。

编辑 #2: 我相信我在这里找到了答案这里。确认后将解决此帖子。

回答：

我在编辑 #2 中链接的解决方案是答案。最终，我只需将以下内容替换为：

TfLiteTensorCopyFromBuffer(inputTensor, imageDataBuffer, imageSize);

替换为：

TfLiteTensorCopyFromBuffer(inputTensor, imageDataBuffer, imageSize * sizeof(float));

谢谢！

学技术

使用TensorFlow Lite C API调整使用make_image_classifier创建的模型的输入维度

发表回复取消回复

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复