尝试向Gemini API上传图像提示时出错

我在使用Postman尝试从Gemini API获取响应时，发送了一个包含图像的提示。我发送请求的地址是： https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent

{  "contents":[    {      "parts":[        {"text": "What is this picture?"},        {          "inline_data": {            "mime_type":"image/jpeg",            "data": "https://i.ibb.co/3mX1qcB/Document-sans-titre-page-0001.jpg"          }        }      ]    }  ]}

我收到的响应是：

{    "error": {        "code": 400,        "message": "Invalid value at 'contents[0].parts[1].inline_data.data' (TYPE_BYTES), Base64 decoding failed for \"https://pastebin.com/raw/kL4WEnnn\"",        "status": "INVALID_ARGUMENT",        "details": [            {                "@type": "type.googleapis.com/google.rpc.BadRequest",                "fieldViolations": [                    {                        "field": "contents[0].parts[1].inline_data.data",                        "description": "Invalid value at 'contents[0].parts[1].inline_data.data' (TYPE_BYTES), Base64 decoding failed for \"https://pastebin.com/raw/kL4WEnnn\""                    }                ]            }        ]    }}

我尝试将图像转换为base64原始文本，上传到pastebin并在请求中使用，但仍然得到相同的错误。有人能帮我吗？

回答：

以下模式如何？

模式1：

在这种模式中，使用了inlineData。

在这种情况下，需要将图像数据（https://i.ibb.co/3mX1qcB/Document-sans-titre-page-0001.jpg）转换为base64数据。首先，我从URL创建了包含base64数据的文本数据，如下所示。文件名为sampleRequestBody.txt。

{"contents":[{"parts":[{"text":"What is this picture?"},{"inline_data":{"mime_type":"image/jpeg","data":"{base64 data converted from image data}"}}]}]}

当使用curl命令时，命令如下所示。

curl -s -X POST \-H "Content-Type: application/json" \-d @sampleRequestBody.txt \"https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key={your API key}"

运行此curl命令时，将获得“测试”部分显示的结果。

模式2：

在这种模式中，使用了fileData。在这种情况下，执行以下流程。

1. 上传图像数据到Gemini

curl "https://i.ibb.co/3mX1qcB/Document-sans-titre-page-0001.jpg" | curl --data-binary @- -X POST -H "Content-Type: image/jpeg" "https://generativelanguage.googleapis.com/upload/v1beta/files?uploadType=media&key={your API key}"

通过此操作，将返回以下结果。

{  "file": {    "name": "files/###s",    "mimeType": "image/jpeg",    "sizeBytes": "1271543",    "createTime": "2024-07-17T00:00:00.000000Z",    "updateTime": "2024-07-17T00:00:00.000000Z",    "expirationTime": "2024-07-19T00:00:00.000000Z",    "sha256Hash": "###",    "uri": "https://generativelanguage.googleapis.com/v1beta/files/###",    "state": "ACTIVE"  }}

请从返回值中复制uri的值。

2. 生成内容

使用uri的值，生成内容如下所示。这里使用了fileData属性。

curl -s -X POST \-d '{"contents":[{"parts":[{"text":"What is this picture?"},{"fileData":{"mimeType":"image/jpeg","fileUri":"https://generativelanguage.googleapis.com/v1beta/files/###"}}]}]}' \-H "Content-Type: application/json" \"https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key={your API key}"

测试：

两种模式都返回以下结果。生成的文本会有所不同。

{  "candidates": [    {      "content": {        "parts": [          {            "text": "This image shows the Swagger documentation for a petstore API.  Swagger is a specification and toolset for describing, documenting, and consuming RESTful web services.  This particular documentation defines the endpoints and data structures for a petstore API.  It outlines how users can interact with the API to create, read, update, and delete pets, as well as manage their inventory."          }        ],        "role": "model"      },      "finishReason": "STOP",      "index": 0,      "safetyRatings": [        {          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",          "probability": "NEGLIGIBLE"        },        {          "category": "HARM_CATEGORY_HATE_SPEECH",          "probability": "NEGLIGIBLE"        },        {          "category": "HARM_CATEGORY_HARASSMENT",          "probability": "NEGLIGIBLE"        },        {          "category": "HARM_CATEGORY_DANGEROUS_CONTENT",          "probability": "NEGLIGIBLE"        }      ]    }  ],  "usageMetadata": {    "promptTokenCount": 263,    "candidatesTokenCount": 76,    "totalTokenCount": 339  }}

学技术

尝试向Gemini API上传图像提示时出错

模式1：

模式2：

1. 上传图像数据到Gemini

2. 生成内容

测试：

参考资料：

发表回复取消回复

模式1：

模式2：

1. 上传图像数据到Gemini

2. 生成内容

测试：

参考资料：

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复