我在使用Postman尝试从Gemini API获取响应时,发送了一个包含图像的提示。我发送请求的地址是: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent
{ "contents":[ { "parts":[ {"text": "What is this picture?"}, { "inline_data": { "mime_type":"image/jpeg", "data": "https://i.ibb.co/3mX1qcB/Document-sans-titre-page-0001.jpg" } } ] } ]}
我收到的响应是:
{ "error": { "code": 400, "message": "Invalid value at 'contents[0].parts[1].inline_data.data' (TYPE_BYTES), Base64 decoding failed for \"https://pastebin.com/raw/kL4WEnnn\"", "status": "INVALID_ARGUMENT", "details": [ { "@type": "type.googleapis.com/google.rpc.BadRequest", "fieldViolations": [ { "field": "contents[0].parts[1].inline_data.data", "description": "Invalid value at 'contents[0].parts[1].inline_data.data' (TYPE_BYTES), Base64 decoding failed for \"https://pastebin.com/raw/kL4WEnnn\"" } ] } ] }}
我尝试将图像转换为base64原始文本,上传到pastebin并在请求中使用,但仍然得到相同的错误。有人能帮我吗?
回答:
以下模式如何?
模式1:
在这种模式中,使用了inlineData
。
在这种情况下,需要将图像数据(https://i.ibb.co/3mX1qcB/Document-sans-titre-page-0001.jpg
)转换为base64数据。首先,我从URL创建了包含base64数据的文本数据,如下所示。文件名为sampleRequestBody.txt
。
{"contents":[{"parts":[{"text":"What is this picture?"},{"inline_data":{"mime_type":"image/jpeg","data":"{base64 data converted from image data}"}}]}]}
当使用curl命令时,命令如下所示。
curl -s -X POST \-H "Content-Type: application/json" \-d @sampleRequestBody.txt \"https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key={your API key}"
运行此curl命令时,将获得“测试”部分显示的结果。
模式2:
在这种模式中,使用了fileData
。在这种情况下,执行以下流程。
1. 上传图像数据到Gemini
curl "https://i.ibb.co/3mX1qcB/Document-sans-titre-page-0001.jpg" | curl --data-binary @- -X POST -H "Content-Type: image/jpeg" "https://generativelanguage.googleapis.com/upload/v1beta/files?uploadType=media&key={your API key}"
通过此操作,将返回以下结果。
{ "file": { "name": "files/###s", "mimeType": "image/jpeg", "sizeBytes": "1271543", "createTime": "2024-07-17T00:00:00.000000Z", "updateTime": "2024-07-17T00:00:00.000000Z", "expirationTime": "2024-07-19T00:00:00.000000Z", "sha256Hash": "###", "uri": "https://generativelanguage.googleapis.com/v1beta/files/###", "state": "ACTIVE" }}
请从返回值中复制uri
的值。
2. 生成内容
使用uri
的值,生成内容如下所示。这里使用了fileData
属性。
curl -s -X POST \-d '{"contents":[{"parts":[{"text":"What is this picture?"},{"fileData":{"mimeType":"image/jpeg","fileUri":"https://generativelanguage.googleapis.com/v1beta/files/###"}}]}]}' \-H "Content-Type: application/json" \"https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key={your API key}"
测试:
两种模式都返回以下结果。生成的文本会有所不同。
{ "candidates": [ { "content": { "parts": [ { "text": "This image shows the Swagger documentation for a petstore API. Swagger is a specification and toolset for describing, documenting, and consuming RESTful web services. This particular documentation defines the endpoints and data structures for a petstore API. It outlines how users can interact with the API to create, read, update, and delete pets, as well as manage their inventory." } ], "role": "model" }, "finishReason": "STOP", "index": 0, "safetyRatings": [ { "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "probability": "NEGLIGIBLE" }, { "category": "HARM_CATEGORY_HATE_SPEECH", "probability": "NEGLIGIBLE" }, { "category": "HARM_CATEGORY_HARASSMENT", "probability": "NEGLIGIBLE" }, { "category": "HARM_CATEGORY_DANGEROUS_CONTENT", "probability": "NEGLIGIBLE" } ] } ], "usageMetadata": { "promptTokenCount": 263, "candidatesTokenCount": 76, "totalTokenCount": 339 }}