The latest Gemini models, like Gemini 3.6 Flash, are available to use with Firebase AI Logic! Learn more.

All Imagen models will shut down as early as June 30, 2026. Learn about migrating your apps to use Nano Banana.

Gemini 2.5 models will shut down in October 2026. To avoid service disruptions, update to a newer model (like gemini-3.6-flash or gemini-3.1-flash-image). Any stable Gemini Live API 2.5 models are not impacted. Learn more.

Google 会使用 AI 技术将内容翻译成您偏好的语言。AI 翻译可能包含错误。

使用 Gemini API 分析图片文件

您可以要求 Gemini 模型分析您以内嵌方式（base64 编码）或通过网址提供的图片文件。使用 Firebase AI Logic, 时，您可以直接从应用发出此请求。

借助此功能，您可以执行以下操作：

创建图片说明或回答有关图片的问题
撰写有关图片的短篇故事或诗歌
检测图片中的对象并返回其边界框坐标
根据情感、风格或其他特征为一组图片添加标签或分类

本指南介绍如何根据图片输入生成文本，但您也可以根据图片输入生成图片。

跳转到代码示例

 跳转到流式响应的代码

如需了解使用图片的其他选项，请参阅其他指南
生成结构化输出多轮对话分析设备上的图片生成图片

准备工作

点击您的 Gemini API 提供商，以查看此页面上特定于提供商的内容和代码。

如果尚未完成，请完成入门指南，其中介绍了如何设置 Firebase 项目、将应用连接到 Firebase、添加 SDK、为所选的 Gemini API 提供方初始化后端服务，以及创建 GenerativeModel 实例。

如需测试和迭代提示，我们建议使用 Google AI Studio。

需要示例图片文件吗？

您可以使用此公开提供的文件，其 MIME 类型为 image/jpeg (查看或下载文件)。 https://storage.googleapis.com/cloud-samples-data/generative-ai/image/scones.jpg

支持此功能的模型

本指南介绍如何根据图片输入生成文本，适用于以下 Gemini模型：

gemini-3.1-pro-preview
gemini-3.6-flash（以及较旧的 gemini-3.5-flash）
gemini-3.5-flash-lite（以及较旧的 gemini-3.1-flash-lite）

通用 Gemini 2.5 模型支持此功能，但它们都已废弃。

注意 Firebase AI Logic 尚不支持配置输入媒体分辨率，但此功能即将推出！

根据图片文件（base64 编码）生成文本

在尝试此示例之前，请先完成本指南的准备工作部分，以设置您的项目和应用。
在该部分中，您还需要点击所选 Gemini API提供商的按钮，以便在此页面上看到特定于提供商的内容。

您可以要求 Gemini 模型通过使用文本和图片提示来生成文本，并提供每个输入文件的 mimeType 和文件本身。如需了解输入文件的要求和建议，请参阅本页面的后续内容。

Swift

您可以调用 generateContent() ，根据文本和图片的多模态输入生成文本。

单个文件输入


import FirebaseAILogic

// Initialize the Gemini Developer API backend service.
let ai = FirebaseAI.firebaseAI(backend: .googleAI())

// Create a `GenerativeModel` instance with a model that supports your use case.
let model = ai.generativeModel(modelName: "gemini-3.6-flash")


guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image, prompt)
print(response.text ?? "No text in response.")

多个文件输入


import FirebaseAILogic

// Initialize the Gemini Developer API backend service.
let ai = FirebaseAI.firebaseAI(backend: .googleAI())

// Create a `GenerativeModel` instance with a model that supports your use case.
let model = ai.generativeModel(modelName: "gemini-3.6-flash")


guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image1, image2, prompt)
print(response.text ?? "No text in response.")

Kotlin

您可以调用 generateContent() ，根据文本和图片的多模态输入生成文本。

^{对于 Kotlin，此 SDK 中的方法是挂起函数，需要从协程范围调用。}

单个文件输入


// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
                        .generativeModel("gemini-3.6-flash")


// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To generate text output, call generateContent with the prompt
val response = model.generateContent(prompt)
print(response.text)

多个文件输入

^{对于 Kotlin，此 SDK 中的方法是挂起函数，需要从协程范围调用。}


// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
                        .generativeModel("gemini-3.6-flash")


// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
  image(bitmap1)
  image(bitmap2)
  text("What is different between these pictures?")
}

// To generate text output, call generateContent with the prompt
val response = model.generateContent(prompt)
print(response.text)

Java

您可以调用 generateContent() ，根据文本和图片的多模态输入生成文本。

^{对于 Java，此 SDK 中的方法会返回
ListenableFuture。}

单个文件输入


// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())
        .generativeModel("gemini-3.6-flash");

// Use the GenerativeModelFutures Java compatibility layer which offers
// support for ListenableFuture and Publisher APIs
GenerativeModelFutures model = GenerativeModelFutures.from(ai);


Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content content = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

多个文件输入


// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())
        .generativeModel("gemini-3.6-flash");

// Use the GenerativeModelFutures Java compatibility layer which offers
// support for ListenableFuture and Publisher APIs
GenerativeModelFutures model = GenerativeModelFutures.from(ai);


Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

您可以调用 generateContent() ，根据文本和图片的多模态输入生成文本。

单个文件输入


import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai";

// TODO(developer): Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Gemini Developer API backend service.
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `GenerativeModel` instance with a model that supports your use case.
const model = getGenerativeModel(ai, { model: "gemini-3.6-flash" });


// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and image
  const result = await model.generateContent([prompt, imagePart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

多个文件输入


import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai";

// TODO(developer): Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Gemini Developer API backend service.
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `GenerativeModel` instance with a model that supports your use case.
const model = getGenerativeModel(ai, { model: "gemini-3.6-flash" });


// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  // Prepare images for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To generate text output, call generateContent with the text and images
  const result = await model.generateContent([prompt, ...imageParts]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

您可以调用 generateContent() ，根据文本和图片的多模态输入生成文本。

单个文件输入


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
final model =
      FirebaseAI.googleAI().generativeModel(model: 'gemini-3.6-flash');


// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To generate text output, call generateContent with the text and image
final response = await model.generateContent([
  Content.multi([prompt,imagePart])
]);
print(response.text);

多个文件输入


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
final model =
      FirebaseAI.googleAI().generativeModel(model: 'gemini-3.6-flash');


final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...imageParts])
]);
print(response.text);

Unity

您可以调用 GenerateContentAsync() ，根据文本和图片的多模态输入生成文本。

单个文件输入


using Firebase;
using Firebase.AI;

// Initialize the Gemini Developer API backend service.
var ai = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());

// Create a `GenerativeModel` instance with a model that supports your use case.
var model = ai.GetGenerativeModel(modelName: "gemini-3.6-flash");


// Convert a Texture2D into InlineDataParts
var grayImage = ModelContent.InlineData("image/png",
      UnityEngine.ImageConversion.EncodeToPNG(UnityEngine.Texture2D.grayTexture));

// Provide a text prompt to include with the image
var prompt = ModelContent.Text("What's in this picture?");

// To generate text output, call GenerateContentAsync and pass in the prompt
var response = await model.GenerateContentAsync(new [] { grayImage, prompt });
UnityEngine.Debug.Log(response.Text ?? "No text in response.");

多个文件输入


using Firebase;
using Firebase.AI;

// Initialize the Gemini Developer API backend service.
var ai = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());

// Create a `GenerativeModel` instance with a model that supports your use case.
var model = ai.GetGenerativeModel(modelName: "gemini-3.6-flash");


// Convert Texture2Ds into InlineDataParts
var blackImage = ModelContent.InlineData("image/png",
      UnityEngine.ImageConversion.EncodeToPNG(UnityEngine.Texture2D.blackTexture));
var whiteImage = ModelContent.InlineData("image/png",
      UnityEngine.ImageConversion.EncodeToPNG(UnityEngine.Texture2D.whiteTexture));

// Provide a text prompt to include with the images
var prompt = ModelContent.Text("What's different between these pictures?");

// To generate text output, call GenerateContentAsync and pass in the prompt
var response = await model.GenerateContentAsync(new [] { blackImage, whiteImage, prompt });
UnityEngine.Debug.Log(response.Text ?? "No text in response.");

了解如何选择适合您的使用场景和应用的模型。

流式传输响应

您可以不等待模型生成完整结果，而是使用流式传输来处理部分结果，从而实现更快的互动。如需流式传输响应，请调用 generateContentStream。

查看示例：流式传输根据图片文件生成的文本

Swift

您可以调用 generateContentStream() ，根据文本和图片的多模态输入流式传输生成的文本。

单个文件输入


import FirebaseAILogic

// Initialize the Gemini Developer API backend service.
let ai = FirebaseAI.firebaseAI(backend: .googleAI())

// Create a `GenerativeModel` instance with a model that supports your use case.
let model = ai.generativeModel(modelName: "gemini-3.6-flash")


guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

多个文件输入


import FirebaseAILogic

// Initialize the Gemini Developer API backend service.
let ai = FirebaseAI.firebaseAI(backend: .googleAI())

// Create a `GenerativeModel` instance with a model that supports your use case.
let model = ai.generativeModel(modelName: "gemini-3.6-flash")


guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image1, image2, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

您可以调用 generateContentStream() ，根据文本和图片的多模态输入流式传输生成的文本。

^{对于 Kotlin，此 SDK 中的方法是挂起函数，需要从协程范围调用。}

单个文件输入


// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
                        .generativeModel("gemini-3.6-flash")


// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
model.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

多个文件输入


// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
                        .generativeModel("gemini-3.6-flash")


// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
    image(bitmap1)
    image(bitmap2)
    text("What's different between these pictures?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
model.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

您可以调用 generateContentStream() ，根据文本和图片的多模态输入流式传输生成的文本。

^{对于 Java，此 SDK 中的流式传输方法会返回 Reactive Streams 库中的
Publisher类型。}

单个文件输入


// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())
        .generativeModel("gemini-3.6-flash");

// Use the GenerativeModelFutures Java compatibility layer which offers
// support for ListenableFuture and Publisher APIs
GenerativeModelFutures model = GenerativeModelFutures.from(ai);


Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content prompt = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

多个文件输入


// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())
        .generativeModel("gemini-3.6-flash");

// Use the GenerativeModelFutures Java compatibility layer which offers
// support for ListenableFuture and Publisher APIs
GenerativeModelFutures model = GenerativeModelFutures.from(ai);


Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

您可以调用 generateContentStream() ，根据文本和图片的多模态输入流式传输生成的文本。

单个文件输入


import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai";

// TODO(developer): Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Gemini Developer API backend service.
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `GenerativeModel` instance with a model that supports your use case.
const model = getGenerativeModel(ai, { model: "gemini-3.6-flash" });


// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What do you see?";

  // Prepare image for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and image
  const result = await model.generateContentStream([prompt, imagePart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

多个文件输入


import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai";

// TODO(developer): Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Gemini Developer API backend service.
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `GenerativeModel` instance with a model that supports your use case.
const model = getGenerativeModel(ai, { model: "gemini-3.6-flash" });


// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To stream generated text output, call generateContentStream with the text and images
  const result = await model.generateContentStream([prompt, ...imageParts]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

您可以调用 generateContentStream() ，根据文本和图片的多模态输入流式传输生成的文本。

单个文件输入


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
final model =
      FirebaseAI.googleAI().generativeModel(model: 'gemini-3.6-flash');


// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,imagePart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

多个文件输入


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Gemini Developer API backend service.
// Create a `GenerativeModel` instance with a model that supports your use case.
final model =
      FirebaseAI.googleAI().generativeModel(model: 'gemini-3.6-flash');


final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To stream generated text output, call generateContentStream with the text and images
final response = await model.generateContentStream([
  Content.multi([prompt, ...imageParts])
]);
await for (final chunk in response) {
  print(chunk.text);
}

Unity

您可以调用 GenerateContentStreamAsync() ，根据文本和图片的多模态输入流式传输生成的文本。

单个文件输入


using Firebase;
using Firebase.AI;

// Initialize the Gemini Developer API backend service.
var ai = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());

// Create a `GenerativeModel` instance with a model that supports your use case.
var model = ai.GetGenerativeModel(modelName: "gemini-3.6-flash");


// Convert a Texture2D into InlineDataParts
var gray = ModelContent.InlineData("image/png",
      UnityEngine.ImageConversion.EncodeToPNG(UnityEngine.Texture2D.grayTexture));

// Provide a text prompt to include with the image
var prompt = ModelContent.Text("What's in this picture?");

// To stream generated text output, call GenerateContentStreamAsync and pass in the prompt
var responseStream = model.GenerateContentStreamAsync(new [] { gray, prompt });
await foreach (var response in responseStream) {
  if (!string.IsNullOrWhiteSpace(response.Text)) {
    UnityEngine.Debug.Log(response.Text);
  }
}

多个文件输入


using Firebase;
using Firebase.AI;

// Initialize the Gemini Developer API backend service.
var ai = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());

// Create a `GenerativeModel` instance with a model that supports your use case.
var model = ai.GetGenerativeModel(modelName: "gemini-3.6-flash");


// Convert Texture2Ds into InlineDataParts
var black = ModelContent.InlineData("image/png",
      UnityEngine.ImageConversion.EncodeToPNG(UnityEngine.Texture2D.blackTexture));
var white = ModelContent.InlineData("image/png",
      UnityEngine.ImageConversion.EncodeToPNG(UnityEngine.Texture2D.whiteTexture));

// Provide a text prompt to include with the images
var prompt = ModelContent.Text("What's different between these pictures?");

// To stream generated text output, call GenerateContentStreamAsync and pass in the prompt
var responseStream = model.GenerateContentStreamAsync(new [] { black, white, prompt });
await foreach (var response in responseStream) {
  if (!string.IsNullOrWhiteSpace(response.Text)) {
    UnityEngine.Debug.Log(response.Text);
  }
}

了解如何选择适合您的使用场景和应用的模型。

输入图片文件的要求和建议

请注意，以内嵌数据形式提供的文件在传输过程中会编码为 base64，这会增加请求的大小。如果请求过大，您会收到 HTTP 413 错误。

如需详细了解以下内容，请参阅“支持的输入文件和要求”页面：

在请求中提供文件的不同选项（以内嵌方式或使用文件的网址）
图片文件的要求和最佳实践

支持的图片 MIME 类型

Gemini 多模态模型支持以下图片 MIME 类型：

PNG - image/png
JPEG - image/jpeg
WebP - image/webp

每个请求的限制

对图片中的像素数量没有具体限制。不过，较大的图片会被缩小和填充，以适应最大分辨率 (3072 x 3072)，同时保留其原始宽高比。

每个请求的文件数上限：3,000 个图片文件

您还可以做什么？

了解如何在计算 token 数向模型发送长提示之前。
设置 Cloud Storage for Firebase ，以便您可以在多模态请求中包含大型文件，并获得更易于管理的方式来在提示中提供文件。文件可以包括图片、PDF、视频和音频。
开始考虑为正式版做准备（请参阅正式版核对清单）：
- 尽早设置 Firebase App Check ，以帮助保护 Gemini API 免遭未经授权的客户端滥用。
- 集成 Firebase Remote Config 以更新应用中的值（例如模型名称），而无需发布新应用版本。

试用其他功能

构建多轮对话（聊天）。
根据纯文本提示生成文本。
根据文本和多模态提示生成结构化输出（例如 JSON）。
根据文本和多模态提示生成和修改图片。
使用工具（例如函数调用和依托 Google 搜索进行接地）将 Gemini 模型连接到应用的其他部分以及外部系统和信息。

了解如何控制内容生成

了解提示设计，包括最佳实践、策略和示例提示。
配置模型参数，例如输出 token 数上限、重复输出 token 的概率等。
使用安全设置来调整获得可能被视为有害的响应的可能性。

您还可以使用 Google AI Studio 尝试提示和模型配置，甚至获取 Google AI Studio生成的代码段。

详细了解支持的模型

了解适用于各种使用场景的模型及其配额和定价。

提供反馈有关您的使用体验Firebase AI Logic

使用 Gemini API 分析图片文件 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

准备工作

支持此功能的模型

根据图片文件（base64 编码）生成文本

Swift

单个文件输入

多个文件输入

Kotlin

单个文件输入

多个文件输入

Java

单个文件输入

多个文件输入

Web

单个文件输入

多个文件输入

Dart

单个文件输入

多个文件输入

Unity

单个文件输入

多个文件输入

流式传输响应

查看示例：流式传输根据图片文件生成的文本

Swift

单个文件输入

多个文件输入

Kotlin

单个文件输入

多个文件输入

Java

单个文件输入

多个文件输入

Web

单个文件输入

多个文件输入

Dart

单个文件输入

多个文件输入

Unity

单个文件输入

多个文件输入

输入图片文件的要求和建议

支持的图片 MIME 类型

每个请求的限制

您还可以做什么？

试用其他功能

了解如何控制内容生成

详细了解支持的模型

使用 Gemini API 分析图片文件