Gemini 3 Pro & Flash, Gemini 3 Pro Image (nano banana pro), and the latest Gemini Live API native audio models are now available to use with Firebase AI Logic on all platforms!

Trang này được dịch bởi Cloud Translation API.

Tạo văn bản bằng Gemini API

Bạn có thể yêu cầu mô hình Gemini tạo văn bản từ một câu lệnh chỉ có văn bản hoặc một câu lệnh đa phương thức. Khi sử dụng Firebase AI Logic, bạn có thể đưa ra yêu cầu này ngay trong ứng dụng của mình.

Câu lệnh đa phương thức có thể bao gồm nhiều loại dữ liệu đầu vào (chẳng hạn như văn bản cùng với hình ảnh, tệp PDF, tệp văn bản thuần tuý, âm thanh và video).

Hướng dẫn này cho biết cách tạo văn bản từ một câu lệnh chỉ có văn bản và từ một câu lệnh cơ bản có nhiều phương thức bao gồm cả một tệp.

Chuyển đến mã cho dữ liệu đầu vào chỉ có văn bản Chuyển đến mã cho dữ liệu đầu vào đa phương thức Chuyển đến mã cho các câu trả lời được truyền trực tuyến

Xem các hướng dẫn khác để biết thêm các lựa chọn xử lý văn bản
Tạo đầu ra có cấu trúc Cuộc trò chuyện nhiều lượt Truyền trực tuyến hai chiều Tạo văn bản trên thiết bị Tạo hình ảnh từ văn bản

Trước khi bắt đầu

Nhấp vào nhà cung cấp Gemini API để xem nội dung và mã dành riêng cho nhà cung cấp trên trang này.

Nếu bạn chưa thực hiện, hãy hoàn tất hướng dẫn bắt đầu sử dụng. Hướng dẫn này mô tả cách thiết lập dự án Firebase, kết nối ứng dụng với Firebase, thêm SDK, khởi chạy dịch vụ phụ trợ cho nhà cung cấp Gemini API mà bạn chọn và tạo một thực thể GenerativeModel.

Để kiểm thử và lặp lại các câu lệnh, bạn nên sử dụng Google AI Studio.

Tạo văn bản từ dữ liệu đầu vào chỉ có văn bản

Trước khi dùng thử mẫu này, hãy hoàn tất phần Trước khi bắt đầu của hướng dẫn này để thiết lập dự án và ứng dụng của bạn.
Trong phần đó, bạn cũng sẽ nhấp vào một nút cho nhà cung cấp Gemini API mà bạn chọn để xem nội dung dành riêng cho nhà cung cấp trên trang này.

Bạn có thể yêu cầu mô hình Gemini tạo văn bản bằng cách đưa ra câu lệnh chỉ có văn bản.

Swift

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào chỉ có văn bản.


import FirebaseAILogic

// Initialize the Gemini Developer API backend service
let ai = FirebaseAI.firebaseAI(backend: .googleAI())

// Create a `GenerativeModel` instance with a model that supports your use case
let model = ai.generativeModel(modelName: "gemini-2.5-flash")


// Provide a prompt that contains text
let prompt = "Write a story about a magic backpack."

// To generate text output, call generateContent with the text input
let response = try await model.generateContent(prompt)
print(response.text ?? "No text in response.")

Kotlin

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào chỉ có văn bản.

^{Đối với Kotlin, các phương thức trong SDK này là hàm tạm ngưng và cần được gọi qua Phạm vi Coroutine.}


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a model that supports your use case
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
                        .generativeModel("gemini-2.5-flash")


// Provide a prompt that contains text
val prompt = "Write a story about a magic backpack."

// To generate text output, call generateContent with the text input
val response = model.generateContent(prompt)
print(response.text)

Java

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào chỉ có văn bản.

^{Đối với Java, các phương thức trong SDK này sẽ trả về ListenableFuture.}


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a model that supports your use case
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())
        .generativeModel("gemini-2.5-flash");

// Use the GenerativeModelFutures Java compatibility layer which offers
// support for ListenableFuture and Publisher APIs
GenerativeModelFutures model = GenerativeModelFutures.from(ai);


// Provide a prompt that contains text
Content prompt = new Content.Builder()
    .addText("Write a story about a magic backpack.")
    .build();

// To generate text output, call generateContent with the text input
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào chỉ có văn bản.


import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(ai, { model: "gemini-2.5-flash" });


// Wrap in an async function so you can use await
async function run() {
  // Provide a prompt that contains text
  const prompt = "Write a story about a magic backpack."

  // To generate text output, call generateContent with the text input
  const result = await model.generateContent(prompt);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào chỉ có văn bản.


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a model that supports your use case
final model =
      FirebaseAI.googleAI().generativeModel(model: 'gemini-2.5-flash');


// Provide a prompt that contains text
final prompt = [Content.text('Write a story about a magic backpack.')];

// To generate text output, call generateContent with the text input
final response = await model.generateContent(prompt);
print(response.text);

Unity

Bạn có thể gọi GenerateContentAsync() để tạo văn bản từ dữ liệu đầu vào chỉ có văn bản.


using Firebase;
using Firebase.AI;

// Initialize the Gemini Developer API backend service
var ai = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());

// Create a `GenerativeModel` instance with a model that supports your use case
var model = ai.GetGenerativeModel(modelName: "gemini-2.5-flash");


// Provide a prompt that contains text
var prompt = "Write a story about a magic backpack.";

// To generate text output, call GenerateContentAsync with the text input
var response = await model.GenerateContentAsync(prompt);
UnityEngine.Debug.Log(response.Text ?? "No text in response.");

Tìm hiểu cách chọn một mô hình phù hợp với trường hợp sử dụng và ứng dụng của bạn.

Tạo văn bản từ dữ liệu đầu vào dạng văn bản và tệp (đa phương thức)

Bạn có thể yêu cầu mô hình Gemini tạo văn bản bằng cách đưa ra lời nhắc kèm theo văn bản và một tệp, cung cấp mimeType của từng tệp đầu vào và chính tệp đó. Tìm các yêu cầu và đề xuất đối với tệp đầu vào ở phần sau của trang này.

Ví dụ sau đây cho thấy những điểm cơ bản về cách tạo văn bản từ một tệp đầu vào bằng cách phân tích một tệp video duy nhất được cung cấp dưới dạng dữ liệu nội dòng (tệp được mã hoá base64).

Xin lưu ý rằng ví dụ này minh hoạ việc cung cấp tệp nội tuyến, nhưng các SDK cũng hỗ trợ cung cấp URL của YouTube.

Bạn cần một tệp video mẫu?

Bạn có thể sử dụng tệp có sẵn công khai này với loại MIME là video/mp4 (xem hoặc tải tệp xuống). https://storage.googleapis.com/cloud-samples-data/video/animals.mp4

Swift

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào đa phương thức gồm tệp văn bản và video.


import FirebaseAILogic

// Initialize the Gemini Developer API backend service
let ai = FirebaseAI.firebaseAI(backend: .googleAI())

// Create a `GenerativeModel` instance with a model that supports your use case
let model = ai.generativeModel(modelName: "gemini-2.5-flash")


// Provide the video as `Data` with the appropriate MIME type.
let video = InlineDataPart(data: try Data(contentsOf: videoURL), mimeType: "video/mp4")

// Provide a text prompt to include with the video
let prompt = "What is in the video?"

// To generate text output, call generateContent with the text and video
let response = try await model.generateContent(video, prompt)
print(response.text ?? "No text in response.")

Kotlin

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào đa phương thức gồm tệp văn bản và video.

^{Đối với Kotlin, các phương thức trong SDK này là hàm tạm ngưng và cần được gọi qua Phạm vi Coroutine.}


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a model that supports your use case
val model = Firebase.ai(backend = GenerativeBackend.googleAI())
                        .generativeModel("gemini-2.5-flash")


val contentResolver = applicationContext.contentResolver
contentResolver.openInputStream(videoUri).use { stream ->
  stream?.let {
    val bytes = stream.readBytes()

    // Provide a prompt that includes the video specified above and text
    val prompt = content {
        inlineData(bytes, "video/mp4")
        text("What is in the video?")
    }

    // To generate text output, call generateContent with the prompt
    val response = model.generateContent(prompt)
    Log.d(TAG, response.text ?: "")
  }
}

Java

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào đa phương thức gồm tệp văn bản và video.

^{Đối với Java, các phương thức trong SDK này sẽ trả về ListenableFuture.}


// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a model that supports your use case
GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())
        .generativeModel("gemini-2.5-flash");

// Use the GenerativeModelFutures Java compatibility layer which offers
// support for ListenableFuture and Publisher APIs
GenerativeModelFutures model = GenerativeModelFutures.from(ai);


ContentResolver resolver = getApplicationContext().getContentResolver();
try (InputStream stream = resolver.openInputStream(videoUri)) {
    File videoFile = new File(new URI(videoUri.toString()));
    int videoSize = (int) videoFile.length();
    byte[] videoBytes = new byte[videoSize];
    if (stream != null) {
        stream.read(videoBytes, 0, videoBytes.length);
        stream.close();

        // Provide a prompt that includes the video specified above and text
        Content prompt = new Content.Builder()
                .addInlineData(videoBytes, "video/mp4")
                .addText("What is in the video?")
                .build();

        // To generate text output, call generateContent with the prompt
        ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
        Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
            @Override
            public void onSuccess(GenerateContentResponse result) {
                String resultText = result.getText();
                System.out.println(resultText);
            }

            @Override
            public void onFailure(Throwable t) {
                t.printStackTrace();
            }
        }, executor);
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (URISyntaxException e) {
    e.printStackTrace();
}

Web

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào đa phương thức gồm tệp văn bản và video.


import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(ai, { model: "gemini-2.5-flash" });


// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the video
  const prompt = "What do you see?";

  const fileInputEl = document.querySelector("input[type=file]");
  const videoPart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and video
  const result = await model.generateContent([prompt, videoPart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

Bạn có thể gọi generateContent() để tạo văn bản từ dữ liệu đầu vào đa phương thức gồm tệp văn bản và video.


import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a model that supports your use case
final model =
      FirebaseAI.googleAI().generativeModel(model: 'gemini-2.5-flash');


// Provide a text prompt to include with the video
final prompt = TextPart("What's in the video?");

// Prepare video for input
final video = await File('video0.mp4').readAsBytes();

// Provide the video as `Data` with the appropriate mimetype
final videoPart = InlineDataPart('video/mp4', video);

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...videoPart])
]);
print(response.text);

Unity

Bạn có thể gọi GenerateContentAsync() để tạo văn bản từ dữ liệu đầu vào đa phương thức gồm tệp văn bản và video.


using Firebase;
using Firebase.AI;

// Initialize the Gemini Developer API backend service
var ai = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI());

// Create a `GenerativeModel` instance with a model that supports your use case
var model = ai.GetGenerativeModel(modelName: "gemini-2.5-flash");


// Provide the video as `data` with the appropriate MIME type.
var video = ModelContent.InlineData("video/mp4",
      System.IO.File.ReadAllBytes(System.IO.Path.Combine(
          UnityEngine.Application.streamingAssetsPath, "yourVideo.mp4")));

// Provide a text prompt to include with the video
var prompt = ModelContent.Text("What is in the video?");

// To generate text output, call GenerateContentAsync with the text and video
var response = await model.GenerateContentAsync(new [] { video, prompt });
UnityEngine.Debug.Log(response.Text ?? "No text in response.");

Tìm hiểu cách chọn một mô hình phù hợp với trường hợp sử dụng và ứng dụng của bạn.

Hiện câu trả lời theo thời gian thực

Bạn có thể đạt được các lượt tương tác nhanh hơn bằng cách không đợi toàn bộ kết quả từ quá trình tạo mô hình mà thay vào đó, hãy sử dụng tính năng truyền phát trực tiếp để xử lý kết quả một phần. Để truyền trực tuyến câu trả lời, hãy gọi generateContentStream.