For mobile and web apps, the Vertex AI in Firebase SDKs let you interact with the supported Gemini models directly from your app.
Gemini models are considered multimodal because they're capable of processing and even generating multiple modalities, including text, code, PDFs, images, video, and audio.
Here's a brief overview of supported models for Vertex AI in Firebase and their latest stable versions. The sections later on this page provide more detailed comparisons and information.
Model | Input | Output | Optimized for |
---|---|---|---|
Gemini models | |||
Gemini 2.0 Flashgemini-2.0-flash-001
|
text, code, PDFs, images, video, audio | text, code, JSON (images & audio coming soon!) |
Next generation features, speed, and multimodal generation for a diverse variety of tasks |
Gemini 1.5 Progemini-1.5-pro-002 |
text, code, PDFs, images, video, audio | text, code, JSON | Complex reasoning tasks requiring more intelligence |
Gemini 1.5 Flashgemini-1.5-flash-002 |
text, code, PDFs, images, video, audio | text, code, JSON | Fast and versatile performance across a diverse variety of tasks |
Yes, Vertex AI in Firebase supports all Gemini models, including older models.
Model | Input | Output | Optimized for |
---|---|---|---|
Gemini 1.0 Pro Visiongemini-1.0-pro-vision-001
|
text, code, PDFs, images, video (frames only) | text, code | Handles text, images, and video for text or code responses. Cannot be used for chat. |
Gemini 1.0 Progemini-1.0-pro-002
|
text, code | text, code | Natural language tasks, multi-turn text and code chat, and code generation |
At the bottom of this page, you can view detailed information about older models. Review our FAQ about all the models that Vertex AI in Firebase supports and does not support.
The remainder of this page provides detailed information about the models supported by Vertex AI in Firebase:
-
- Supported input and output
- High-level comparison of the supported capabilities
- Specifications and limitations, for example max input tokens or max length of input video
Description of how models are versioned, specifically their stable, auto-updated, and preview versions
Lists of available model names to include in your code during initialization
Lists of supported languages for the models
At the bottom of this page, you can view detailed information about older models.
Compare models
Each model has different capabilities to support various use cases. Note that each of tables in this section describe each model when used with Vertex AI in Firebase. Each model might have additional capabilities that aren't available when using our SDKs.
You can learn more about each of the Gemini models in the Google Cloud documentation.
Supported input and output
These are the supported input and output types when using each model with Vertex AI in Firebase:
Gemini 2.0 Flash | Gemini 1.5 Pro | Gemini 1.5 Flash | ||
---|---|---|---|---|
Input types | ||||
Text | ||||
Code | ||||
Documents (PDFs or plain-text) | ||||
Images, Video, and Audio | ||||
Audio (streaming) | coming soon! | |||
Output types | ||||
Text | ||||
Structured output (like JSON) | ||||
Code | ||||
Images | coming soon! | |||
Audio | coming soon! | |||
Audio (streaming) | coming soon! |
To learn about supported file types, see Supported input files and requirements for the Vertex AI Gemini API.
Supported capabilities and features
These are the supported capabilities and features when using each model with Vertex AI in Firebase:
Gemini 2.0 Flash | Gemini 1.5 Pro | Gemini 1.5 Flash | ||
---|---|---|---|---|
Generate text from text or multimodal inputs | ||||
Generate images | coming soon! | |||
Generate audio | coming soon! | |||
Generate structured output (like JSON) | ||||
Analyze images and video (vision) | ||||
Analyze audio | ||||
Analyze documents (PDFs or plain-text) | ||||
Multi-turn chat | ||||
Function calling (tools) | ||||
Basic function calling | ||||
Parallel function calling | ||||
Function calling mode | ||||
Count tokens and billable characters | ||||
System instructions | ||||
Multimodal Live API (bidirectional streaming) | coming soon! |
Specifications and limitations
These are the specifications and limitations when using each model with Vertex AI in Firebase:
Property | Gemini 2.0 Flash | Gemini 1.5 Pro | Gemini 1.5 Flash |
---|---|---|---|
Context window * Total token limit (combined input+output) |
1,048,576 tokens | 2,097,152 tokens | 1,048,576 tokens |
Output token limit * | 8,192 tokens | 8,192 tokens | 8,192 tokens |
Knowledge cutoff date | June 2024 | May 2024 | May 2024 |
Images (per request) | |||
Max number of input images | 3,000 images | 3,000 images | 3,000 images |
Max number of output images | coming soon! | --- | --- |
Max size per input base64-encoded image | 7 MB | 7 MB | 7 MB |
PDFs (per request) | |||
Max number of input PDF files ** | 3,000 files | 3,000 files | 3,000 files |
Max number of pages per input PDF file ** | 1,000 pages | 1,000 pages | 1,000 pages |
Max size per input PDF file | 50 MB | 50 MB | 50 MB |
Video (per request) | |||
Max number of input video files | 10 files | 10 files | 10 files |
Max length of all input video (frames only) | ~60 minutes | ~60 minutes | ~60 minutes |
Max length of all input video (frames+audio) | ~45 minutes | ~45 minutes | ~45 minutes |
Audio (per request) | |||
Max number of input audio files | 1 file | 1 file | 1 file |
Max number of output audio files | coming soon! | --- | --- |
Max length of all input audio | ~8.4 hours | ~8.4 hours | ~8.4 hours |
Max length of all output audio | coming soon! | --- | --- |
* For all models, a token is equivalent to about 4 characters,
so 100 tokens are about 60-80 English words. For Gemini models, you can
determine the total count of tokens in your requests using
countTokens
.
** PDFs are treated as images, so a single page of a PDF is treated as one image. The number of pages allowed in a request is limited to the number of images the model can support.
Find additional detailed information
Quotas and pricing are different for each model. Pricing also depends on input and output.
Learn about supported input file types, how to specify MIME type, and how to make sure that your input files and multimodal requests meet the requirements and follow best practices in Supported input files and requirements for the Vertex AI Gemini API.
Model versioning and naming patterns
Models are offered in stable, auto-updated, and preview versions.
Stable versions are considered Generally Available.
- Stable versions have model names appended with a
specific three-digit version number, for example
.gemini-2.0-flash-001
- Stable versions have model names appended with a
specific three-digit version number, for example
Auto-updated versions always point to the latest stable version of that model; if a new stable version is released, the auto-updated version automatically starts pointing to that new stable version.
- Auto-updated versions have model names with no
appendage, for example
.gemini-2.0-flash
- Auto-updated versions have model names with no
appendage, for example
Preview versions have new capabilities and are considered not stable. Note that preview versions always point to the latest preview version of that model; if a new preview version is released, any existing preview version automatically starts pointing to that new preview version.
- Preview versions have model names appended with
along with the model's initial release date (-preview
), for example-MMDD
(released on April 9, 2024).gemini-1.5-pro-preview-0409
- Preview versions have model names appended with
Learn more about the available model versions and their lifecycle (Gemini) in the Google Cloud documentation.
Available model names
Model names are the explicit values that you include in your code during initialization of the generative model (which is a required step to call the Gemini API).
You can use the
publishers.models.list
endpoint
to list all available model names. Note that this returned list will include
all models that Vertex AI supports, but Vertex AI in Firebase only supports
the Gemini models described on this page.
Also note that auto-updated versions (for example, gemini-2.0-flash
) aren't
listed because they're a convenience alias for the base stable model.
Gemini model names
For initialization examples for your language, see the getting started guide.
Gemini 2.0 Flash model names
Model name | Description | Release stage | Initial release date | Discontinuation date |
---|---|---|---|---|
Stable versions | ||||
gemini-2.0-flash-001 |
Latest stable version of Gemini 2.0 Flash | General Availability | 2025-02-05 | To be determined |
Auto-updated version | ||||
gemini-2.0-flash |
Points to the latest stable version of 2.0 Flash (currently gemini-2.0-flash-001 |
General Availability | 2025-02-10 | --- |
Gemini 1.5 Pro model names
Model name | Description | Release stage | Initial release date | Discontinuation date |
---|---|---|---|---|
Stable versions | ||||
gemini-1.5-pro-002 |
Latest stable version of Gemini 1.5 Pro | General Availability | 2024-09-24 | No earlier than 2025-09-24 |
gemini-1.5-pro-001 |
Initial stable version of Gemini 1.5 Pro | General Availability | 2024-05-24 | No earlier than 2025-05-24 |
Auto-updated version | ||||
gemini-1.5-pro |
Points to the latest stable version of 1.5 Pro (currently gemini-1.5-pro-002 |
General Availability | 2024-09-24 | --- |
Gemini 1.5 Flash model names
Model name | Description | Release stage | Initial release date | Discontinuation date |
---|---|---|---|---|
Stable versions | ||||
gemini-1.5-flash-002 |
Latest stable version of Gemini 1.5 Flash | General Availability | 2024-09-24 | No earlier than 2025-09-24 |
gemini-1.5-flash-001 |
Initial stable version of Gemini 1.5 Flash | General Availability | 2024-05-24 | No earlier than 2025-05-24 |
Auto-updated version | ||||
gemini-1.5-flash |
Points to the latest stable version of 1.5 Flash (currently gemini-1.5-flash-002 |
General Availability | 2024-09-24 | --- |
Supported languages
Gemini
All the Gemini models can understand and respond in the following languages:
Arabic (ar), Bengali (bn), Bulgarian (bg), Chinese simplified and traditional (zh), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hebrew (iw), Hindi (hi), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Latvian (lv), Lithuanian (lt), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Serbian (sr), Slovak (sk), Slovenian (sl), Spanish (es), Swahili (sw), Swedish (sv), Thai (th), Turkish (tr), Ukrainian (uk), Vietnamese (vi)
Gemini 1.5 Pro and Gemini 1.5 Flash models can understand and respond in the following additional languages:
Afrikaans (af), Amharic (am), Assamese (as), Azerbaijani (az), Belarusian (be), Bosnian (bs), Catalan (ca), Cebuano (ceb), Corsican (co), Welsh (cy), Dhivehi (dv), Esperanto (eo), Basque (eu), Persian (fa), Filipino (Tagalog) (fil), Frisian (fy), Irish (ga), Scots Gaelic (gd), Galician (gl), Gujarati (gu), Hausa (ha), Hawaiian (haw), Hmong (hmn), Haitian Creole (ht), Armenian (hy), Igbo (ig), Icelandic (is), Javanese (jv), Georgian (ka), Kazakh (kk), Khmer (km), Kannada (kn), Krio (kri), Kurdish (ku), Kyrgyz (ky), Latin (la), Luxembourgish (lb), Lao (lo), Malagasy (mg), Maori (mi), Macedonian (mk), Malayalam (ml), Mongolian (mn), Meiteilon (Manipuri) (mni-Mtei), Marathi (mr), Malay (ms), Maltese (mt), Myanmar (Burmese) (my), Nepali (ne), Nyanja (Chichewa) (ny), Odia (Oriya) (or), Punjabi (pa), Pashto (ps), Sindhi (sd), Sinhala (Sinhalese) (si), Samoan (sm), Shona (sn), Somali (so), Albanian (sq), Sesotho (st), Sundanese (su), Tamil (ta), Telugu (te), Tajik (tg), Uyghur (ug), Urdu (ur), Uzbek (uz), Xhosa (xh), Yiddish (yi), Yoruba (yo), Zulu (zu)
Information about older models
Vertex AI in Firebase supports all Gemini models, including older models like Gemini 1.0 Pro and Gemini 1.0 Pro Vision. However, we strongly recommend using a newer model with our SDKs. These older Gemini models are approaching their discontinuation date and don't offer all the capabilities of the newer models.
These are the input and output types when using each model with Vertex AI in Firebase:
Gemini 1.0 Pro Vision | Gemini 1.0 Pro | |||
---|---|---|---|---|
Input types | ||||
Text | ||||
Code | ||||
Image | ||||
Documents (PDFs or plain text) | ||||
Video (frames only) | ||||
Video (frames+audio) | ||||
Audio | ||||
Output types | ||||
Text | ||||
Code | ||||
Images, Video, and Audio |
These are the capabilities and features when using each model with Vertex AI in Firebase:
Gemini 1.0 Pro Vision | Gemini 1.0 Pro | ||
---|---|---|---|
Generate text from text-only input | |||
Generate text from multimodal input | |||
Generate images, video, or audio | |||
Generate structured output (like JSON) using response schema | |||
Multi-turn chat | |||
Function calling (tools) | |||
Basic function calling | |||
Parallel function calling | |||
Function calling mode | |||
Count tokens and billable characters | |||
System instructions |
These are the specifications and limitations when using each model with Vertex AI in Firebase:
Property | Gemini 1.0 Pro Vision | Gemini 1.0 Pro |
---|---|---|
Context window * Total token limit (combined input+output) |
16,384 tokens | 32,760 tokens |
Output token limit * | 2,048 tokens | 8,192 tokens |
Knowledge cutoff date | February 2023 | February 2023 |
Images (per request) | ||
Max number of input images | 16 images | --- |
Max size per base64-encoded input image | 7 MB | --- |
PDFs (per request) | ||
Max number of input PDF files ** | 16 files | --- |
Max number of pages per input PDF file ** | 16 pages | --- |
Max size per input PDF file | 50 MB | --- |
Video (per request) | ||
Max number of input video files | 1 file | --- |
Max length of all input video (frames only) | 2 minutes | --- |
Max length of all input video (frames+audio) | --- | --- |
Audio (per request) | ||
Max number of input audio files | --- | --- |
Max length of all input audio | --- | --- |
* For all models, a token is equivalent to about 4 characters,
so 100 tokens are about 60-80 English words. For Gemini models, you can
determine the total count of tokens in your requests using
countTokens
.
** PDFs are treated as images, so a single page of a PDF is treated as one image. The number of pages allowed in a request is limited to the number of images the model can support.
Gemini 1.0 Pro Vision model names
Model name | Description | Release stage | Initial release date | Discontinuation date |
---|---|---|---|---|
Stable versions | ||||
gemini-1.0-pro-vision-001 |
Latest stable version of Gemini 1.0 Pro Vision | General Availability | 2024-02-15 | No earlier than 2025-02-15 |
Auto-updated version | ||||
gemini-1.0-pro-vision |
Points to the latest stable version of 1.5 Pro Vision (currently gemini-1.5-pro-vision-001 |
General Availability | 2024-01-04 | --- |
Gemini 1.0 Pro model names
Model name | Description | Release stage | Initial release date | Discontinuation date |
---|---|---|---|---|
Stable versions | ||||
gemini-1.0-pro-002 |
Latest stable version of Gemini 1.0 Pro | General Availability | 2024-04-09 | No earlier than 2025-04-09 |
gemini-1.0-pro-001 |
Stable version of Gemini 1.0 Pro | General Availability | 2024-02-15 | No earlier than 2025-02-15 |
Auto-updated version | ||||
gemini-1.0-pro |
Points to the latest stable version of 1.0 Pro (currently gemini-1.0-pro-002 |
General Availability | 2024-02-15 | --- |
Next steps
Try out the capabilities of the Gemini API
- Build multi-turn conversations (chat).
- Generate text from text-only prompts.
- Generate text from multimodal prompts (including text, images, PDFs, video, and audio).
- Generate structured output (like JSON) from both text and multimodal prompts.
- Use function calling to connect generative models to external systems and information.