[null,null,["最后更新时间 (UTC):2025-08-19。"],[],[],null,["\u003cbr /\u003e\n\nGemini models process input and output in units called *tokens*.\n\nTokens can be single characters like `z` or whole words like `cat`. Long words\nare broken up into several tokens. The set of all tokens used by the model is\ncalled the vocabulary, and the process of splitting text into tokens is called\n*tokenization*.\n\nFor Gemini models, a token is equivalent to about 4 characters.\n100 tokens is equal to about 60-80 English words.\n\nEach model has a\n[maximum number of tokens](/docs/ai-logic/models#specs-and-limitations-comparison)\nthat it can handle in a prompt and response. Knowing the token count of your\nprompt lets you know if you've exceeded this limit. Additionally, the cost of a\nrequest is determined in part by the number of input and output tokens, so\nknowing how to count tokens can be helpful.\n| **Tip:** To control the number of tokens used for generating a response (and thus control costs), you can set the [thinking budget](/docs/ai-logic/thinking) (for 2.5 models only) and the `maxOutputTokens` (all Gemini models) in the [model's configuration](/docs/ai-logic/model-parameters#gemini).\n\nNote that Gemini 1.0 and 1.5 models also supported a\n\"billable characters\" count and pricing, but since those models are all either\nretired or soon-to-be-retired, this page does not describe anything about\nbillable characters.\n\nSupported models\n\n- `gemini-2.5-pro`\n- `gemini-2.5-flash`\n- `gemini-2.5-flash-lite`\n- `gemini-2.0-flash-001` (and its auto-updated alias `gemini-2.0-flash`)\n- `gemini-2.0-flash-lite-001` (and its auto-updated alias `gemini-2.0-flash-lite`)\n- `gemini-2.0-flash-preview-image-generation`\n\n| **Note** : Although all generative models process input and output as tokens, this page and its token counting options are specific only to the *Gemini* models listed above. Specifically, note that the Gemini 2.0 Flash Live model is not supported.\n|\n| For Imagen models, pricing and limits aren't based on tokens.\n\nOptions for counting tokens\n\nAll input and output for the Gemini API is tokenized, including text, image\nfiles, and other non-text modalities. Here are the options for counting tokens:\n\nCheck the token count for your *requests only* (before sending them\nto the model).\n: Call `countTokens` with the input of the request\n *before* sending it to the model. This returns:\n\n - `total_tokens`: token count of the *input only*\n\nCheck the token count for *both your requests and responses*.\n: Use the `usageMetadata` attribute on the response object.\n This includes:\n\n - `prompt_token_count`: token count of the input only\n - `candidates_token_count`: token count of the output only (does not include thinking tokens)\n - `thoughts_token_count`: token count of any thinking tokens used to generate the response\n - `total_token_count`: total count of tokens for *both* the input and the output (includes any thinking tokens)\n\n\n When streaming output, the `usageMetadata` attribute only\n appears on the last chunk of the stream. It's `nil` for\n intermediate chunks.\n\nNote the following points about the options above:\n\n- They will *not* count the number of input images or the number of seconds in video or audio input files. However, the token count for each of these modalities will *correlate* with these values.\n- The input token count includes the prompt (text and any input files) as well as any system instructions and tools.\n- The output token count does not include any thinking tokens; those are provided in a separate field.\n- Review the [additional information specific to each type of request](#additional-information) later on this page.\n\nPricing for these options\n\n- Calling `countTokens`: There's no charge for calling `countTokens`\n (the Count Tokens API). The maximum quota for the Count Tokens API is 3000\n requests per minute (RPM).\n\n- Using the `usageMetadata` attribute: This attribute is always returned as\n part of the response and doesn't incur any tokens or charge itself.\n\nAdditional information\n\nHere's some additional information when working with specific types of requests.\n\nCount text input tokens\n\nNo additional information.\n\nCount multi-turn (chat) tokens\n\nNote the following for calling `countTokens` when using chat:\n\n- If you call `countTokens` with the chat history, it returns the total token count from both roles in the chat (`total_tokens`).\n- To understand how big your next conversational turn will be, you need to append it to the history when you call `countTokens`.\n\nCount multimodal input tokens\n\nNote the following points about counting tokens with multimodal input:\n\n- You can optionally call `countTokens` on the text and the file separately.\n- For both token counting options, you'll get the same token count whether you provide the file as inline data or using its URL.\n\nImage input files\n\nImage input files are converted to tokens based on their dimensions:\n\n- Image inputs with *both* dimensions less than or equal to 384 pixels: each image is counted as 258 tokens.\n- Image inputs that are larger in one or both dimensions: each image is cropped and scaled as needed into tiles of 768x768 pixels, and then each tile is counted as 258 tokens.\n\nVideo and audio input files\n\nVideo and audio input files are converted to tokens at the following fixed\nrates:\n\n- Video: 263 tokens per second\n- Audio: 32 tokens per second\n\nDocument (like PDFs) input files\n\nPDF input files are treated as images, so each page of a PDF is tokenized in the\nsame way as an image."]]