Google Debuts Gemini 3.1 Flash-Lite Model for Low-Cost AI Workloads
In Focus
- The new Flash-Lite AI model is the fastest in the Gemini 3 series
- The AI model can handle light agentic tasks
- Gemini 3.1 Flash-Lite is available in preview mode in Google AI Studio
Google has launched the Gemini 3.1 Flash-Lite AI model. According to Gadgets360, the tech giant has termed the AI model as the fastest and most cost-efficient in the Gemini 3 series. Google’s Gemini 3.1 Flash-Lite is also designed to handle high-volume developer workloads. Google said the Gemini 3.1 AI model offers higher output speeds compared to the 2.5 series.
Gemini 3.1 Flash-Lite Offers With Unique Capabilities
Gemini 3.1 Flash-Lite comes with multiple capabilities. The AI model is designed to support multimodal inputs and complete speech-to-text tasks fast and at scale. With these capabilities, users can transcribe audio files like memos, recordings, and voice inputs into text. Additionally, they can use prompts to generate their transcriptions in specific formats.
Gemini 3.1 Flash-Lite for developers is able to handle light agentic tasks. Since the new AI model can support structured JSON output, it’s ideal for tasks relating to data classification, extraction, and lightweight processing. Users can generate the outputs they wish to generate using detailed prompts.
Google’s Gemini 3.1 AI model can also manage high-volumes of document workloads. It’s capable of analyzing PDFs to generate clear summaries. The model can compare information across multiple sources and process workflows that require fast review.
This makes it easier to categorize incoming files, run simple pass-fail checks, and extract standard data quickly. Google released the Flash-Lite AI days after it launched Nano Banana 2, its latest AI image generation model that offers lightning-fast reasoning speeds.
Flash-Lite Offers Adaptable Thinking Capabilities
Adjustable thinking levels are a key feature in Google’s Gemini AI model 2026 update. With this feature, developers have the option of determining how much reasoning the model uses in each task. This control is critical for managing high-frequency workloads efficiently. Flash-Lite can expand to handle tasks like large-volume translation and content moderation.
“3.1 Flash-Light is a remarkably competent model. It is lightning fast, but still somehow finds a way to follow all instructions. It is great at tool calling and can rapidly explore codebases in a fraction of the time of bigger models. The intelligence to speed ratio is unparalleled in any other model,” Andrew Carr, Chief Scientist at Cartwheel stated.
The AI model also supports tasks that require deeper reasoning. These include user interface and dashboard development, simulations, and execution of detailed instructions.
How Much Will Gemini 3.1 Flash-Lite Cost?
When releasing Gemini Flash-Lite, Google highlighted the cost-effectiveness of the AI model. The tech giant priced a million input tokens at $0.25, while the same amount of output tokens will retail at $1.5. This is slightly lower than Gemini 2.5 Flash, which costs $0.3 and $2.5 per one million input and output tokens respectively.
The AI model is currently available in preview mode through the Gemini application programming interface (API) through Vertex AI and Google AI Studio.
