Google released Gemini 3.1 Pro on February 19, 2026 — a reasoning-optimized upgrade to the Gemini 3 Pro series that tops 13 of 16 industry benchmarks and introduces a dedicated API endpoint for agentic, custom-tool workflows. This guide covers official benchmark results, full API specifications, pricing, capability comparisons, and access instructions for developers and enterprises.
What Is Gemini 3.1 Pro? (Direct Answer)
Gemini 3.1 Pro is Google's most advanced general-purpose AI model as of February 2026, built on the Gemini 3 Pro architecture and optimized for three core improvements:
- Better thinking: Upgraded chain-of-thought reasoning that scores 77.1% on ARC-AGI-2 — more than double Gemini 3 Pro's prior score
- Improved token efficiency: Reduced verbosity with higher-quality outputs per token
- More grounded, factually consistent responses: Specifically tuned for software engineering behavior, agentic multi-step task execution, and precise custom tool use
According to Google DeepMind's official model card, Gemini 3.1 Pro is “Google's most advanced model for complex tasks” as of its publication date and is natively multimodal, processing text, images, audio, video, and entire code repositories.
Official API Specifications (Google AI for Developers)
The following technical specifications are sourced directly from the official Gemini API documentation at ai.google.dev:
| Property | Details |
|---|---|
| Model code | gemini-3.1-pro-preview |
| Custom tools endpoint | gemini-3.1-pro-preview-customtools |
| Input modalities | Text, Image, Video, Audio, PDF |
| Output modality | Text |
| Input token limit | 1,048,576 (~1,500 A4 pages) |
| Output token limit | 65,536 tokens |
| Knowledge cutoff | January 2025 |
| Release date | February 2026 (Preview) |
| Batch API | Supported |
| Context caching | Supported |
| Code execution | Supported |
| Function calling | Supported |
| Search grounding | Supported |
| Structured outputs | Supported |
| Thinking / reasoning | Supported |
| URL context | Supported |
| Live API | Not supported |
| Image generation | Not supported |
| Audio generation | Not supported |
إن gemini-3.1-pro-preview-customtools Endpoint
For developers building agentic workflows with custom tools (such as view_file, search_code, or bash), Google provides a dedicated endpoint that prioritizes custom tool calls over general-purpose behavior. This endpoint is distinct from the standard gemini-3.1-pro-preview and is specifically optimized for:
- Multi-step agentic pipelines
- Bash and terminal tool use
- Code-centric reasoning workflows
Note: إن
customtoolsendpoint may show quality fluctuations in use cases that do not involve custom tool calling.
Gemini 3.1 Pro Benchmark Results
Gemini 3.1 Pro was evaluated on 16 industry-standard benchmarks. It achieved first place on 13 of 16, with results verified as of February 2026.
Head-to-Head Benchmark Comparison
| Benchmark | Gemini 3.1 Pro | Gemini 3 Pro | Claude Opus 4.6 | GPT-5.2 |
|---|---|---|---|---|
| ARC-AGI-2 (abstract reasoning) | 77.1% | ~31.1% | 68.8% | — |
| GPQA Diamond (expert science) | 94.3% | — | 91.3% | 92.4% |
| Humanity's Last Exam (no tools) | 44.4% | 37.5% | — | 34.5% |
| Humanity's Last Exam (with tools) | 51.4% | — | 53.1% | — |
| SWE-Bench Verified (agentic coding) | 80.6% | — | — | — |
| Terminal-Bench 2.0 | 68.5% | — | — | 77.3%* |
| APEX-Agents (long-horizon tasks) | 33.5% | 18.4% | 29.8% | 23.0% |
| SWE-Bench Pro (Public) | 54.2% | — | — | 56.8%* |
| GDPval-AA Elo (expert tasks) | 1317 | — | — | — |
*GPT-5.3-Codex scores reported on limited benchmarks only. Claude Sonnet 4.6 leads GDPval-AA Elo with 1633.
Key takeaways from benchmarks:
- Gemini 3.1 Pro's biggest leap is on ARC-AGI-2, where it more than doubled its predecessor's score and leads all known competitors.
- It leads on APEX-Agents, nearly doubling Gemini 3 Pro's score — the clearest signal of improved agentic workflow performance.
- Claude Opus 4.6 leads on Humanity's Last Exam (with tools) and GDPval-AA expert tasks. GPT-5.3-Codex leads on terminal and SWE-Bench Pro coding tasks.
- The competitive AI landscape remains multi-winner: no single model dominates every benchmark.
Gemini 3.1 Pro API Pricing
Google maintained the same pricing as Gemini 3 Pro, making Gemini 3.1 Pro a significant value upgrade:
| Token Range | Input Price | Output Price |
|---|---|---|
| Up to 200,000 tokens | $2.00 / 1M tokens | $12.00 / 1M tokens |
| 200,000 – 1,000,000 tokens | $4.00 / 1M tokens | $18.00 / 1M tokens |
Context: At $2/1M input tokens, Gemini 3.1 Pro is priced at less than half the cost of Claude Opus 4.6, while achieving broadly comparable benchmark scores across most evaluations. This price-performance ratio makes it particularly compelling for enterprise teams with high-volume API usage.
Gemini 3.1 Pro vs. Competitors: Decision Table
| Criteria | Gemini 3.1 Pro | Claude Opus 4.6 | GPT-5.2 | DeepSeek V3.2 |
|---|---|---|---|---|
| Best abstract reasoning | ✅ #1 ARC-AGI-2 (77.1%) | ❌ | ❌ | ❌ |
| Best agentic workflows | ✅ #1 APEX-Agents | ❌ | ❌ | ❌ |
| Best expert task ELO | ❌ | ❌ | ❌ | ❌ |
| Best science knowledge | ✅ #1 GPQA Diamond | ❌ | ❌ | ❌ |
| Best long-context tool use | ❌ (Claude wins HLE+tools) | ✅ | ❌ | ❌ |
| Lowest price (comparable tier) | ✅ ~$2/1M input | ❌ ($5+) | ❌ | ✅ (cheaper) |
| 1M token context window | ✅ | ✅ | ✅ | ❌ |
| Custom tools API endpoint | ✅ | ❌ | ❌ | ❌ |
| Multimodal input | ✅ Text, Image, Audio, Video, PDF | ✅ | ✅ | ❌ |
Choose Gemini 3.1 Pro if: You need top-tier reasoning, agentic coding, Google Cloud integration, or the best price-performance ratio among frontier reasoning models.
Consider alternatives if: You need the absolute best performance on long-horizon expert tasks with tool use (Claude Opus 4.6) or terminal-specific coding benchmarks (GPT-5.3-Codex).
Where to Access Gemini 3.1 Pro
For Developers
- Google AI Studio — Free to try at
aistudio.google.com; use model stringgemini-3.1-pro-preview - Gemini API — Full programmatic access; supports Batch API, caching, function calling, structured outputs
- Gemini CLI — Command-line access at
geminicli.com - Google Antigravity — Google's agentic development platform
- Android Studio — Native integration for Android app development
For Enterprise
- Vertex AI — Full Google Cloud integration with enterprise SLAs
- Gemini Enterprise — Workspace-integrated access for business teams
For Consumers
- Gemini App — Available now with higher limits for Google AI Pro and Ultra subscribers
- NotebookLM — Exclusively available to Pro and Ultra plan users
Key Capabilities: What Gemini 3.1 Pro Can Do
Gemini 3.1 Pro is engineered for tasks where a simple answer is not enough. Documented real-world applications include:
- Animated SVG generation from text prompts — website-ready, code-based, infinitely scalable, small file size
- Live data dashboard creation — building a real-time ISS orbital telemetry dashboard from a public API
- 3D interactive design — generating a starling murmuration simulation with hand-tracking and generative audio
- Literary-to-UI translation — converting the themes of Wuthering Heights into a fully functional web portfolio
- Long-document synthesis — processing contracts, research papers, and codebases up to 1M tokens in a single request
- Multi-step agentic tasks — reliable execution of complex, real-world professional workflows via the APEX-Agents benchmark
Safety and Trust
According to Google DeepMind's model card, Gemini 3.1 Pro was evaluated under the Frontier Safety Framework across five risk areas:
- CBRN (chemical, biological, radiological, and nuclear) information risks
- Cybersecurity
- Harmful manipulation
- Machine learning research risks
- Misalignment
The model remained below critical thresholds in all five categories. Automated content safety evaluations also showed marginal improvements over Gemini 3 Pro: +0.10% in text-to-text safety and +0.11% in multilingual safety.
FAQ: Gemini 3.1 Pro Common Questions
Q: What is the model code for Gemini 3.1 Pro in the API? The standard model code is gemini-3.1-pro-preview. For agentic workflows with custom tools, use gemini-3.1-pro-preview-customtools.
Q: What is Gemini 3.1 Pro's context window? The input token limit is 1,048,576 tokens (approximately 1,500 A4 pages). The output token limit is 65,536 tokens.
Q: What benchmarks does Gemini 3.1 Pro lead? It leads 13 of 16 evaluated benchmarks, including ARC-AGI-2 (77.1%), GPQA Diamond (94.3%), Humanity's Last Exam without tools (44.4%), SWE-Bench Verified (80.6%), and APEX-Agents (33.5%).
Q: Does Gemini 3.1 Pro beat Claude Opus 4.6 and GPT-5.2 on all benchmarks? No. While Gemini 3.1 Pro leads most evaluations, Claude Opus 4.6 outperforms it on Humanity's Last Exam with tools (53.1% vs 51.4%) and GDPval-AA expert tasks. GPT-5.3-Codex leads on terminal-specific coding benchmarks.
Q: What does Gemini 3.1 Pro cost? $2.00 per 1M input tokens and $12.00 per 1M output tokens for prompts under 200,000 tokens. Long-context pricing (200K–1M tokens) is $4.00 input / $18.00 output per 1M tokens.
Q: Does Gemini 3.1 Pro support image and video input? Yes. The model is natively multimodal and accepts text, image, video, audio, and PDF inputs, outputting text.
Q: Is Gemini 3.1 Pro available for free? Developers can try it for free in Google AI Studio. Consumer access via the Gemini app and NotebookLM requires a Google AI Pro or Ultra subscription for higher usage limits.
Q: What is the knowledge cutoff for Gemini 3.1 Pro? The knowledge cutoff is January 2025, per the official Google AI developer documentation.
Q: Is Gemini 3.1 Pro generally available? Not yet. As of February 2026, it is in preview. Google has stated general availability is coming soon, pending validation of agentic workflow improvements.
Q: What file types can Gemini 3.1 Pro process via the API? Through the Gemini API, it supports text, images, video, audio files, and PDFs as inputs, all within the 1M token context window limit.



