VERTU® Official Site

MiniMax M2.5 Free in Kilo Code: State-of-the-Art Model

Complete Guide to MiniMax's SOTA Breakthrough—Free for One Week in Kilo's CLI, VS Code Extension, and More—Rivaling Claude Opus 4.6, Beating Gemini 3 Pro, at Fraction of the Cost

MiniMax M2.5 represents a quantum leap for the Chinese AI lab, achieving 80.2% on SWE-Bench Verified (human-validated real-world GitHub issues) matching Claude Opus 4.6's standard performance, outperforming Gemini 3 Pro on SWE-Bench Pro (55.4% vs 43.3%) and Multi-SWE-Bench (51.3% vs 50.3%), while delivering 100 tokens per second throughput (3× faster than Opus), all with only 10B activated parameters (smallest Tier-1 model). The Free Access: Available completely free for one week in Kilo Code (CLI, VS Code extension, IDE integrations) with no credits required, making SOTA performance accessible without “frontier tax.” The Architecture: “Total overhaul of M2.1 architecture” engineered specifically for Agent-Verse and agentic workflows, optimized for “thinking efficiency” through planning, achieving $0.3/M input and $0.06/M blended cost with cache (best price for SOTA model). The Competitive Position: MiniMax moves “into big leagues as truly SOTA lab,” blurring “OSS vs Proprietary distinction” with 51.3% Multi-SWE-Bench, 76.3% BrowseComp, rivaling best frontier models while maintaining cost efficiency. The Popularity: M2.1 already “most popular open-weight model on Kilo to date,” with M2.5 poised to “rule every category” on Kilo's leaderboard. The Self-Hosting Advantage: 10B parameter efficiency enables self-hosting without “massive clusters,” offering “unparalleled advantage for developers.” Access Points: Kilo Code via kilo.ai/landing/minimax-m25, CLI installation, VS Code marketplace extension, all IDE integrations, joining 1.5M+ developers using Kilo platform.

Part I: The Breakthrough Performance

SWE-Bench Verified: 80.2% Matching Opus 4.6

The Benchmark: Human-validated subset of real-world GitHub issues testing production-level bug solving

MiniMax M2.5: 80.2% accuracy

Claude Opus 4.6: “Just below 80%” on standard trials

The Significance: “M2.5 sits comfortably at 80.2% out-of-the-box”

Anthropic's Prompt Modification: Opus 4.6 reaches 81.42% with specific prompt adjustment

Real-World Testing: “Fits with what we're seeing with Opus in the wild”

Competitive Assessment: “Formidable powerhouse rivaling best in industry”

SWE-Bench Pro: Dominating Gemini 3 Pro

MiniMax M2.5: 55.4%

Gemini 3 Pro: 43.3%

Performance Gap: +12.1 percentage points

What It Tests: “Increased difficulty and realism of advanced software engineering tasks”

Proof: “Can handle” most rigorous engineering challenges

Multi-SWE-Bench: Complex Multi-Step Superiority

MiniMax M2.5: 51.3%

Gemini 3 Pro: 50.3%

What It Measures: “Complex, multi-step software suites”

Capability Demonstrated: “Superior autonomous execution in long-horizon tasks”

Implication: Better at sustained reasoning and multi-phase problem-solving

BrowseComp: Agentic Search Excellence

MiniMax M2.5: 76.3%

What It Tests: Information retrieval and research capabilities

Agentic Strength: “Catching up to and often surpassing major models like GPT-5.2 and Gemini 3 Pro in coding and research”

The Complete Picture

Benchmark Summary:

  • SWE-Bench Verified: 80.2% (matches Opus 4.6)
  • SWE-Bench Pro: 55.4% (beats Gemini 3 Pro by 12.1 points)
  • Multi-SWE-Bench: 51.3% (edges Gemini 3 Pro)
  • BrowseComp: 76.3% (strong research capability)

Overall Assessment: “Not just incremental update; total overhaul of M2.1 architecture”

Part II: Speed and Efficiency Revolution

Lightning Fast Throughput: 100 TPS

Speed: 100 tokens per second

Versus Opus: 3× faster in early testing

What It Means: Dramatically faster response times

User Experience: Near-instant feedback for complex queries

Agentic Advantage: Rapid iteration in autonomous workflows

Thinking Efficiency Optimization

Training Focus: “Trained to optimize actions and output through planning”

Token Efficiency: More efficient than previous generations

Cost Impact: Lower token consumption for same quality

Planning Integration: Built-in planning reduces wasted tokens

Result: Better performance with fewer resources

The 10B Parameter Advantage

Activated Parameters: Only 10 billion

Significance: “Smallest Tier-1 model in existence”

Comparison: Other Tier-1 models require “massive clusters”

Self-Hosting Benefit: “Unparalleled advantage for developers who want to self-host”

Deployment Flexibility: Feasible on consumer-grade hardware

Cost Savings: Lower infrastructure requirements

Always-On Efficiency

Input Pricing: $0.3/M tokens

Blended Cost with Cache: $0.06/M tokens

Competitive Assessment: “Best price of any SOTA model for always-on agents”

Use Case: Continuous monitoring, real-time assistance, persistent agents

Economic Impact: Enables 24/7 agent deployment affordably

Part III: Free Access Through Kilo Code

The One-Week Free Promotion

Duration: One week from launch (Feb 12, 2026)

Scope: “Completely free for all Kilo users”

No Restrictions: “No credits required—just pure, unadulterated SOTA power”

Access Method: Select MiniMax M2.5 from model dropdown

Philosophy: “Give every developer world's most powerful tools without ‘frontier tax'”

Scale: “Biggest leap yet” for Kilo Code

Platform Integration

Kilo CLI: Command-line interface access

VS Code Extension: Marketplace.visualstudio.com/items?itemName=kilocode.Kilo-Code

IDE Support: All major IDEs integrated

Cloud Access: Kilo Cloud platform

Slack Integration: Kilo for Slack (previously made M2.1 free)

User Base: Join 1.5M+ developers

Installation Process

Step 1: Visit kilo.ai/landing/minimax-m25

Step 2: Click “Install Kilo Code”

Step 3: Choose installation method (CLI, VS Code, etc.)

Step 4: Select MiniMax M2.5 from dropdown

Step 5: Start building immediately

Simplicity: No complex setup, instant access

Part IV: The MiniMax Model Family

MiniMax M2.5 (New Release)

Status: Free in Kilo (one week promotion)

Performance: SOTA on multiple benchmarks

Capabilities:

  • Exceptional reasoning
  • Superior coding
  • Fast inference (100 TPS)
  • Open weights coming

Best For: Production coding, complex problem-solving, agentic workflows

MiniMax M2.1 (High Performer)

Pricing: $0.27/M input tokens

Status: “Most popular open-weight model on Kilo to date”

Performance: “Competitive performance on practical coding benchmarks”

Reliability: “Reliable for production use cases”

Cost Position: “Fraction of frontier model costs”

Best For: Cost-effective production deployment

MiniMax M1-80k (Long Context)

Pricing: $0.80/M input tokens

Context Window: 80,000 tokens

Reasoning: “Advanced chain-of-thought reasoning”

Specialty: “Excellent for complex multi-step tasks”

Capabilities:

  • Deep reasoning
  • Complex task handling
  • Extended context understanding

Best For: Multi-step analysis, large codebases, comprehensive reasoning

Part V: MiniMax Company Context

The Organization

Founded: 2022

Location: China (leading Chinese AI company)

Backing: Major investors including Alibaba, Tencent, HongShan

User Base: Over 200 million users globally

Specialization: Large language models and multi-modal AI technology

Open-Weight Philosophy

Commitment: “Making advanced AI accessible to developers worldwide”

Previous Releases: M2 series, M1 models

Recognition: “Competitive performance on coding and reasoning benchmarks”

Future Plans: “MiniMax typically releases open weights for their models”

M2.5 Timeline: “Expect M2.5 weights on HuggingFace soon”

Community Benefit: Open weights enable research, fine-tuning, self-hosting

Part VI: Agentic Engineering Capabilities

Designed for the Agent-Verse

Core Design: “Engineered from ground up for Agent-Verse”

Primary Role: “Primary workhorse for future workspace”

Kilo's Excitement: “Particularly excited about agentic capabilities for planning and executing large-scale dev projects”

Optimization: “Specifically designed for agentic workflows”

Planning and Execution

Planning Capability: Built-in task decomposition and strategy

Execution Quality: “Superior autonomous execution in long-horizon tasks”

Multi-Step Mastery: Handles complex sequential workflows

Error Recovery: Robust handling of obstacles in multi-phase tasks

Adaptability: Adjusts approach based on intermediate results

Kilo Code Modes for Every Workflow Step

Ask Mode: “Knowledgeable technical assistant focused on answering questions without changing codebase”

Architect Mode: System design and architecture planning

Code Mode: Active code generation and modification

Debug Mode: Issue identification and resolution

Orchestrator Mode: Multi-agent coordination and workflow management

Custom Mode: User-defined specialized behaviors

M2.5 Performance: Expected to “rule every category” on Kilo leaderboard

Part VII: Competitive Landscape

The “OSS vs Proprietary” Blur

Previous Distinction: Clear gap between open-source and proprietary performance

M2.5 Impact: “OSS vs Proprietary distinction is blurring”

SOTA Lab Status: MiniMax now “truly SOTA lab with truly SOTA model”

Market Position: Competing directly with GPT-5.2, Claude Opus 4.6, Gemini 3 Pro

Kilo Code vs. Other Tools

GitHub Copilot: Different pricing model, limited model selection

Cursor: Proprietary editor lock-in

Windsurf: Alternative agentic coding tool

Cline: Autonomous coding agent

Amp Code (Sourcegraph/Cody): Shutting down VS Code extension

Kilo Advantage: Model flexibility, platform agnostic, competitive pricing

The Leaderboard Dominance

Current Status: “M2.1 has been top model on Kilo's leaderboard for every mode except Architect and Orchestrator”

M2.5 Potential: “Will M2.5 push MiniMax over edge to rule every category?”

User Validation: Popularity driven by actual developer preference

Performance Proof: Real-world results on Kilo platform

Part VIII: Practical Applications

Production Bug Solving

Use Case: Real-world GitHub issue resolution

Evidence: 80.2% SWE-Bench Verified

Workflow:

  1. Analyze bug report
  2. Navigate codebase
  3. Identify root cause
  4. Implement fix
  5. Validate solution

Speed: 100 TPS enables rapid iteration

Large-Scale Development Projects

Planning: Task decomposition and sequencing

Execution: Multi-file code generation

Integration: Component coordination

Testing: Validation and debugging

Documentation: Comprehensive commenting

Complex Multi-Step Tasks

Research: Information gathering via BrowseComp capability

Analysis: Deep reasoning through chain-of-thought

Synthesis: Combining multiple sources

Implementation: Translation to working code

Optimization: Iterative refinement

Always-On Agent Deployment

Cost-Effective: $0.06/M blended cost with cache

Use Cases:

  • Code review automation
  • Continuous testing
  • Real-time documentation
  • Issue triage
  • Pull request assistance

Economic Viability: “Best price any SOTA model for always-on agents”

Part IX: Why This Matters

Democratizing SOTA Performance

Philosophy: “Give every developer world's most powerful tools without ‘frontier tax'”

Free Access: No cost barrier for one week

Post-Promotion: Still most affordable SOTA option

Impact: Levels playing field for individual developers, startups, students

The Self-Hosting Future

10B Parameter Efficiency: Makes self-hosting practical

Infrastructure Savings: No massive cluster requirements

Data Sovereignty: Keep models on own infrastructure

Customization: Fine-tune for specific domains

Independence: Not dependent on API availability

Moving the Industry Forward

Quote: “Best way to move industry forward is to put best models in hands of every developer”

Competition Effect: Pressure on proprietary labs to improve

Innovation Acceleration: More developers with SOTA tools create faster progress

Knowledge Sharing: Open weights enable community research

Conclusion: The New SOTA Standard

What M2.5 Achieves

Performance: 80.2% SWE-Bench Verified matching Opus 4.6

Speed: 100 TPS, 3× faster than Opus

Efficiency: 10B parameters, smallest Tier-1 model

Cost: $0.06/M blended, best SOTA pricing

Versatility: Excellence across coding, reasoning, research

Accessibility: Free in Kilo for one week, open weights coming

Why It's Revolutionary

Blurs OSS/Proprietary Gap: Open-weight model matching closed-source performance

Enables Self-Hosting: First Tier-1 model practical for individual deployment

Affordable Always-On Agents: Economic viability of continuous AI assistance

Democratized SOTA: No frontier tax for world-class performance

How to Get Started

Immediate Action: Visit kilo.ai/landing/minimax-m25

Install: Choose CLI, VS Code extension, or IDE integration

Select: Pick MiniMax M2.5 from dropdown

Build: Start coding with SOTA assistance immediately

Free Window: One week, no credits required

Post-Promotion: Still most affordable SOTA option available


Try M2.5 Free:

  • Landing Page: kilo.ai/landing/minimax-m25
  • CLI: Install via Kilo Code CLI
  • VS Code: marketplace.visualstudio.com (search “Kilo Code”)
  • Docs: kilo.ai/docs
  • Blog: blog.kilo.ai

Join: 1.5M+ developers using Kilo Code


The Bottom Line: MiniMax M2.5 achieves breakthrough SOTA performance—80.2% SWE-Bench Verified (matching Claude Opus 4.6), 55.4% SWE-Bench Pro (beating Gemini 3 Pro by 12.1 points), 51.3% Multi-SWE-Bench, 76.3% BrowseComp—at 100 tokens/second speed (3× faster than Opus) with only 10B activated parameters (smallest Tier-1 model enabling self-hosting) and $0.06/M blended cost (best SOTA pricing). Available completely free for one week in Kilo Code (CLI, VS Code, all IDEs) with no credits required, blurring “OSS vs Proprietary distinction” as MiniMax moves “into big leagues as truly SOTA lab.” Engineered for Agent-Verse with planning optimization, expected to “rule every category” on Kilo leaderboard, joining M2.1 as “most popular open-weight model” while providing “world's most powerful tools without frontier tax.” Open weights coming to HuggingFace soon. Access at kilo.ai/landing/minimax-m25—join 1.5M+ developers experiencing production-grade AI assistance democratized.

The future of coding assistance just got faster, cheaper, and accessible to everyone. One week free. No frontier tax. Pure SOTA power.

Share:

Recent Posts

VERTU SPRING CURATION

TOP-Rated Vertu Products

Featured Posts

Shopping Basket

VERTU Exclusive Benefits