Xiaomi has made an entry into the frontier AI race with the launch of MiMo-V2-Pro — a one-trillion parameter foundation model whose performance benchmarks are said to be approaching those of
OpenAI and Anthropic but at roughly one-sixth to one-seventh of the cost when accessed through its API. The project was led by Fuli Luo, who previously worked on DeepSeek R1 project — the Chinese AI model that shook the industry and cost Nvidia billions of dollars in market cap in a day. That model delivered frontier-level performance at relatively lower costs.
Luo did not mince words about what Xiaomi was attempting with this release, describing it in a post on X as “a quiet ambush” on the global AI frontier. Luo also confirmed that the company plans to open source a variant of MiMo-V2-Pro but “when the models are stable enough to deserve it.”
Xiaomi is the world’s third-largest smartphone manufacturer and has also forayed into the electric vehicle market with focus on combining hardware, software and now advanced AI reasoning under one roof. As per a report by VentureBeat, Xiaomi is trying focus on what it calls the “action space” of intelligence.
Here’s Fuli Luo’s full post on Xiaomi MiMo-V2-Pro AI model
MiMo-V2-Pro & Omni & TTS is out. Our first full-stack model family built truly for the Agent era.I call this a quiet ambush — not because we planned it, but because the shift from Chat to Agent paradigm happened so fast, even we barely believed it. Somewhere in between was a process that was thrilling, painful, and fascinating all at once.The 1T base model started training months ago. The original goal was long-context reasoning efficiency. Hybrid Attention carries real innovation, without overreaching — and it turns out to be exactly the right foundation for the Agent era. 1M context window. MTP inference for ultra-low latency and cost. These architectural decisions weren't trendy. They were a structural advantage we built before we needed it.What changed everything was experiencing a complex agentic scaffold — what I'd call orchestrated Context — for the first time. I was shocked on day one. I tried to convince the team to use it. That didn't work. So I gave a hard mandate: anyone on MiMo Team with fewer than 100 conversations tomorrow can quit. It worked. Once the team's imagination was ignited by what agentic systems could do, that imagination converted directly into research velocity.People ask why we move so fast. I saw it firsthand building DeepSeek R1. My honest summary:— Backbone and Infra research has long cycles. You need strategic conviction a year before it pays off.— Posttrain agility is a different muscle: product intuition driving evaluation, iteration cycles compressed, paradigm shifts caught early.— And the constant: curiosity, sharp technical instinct, decisive execution, full commitment — and something that's easy to underestimate: a genuine love for the world you're building for.We will open-source — when the models are stable enough to deserve it.From Beijing, very late, not quite awake.