The Sequence Weekly Alpha #686: Kimi K2 is a Trillion Parameter Open Source Model You Must Know About
The new chinese model is pushing the boundaries of open source AI.
This is a new experimental section where we highlight some of the most important AI releases from the past week—especially those that may have flown under the radar. We get to pick one, ocassionally two developments per week. Our aim is to keep you informed without overwhelming you with technical jargon. The pace of innovation in AI is so rapid that key developments are often missed, particularly when they come from outside the major Western labs.
To kick things off, I want to spotlight a remarkably impressive release from China that deserves your attention. You’ve probably heard of the DeepSeek models, right? But Kimi? Maybe not. And yet—you absolutely should. A new release last week is certainly making noise. Let’s dive in.
Kimi K2 is just a MASSIVE model and one that could mark a new milestone in open-source language modeling. The model combines a trillion parameters with a sparse activation scheme that boosts efficiency. This essay explains K2’s motivation, its sparsely routed Mixture‑of‑Experts (MoE) design, and its performance across coding, reasoning, and agentic benchmarks.
Let’s start with the creators of Kimi.