The Sequence Radar #869: Last Week in AI: The Token Becomes the Unit of Account — Opus 4.8, OpenRouter, Cognition, Snowflake, and a papal warning

Opus 4.8 and remarkable fundraising events.

May 31, 2026

Next Week in The Sequence:

We continue our series about transformer alternatives.
In the AI of the Week section, we discuss Opus 4.8.
The opinion of the week discusses companies the strategic differences between companies like Google, NVIDIA, Microsoft, OpenAI and Anthropic when comes to their ownership of different areas of the AI stack.

Subscribe and don’t miss out:

📝 Editorial: Last Week in AI: The Token Becomes the Unit of Account — Opus 4.8, OpenRouter, Cognition, Snowflake, and a papal warning

For two years the AI boom was an argument about the future, told in benchmarks and term sheets. This week it became an argument about the present, told in revenue.

Start with the substrate. Anthropic shipped Claude Opus 4.8 and, in the same breath, disclosed it’s tracking toward its first operating profit — roughly $10.9B in projected Q2 revenue, up about 130% quarter over quarter — while closing a $65B round. Sit with that. A lab still doing frontier training runs is approaching operational profitability. The “labs are structurally unprofitable” assumption that anchored every bear case just lost its load-bearing wall.

The model itself Anthropic described, refreshingly, as “a modest but tangible improvement” — agentic coding nudges from ~64% to ~69%, reasoning-with-tools from ~55% to ~58%, at the same price as 4.7. The interesting stuff is underneath the benchmarks. Three changes matter. First, an effort control that lets you dial how hard the model thinks per task — explicit governance over the compute-versus-quality tradeoff that every agent builder has been hacking around with prompt tricks. Second, dynamic workflows: a Claude Code capability where the model plans a large task, spins up parallel sub-agents to attack the pieces, verifies their outputs, and reports back — paired with a Messages API that now accepts live edits to the message array mid-run without breaking the prompt cache, so you can steer a long job without tearing it down and restarting. Third, honesty as a measured capability: 4.8 is roughly 4x less likely than 4.7 to let a flaw in its own code slip through unflagged, and surfaces its own uncertainty more readily. Stack those and you get the thing that actually matters once a model runs unattended for hours: it plans, it parallelizes, it checks its own work, and — because you can’t read every diff — it’s trained to distrust itself. It also burns tokens by the fistful doing all of it.

Then watch where the money flows, because it tells you the unit of account is now the token. OpenRouter raised $113M at $1.3B doing something almost embarrassingly simple: routing across 400+ models and taking ~5% of the inference spend that passes through. Its weekly throughput went from 5T to 25T tokens in six months — 5x. That’s not a forecast; that’s a meter. Cognition raised $1B at $26B, and buried in the announcement was the line that should reorganize your priors: 89% of code committed inside Cognition is now written by Devin, up from 13% in December. Run-rate revenue went from $37M to $492M in a year. Autonomous software engineering stopped being a demo and became the default committer.

Snowflake closes the loop on the public side. Product revenue up 34%, guidance raised, stock up ~36% in a single session — and the two tells are a $6B AWS compute deal and the acquisition of Natoma, an MCP platform for governing agent access. The data layer is repricing itself around agents that consume, not analysts that query. The whole stack — model, router, agent, substrate — is converging on one business model: charge by the token, because the token is the work.

Which is exactly the moment Pope Leo XIV chose to publish Magnifica Humanitas, his first encyclical, presented alongside Anthropic’s Chris Olah. Stripped of its theology, the argument is an engineering critique the field should take seriously: technology is never neutral, because it inherits the incentives of whoever builds and funds it — and the danger isn’t malice but quiet disintermediation, decisions sliding out of human hands one delegated commit at a time.

Hold those two facts together. Cognition’s 89% is the encyclical’s thesis restated as a metric. The meter that makes the economics work is the same meter measuring how much judgment we’ve handed off. The bull case and the moral case are reading the same number.

The flywheel is no longer a slide. It’s on the income statement. The open question is what it’s optimizing for.

🔎 AI Research

Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization

AI Lab: CMU & Amazon

Summary: To address the challenge of evaluating whether AI agents can accurately translate informal programming intent into formal specifications, the researchers introduce the VERUS-SPECBENCH benchmark and the VERUS-SPECGYM agentic environment. By extending an execution mechanism to test generated specifications against both official tests and adversarial “hacks,” the study reveals that specification autoformalization remains highly brittle even for models capable of generating correct code.

Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

AI Lab: Tsinghua University, NVIDIA, University of Toronto, & Vector Institute

Summary: Gamma-World presents a scalable, generative multi-agent world model that moves beyond traditional single-agent simulations by utilizing Simplex Rotary Agent Encoding for permutation-symmetric identities and Sparse Hub Attention for efficient cross-agent communication. Through conditional teacher-student distillation and KV-cached streaming, the framework achieves real-time, action-responsive rollouts at 24 FPS that maintain strong consistency across virtual gaming and physical robotic environments.

Self-Improving Language Models with Bidirectional Evolutionary Search

AI Lab: Harvard University & MIT

Summary: Bidirectional Evolutionary Search (BES) overcomes the limitations of sparse verification signals and narrow autoregressive expansion by coupling forward candidate evolution with backward goal decomposition. By recombining trajectory segments to escape narrow probability distributions and scoring them against fine-grained sub-goals, BES significantly outperforms existing open-source frameworks on complex logical reasoning and open problem-solving tasks.

MobileMoE: Scaling On-Device Mixture of Experts

AI Lab: Meta AI

Summary: MobileMoE introduces a family of sub-billion active-parameter Mixture-of-Experts (MoE) language models specifically optimized for efficient deployment on edge devices like smartphones. Guided by a novel on-device scaling law and supported by a custom fused MoE kernel, these models achieve state-of-the-art performance while delivering substantially faster prefill and decode speeds compared to dense baselines at a similar memory footprint.

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini

AI Lab: Google DeepMind

Summary: Gemini Embedding 2 is a native multimodal embedding model that seamlessly maps text, image, audio, and video inputs into a single, unified representation space without relying on intermediate transcriptions. Trained via large-scale contrastive learning in a multi-task setup, the model establishes new state-of-the-art performance across unimodal, cross-modal, and multimodal retrieval benchmarks while demonstrating robust zero-shot generalization across diverse enterprise and specialized domains.

The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence

AI Lab: MiniMax

Summary: The MiniMax-M2 series introduces a highly efficient 229.9B parameter Mixture-of-Experts model that activates only 9.8B parameters per token, specifically engineered for complex, long-horizon agentic workflows. By leveraging agent-driven data pipelines, a specialized reinforcement learning system called Forge, and autonomous self-evolution capabilities, the model achieves frontier-level performance across coding, deep search, and reasoning benchmarks while maintaining a minimal computational footprint.

🤖 AI Tech Releases

Claude Opus 4.8

Anthropic released the new version of its marquee model, with strong agentic and coding capabilities.

📡10 AI News You Need to Know About

Anthropic raises $65B in Series H at $965B post-money valuation — Anthropic raised $65 billion in Series H funding (co-led by Altimeter, Dragoneer, Greenoaks, and Sequoia) at a $965 billion post-money valuation, disclosing that run-rate revenue crossed $47 billion earlier this month, and bringing on Micron, Samsung, and SK hynix as strategic memory/storage partners alongside $15B in previously committed hyperscaler investment (including $5B from Amazon).
Cognition raises $1B at $25B pre-money valuation — Cognition, maker of the AI software engineer Devin, raised more than $1 billion (led by Lux Capital, General Catalyst, and 8VC) at a ~$26B post-money valuation, more than doubling in eight months as it hit a $492M annualized revenue run-rate.
Robinhood lets AI agents trade stocks — Robinhood launched Agentic Trading and an Agentic Credit Card in beta, letting customers connect third-party AI agents (via MCP) to a separate, funded account to autonomously trade equities and make purchases. Original source:
OpenRouter doubles valuation to $1.3B — The multi-model AI inference-routing startup raised a $113M Series B led by Alphabet’s CapitalG at a ~$1.3B valuation, more than double its level a year ago, as weekly volume grew from 5T to 25T tokens.
Hark raises $700M Series A — Brett Adcock’s secretive AI startup raised $700M at a $6B post-money valuation (led by Parkway Venture Capital) to build a “universal” agentic AI assistant with proprietary multimodal models and custom hardware, with first models due summer 2026.
Mistral signs Airbus and BMW — Mistral AI expanded into “physical AI” for manufacturing, announcing partnerships to apply its models to Airbus (aircraft design, flight safety, defense/space) and BMW’s “Large Industry Model” crash-simulation initiative, plus a new French data center.
MiniMax doubles sales ahead of new model — The Chinese AI developer’s annualized revenue more than doubled over two months to at least $300M, driven by its M2.7 model and a fivefold jump in enterprise users, ahead of its next flagship launch. No primary source to substitute: the figures come from a Bloomberg Television interview with co-founder Yun Yeyi, so Bloomberg is the original source.
SK Hynix joins the $1 trillion club — Shares of the South Korean memory maker surged ~9–15% to push its market value above $1 trillion for the first time, driven by HBM demand for AI, joining rivals Samsung and Micron. No company announcement exists for a stock-price milestone; the event was originated on the wires by Reuters: https://money.usnews.com/investing/news/articles/2026-05-26/sk-hynix-joins-1-trillion-club-after-samsung-micron-on-ai-chip-boom
Pope Leo warns AI shouldn’t dominate humanity — In his first encyclical, Magnifica Humanitas, Pope Leo XIV called for “disarming” AI to keep it human-friendly and free of monopolistic control, warning it risks deepening inequality and eroding human agency. Original source: the encyclical itself, published by the Vatican (vatican.va) — that’s the underlying document the coverage is based on.
Snowflake signs $6B AWS deal for Graviton chips — Snowflake committed $6B over five years to AWS — its largest infrastructure commitment ever — expanding use of Amazon’s ARM-based Graviton CPUs and GPUs to power agentic AI workloads.

TheSequence

Discussion about this post

Ready for more?