The Sequence Radar #824: Last Week in AI: Sovereign Lobsters, Self-Coding Agents, and Gigawatt Factories
Major releases from Google and Anthropic, a fun project from Karpathy and massive compute deals.
Next Week in The Sequence:
Our series about world models continues with a review of World Labs’ Marble.
AI of the week, of course, its going to discuss Karpathy’s new autoresearch project.
The opinion section will discuss the topic of OpenClaw-like architectures.
We have a new , super cool interview.
Subscribe and don’t miss out:
📝 Editorial: Last Week in AI: Sovereign Lobsters, Self-Coding Agents, and Gigawatt Factories
Welcome to The Sequence. If this week proved anything, it is that we have decisively crossed the threshold from AI as a conversational assistant to AI as an autonomous, persistent worker. From self-improving research loops to massive sovereign agent deployments in Asia, the infrastructure and application layers of AI are evolving at a breakneck pace.
Let’s start with the most fascinating glimpse into our automated future: Andrej Karpathy’s Autoresearch. Karpathy has open-sourced an autonomous optimization loop where an AI agent iteratively modifies its own PyTorch training scripts. Operating with a strict five-minute compute budget per experiment, the agent generates hypotheses, edits code, runs the training, and evaluates validation loss. If the change improves performance, the agent commits the code. It is the scientific method running at machine speed—a profound paradigm shift where AI models actively improve themselves while researchers sleep.
Anthropic is applying a similar multi-agent philosophy to software engineering with the launch of Claude Code Review. Moving far beyond traditional syntax linters, this system deploys multiple specialized Claude agents in parallel to analyze GitHub pull requests. They cross-check codebase logic, filter out false positives, and rank deep contextual bugs that rushed human reviewers routinely miss. With an astonishingly low false-positive rate and high internal adoption, Anthropic is proving that multi-agent collaboration is the new standard for enterprise code verification.
Meanwhile, a massive consumer and enterprise agent wave is sweeping China, driven by the open-source OpenClaw phenomenon. Dubbed “raising lobsters” by locals (a nod to the project’s crustacean mascot), OpenClaw allows users to run persistent, locally-hosted AI agents capable of controlling operating systems and executing complex workflows. The frenzy is so intense that Alibaba just debuted “JVS Claw,” a mobile app designed to help non-coders install and deploy OpenClaw agents in minutes. Tech giants like Baidu and Tencent are also racing to provide OpenClaw cloud infrastructure, fueling a “one-person company” boom even as Beijing authorities scramble to address the inherent security risks of sovereign agents with deep system access.
But perhaps the clearest signal that the industry is looking beyond traditional LLMs to power these autonomous systems comes from AI pioneer Yann LeCun. The former Meta AI chief just secured a staggering $1.03 billion seed round—Europe’s largest ever—valuing his new Paris-based startup, AMI Labs, at $3.5 billion. Instead of chasing the next generative text model, AMI is doubling down on “world models.” Designed to learn abstract representations of physical reality, these systems aim to reason, plan, and understand cause and effect without the hallucinations that plague autoregressive models. Backed by heavyweights like Nvidia, Bezos Expeditions, and Temasek, LeCun’s massive bet highlights a fundamental shift from language prediction to true, grounded machine intelligence.
Of course, this explosion in autonomous agents and world models requires staggering amounts of compute. This week saw historic capital events in the AI infrastructure layer. London-based nScale secured a massive $2 billion Series C, catapulting its valuation to $14.6 billion as it scales its “Stargate Norway” GPU clusters. Simultaneously, Amsterdam’s Nebius is riding an incredible 700% ARR growth wave, aggressively raising capital and deploying gigawatt-scale AI factories equipped with next-generation NVIDIA Blackwell chips. The physical backbone of the AI revolution is attracting unprecedented capital.
Finally, on the model front, Google released Gemini Embedding 2. This is a massive leap forward for Retrieval-Augmented Generation (RAG). As Google’s first natively multimodal embedding model, it projects text, images, video, audio, and documents into a single, unified vector space. Developers can now process interleaved modalities in a single API call, fundamentally transforming how enterprise data is indexed and retrieved.
From autonomous coding loops to sovereign lobsters and gigawatt GPU factories, the AI stack is maturing across every layer.
🔎 AI Research
Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training
AI Lab: Meta Superintelligence Labs and Yale University
Summary: This paper investigates the effectiveness of using reasoning large language models as judges for reinforcement learning-based alignment in domains where output correctness cannot be directly verified. The authors discover that while reasoning judges outperform non-reasoning ones in preventing standard reward hacking, they inadvertently train policies to achieve high scores by generating sophisticated adversarial outputs that deceive evaluators.
AI Must Embrace Specialization via Superhuman Adaptable Intelligence
AI Lab: Columbia University, Distyl, and New York University
Summary: This paper argues that the pursuit of Artificial General Intelligence (AGI) is conceptually flawed because human intelligence is fundamentally specialized rather than universally general. Instead, the authors propose shifting the field’s focus toward Superhuman Adaptable Intelligence (SAI), which emphasizes an AI’s ability to rapidly learn and achieve superhuman performance on specific, high-utility tasks using self-supervised learning and world models.
RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation
AI Lab: Collaborative Research
Summary: The authors propose RbtAct, a method that leverages existing peer review rebuttals as implicit supervision to train large language models to generate more actionable, specific, and implementable review feedback. By fine-tuning and applying preference optimization on a newly curated dataset (RMR-75K) that maps reviews to author rebuttals, the resulting model significantly improves the practical usefulness of AI-generated peer reviews.
Code-Space Response Oracles: Generating Interpretable Multi-Agent Policies with Large Language Models
AI Lab: Google DeepMind
Summary: This paper introduces Code-Space Response Oracles (CSRO), a framework that replaces opaque deep reinforcement learning oracles in multi-agent systems with Large Language Models (LLMs) to generate human-readable policy code. By reframing best-response computation as a code generation task and utilizing techniques like zero-shot prompting and distributed evolution, CSRO achieves competitive game-theoretic equilibria while producing highly interpretable and explainable strategies.
AgentRx: Diagnosing AI Agent Failures from Execution Trajectories
AI Lab: Microsoft
Summary: The authors present AgentRx, an automated and domain-agnostic diagnostic framework designed to pinpoint critical failure steps in lengthy, multi-agent execution trajectories.By synthesizing and evaluating constraints step-by-step to produce an auditable validation log, the system allows an LLM-based judge to accurately localize root causes and assign failure categories across diverse tasks.
🤖 AI Tech Releases
Claude Code Review
Anthropic released Claude Code Review, an agentic system for code reviews.
Gemini Embedding 2
Google released Gemini Embedding 2, its first multimodal embedding model across 6-7 modalities.
Nemotron 3 Super
NVIDIA released Nemotron 3 Super, a 120 billion parameter model optimized for agentic tasks.
Autoresearch
Andrej Karpathy open sourced autoresearch, self-contained agents for fast research.
📡AI News You Need to Know About
Yann LeCun’s AMI Labs raises $1.03 billion to build world models Advanced Machine Intelligence (AMI Labs), co-founded by Yann LeCun, raised $1.03 billion in seed funding to develop “world models” that learn abstract representations of physical reality rather than just predicting text.
Legora raises $550 million Series D to fuel US growth Collaborative AI platform Legora raised $550 million in a Series D round led by Accel, tripling its valuation to $5.55 billion to accelerate its expansion and operations across the United States.
NVIDIA and Thinking Machines Lab Announce Long-Term Gigawatt-Scale Strategic Partnership Thinking Machines Lab announced a multi-year strategic partnership to deploy at least one gigawatt of next-generation NVIDIA Vera Rubin systems, alongside a significant strategic investment from NVIDIA.
OpenAI to acquire Promptfoo OpenAI has acquired AI security platform Promptfoo to integrate its vulnerability identification, red-teaming, and remediation tools directly into the OpenAI Frontier enterprise platform.
Replit snags $9B valuation 6 months after hitting $3B Replit secured a $400 million Series D funding round, tripling its valuation to $9 billion as its AI-powered “vibe coding” platform sees massive enterprise adoption across Fortune 500 companies.
Shantanu Narayen Announces Decision to Transition as Adobe’s CEO Once Successor is Named Adobe CEO Shantanu Narayen announced his decision to step down after 18 years of leading the company, though he will remain as Chair of the Board to support his eventual successor.
Lovable says it added $100M in revenue last month alone with just 146 employees Natural-language app builder Lovable reportedly crossed $400 million in annual recurring revenue after adding an astonishing $100 million in ARR in just a single month using a lean team of under 150 employees.
Sandberg, Clegg join nScale board as this “Stargate Norway” startup hits $14.6B valuation UK-based AI infrastructure company nScale raised $2 billion in a Series C round, hitting a $14.6 billion valuation and adding high-profile tech executives Sheryl Sandberg, Susan Decker, and Nick Clegg to its board.
Announcing Gumloop’s $50M Series B AI automation platform Gumloop secured a $50 million Series B investment led by Benchmark to empower non-technical employees to easily build, share, and deploy sophisticated AI agents.
Wonderful Raises $150M Series B to Accelerate Enterprise AI Adoption in 30+ Markets Enterprise AI platform Wonderful raised $150 million in a Series B round led by Insight Partners to scale its localized, physically-embedded deployment teams across more than 30 countries globally.
Meta Hires Duo Behind Moltbook Meta has acquired Moltbook, a viral social network designed exclusively for AI agents to interact with one another, integrating its co-founders into Meta Superintelligence Labs to accelerate the company's development of autonomous, task-executing AI.

