The Sequence Radar #783: Softbank, DeepSeek, MiniMax and The Sequence 2026
A new DeepSeek paper everyone is talking about, Softbank major deals and MiniMax small IPO.
Next Week in The Sequence:
We continue our series about synthetic data exploring the potential for world models. We review DeepSeek’s new paper that everyone is talking about. For the opinion section, we discuss the topic of models vs systems and how frontier models are absorving many of the agentic capabilities.
Subscribe and don’t miss out:
📝 Editorial: Softbank, DeepSeek, MiniMax and The Sequence 2026
As we cross the threshold into a new year, I am upholding my annual tradition of experimenting with the structure of The Sequence. Our “north star” remains unchanged: to provide content that is technical, opinionated, and distinct from the commodity news cycle found elsewhere. We do not focus on mainstream headlines; we focus on practical AI knowledge and original perspectives.
2025 was a pivotal year for my own journey in the trenches of the industry. It was marked by the acquisition of my small-model company, NeuralFabric, by Cisco; the merger and $25M fundraise of Sentora (focusing on crypto-AI financial services); and the rapid scaling of LayerLens in the AI evaluation space. These experiences—spanning edge computing, decentralized finance, and benchmarking—have heavily influenced my perspective on AI systems.
To reflect this, I am introducing several structural evolutions to the newsletter this year:
A New Interview Section: I have been spending time with extraordinary engineering teams and have lined up deep-dive conversations that we will begin publishing shortly.
Reimagined Knowledge Series: We observed that while our readers value the “concept of the day,” the accompanying research analysis was often too dense for a Tuesday read. We will now expand the conceptual deep-dives and reserve research analysis for more targeted updates.
Engineering First: Drawing from my recent development work, I will be introducing posts focused specifically on practical engineering lessons and best practices.
Timely, Flexible Coverage: The pace of innovation does not adhere to a rigid content calendar. We will adopt a more flexible structure to cover high-impact developments as they happen, rather than forcing them into pre-set slots.
With that framework in mind—prioritizing technical rigor and market reality—let’s examine some of the AI events that defined the transition from 2025 to 2026.
The first week of 2026 has set a definitive tone for the year ahead, characterized by a widening schism in the ecosystem. On one side, we witness the brute-force consolidation of infrastructure by incumbents; on the other, a relentless drive for algorithmic efficiency by challengers. The events of late December—specifically SoftBank’s infrastructure acquisitions, DeepSeek’s latest architectural breakthroughs, and MiniMax’s public market debut—illustrate this bifurcation perfectly.
SoftBank Group has effectively declared the era of “Artificial Super Intelligence” (ASI) open, backing its vision with the largest capital deployment in the industry’s history. Masayoshi Son has executed a two-pronged strategy to corner the market on both intelligence and the hardware that powers it. First, SoftBank completed its staggering $41 billion investment commitment to OpenAI on December 26. But the more telling move was the acquisition of DigitalBridge for $4 billion on December 29. By purchasing DigitalBridge, SoftBank has moved beyond merely funding models to owning the physical “ground” they run on—data centers and edge infrastructure. This vertical integration suggests a future where the distinction between the model provider and the infrastructure provider becomes increasingly blurred.
While SoftBank attempts to conquer AGI through capital dominance, Chinese lab DeepSeek continues to demonstrate that smarter math can rival larger budgets. In the final days of 2025, DeepSeek released research detailing DeepSeek-V3.2, a refinement of their Mixture-of-Experts (MoE) architecture. The core breakthrough is Group Relative Policy Optimization (GRPO). This technique removes the need for a resource-heavy “critic” model during the reinforcement learning phase, allowing the AI to self-evaluate group outputs. DeepSeek has managed to match the reasoning capabilities of leading Western models like GPT-5 and Gemini 3.0 Pro, but with a training efficiency that requires a fraction of the GPU hours. This proves that while capital is a moat, algorithmic innovation acts as a bridge.
Finally, the ecosystem saw a major test of market reality. MiniMax, the Shanghai-based unicorn known for character-based interactions and video generation, filed for an IPO on the Hong Kong Stock Exchange on December 31. Targeting a raise of roughly $539 million, this IPO is a watershed moment for the “Application Layer” of AI. Unlike the foundational model wars, MiniMax is testing the public market’s appetite for applied AI. By securing public capital, MiniMax ensures it has the “dry powder” to navigate tightening hardware export controls and fierce domestic competition.
2026 has begun not with a whimper, but with the roar of engines firing on three distinct tracks: SoftBank’s infinite capital, DeepSeek’s mathematical precision, and MiniMax’s commercial validation.
🔎 AI Research
Training AI Co-Scientists Using Rubric Rewards
AI Lab: Meta Superintelligence Labs
Summary: This paper introduces a method to train AI “co-scientists” by extracting research goals and goal-specific grading rubrics from existing scientific papers to serve as reinforcement learning rewards. By using a frozen policy as a grader to verify generated plans against these rubrics, the approach significantly improves the model’s ability to generate rigorous, constraint-satisfying research plans across diverse domains like machine learning and medicine.
Web World Models
AI Lab: Princeton University
Summary: The authors propose “Web World Models,” a hybrid architecture that uses standard web code to define deterministic “physics” and logic while leveraging Large Language Models to generate creative content and narratives on demand. This approach enables the creation of scalable, persistent, and controllable environments—such as infinite travel atlases or galaxy simulations—that bridge the gap between static web frameworks and fully generative world models.
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
AI Lab: Google DeepMind
Summary: This research presents RISE, an unsupervised framework that uses sparse auto-encoders to identify “reasoning vectors” within a model’s activation space, corresponding to interpretable behaviors like reflection and backtracking. By manipulating these vectors during inference, the authors demonstrate the ability to controllably steer a model’s reasoning process and confidence levels without the need for supervised training data.
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation
AI Lab: Meta Superintelligence Labs
Summary: To address the lack of physical realism in AI-generated videos, this paper introduces a data construction pipeline called PhyAugPipe to collect physics-rich training examples and a new optimization framework, PhyGDPO, that incorporates physics-aware rewards. The method utilizes a vision-language model to guide preference optimization, significantly improving the physical consistency and plausibility of generated videos compared to state-of-the-art models.
mHC: Manifold-Constrained Hyper-Connections
AI Lab: DeepSeek-AI
Summary: This work proposes Manifold-Constrained Hyper-Connections (mHC), a framework that projects the expanded residual connection space of Hyper-Connections onto a specific manifold to restore the critical identity mapping property. By combining this topological constraint with efficient infrastructure optimizations like kernel fusion, mHC achieves superior scalability and training stability for large language models while retaining performance gains.
SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling
AI Lab: NVIDIA
Summary: This paper introduces SurgWorld, a domain-specific world model trained on a newly curated dataset of expert-annotated surgical videos to generate photorealistic scenes and pseudo-kinematics. By synthesizing paired video-action data to augment scarce real-world demonstrations, the framework significantly improves the performance and generalization of robotic policies for tasks like surgical needle manipulation.
🤖 AI Tech Releases
Qwen-Image-2512
Alibaba Qwen open sourced Qwen-Image-2512, a new version of its text-to-image model.
📡AI Radar
OpenAI is shifting its product strategy to prioritize voice-first interactions, marking a move away from screen-based interfaces in the tech industry.
Manus has announced it is joining Meta to help accelerate the development of next-generation immersive hardware and AI agents.
The Indian government has sanctioned a $4.6 billion investment package aimed at strengthening the country’s domestic electronics manufacturing ecosystem. MeitY India
Elon Musk’s xAI is reportedly preparing to significantly increase the compute capacity of its Colossus data center.
SoftBank Group has entered into a definitive agreement to acquire the digital infrastructure firm DigitalBridge for approximately $4 billion.
US regulators have renewed TSMC’s annual license, allowing the chipmaker to continue exporting equipment to its manufacturing facilities in China.
Chinese artificial intelligence startup MiniMax is planning to list on the Hong Kong Stock Exchange with an IPO target of roughly $539 million.
SoftBank has finalized the funding for its massive $40 billion investment into OpenAI, solidifying its stake in the AI giant.

