TheSequence

TheSequence

The Sequence Opinion #778: After Scaling: The Era of Research and New Recipes for Frontier AI

Some ideas about new techniques that can unlock new waves of innovations in frontier models.

Dec 25, 2025
∙ Paid
Created Using GPT-5

For the last few years, AI progress has felt almost… procedural. Take a transformer. Pour in internet-scale text. Add a mountain of GPUs. Train until the loss curve politely bends. Then do post-training—RLHF, preference tuning, tool-use fine-tuning—until the thing behaves. If you did this with enough care (and enough capex), capability would arrive on schedule.

That mood is what Ilya Sutskever calls the “age of scaling”—a period where one word (“scaling”) basically told an entire industry what to do next. And his claim, in the Dwarkesh podcast, is that we’re exiting that era and returning to something messier, more eclectic, and ultimately more interesting: an “age of research again, just with big computers.”

The reason is not that scaling “stops working” overnight. It’s that the cleanest axis—pretraining on ever more data—runs into a very physical ceiling: the data is finite. So the question becomes: when the easy recipe is exhausted, what are the new recipes? What are the techniques that convert compute into genuine generalization—models that learn faster, adapt better, and make fewer weird mistakes?

Below is a map of the most promising technique-clusters that could plausibly unlock the next wave of frontier innovation. It’s not a list of “one weird trick.” It’s more like a toolbox for the post-pretraining world.

1) “Souped-up pretraining”: same idea, different physics

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Jesus Rodriguez · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture