TheSequence

TheSequence

Share this post

TheSequence
TheSequence
The Sequence Chat: Small Specialists vs. Large Generalist Models and What if NVIDIA Becomes Sun Microsystems

The Sequence Chat: Small Specialists vs. Large Generalist Models and What if NVIDIA Becomes Sun Microsystems

A controversial debate and a crazy thesis.

Nov 13, 2024
∙ Paid
14

Share this post

TheSequence
TheSequence
The Sequence Chat: Small Specialists vs. Large Generalist Models and What if NVIDIA Becomes Sun Microsystems
3
Share
Created Using DALL-E

Mega large transformer have dominated the generative AI revolution over the last few years. The emergent properties that surface at scale in large foundation models are nothing short of magical. At the same time, running pretraining, fine-tuning and inference workloads in large models results cost prohibited for most organizations. As a result, we have seen the emergence of smaller models that more specialized in a given domain. The raise of smaller models also represents a challenge to the dependency of NVIDIA GPUs are these models are perfectly able to run in commodity hardware.

Today, I would like to explore two complex thesis:

  1. Dive into the debate between smaller and large models.

  2. What happens to NVIDIA in a world in which new AI architecture do not depend on its GPU topologies?

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share