TheSequence

TheSequence

Share this post

TheSequence
TheSequence
The Sequence Chat: Why Transformers are the Best Thing that Ever Happened to NVIDIA

The Sequence Chat: Why Transformers are the Best Thing that Ever Happened to NVIDIA

A discussion about some controvertial and original ideas in AI.

Oct 21, 2024
∙ Paid
12

Share this post

TheSequence
TheSequence
The Sequence Chat: Why Transformers are the Best Thing that Ever Happened to NVIDIA
3
Share
Created Using DALL-E

I wanted to devote some installments of The Sequence to outline some reflections about several controversial ideas around AI. At the end, one of the rarest things to find in today’s market plagued with hundreds of AI newsletters are publications that discuss original ideas. I think this section would be a cool complement to our interview series and, if nothing else, might force you to think about these topics even if you disagree with my opinion 😉

Today, I would like to start with a simple but controversial thesis that I was discussing with some of my students recently. The cornerstone of this thesis is why the transformer architecture used in foundation models is, arguably, the best thing that ever happened to NVIDIA.

Have you ever heard the phrase that the only company turning real profits in AI is NVIDIA? Well, transformers have a lot to do with that. The main reasons are both technical and market related:

  1.  Technical: The transformer architecture is the first model in which knowledge scales with pre, post training data without clear limits.

  2. Market: The fact that transformers have become the dominant AI paradigm have given NVIDIA time to optimize its hardware for that architecture.

  3. Scale: Past certain scale, all transformer architectures are running using NVIDIA GPUs.

Let’s dive into these two points:

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share