TheSequence

TheSequence

Share this post

TheSequence
TheSequence
Edge 435: Learn About Hungry Hungry Hippos and SSMs

Edge 435: Learn About Hungry Hungry Hippos and SSMs

One of the most important layers of state space models.

Oct 01, 2024
∙ Paid
10

Share this post

TheSequence
TheSequence
Edge 435: Learn About Hungry Hungry Hippos and SSMs
2
Share
Created Using Ideogram

In this issue:

  1. An overview of hungry hungry hippost(H3).

  2. A review of the original H3 paper.

  3. An introduction to Character.ai’s PromptPoet framework.

💡 ML Concept of the Day: Hungry Hungry Hippos (H3), an Important Layer for SSMs

State space models (SSMs) have long been a fundamental concept in signal processing. Recent work has demonstrated their effectiveness as sequence models, particularly in capturing long-range dependencies. They have set new performance standards across various benchmarks, such as the Long-Range Arena (LRA), and have shown impressive results in tasks like speech generation. Despite these successes, SSMs have historically fallen short compared to attention mechanisms in language modeling tasks.

The obvious question then is: how can this performance gap be closed? A novel layer called H3, short for “Hungry Hungry Hippos,” was developed by researchers at Stanford University specifically to tackle associative recall challenges. By integrating H3, nearly all the attention layers in GPT-style transformers can be replaced, achieving either superior or comparable quality.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share