Edge 435: Learn About Hungry Hungry Hippos and SSMs
One of the most important layers of state space models.
In this issue:
An overview of hungry hungry hippost(H3).
A review of the original H3 paper.
An introduction to Character.ai’s PromptPoet framework.
💡 ML Concept of the Day: Hungry Hungry Hippos (H3), an Important Layer for SSMs
State space models (SSMs) have long been a fundamental concept in signal processing. Recent work has demonstrated their effectiveness as sequence models, particularly in capturing long-range dependencies. They have set new performance standards across various benchmarks, such as the Long-Range Arena (LRA), and have shown impressive results in tasks like speech generation. Despite these successes, SSMs have historically fallen short compared to attention mechanisms in language modeling tasks.
The obvious question then is: how can this performance gap be closed? A novel layer called H3, short for “Hungry Hungry Hippos,” was developed by researchers at Stanford University specifically to tackle associative recall challenges. By integrating H3, nearly all the attention layers in GPT-style transformers can be replaced, achieving either superior or comparable quality.