TheSequence

TheSequence

Share this post

TheSequence
TheSequence
Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model

Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model

Can a hybrid design outperform each one of the baseline architectures?

Sep 03, 2024
∙ Paid
19

Share this post

TheSequence
TheSequence
Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model
1
1
Share
Created Using Ideogram

In this issue:

  1. An overview of the concepts in the Jamba model that combines transformers and SSMs.

  2. A review of the original Jamba paper by AI21 Labs.

  3. A walkthrough DeepEval, a framework for LLM evaluation.

💡 ML Concept of the Day: Understanding Jamba

Regularly, State Space Models(SSMs) are positioned as an alternative to transformer models but if that doesn’t have to be the case. This was the thesis behind a new model called Jamba released by the ambitious team at AI21 Labs. Jamba combines transformers and SSMs in a single architecture that could open new avenues for the future of LLMs. Jamba tries to address some of the limitations of SSM’s performance relative to transformer in many traditional scenarios.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share