Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model

Can a hybrid design outperform each one of the baseline architectures?

Sep 03, 2024

∙ Paid

In this issue:

An overview of the concepts in the Jamba model that combines transformers and SSMs.
A review of the original Jamba paper by AI21 Labs.
A walkthrough DeepEval, a framework for LLM evaluation.

💡 ML Concept of the Day: Understanding Jamba

Regularly, State Space Models(SSMs) are positioned as an alternative to transformer models but if that doesn’t have to be the case. This was the thesis behind a new model called Jamba released by the ambitious team at AI21 Labs. Jamba combines transformers and SSMs in a single architecture that could open new avenues for the future of LLMs. Jamba tries to address some of the limitations of SSM’s performance relative to transformer in many traditional scenarios.

TheSequence

Edge 427: Jamba Combines SSMs, Transformers and MOEs in a Single Model

Can a hybrid design outperform each one of the baseline architectures?

In this issue:

💡 ML Concept of the Day: Understanding Jamba

This post is for paid subscribers