TheSequence

TheSequence

Share this post

TheSequence
TheSequence
Edge 441: SSMs Beyond Language

Edge 441: SSMs Beyond Language

Oct 22, 2024
∙ Paid
19

Share this post

TheSequence
TheSequence
Edge 441: SSMs Beyond Language
2
Share
Created Using Midjourney

In this issue:

  1. Exploring SSMs for non-language modalities.

  2. Meta AI resesearch about SSMs for speech recognition.

  3. The Llama-Factory framework for pretraining LLMs.

💡 ML Concept of the Day: SSMs Beyond Language

Throughout this series, we have explored the fundamentals latest research in state space models(SSMs) as one of the main alternatives to transformer architectures. SSMs provide a more efficient scaling mechanism than transformers which makes it ideal for models with large context windows. Given the state of the market, the core focus on SSMs have been in LLMs but, surprisingly, some of the core applications of SSMs are surfacing in other modalities.

Take audio, for instance, SSMs have emerged as one 3of the most efficient techniques in this modality given its efficiency processing continuous irregular continuous data. Models like AudioMamba , RawMamba and some of the work done by Cartesia are great examples of SSMs applied to audio.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share