TheSequence

Share this post

Edge 264: Inside Muse: Google’s New Text-to-Image Super Model

thesequence.substack.com

Edge 264: Inside Muse: Google’s New Text-to-Image Super Model

The new generative AI model shows significant efficiency improvements over models like Stable Diffusion, Imagen and Parti.

Jan 26
21
Share this post

Edge 264: Inside Muse: Google’s New Text-to-Image Super Model

thesequence.substack.com
Created Using: Stable Diffusion

Text-to-Image(TTI) models have been at the center of the generative AI revolution with models such as DALL-E, Stable Diffusion or Midjourney capturing the headlines. This explosion in high quality TTI models have been fundamentally powered by diffusion or autoregressive methods that can effectively compute similarities between text and images. The nascent nature of these architectures remain makes them relatively prohibited from a computational standpoint and there is still a lot of work that can be done to improve their efficiency and cost. Recently, Google unveiled Muse, a TTI model that can achieve state-of-the-art image quality outputs while remaining more efficient than diffusion and autoregressive models.

Muse follows Google’s active work in TTI with diffusion models such as Image or autoregressive models like Parti. Muse builds on the lessons learned in building both architectures to improve the computational efficiency while achieving the same level of image quality.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2023 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing