😶🌫️ Edge#225: Understanding Latent Diffusion Models
In this issue:
we explain latent diffusion models;
we discuss the original latent diffusion paper;
we explore Hugging Face Diffusers, a library for state-of-the-art diffusion models.
Enjoy the learning!
💡 ML Concept of the Day: Understanding Latent Diffusion Models
In the previous edition of our series about text-to-image synthesis (Edge#223), we explored the different types of diffusion techniques that are used by models in this area. Today, we would like to focus on one area that has been getting much adoption in recent years. Latent diffusion has quickly developed as one of the most viable options for implementing diffusion models in the real world without breaking the bank.
Diffusion models have been able to achieve state-of-the-art performance in image synthesis. Their formulation allows guiding mechanisms to control the image without the need for retraining. Not surprisingly, diffusion models have become a favorite technique to combine with methods like CLIP for text-to-image generation. As often happens in machine learning (ML), methods that have a robust technical foundation can run into practical problems. In the case of diffusion methods, the challenge is that they operate directly in the pixel space, which requires hundreds of GPU days for optimization and makes inference quite expensive as well.
Latent diffusion models (LDM) enable