TheSequence

Share this post

🌅 Edge#229: VQGAN + CLIP

thesequence.substack.com

🌅 Edge#229: VQGAN + CLIP

+the original VQGAN+CLIP paper; +VQGAN+CLIP implementations

Sep 27, 2022
46
Share this post

🌅 Edge#229: VQGAN + CLIP

thesequence.substack.com

In this issue:

  • we introduce VQGAN + CLIP architecture;

  • we discuss the original VQGAN+CLIP paper; 

  • we explore the VQGAN+CLIP implementations. 

Enjoy the learning!  


💡 ML Concept of the Day: VQGAN + CLIP 

In Edge#219, we introduced OpenAI’s CLIP as a method that was able to represent both text and images in the same feature space. OpenAI published CLIP and DALL-E together but open-sourced only the former. Almost immediately after CLIP was made available, AI researchers started playing with the idea of combining it with generative adversarial networks (GANs). That evolution produced one of the most popular text-to-image synthesis in the current ecosystem: VQGAN+CLIP.  

The idea of combining CLIP and GANs is very intuitive. By

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2023 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing