Edge 449: Getting Into Adversarial Distillation
A way to distill models using inspiration from GANs.
In this issue:
Understanding adversarial distillation.
Alibaba’s paper about Introspective Adversarial Distillation (IAD).
LMQL framework for querying LLMs.
💡 ML Concept of the Day: An Overview of Adversarial Distillation
Previously, we covered the main types of model distillation techniques including online, offline and self-distillation which are fundamentally based on the student-teacher interactions. For the remaining of this series, we are going to focus on specific knowledge distillation techniques that are widely adopted in the space of foundation models. The first stop is about a method known as adversarial distillation.
As it names indicates, adversarial distillation draws inspiration from generative adversarial networks(GANs) using a generator-discriminator architecture. In that setting, the generator creates synthetic samples close to the true data distribution while the discriminator learns to differentiate between the synthetic and original data samples. Applying these ideas to knowledge distillation we end up with a simple workflow: