Edge 447: Not All Model Distillations are Created Equal
Understanding the different types of model distillation.
In this issue:
Understanding the different types of model distillation.
The original model distillation paper.
The Haystack framework for RAG applications.
💡 ML Concept of the Day: Types of Model Distillation
In the previous issue of this series, we introduced the concept of model distillation. The core idea of this technique is to use a teacher model to train a smaller, more efficient model that maintains the accuracy of the teacher model on a specific set of domains. As you can imagine, there are multiple ways to accomplish this based on a teacher-student interactions.
Model distillation techniques fall into three primary categories: response-based, feature-based, and relation-based. Each method emphasizes a different aspect of transferring knowledge from a larger, more complex teacher model to a smaller, more efficient student model. These approaches each come with their own set of strengths and limitations, and understanding them can be beneficial, even though response-based distillation is often the easiest to implement.
1. Response-Based Distillation: