Edge 398: Inside Phi-3: Microsoft's Amazing Small Language Model
The new family of models notoriously outperform models many times their size.
Phi-3 has been one of the models at the center of Microsoft’s generative AI announcemements at its BUILD conference so we thought it might be relevant to deep dive.
The Phi family of models have been associated with the term small language model(SLM) which has been gaining traction in the world of generative AI. The term was originally coined by Microsoft after publishing a paper with a catchy title: “Textbooks is all You Need” that challenged this assumption by creating a small coding language model trained solely in textbook quality datasets. The paper unveiled the Phi model that was able to outperform much larger alternatives. That was followed by the release of the Phi-2 models later last year, which showed Microsoft’s commitment to continue this line of work. A few weeks ago, Microsoft unveiled Phi-3, which, although not that small anymore, outperforms models 25 times its size.
Let’s dive into some of the details behind Phi-3.