Inside Mixtral 8x7B: One of the Most Exciting Open Source LLM Ever Releases of this Year
The model follows Mistral 7b with an innovative mixture-of-experts architecture that deviates a bit from monolthical transformer models.
Mistral AI is one of the most innovative companies pushing the boundaries of open-source LLMs. Mistral’s first release: Mistral 7B has become one of the most adopted open-source LLMs in the market. A few days ago, they dropped a torrent link with Mixtral 8x7B, their second release, which is quite intriguing. Today we dive into this release, which is considered by many, one of the most important development in open source generative AI this year.
What makes Mixtral 8x7B so interesting is the fact that it explores a new architecture paradigm that contrasts with the monolithic approach followed by most LLMs. The model is based on a mixture-of-experts approach, which, although not new, hasn’t been proven in the LLM space at scale.
There is not a lot published about Mixtral 8x7B, but below I outlined some details that might be relevant: