Edge 302: Inside MPT-7B: MosaicML's Suite of Open Source LLMs that Supports 65k Tokens
The new suite of models was released by MosaicML and support models optimized for Instructions, Chats, Stories and More.
The world is undergoing a transformative shift, courtesy of the remarkable impact of large language models (LLMs). However, for individuals outside the confines of well-funded industry laboratories, the process of training and implementing these models can prove to be an arduous task. As a consequence, there has been an upsurge of activity centered around open-source LLMs. Prominent examples include Meta’s LLaMA series, EleutherAI’s Pythia series, StabilityAI’s StableLM series, and Berkeley AI Research’s OpenLLaMA model.
MosaicML recently introduced of a novel model series named MPT (MosaicML Pretrained Transformer) to address the limitations encountered by the aforementioned models. This release aims to provide an open-source model that is both commercially viable and surpasses the capabilities of LLaMA-7B in various aspects. Key features of our MPT model series include: