🤖Edge#142: How Microsoft Built a 530 Billion Parameter Model

What’s New in AI, a deep dive into one of the freshest research papers or technology frameworks that is worth your attention. Our goal is to keep you up to date with new developments in AI to complement the concepts we debate in other editions of our newsletter.


💥 What’s New in AI: How Microsoft built Megatron-Turing NLG, one of the largest language models in history  

Another month and another big transformer model becomes available. This time, it was Microsoft’s turn. In collaboration with NVIDIA, the Redmon giant announced a 530 billion parameter model called Megatron-Turing Natural Language Generation (MT-NLG). The model is a successor of Turing-NLG, which, a few months ago, was considered the biggest language model in the world. 

Large pretrained language models are always impressive, but they have become sort of the norm in the NLP space. In that sense, it’s worth looking into the unique aspects of a model that are uniquely differentiated compared to alternatives.

This post is for paid subscribers