🗣🏎 The Race for Big Language Models Continues

Oct 17, 2021

📝 Editorial

Massively large pretrained models have become the norm in natural language processing (NLP). It seems that every other month, we achieve a new milestone in terms of the size of language models. And yet, we can’t stop writing about it because it’s so fascinating. When GPT-3 reached 175 billion parameters a few months ago, it seemed that we were close to the peak in size of language models. Since then, such models as Switch Transformer and the recently announced Wu Dao 2.0 have comfortably surpassed 1 trillion parameters. Just this week, Microsoft Research and NVIDIA announced a new generative language model with a remarkable 530 billion parameters.

Named Megatron-Turing NLG, the new model leverages Microsoft’s proprietary parallel training technologies such as DeepSpeed. The model achieved state-of-the-art performance in several highly complex disciplines such as reaching comprehension, common sense reasoning, and language inference. Just like Google and OpenAI, Microsoft is a big believer in large pretrained language models. We do need to wonder whether this path is sustainable at all. At some point, language models would need to become smarter without necessarily becoming much bigger if we want to achieve mainstream adoption of these technologies.

🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻

🗓 Next week in TheSequence Edge:

Edge#133: we discuss Self-Supervised Learning for Speech; we explore AVID, a SSL model for audiovisual tasks; we overview s3prl, an open-source framework for SSL speech models.

Edge#134: fascinating results of Run:ai’s AI Infrastructure Survey.

Now, let’s review the most important developments in the AI industry this week

🔎 ML Research

MT-NLG

Microsoft Research and NVIDIA published the research behind Megatron-Turing Natural Language Generation, one of the largest generative language models ever built with over 530 billion parameters →read more on Microsoft Research blog

RGB-Stacking

DeepMind published a paper outlining a new benchmark for robotic applications →read more on DeepMind blog

Self-Supervised Learning and Medical Imagining

Google Research published a paper about recent applications of self-supervised learning to medical image classification →read more on Google Research blog

Predicting Traffic Crashes

Researchers from MIT published a paper outlining a deep learning model that can predict traffic crashes based on high-resolution images and sensor inputs →read more on MIT News

🛠 Real World ML

TensorFlowJS and DermAssist

Google Health published a blog post detailing the TensorFlow JS architecture powering DermAssist, a dermatology-focused digital assistant →read more on TensorFlow blog

Optimizing Uber HDFS Infrastructure

The Uber engineering team published a blog post about the practices followed to optimize its HDFS infrastructure for cost efficiency →read more on Uber blog

🤖 Cool AI Tech Releases

Microsoft Translator 100 Languages Milestone

Microsoft added 12 new languages to Microsoft Translator, pushing the total to 100 supported languages →read more on Microsoft AI blog

MiniHack

Facebook AI Research open-sourced MiniHack, a new environment for advancing research in open-ended reinforcement learning scenarios →read more on FAIR blog

🔦 Career’s highlight

Alumni of AI2 Incubator Yoodli – an AI-enabled software platform that analyzes public speaking and gives personalized tips for improvement – is looking for the Fullstack Developer, Applied AI Engineer, Head of Growth. Positions are open in Seattle, US/remote.

Check out all AI2 positions

💸 Money in AI

For ML&AI:

AI chipmaker Hailo raised $136 million in a Series C funding round led by Poalim Equity and entrepreneur Gil Agmon. Hiring in Tel Aviv, Israel.
Developer-first MLOps platform Weights & Biases raised more than $100 million in a Series C funding round co-led by Felicis Ventures, BOND, Insight Partners and Coatue. Hiring in multiple locations.
Autonomous Kubernetes management platform Cast AI raised $10 million in series A funding led by Cota Capital. Hiring in Vilnius, Lithuania /San Francisco, US.
Intelligent applications startup Spice AI raised a $1 million seed funding round. Hiring in Seattle, US.

AI-powered:

Employee behavior analytics startup Aware raised $60 million in a Series C round led by Goldman Sachs Growth Equity. Hiring remote.
Customer service analytics platform SupportLogic raised $50 million in Series B funding led by WestBridge Capital Partners and General Catalyst. Hiring.
Sales engagement platform Groove raises $45 million Series B funding round led by Viking Global Investors. Hiring in multiple locations.
Geospatial analytics startup AiDash raised $27 million in Series B funding led by G2 Venture Partners. Hiring in San Jose, US.
People enablement platform AmplifAI raised $18.5 million in a Series A funding round led by Greycroft.
GPT-3-powered writing tool Copy.ai raised $11 million in a Series A round, led by Wing Venture Capital. Hiring remote.
Healthcare startup ScienceIO raised $8 million in seed funding. Hiring in the US/remote.
Synthetic data startup AI.Reverie (website is no longer available) was quietly acquired by Facebook.

TheSequence