🗣🗣🗣 No Language Left Behind
📝 Editorial
Natural language understanding (NLU) is the area of deep learning that has seen the most impressive breakthroughs in recent years. However, most of the large-scale NLU models that impressed us are regularly optimized for a small set of high-resource languages. NLU models that exhibit remarkable performance in areas such as question answering, text completion and machine translation in languages like English, Spanish or French struggle when applied to hundreds of dialects that don’t possess large training datasets. The result is that there is growing inequality among the segments of the world population that can benefit from high-quality NLU solutions. This disparity is even more apparent for languages spoken outside Europe and North America.
Extending NLU research to low-resource languages is a known challenge in the space. One of the most impressive achievements of recent years came last week from Meta AI with the release of the No Language Left Behind (NLLB)-200 model. This single neural network is able to translate text from 200 different languages achieving state-of-the-art results. To train NLLB-200, Meta AI used a technique two-step curriculum approach in which knowledge acquired from high-resource language training epochs was used in low-resource languages. The result was a massive 54 billion parameter model that had to be trained in Meta’s new Research SuperCluster (RSC) supercomputer. Together with NLLB-200, Meta AI open-sourced the FLORES-200 dataset for evaluating machine translation models. It also provides $200,000.00 in grants to non-profit organizations building applications that use NLLB-200. All together, NLLB-200 represents one of the most impressive milestones ever achieved in machine translation for low-resource languages.
🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻
🗓 Next week in TheSequence Edge:
Edge#207: we summarize our graph neural networks (GNNs) series.
Edge#208: we explore Google Brain’s Minerva who can solve complex mathematical and scientific problems using step-by-step reasoning.
Now, let’s review the most important developments in the AI industry this week
🔎 ML Research
Translating Across 200 Languages
Meta AI published a paper detailing a new model that can perform high-quality translations across 200 languages →read more on Meta AI blog
Director – a Hierarchical RL Agent
Google Research published a paper detailing Director, a hierarchical reinforcement learning agent that can learn hierarchical behaviors from raw pixels →read more on Google Reseach blog
Joint Image-Text Representations
Amazon Research published a paper presenting a model for alignment of features in image and text datasets →read more on Amazon Research blog
Disfluency Speech Detection
Google Research published a paper detailing a BERT-like model that can detect disfluency in natural speech →read more on Google Research blog
☝️ We Recommend – Try the Real-Time Database for Continuously Changing Data
You can now enroll in Molecula’s 7-day Cloud trial (without installation or infrastructure management) or install FeatureBase in your own environment to meet your needs (no credit card required) →See which trial experience is right for you
🤖 Cool AI Tech Releases
PyTorch 1.12
A new release of PyTorch is available with capabilities with Torch Arrow for batch data preprocessing, a functional API for modules and many others →read more on PyTorch blog
🛠 Real World ML
Anomaly Detection at Walmart
Walmart details the ML architecture used for anomaly detection in its e-commerce infrastructure →read more on the Walmart Tech Labs blog
Uber Spark Architecture
Uber discusses some of the updates for data shuffling in its Spark architecture →read more on Uber Engineering blog
💸 Money in AI
Deep tech startup Celus raised $25.6 million in a Series A round of funding led by Earlybird Venture Capital. Hiring in Munich/Germany.
AI chip developer Rebellions raised a $22.8 million extension to its Series A financing from strategic investor KT. Hiring in South Korea.
Acquisitions
Data observability startup Databand has been acquired by IBM to extend its leadership in observability. Details of the deal weren’t disclosed. Hiring in New York/US and Tel Aviv/Israel.