π·ββοΈπ§π»βππ©βπ»π¨π»βπ« The MoE Momentum
Weekly news digest curated by the industry insiders
πΒ EditorialΒ
Massively large neural networks seem to be the pattern to follow these days in the deep learning space. The size and complexity of deep learning models are reaching unimaginable levels, particularly in models that try to master multiple tasks. Such large models are not only difficult to understand but incredibly challenging to train and run without incurring significant computational expenses. In recent years, Mixture of experts (MoE) has emerged as one of the most efficient techniques to build and train large multi-task models. While MoE is not necessarily a novel ML technique, it has certainly experienced a renaissance with the rapid emergence of massively large deep learning models.Β Β
Conceptually, MoE is rooted in the simple idea of decomposing a large multi-task network into smaller expert networks that can master an individual task. This might sound similar to ensemble learning, but the big difference is that MoE models execute one expert network at any given time. The greatest benefit of MoE models is that their computation costs scale sub-linearly with respect to their size. As a result, MoE has become one of the most adopted architectures for large-scale models. Just this week, Microsoft and Google Research published papers outlining techniques to improve the scalability of MoE models. As big ML models continue to dominate the deep learning space, MoE techniques are likely to become more mainstream in real-world ML solutions.Β Β Β Β Β Β
πΊπ» TheSequence Scope is our Sunday free digest. To receive high-quality educational content about the most relevant concepts, research papers, and developments in the ML world every Tuesday and Thursday, pleaseΒ subscribeΒ toΒ TheSequence Edge πΊπ»
πΒ Next week in TheSequence Edge:
Edge#159:Β we recap our MLOPs series (two parts!);Β Β Β Β
Edge#160: we deep dive into Aporia, an ML Observability platform.
Now, letβs review the most important developments in the AI industry this week
π ML Research
Data2vecΒ
Meta (Facebook) AI Research (FAIR) published a paper unveiling data2vec, a self-supervised learning model that mastered speech, language, and computer vision tasks βread more on FAIR blog
MoE Task RoutingΒ
Google Research published a paper introducing TaskMoE, a technique to extract smaller, more efficient subnetworks from large multi-task models based on Mixture of experts (MoE) architectures βread more on Google Research blog
DeepSpeed and MoEΒ
Microsoft Research published a very detailed blog post detailing how to use its DeepSpeed framework to scale the training of Mixture of experts (MoE) models βread more on Microsoft Research blog
StylEx β Visual Interpretability of ClassifiersΒ
Google Research published a paper proposing StylEx, a method to visualize the influence that individual attributes have on the output of ML classifiers βread more on Google Research blogΒ Β
π€ Cool AI Tech Releases
Macaw DemoΒ
The Allen Institute for AI (AI2) open-sourced a demo solution that compares its Macaw model against OpenAIβs GPT-3 βread more on AI2 blog
π Β Real World ML
AI Fairness at LinkedInΒ
The LinkedIn engineering team published some details about how they integrate fairness as a first-class citizen of its AI products βread more on LinkedIn Engineering blog
π¦ Useful Tweet
πΈ Money in AI
AI-powered
Revenue operations platform Clari raised $225 million in a Series F round of funding led by Blackstone. Hiring across the US and remote.
Agtech startup Green Labs raised a $140 million Series C led by BRV Capital Management. Hiring in South Korea.
Contextual codeless AI infrastructure Pixis raised a $100 million Series C funding round led by SoftBank Vision Fund 2. Hiring in India and the US.
Banking and financial services platform Personetics raised $85 million in growth funding from Thoma Bravo. Hiring across the globe.
Security company Ambient.ai raised $52 million in venture funding led by a16z. Hiring mostly in Palo Alto/US.
Time management and smart calendar tool Clockwise raised $45 million in Series C funding led by Coatue. Hiring in San Francisco/US and remote.
HR management platform flex raised a $32 million Series B round led by Greenoaks. Hiring in South Korea.
Support automation platform Capacity raised an additionalΒ $27 millionΒ in a Series C round led by existing investors. Hiring remote.
Revenue growth platform Proton.aiΒ raised a $20 million Series A round led by Felicis Ventures. Hiring remote.
Logistic platform 7bridges raised $17 million in a Series A round led by Eight Roads. Hiring in London/UK.
SAAS CPG platform Turing Labs raised a $16.5 million Series A round led Insight Partners. Hiring remote.
Data science knowledge capturing and sharing solution Vectice raised a $12.6 million Series A round co-led by Sorenson Ventures and Crosslink Capital. Hiring in Nates/France and San Francisco/US.
Intelligent project tracking StructionSite raised $10 million in a funding round led by 500 Global. Hiring remote in the US.
People intelligence platform Diversio raised $6.5 million in Series A funding from a group of investors. Hiring in Canada, the US, and the UK.
Healthcare customer support platform BirchAI (a spinout from the Allen Institute for AI (AI2), our long-term partner) raised $3.1 million in seed financing led by Radical Ventures. You can read our interview with BirchAI CTO here. Hiring in Seattle/US.
Where can I read the basics of MoE?