TheSequence Scope: The Challenge of Data-Efficient Machine Learning

Weekly newsletter that discusses impactful ML research papers, cool tech releases, the money in AI, and real-life implementations

Jul 25, 2020

📝 Editorial

Supervised learning is the dominant school in machine learning solutions. The idea of training a model in a labeled dataset in order to master a task seems intuitive. However, in practice, many supervised learning techniques run into challenges that require large labeled datasets in order to generalize even very simple knowledge.

This challenge is overwhelming for both startups and big companies and is one of the main roadblocks for the mainstream adoption of ML. We all love to hear about breakthroughs like AlphaGo or GPT-3 until you realize the ginormous size of the training datasets used to create those models.

The idea of building machine learning methods that can operate with smaller labeled datasets is an active area of research and there is no shortage of ideas. Techniques like semi-supervised learning attempt to use unlabeled datasets in the training process. Generative models look to create new labeled datasets from existing ones. Self-supervised learning wants to build models that learn from scratch. Transfer learning tries to reuse knowledge between tasks while meta-learning has the simple goal of building models that learn to learn from scratch. Just this week, DeepMind proposed a new meta-learning technique to build more efficient reinforcement learning models. Some of the best minds in machine learning are paving the way to build more data-efficient models.

🗓 Next week in TheSequence Edge

July 28, Edge#7: the concept of generative models; Optimus, one of the most innovative research in generative models recently published by Microsoft Research; ART, an open-source framework that uses generative models for protecting neural networks.

July 30, Edge#8: the concept of generative adversarial networks; the original GAN paper by Ian Goodfellow; deep dive into TF-GANs.

To stay up to date and receive TheSequence Edge every Tuesday and Thursday, please consider joining our community. Till August 15 you can subscribe with a permanent 20% discount. Sunday edition of TheSequence Scope is always free.

Now, let’s review the most important developments in AI research and technology this week.

🔎 ML Research

Using Meta-Learning to Generate Reinforcement Learning Models

DeepMind researchers published a paper proposing a meta-learning method that can automatically generate reinforcement learning models ->read the original paper

Self-Supervised Learning for Image Classification

Facebook AI Research (FAIR) is at the forefront of self-supervised learning. Recently, they published a paper proposing a self-supervised learning method for training image classification models ->read more in Facebook AI blog

Data-Efficient Reinforcement Learning

The prestigious Berkeley AI Lab published a couple of papers about techniques for making reinforcement learning operate with smaller training datasets ->read more in Berkeley AI Lab blog

🤖 Cool AI Tech Releases

LinkedIn’s LIquid

LinkedIn’s engineering team detailed their work on LIquid, a new type of graph database ->read more in LinkedIn’s engineering blog

TensorFlow Lite XNNPACK Integration

The TensorFlow team unveiled support for the XNNPACK hardware acceleration library. The integration will allow 2-3x faster inference routines in TensorFlow Lite models ->read more in TensorFlow blog

💬 Useful Tweet

GPT3 (generative pre-trained transformer) is the newest in the family of NLU (nature language understanding) models but it's been around for a few months. It uses the transformer architecture that we covered in Edge#3. The hype around GPT3 is huge, but the language generators are still in a very nascent stage. Great opportunities to join OpenAI and work on the development of this fascinating technology.

💸 Money in AI

AI-based crowdsourcing startup StuffThatWorks raised a $9 million seed round. Its idea is to let people find the most effective treatments collaborating with each other. The human input is enhanced by ML algorithms that are programmed to look for valuable insights.
Autonomous technology company and intelligence systems provider Sea Machines Robotics raised $15 million to accelerate deployment of its AI-powered situational awareness mechanism in the unmanned naval boat and ship market. They are hiring.
Educational platform Riiid has just raised a large sum of $41.8 million for AI-test prep solutions. After successful concept validation in Korea, Japan, and Vietnam, Riiid is going to expand across the U.S., South America, and the Middle East.
Big data analytics startup Quantexa, which built the machine learning platform “Contextual Decision Intelligence” (CDI) has raised $64.7 million. The platform gathers scattered data points, analyses it to uncover risky activities, enhance customer intelligence, and deal with credit risk and fraud challenges.

If you find our newsletter useful, please consider supporting our efforts. Subscribe or make it a gift for those who can benefit from it. It’s a permanent 20% discount until August 15.

TheSequence is a summary of groundbreaking ML research papers, engaging explanations of ML concepts, exploration of new ML frameworks, and platforms. It also keeps you up to date with the news, trends, and technology developments in the AI field.

5 minutes of your time, 3 times a week – you will steadily become knowledgeable about everything happening in the AI space.

TheSequence

Discussion about this post

Ready for more?