➗✖️ OpenAI New NLP Challenge: Mathematical Reasoning
Weekly news digest curated by the industry insiders
Mathematical reasoning has long been one of the mystical challenges for ML models. In recent years, we have seen tremendous advances in natural language processing(NLP) with methods such as transformers powering models like GPT-3 that can solve many complex language tasks. However, even those models struggle when presented with multi-step mathematical reasoning problems. Take a simple word math problem such as the following :
“Tim grows 5 trees. Each year he collects 6 lemons from each tree. How many lemons does he get in a decade?”
From an ML perspective, creating a model that solves these types of problems presents some fundamental challenges. In addition to the sophisticated interpretability required in math reasoning problems, they are very sensitive to cascading errors. Mathematical reasoning models need to be able to correct mistakes accordingly and concatenate a complex sequence of steps. Not surprisingly, many experts believe that mathematical reasoning problems are a way to expose the limitations of NLP models.
Despite the challenges, the AI community has been steadily making progress towards creating ML models specialized in multi-step mathematical reasoning. A few days ago, OpenAI published a paper outlining some methods to tackle word math problems. As part of their research, OpenAI also open-sourced GSM8K, a dataset of 8.5K high-quality problems at the grade school math level with varying levels of linguistic diversity. The research from OpenAI favors a dual method in which a model generates many candidate solutions, and a verifier model evaluates the correctness of the possible solutions. This student-teacher approach is resemblant to how we learn math in the first place 😉. OpenAI research is one of the most impressive developments in one of the toughest areas of NLP.
🔺🔻TheSequence Scope – our Sunday edition with the industry’s development overview – is free. To receive high-quality content about the most relevant developments in the ML world every Tuesday and Thursday, please subscribe to TheSequence Edge 🔺🔻
🗓 Next week in TheSequence Edge:
Edge#139: we start a new series about MLOps; we explore TFX, a TensorFlow-based architecture created by Google to manage machine learning models; we overview MLflow, a platform for end-to-end machine learning lifecycle management.
Edge#140: a deep dive into new Metacloud by cnvrg.io, an ML platform that offers flexibility and choice over what infrastructure to use for your AI workloads.
Now, let’s review the most important developments in the AI industry this week
🔎 ML Research
Solving Math World Problems
OpenAI published a paper and dataset presenting a verifier model that can solve word math problems →read more on OpenAI blog
AGI and Real-World Problems
DeepMind published a thoughtful post outlining some of the fundamental world problems such as weather forecasts that could foster a path towards artificial general intelligence (AGI) →read more on DeepMind blog
A New Language-Image Model
Microsoft Research published a paper proposing Turing Bletchley, a 2.5 billion parameter model that mastered several image-language tasks in multiple languages →read more on Microsoft Research blog
Reversible Actions in RL Agents
Google Research published a paper unveiling a method that can detect reversible actions in RL environments which are key to enabling safer RL models →read more on Google Research blog
🤖 Cool AI Tech Releases
Azure OpenAI Service
Microsoft announced the release of the Azure OpenAI Service, which enables access to the OpenAI GPT-3 API through the Azure platform →read more on Microsoft AI blog
RStudio in SageMaker
AWS announced the availability of a fully managed version of the popular RStudio on Amazon SageMarker →read more on Amazon blog
The PyTorch team unveiled Torch FX, a new TorchVision utility library for feature extraction and transformation →read more on PyTorch blog
🗯 Interesting Tweet
💸 Money in AI
Data management startup SiaSearch was acquired by Scale AI under undisclosed financial terms.