🔹🔸Edge#104: AllenNLP Makes Cutting-Edge NLP Models Look Easy

No subscription is needed

This is an example of TheSequence Edge, a Premium newsletter that our subscribers receive every Tuesday and Thursday. On Thursdays, we do deep dives into one of the freshest research papers or technology frameworks that is worth your attention. Subscribe today with 40% OFF ($30/year). Only 4 days left!

Subscribe with 40% OFF

💥 What’s New in AI: AllenNLP Makes Cutting-Edge NLP Models Look Easy

Natural language processing (NLP) is one of the fastest areas of growth in the deep learning space. The advent of new deep learning architectures such as transformers has turned out to be sort of a Sputnik moment for NLP, triggering a race in innovation to levels we have never seen before. Breakthroughs such as Google BERT or OpenAI GPT-3 have raised the level of possibilities and expectations for NLP applications. However, despite the indisputable progress in NLP research, implementing these types of advanced NLP techniques in real-world applications remains a challenge. AllenNLP is an open-source framework that streamlines the implementation of state-of-the-art NLP models for a diverse number of linguistic tasks. 

In a machine learning ecosystem with an overwhelming number of NLP stacks, it becomes increasingly difficult to develop solid criteria to select one technology stack over others. If we look deeper into the current composition of the NLP tech space, we will find a disproportionally large number of frameworks that enable basic capabilities such as sentiment analysis or text classification using relatively basic NLP models. The result is that it’s really hard to find the frameworks that enable the implementation and usage of new cutting-edge NLP techniques without requiring monumental development efforts. There are a handful of frameworks that have been set to tackle this challenge. Among them, AllenNLP has gained relevant traction as one of the most advanced frameworks for advancing NLP research and implementation. 

Created by the Allen Institute for AI, AllenNLP provides a simple and modular programing model for applying advanced deep learning techniques to NLP research, streamlining the creation of NLP experiments and abstracting the core building blocks of NLP models. AllenNLP is based on PyTorch and has quickly become a favorite of the NLP research and development community. What are the key capabilities of AllenNLP? Let’s deep dive. 


At its core, AllenNLP is an open-source library that encapsulates the common operations that are typically done in the implementation of NLP models. Since its inception, AllenNLP was designed to enable a fundamental set of capabilities: 

  1. NLP Model Abstractions: Abstract the core components of NLP models, making it easy to create NLP models by using a higher-level of abstraction while enabling the reusability of low-level components. 

  2. Low-Level NLP Abstractions: Provide abstractions for low-level NLP tasks such as masking and padding, keeping the implementation details detached from the core NLP model. 

  3. Experiment Design: Enable the design of NLP experiments using a declarative model that makes it easier to change and version. 

  4. Education: Share implementations of sophisticated NLP models through interactive online demos so they can be easily accessible by the data science community. 

From a functional standpoint, AllenNLP abstracts the main components of the lifecycle of an NLP solution, from reading data to building and training a model. The declarative nature of the experiment configuration is one of the key differentiators of AllenNLP compared to alternative NLP frameworks. In AllenNLP, data scientists can orchestrate the interactions between the main components of an NLP workflow using configuration files instead of code. 

The process of building an NLP solution in Allen NLP starts with reading and processing the training dataset. These tasks are abstracted by the DatasetReader class. The objective of the DatasetReader class is to process a text input and produce a series of labeled fields that can be used to train an NLP model. 

The outputs produced by the DatasetReader are used to build specific NLP models. In AllenNLP, the process of implementing a model is abstracted by the, well, Model class. Given that AllenNLP is built on PyTorch, every Model is implemented as a PyTorch Module, which streamlines the interoperability with other components of the PyTorch ecosystem. 

The training process in AllenNLP is fundamentally driven using JSON configuration files, which set up the parameters for the training of specific NLP models. AllenNLP configuration files enable the configuration of a complete training loop from the reading and transformation of the input dataset to the training optimization. Below, we can see a configuration file for a sample text classification model. 

Once the models have been initialized and trained, AllenNLP provides mechanisms to streamline important aspects of its lifecycle such as hyperparameter optimization, model serving, debugging and several others. 

A Large Portfolio of NLP Tasks 

One of the key benefits of AllenNLP is its large catalog of NLP tasks that can be assembled into comprehensive solutions. From pre-trained models to encoding and transformation tasks. AllenNLP includes a rich library of NLP that includes some of the following categories: 

  • Text Generation: Includes tasks such as summarization and others that involve generating text based on a larger input text. 

  • Language Modeling: Tasks that focus on learning the probability distribution over sequences of tokens. 

  • Multiple Choice: Tasks that require selecting an option among several alternatives. 

  • Pair Classification: Tasks that text two sentences as input to determine the relationships between their facts. 

  • Structured Prediction: Tasks that determine in a sentence representations that can answer specific questions. 

  • Sequence Tagging: Tasks that identify entities within a sentence. 

  • Text + vision: Multi-modal tasks such as visual question answering that can answer textual questions about the contents of images. 

The implementation of these tasks is abstracted as PyTorch modules, which can be used with or without AllenNLP. Additionally, AllenNLP contains a large portfolio of pre-trained models, including several transformer architectures that drastically simplify the process of using these state-of-the-art techniques in NLP solutions. 

A Framework for Modern NLP 

In the current state of the deep learning ecosystem, the task of selecting an NLP stack might seem overwhelming. However, the process becomes drastically simpler if you discriminate by frameworks that incorporate state-of-the-art NLP research methods such as transformer architectures. AllenNLP is one of those rare NLP frameworks that includes implementations of cutting-edge NLP techniques using a simple and modular programming model based on PyTorch. AllenNLP was created by researchers with the objective of advancing NLP research and implementations and the first iterations of this framework certainly achieve that objective. 

Further Reading: More details about AllenNLP can be found on the project’s website: https://allennlp.org/

Previously in ‘What’s New in AI’: Edge#102: DeepMind Redefines One of the Most Important Algorithms in ML as a Game