📕📖📗 Natural Language Understanding Recap
Recap collections help you navigate specific topics and fill the gaps if you missed something
💡 Natural Language Understanding (NLU)
In the AI context, Natural Language Processing (NLP) is the overarching umbrella that encompasses several disciplines that tackle the interaction between computer systems and human natural languages. From that perspective, NLP includes several sub-disciplines such as discourse analysis, relationship extraction, natural language understanding (NLU) and a few other language analysis areas.
Natural Language Understanding (NLU) is a subset of NLP that focuses on reading comprehension and semantic analysis. NLU has experienced remarkable growth in the last few years. It is one of the areas of AI that has seen the greatest adoption into mainstream applications. From basic chatbots to sophisticated digital assistants, conversational applications are becoming a common trend in the software industry.
In the previous Edges, we’ve discussed several of the top technologies, cutting-edge research papers and concepts in the NLU space. Below, we listed some of our favorite editions about NLU topics.
Edge#21 (read without subscription): Among the NLU disciplines receiving lots of attention from the research community, question-answering models (QA models) are close to the top of the list. The universe of question-answering models can be divided into two main groups: closed-domain and open-domain. Closed-domain question-answering models focus on answering a limited set of questions about a specific topic or domain. These techniques have been popular to power machine reading applications in fields like telemedicine or research and discovery applications in legal fields. Open-domain question-answering is a more interesting and exponentially more complex challenge. The idea of open-domain question-answering is to create models that can answer questions about any topic and across an arbitrary set of documents. Also in Edge#21: the paper in which the Google Research team presents a new dataset for training and evaluating question-answering systems; DeepPavlov open-source framework for advanced NLU methods including question-answering.
Edge#22: In this issue, we explore the controversial world of text or language generation models. By controversial, we refer to the fact that text generation deep learning methods have been at the center of issues such as fake news, which have surfaced some of the nefarious uses of machine learning applications. However, text generation stands on its own merits as one of the most important NLU disciplines in current times. Also in Edge#22: the research behind Microsoft’s Turing-NLG, one of the largest language pre-trained models in history; and AllenNLP, an open-source framework for advanced natural language research.
Edge#23: In this issue, we expand into the topic of QA models, by focusing on another discipline that is getting a lot of attention from the deep learning research community: machine reading comprehension (MRC). Conceptually, MRC looks to replicate humans’ cognitive ability to understand a text with little or no previous context. If we want to test someone’s understanding of a given text, we typically ask different questions with various degrees of complexity. MRC looks to recreate that ability in deep learning models. Also in Edge#23: the concept of machine reading comprehension; the evaluation of the SQuAD 2.0 dataset from Stanford University; the introduction to the spaCy framework.
Edge#24: Here we explore the topic of text summarization. It consists of methods that generate a succinct summary from a longer-form text. The act of summarizing large pieces of information in brief textual outlines is one of the marvels of human cognition. When summarizing long-form texts, we not only make them shorter, we produce texts that capture the essence of the given information. Text summarization looks to recreate this skill in machine learning models. Also in Edge#24: the overview of PEGASUS, Google’s new research in abstractive text summarization; the exploration of Stanford’s CoreNLP framework.
Reading TheSequence Edge regularly, you become smarter about ML and AI. The content is trusted by the major AI Labs, enterprises, and universities of the world.