Edge 375: Meta's System 2 Attention is a Very Unique LLM Reasoning Method
The method has been inspired by cognitive psychology and has immediate impact in LLM reasoning.
In this Issue:
An introduction to Meta’s System 2 Attention(S2A) method for reasoning in LLMs.
A review of the original S2A paper.
An review of the Chainlit framework for building LLM apps.
💡 ML Concept of the Day: Understanding Meta’s System 2 Attention
LLMs excel in reasoning and knowledge accumulation through their extensive pre-training. They are designed to focus intensely on the current context for predicting the next word. For instance, if a particular entity appears in a text, the model anticipates its recurrence. Transformer-based LLMs, with their soft-attention mechanism, are adept at identifying similar words and concepts within their context. While this enhances their prediction accuracy, it also leaves them vulnerable to misleading correlations in the context they analyze. Let’s look at the following example which clearly shows how the output of LLMs is affected by irrelevant correlations in context.