TheSequence

TheSequence

Share this post

TheSequence
TheSequence
Edge 375: Meta's System 2 Attention is a Very Unique LLM Reasoning Method
Copy link
Facebook
Email
Notes
More

Edge 375: Meta's System 2 Attention is a Very Unique LLM Reasoning Method

The method has been inspired by cognitive psychology and has immediate impact in LLM reasoning.

Mar 05, 2024
∙ Paid
71

Share this post

TheSequence
TheSequence
Edge 375: Meta's System 2 Attention is a Very Unique LLM Reasoning Method
Copy link
Facebook
Email
Notes
More
2
Share
Illustrate an advanced artificial intelligence model conceptualized as a complex machine, embodying Daniel Kahneman's System 1 and System 2 thinking. This machine consists of two distinct sections: one side is sleek, with smooth surfaces and a fast, intuitive operation symbolizing System 1. It's depicted with abstract, fluid designs, perhaps glowing lines or circuits, representing quick, automatic thoughts. The other side is rugged, with gears, levers, and digital displays, representing the slow, deliberate, and logical processing of System 2. The machine is placed in a futuristic laboratory setting, with ambient light highlighting the contrast between the two sections. Include visual metaphors such as a brain-like structure connecting both sections, symbolizing the integration of fast and slow thinking in decision making. The scene should evoke a sense of advanced technology and cognitive science blending together.
Created Using DALL-E

In this Issue:

  1. An introduction to Meta’s System 2 Attention(S2A) method for reasoning in LLMs.

  2. A review of the original S2A paper.

  3. An review of the Chainlit framework for building LLM apps.

💡 ML Concept of the Day: Understanding Meta’s System 2 Attention

LLMs excel in reasoning and knowledge accumulation through their extensive pre-training. They are designed to focus intensely on the current context for predicting the next word. For instance, if a particular entity appears in a text, the model anticipates its recurrence. Transformer-based LLMs, with their soft-attention mechanism, are adept at identifying similar words and concepts within their context. While this enhances their prediction accuracy, it also leaves them vulnerable to misleading correlations in the context they analyze. Let’s look at the following example which clearly shows how the output of LLMs is affected by irrelevant correlations in context.

Image Credit: Meta AI

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More