TheSequence

TheSequence

Share this post

TheSequence
TheSequence
The Sequence Research #558: The New Reinforcement Learning from Internal Feedback Allows LLMs to Reason Without External Rewards

The Sequence Research #558: The New Reinforcement Learning from Internal Feedback Allows LLMs to Reason Without External Rewards

The new method from UC Berkeley provides an interesting complement to traditional RLHF methods.

Jun 06, 2025
∙ Paid
10

Share this post

TheSequence
TheSequence
The Sequence Research #558: The New Reinforcement Learning from Internal Feedback Allows LLMs to Reason Without External Rewards
Share
Generated image
Created Using GPT-4o

Reinforcement learning has established itself as a key technique to enhance the capabilities of large language models (LLMs), particularly in complex reasoning tasks. Established approaches such as Reinforcement Learning from Human Feedback (RLHF) and Reinforcement Learning with Verifiable Rewards (RLVR) have delivered impressive results, aligning models with human preferences and improving factual correctness through testable reward structures. Yet, both come with intrinsic limitations. RLHF demands labor-intensive and costly human annotation, while RLVR is constrained to domains where answers can be objectively verified via test suites or matching gold-standard outputs.

In response to these limitations, the paper "Learning to Reason without External Rewards" proposes a radically different paradigm: Reinforcement Learning from Internal Feedback (RLIF). This approach enables LLMs to learn from their own internal signals, without relying on any form of external supervision. The central idea is instantiated through a new algorithm called INTUITOR, which uses a model's internal measure of confidence, termed self-certainty, as the only reward signal for policy optimization.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share