Edge 342: Who's Happy Potter? Inside One of the Most Fascinating Papers Published This Year

Microsoft Research’s details a fine-tuning method for unlearning concepts in LLMs

Nov 09, 2023

∙ Paid

Create a cinematic scene featuring an advanced humanoid robot sitting in a dimly lit, cluttered attic, surrounded by dusty books and various magical paraphernalia. The robot's face displays a puzzled expression as it holds an open book with the title 'Wizardry & Witchcraft' barely visible. Scattered on the floor are a broomstick, a wizard's hat, and a pair of round glasses. The robot's finger is on a page as if it's trying to comprehend the text, with a holographic projection above its head showing a broken link symbol and question marks, symbolizing the forgetting of information. The atmosphere should convey a sense of confusion and forgetfulness. — Created Using DALL-E

Large language models(LLMs) are regularly trained in vast amounts of unlabeled data, which often leads to acquiring knowledge of incredibly diverse subjects. The datasets used in the pretraining of LLMs often including copyrighted material, triggering both legal and ethical concerns for developers, users, and original content creators. Quite often, specific knowledge from LLMs is required to be removed in order to adapt it to a specific domain. While the learning in LLMs is certainly impressive, the unlearning of specific concepts remains a very nascent area of exploration. While fine-tuning methods are certainly effective for incorporating new concepts, can they be used to unlearn specific knowledge?

In one of the most fascinating papers of this year, Microsoft Research explores an unlearning technique for LLMs. The challenge was nothing less than making Llama-7B to forget any knowledge of Harry Potter.

The Unlearning Challenge in LLMs

TheSequence

Edge 342: Who's Happy Potter? Inside One of the Most Fascinating Papers Published This Year

Microsoft Research’s details a fine-tuning method for unlearning concepts in LLMs

The Unlearning Challenge in LLMs

This post is for paid subscribers