TheSequence

TheSequence

Share this post

TheSequence
TheSequence
Edge 452: The AI Magic Behind Google's NotebookLM Audio Features

Edge 452: The AI Magic Behind Google's NotebookLM Audio Features

How does NotebookLM generate such cool podcasts?

Nov 28, 2024
∙ Paid
16

Share this post

TheSequence
TheSequence
Edge 452: The AI Magic Behind Google's NotebookLM Audio Features
1
Share
Created Using Midjourney

Google’s NotebookLM has rapidly become one of the most popular AI tools since the release of ChatGPT. Podcast generation is by far the most popular feature of NotebookLM. These days I constantly find social media threads that use audio clips generated by NotebookLM to the point that I am starting to become familiar with the voices in the podcast. The audio generation in NotebookLM touches on aspects such as humor, regular questions, interruptions etc which are incredibly hard to master. How did Google achieved this? Well, NotebookLM’s audio generation capabilities were the result of combining several techniques developed by Google DeepMind over the last few years. Specifically NotebookLM audio magic was powered by innovations in two key models: SoundStorm and AudioLM, which underpin Google DeepMind’s approach to audio generation.

Audio generation represents a burgeoning area of research within the domain of Artificial Intelligence (AI). This field centers on the creation of artificial systems capable of generating realistic and coherent sounds, including speech and music. Google DeepMind has made notable strides in this domain, pioneering novel techniques that are significantly impacting audio generation.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Jesus Rodriguez
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share