Edge 328: Inside AudioCraft: Meta AI’s New Family of Generative Audio Models
A review of Meta's EnCodec, AudioGen and MusicGen models.
Audio is rapidly becoming one of the new frontiers of generative AI. In the pursuit of generating high-fidelity audio, Meta AI faces the challenge of modeling intricate signals and patterns at diverse scales. Among various audio types, music proves especially daunting due to its amalgamation of local and long-range patterns, spanning from individual notes to complex musical structures with multiple instruments. While conventional approaches rely on symbolic representations like MIDI or piano rolls, they fall short in capturing the expressive nuances and stylistic richness intrinsic to music. Meta AI recently introduced AudioCraft, a family of generative AI models for high-quality audio generation.