Google researchers have introduced MusicLM, an AI model that can generate high-fidelity music from text. MusicLM creates music at a constant 24 kHz throughout a number of minutes by modeling the conditional music generating process as a hierarchical sequence-to-sequence modeling problem.
According to the research paper, MusicLM was trained on a dataset of 280,000 hours of music to produce songs that make sense for complex descriptions. The researchers also claim their model outperforms previous systems both in audio quality and adherence to the text description.
MusicLM samples, includes five-minute pieces produced from only one or two words like melodic techno, as well as 30-second samples that sound like entire