AI

Discover JASCO: Meta FAIR’s Innovative AI Model for High-Quality Text-to-Music Generation

Meta FAIR researchers introduce JASCO, a groundbreaking text-to-music AI model offering unprecedented control with symbolic and audio-based inputs. Learn how JASCO is transforming AI-generated music.

In a significant step forward for AI-generated music, Meta FAIR researchers have unveiled JASCO, a new generative text-to-music model that offers unprecedented control over the music creation process. Unlike other text-to-music approaches, JASCO can accept a wide range of conditioning inputs, such as chords or beats, allowing for greater flexibility and precision in the types of outputs it generates.

JASCO, which stands for “Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation,” is a temporally controlled text-to-music generation model that utilizes both symbolic and audio-based conditions. This innovative approach enables JASCO to generate high-quality music samples conditioned on global text descriptions, along with fine-grained local controls.

JASCO’s Conditioning Method

The model is based on the flow-matching modelling paradigm and a novel conditioning method, allowing for music generation controlled both locally (e.g., chords) and globally (text description). The researchers apply information bottleneck layers and temporal blurring to extract relevant information for specific controls. This allows the incorporation of symbolic and audio-based conditions in the same text-to-music model.

Meta FAIR researchers have conducted numerous studies focusing on different symbolic conditioning signals such as chords, melody or different types of audio representants such as separate drum tracks or a full mix. The outcomes point to the fact that the expertise level of music generated by JASCO is approximately similar to the evaluated baseline techniques when it comes to generation quality, while the techniques afford significantly better and more diverse controls over the created music.

To provide some evidence of their proficiency, Meta presents a page of music JASCO clips utilizing simple melodies that belong to the public domain and turning them into music tracks. For example, a melody from Maurice Ravel’s Boléro is transformed into an ‘80s driving pop song and a folk song where accordion and acoustic guitar are played. Tchaikovsky’s Swan Lake is versed into a ‘Chinese traditional’ piece with instruments consisting of guzheng, percussion, and bamboo flute and an R&B track with deep bass electronic drums, and a lead trumpet.

Meta has been transparent about releasing its findings on AI for public consumption. With JASCO, the company has published the research paper for it and in a couple of weeks, the inference code under the MIT license and the pre-trained JASCO model under the Creative Commons license. This will be useful for other developers to use the given model and construct more AI applications.

Top 11 Text-to-Video Generative AI Models 

According to Meta FAIR representation, with the further development of technological novelties in the sphere of artificial intelligence occurring at a fast pace the company considers initiatives like the given one as one of the most crucial for cooperation with the international AI community.

It was followed by the Meta release last year of MusicGen, which is a text-to-speech text that can produce 12-second tracks based on simple instructions. Later in 2020, DeepMind of Google, a renowned artificial intelligence, centre, developed a new software for video-to-audio conversion, also known as Video-to-audio or V2A. Stability AI, the firm behind the AI art generator known as Stable Diffusion, has released Stable Audio Open, an open-source model of the system that can generate audio clips of up to 47 seconds of sound completely for free.

Hence JASCO’s new AI model provides better control over output types using symbolic and audio-based conditions. It generates high-quality music with local and global controls, comparable to other AI tools while offering more versatility. Along with other AI music advancements, JASCO is set to transform the industry but raises copyright and authorship concerns.

What is Spotify AI DJ and How to Use it for a Personalized Music Playlist?

This post was last modified on June 22, 2024 12:21 am

Tech Chilli Desk

Tech Chilli News Desk is a conglomeration of Tech enthusiasts who are committed to delving deep into the evolving new-age technology of Web 3.0, Artificial Intelligence (AI), Robotics, Fintech, Crypto and more. This desk brings the latest information on Digital Transformation through use cases, implementations, coverage, case studies, reporting and deep analysis.

Recent Posts

Rish Gupta Net Worth: CEO & Co-Founder of Spot AI

Rish Gupta is an Indian entrepreneur who serves as the chief executive officer (CEO) of…

April 19, 2025

Top 10 Robotics Skills Required for Engineering Career Growth

Are you looking to advance your engineering career in the field of robotics? Check out…

April 18, 2025

Top 20 Books on AI in 2025: The Ultimate Reading List on Artificial Intelligence

Artificial intelligence is a topic that has recently made internet users all over the world…

April 18, 2025

Top 10 Best AI Communities in 2025

Boost your learning journey with the power of AI communities. The article below highlights the…

April 18, 2025

Artificial Intelligence (AI) Glossary and Terminologies – Complete Cheat Sheet List

Demystify the world of Artificial Intelligence with our comprehensive AI Glossary and Terminologies Cheat Sheet.…

April 18, 2025

Scott Wu Net Worth: Devin AI Software Engineer, CEO of Cognition Labs

Scott Wu is the co-founder and Chief Executive Officer of Cognition Labs, an artificial intelligence…

April 17, 2025