The Fundamental AI Research team at Meta has launched four new AI models, now accessible to researchers and developers for creating innovative applications. A detailed paper about one of these models, JASCO, has been published on the arXiv preprint server, describing its potential uses.
With the growing interest in AI applications, leading companies in the sector are developing AI models to enable other entities to integrate AI functionalities into their applications. In this initiative, Meta’s team has released four models: JASCO, AudioSeal, and two versions of Chameleon.
JASCO is designed to process various types of audio input to enhance sound quality. According to the team, this model allows users to modify attributes like drum sounds, guitar chords, and melodies to create a piece of music. It can also take text input to influence the tune’s style.
For instance, one could instruct the model to produce a bluesy tune with prominent bass and drums, followed by similar instructions for other instruments. Meta’s team also compared JASCO with other similar systems and found that JASCO outperformed them in three key metrics.
AudioSeal is intended to watermark speech generated by AI applications, making it easily identifiable as artificially created. The team notes that it can also watermark AI-generated speech segments mixed with real speech, and it will be available with a commercial license.
The Chameleon models, 7B and 34B, convert text into visual representations and are released with limited functionality. These models need to understand both text and images, allowing them to perform tasks such as generating captions for pictures.
4o