In July, Meta introduced the Make-a-Scene AI system for converting text into an image. And now, Meta CEO Mark Zuckerberg has unveiled a version of Make-a-Video that lets you convert text to video.
Make-a-Video is “a new AI system that allows people to turn text descriptions into short, high-quality video clips,” Zuckerberg wrote.
Functionally Video works the same way as Scene. The system relies on a combination of natural language processing and generative neural networks to transform non-visual descriptions into images. In fact, it simply generates content in a different format. The researchers say that in order to train the model, they needed to learn what the world looks like and how text-image paired datasets describe it, as well as learn how the world moves using video. This approach allowed the team to reduce the time required to train the video model and eliminate the need for paired text and video data while maintaining diversity.
Meta offers Make-a-Video as an open source project. The company is ready to “share this generative research and AI results with the community for feedback.” It is also noted that the company seeks to prevent the use of this tool for harmful purposes. Therefore, the research team cleared the Make-a-Video training dataset in advance of any NSFW images (materials containing nudity, goro, pornography, scenes of violence), as well as toxic phrases.
Source: Engadget