Vall-E is Microsoft’s new AI technology that very accurately imitates a person’s voice based on a 3-second sample

Vall-E is Microsoft's new AI technology that very accurately imitates a person's voice based on a 3-second sample

Microsoft researchers have created a new model of artificial intelligence, Vall-E, capable of reproducing a voice identical to a human. Vall-E is said to be trained on “discrete codes derived from a standard neural audio codec model” as well as 60,000 hours of conversation (100 times more than existing systems) from more than 7,000 speakers. Most of the dialogue is taken from the public LibriVox audiobook sites.

We help

How fighters of the Kherson direction choose drones and how you can do it

Vall-E is based on the EnCodec technology that Meta announced in October 2022. It analyzes a person’s voice, breaks down information into components and synthesizes variations of its sound in different phrases. Even after listening to just a three-second sample, Vall-E can reproduce the timbre and emotional tone of the speaker.

“The results of the experiment show that Vall-E is significantly superior to the current TTS system [ИИ, воспроизводящий голоса, которых он никогда не слышал] from the point of view of the naturalness of speech and similarity to the speaker,” the researchers’ article states.

You can listen to examples of Vall-E voices playing on GitHub. Most sound identical to the recordings, despite the fact that only short fragments are used. Several voices sound more robotic and reminiscent of traditional text-to-speech voices.

Microsoft researchers believe that Vall-E could be used in the future as a text-to-speech tool, speech editing method, and audio generation system by combining it with other generative AI, such as GPT-3.

Course

FINANCIAL MANAGER

Become a professional financial manager and earn from $500 in 2 months.

REGISTER!finmanager

Vall-E is Microsoft's new AI technology that extremely accurately imitates the human voice based on a 3-second sample.

As with other AI models, there are concerns about Vall-E being misused—for example, to mimic the voices of public figures, politicians, or celebrities (especially when used in conjunction with deepfakes). Criminals will also be able to obtain sensitive data if they make a person believe that they are talking to family, friends or officials. Some security systems also use voice recognition. As for its impact on jobs, Vall-E is likely to be a cheaper alternative for dubbing actors.

But Vall-E researchers say all these risks can be reduced:

“A model can be built that will determine whether the audio was synthesized by Vall-E.”

Microsoft, it seems, has decidedly taken up the development of AI technologies and their implementation in its own products. OpenAI’s GPT language model will try to be integrated with Word, Outlook and PowerPoint, and ChatGPT – a chatbot that generates human-like texts and gives detailed answers to questions – will be added to the version of the Bing search engine from March.

According to media reports, Microsoft is also in talks to invest $10 billion in OpenAI. The agreement stipulates that the company will receive 75% of II-lab’s profits until it recoups its investment. After reaching that goal, Microsoft will receive 49% of the startup’s shares, other investors will receive another 49%, and OpenAI’s non-profit parent organization will receive 2%.

Media: Microsoft invests $10 billion in OpenAI – the developer of the chatbot ChatGPT, which generates frighteningly human texts

Source: Techspot

Related Posts

XDefiant is the first Ubisoft game with an official Ukrainian localization

Well, Ubisoft, welcome to the beginning, and we are very much looking forward to Ukrainians in the upcoming Assassin’s Creed and Far Cry games. Get ready for…

GTA VI publisher Take-Two lays off 5% of staff and cancels games

The meme is funny, the situation is terrible. After reports of preparations for the announcement of the new Mafia (an announcement is expected at one of the…

experts named the top 10 cities for remout

Experts have compiled the top 10 cities that can be considered the best for remoting. This was reported by CNBC, based on data from Remote.com. The platform…

Taskombank closes the sportbank project – from today neobank does not accept new customers, and from May 12 it closes the application

sportbank worked under the license of JSC “TASKOMBANK” of Serhiy Tihipko, therefore all clients are automatically referred to this institution and will be served there in the…

Ukraine starts the production of “Shmavik” – an analogue of the Chinese DJI Mavic

The Technology section is powered by Favbet Tech The Ukrainian company is launching mass production of its own analogue of the Chinese quadrocopter DJI Mavic, popular among…

about preserving the neutrality of Telegram, pressure from the FBI, Elon Musk and Mark Zuckerberg

The Technology section is powered by Favbet Tech Tonight, the same interview with the founder of Telegram Pavel Durov, which was taken by the famous propagandist Tucker…

Leave a Reply

Your email address will not be published. Required fields are marked *