VASA-1: Lifelike Audio-Driven Talking Faces
Generated in Real Time
How it works?
Image + Audio Clip
Single portrait photo + speech audio = hyper-realistic talking face video with precise lip-audio sync, lifelike facial behavior, and naturalistic head movements, generated in real time.
website- https://www.microsoft.com/en-us/research/project/vasa-1/
Microsoft has recently introduced a new VASA-1 AI model, a framework designed to create lifelike talking faces for virtual characters. By utilizing just a single static image and a speech audio clip, the company states that its VASA-1 can produce realistic short videos. The model also provides multiple options for making modifications to the video.