Audio2Face Overview#

NVIDIA Omniverse™ Audio2Face is a combination of artificial intelligence (AI)-based technologies that generate facial animation and lip sync driven only by an audio source. Intuitive Character retargeting allows the user to connect and animate their own characters.

With Audio2Face, you can:

Analyze an audio sample and automatically animate emotions in a character’s performance.
Animate all the features of your character’s face, including their eyes and tongue.
Character Setup and Transfer provides the means to re-target the performance of the entire face including Eyes, Teeth and Tongue from the A2F default model to the users own characters.
With LiveLink A2F can live stream blendshape animation to other applications.

You can use Audio2Face at runtime or in more traditional content creation pipelines. It offers various output formats and includes the ability to connect a custom blend shape mesh and export the resulting blend weights.

Minimum Mesh Requirements for Full Face Character Setup#

A2F requires that a Head Mesh must be broken down into its individual mesh components. Head Mesh, Left Eye, Right Eye, Lower Teeth and Tongue must be individual meshes and cannot contain sub meshes. Please see Online Documentation and NVOD A2F Tutorial videos for further guidance.

Requirements#

In order to use Audio2Face, you must have:

Win 64 1909 or newer
Omniverse Nucleus
Ubuntu Linux Version 20.04 or newer

If you’re trying to access sample assets in the local Omniverse mount, you need to install the Nucleus application from the Omniverse Launcher.

Note

Audio2Face optionally uses the Riva text-to-speech extension which, to use, must be enabled with the Extension Manager. The Riva extension is only available to Audio2Face and will not be available to other NVIDIA Omniverse™ Apps.