Nvidia has unveiled a new cloud-based suite of GPU-accelerated AI video conferencing software with the aim of enhancing the quality of streaming video and improving the overall video conferencing experience.
Nvidia Maxine is a cloud-native streaming video AI platform that allows service providers to bring new AI-powered capabilities to the over 30m web meetings that are estimated to take place every day. By running the new platform on the company’s GPUs in the cloud, video conference service providers can offer users new AI effects including gaze correction, super-resolution, noise cancellation, face relighting and more.
One of the best things about Nvidia Maxine though is that end users can enjoy all of these new features without the need for specialized hardware as the data from their video conferencing calls is processed in the cloud rather than on their local devices. Vice president and general manager of Accelerated Computing at Nvidia, Ian Buck provided further insight on the company’s new platform in a press release, saying:
“Video conferencing is now a part of everyday life, helping millions of people work, learn and play, and even see the doctor. NVIDIA Maxine integrates our most advanced video, audio and conversational AI capabilities to bring breakthrough efficiency and new capabilities to the platforms that are keeping us all connected.”
The Nvidia Maxine platform is also able to dramatically reduce how much bandwidth is required for video calls as the AI software analyzes the key facial points of each person on a call and then intelligently re-animates the face in the video on the other side.
Using the company’s new AI-based video compression technology running on Nvidia GPUs, developers can reduce video bandwidth consumption down to one-tenth of the requirements of the H.264 streaming video compression standard. This not only cuts costs for providers but delivers a smoother video conferencing experience even for users with less than ideal internet speeds.
Maxine will also help make video conferencing feel more like a face-to-face conversation as service providers will be able to leverage Nvidia’s research in generative adversarial networks (GANs) to offer a variety of new features. Some of these include face alignment so that people appear to be facing each other during a call, gaze correction which helps simulate eye contact and animated avatars with realistic animation automatically driven by their voice and emotional tone in real time.
With the Nvidia Jarvis SDK, developers can even integrate virtual assistants that use state-of-the-art AI language models for speech recognition, language understanding and speech generation. These virtual assistants can also take notes, set action items and answer questions using human-like voices. At the same time, additional conversational AI services such as translations, closed captioning and transcriptions help ensure participants know what is being discussed in a call.
Interested computer vision AI developers, software partners, startups and computer manufactures creating audio and video apps can now apply for early access to the Nvidia Maxine platform.