A Textured Approach: NVIDIA Research Shows How Gen AI Helps Create and Edit Photorealistic Materials

NVIDIA researchers are taking the stage at SIGGRAPH, the world’s largest computer graphics conference, to demonstrate a generative AI workflow that helps artists rapidly create and iterate on materials for 3D scenes.

The research demo, which will be presented today at the show’s Real-Time Live event, showcases how artists can use text or image prompts to generate custom textured materials — such as fabric, wood and stone — faster and with finer creative control. These capabilities will be coming to NVIDIA Picasso, allowing enterprises, software creators and service providers to create custom generative AI models for materials, developed using their own fully licensed data.

This set of AI models will facilitate iterative creating and editing of materials, enabling companies to offer new tools that’ll help artists rapidly refine a 3D object’s appearance until they achieve the desired result.

In the demo, NVIDIA researchers experiment with a living-room scene, like an interior designer assisted by AI might do in any 3D rendering application. In this case, researchers use NVIDIA Omniverse USD Composer — a reference application for scene assembly and composition using Universal Scene Description, known as OpenUSD — to add a brick-textured wall, to create and modify fabric choices for the sofa and throw pillows, and to incorporate an abstract animal design in a specific area of the wall.

Generative AI Enables Iterative Design 

The Real-Time Live demo combines several optimized AI models — a palette of tools that developers using Picasso will be able to customize and integrate into creative applications for artists.

Once integrated into creative applications, these features will allow artists to enter a brief text prompt to generate materials — such as a brick or a mosaic pattern — that are tileable, meaning they can be seamlessly replicated over a surface of any size. Or, they can import a reference image, such as a swatch of flannel fabric, and apply it to any object in the virtual scene.

An AI editing tool lets artists modify a specific area of the material they’re working on, such as the center of a coffee table texture.

The AI-generated materials support physics-based rendering, responding realistically to changes in the scene’s lighting. They include normal, roughness and ambient occlusion maps — features that are critical to creating and fine-tuning materials for photorealistic 3D scenes.

When accelerated on NVIDIA Tensor Core GPUs, materials can be generated in near real time, and can be upscaled in the background, achieving up to 4K resolution while creators continue to refine other parts of the scene.

Across creative industries — including architecture, game development and interior design — these capabilities could help artists quickly explore ideas and experiment with different aesthetic styles to create multiple versions of a scene.

A game developer, for example, could use these generative AI features to speed up the process of designing an open world environment or creating a character’s wardrobe. An architect could experiment with different styles of building facades in various lighting environments.

Build Generative AI Services With NVIDIA Picasso 

These capabilities for physics-based material generation will be made available in NVIDIA Picasso, a cloud-based foundry that allows companies to build, optimize and fine-tune their own generative AI foundational models for visual content.

Picasso enables content providers to develop generative AI tools and services trained on fully licensed, rights-reserved data. It’s part of NVIDIA AI Foundations, a set of model-making services that advance generative AI across text, visual content and biology.

At today’s SIGGRAPH keynote, NVIDIA founder and CEO Jensen Huang also announced a new Picasso feature to generate photorealistic 360 HDRi environment maps to light 3D scenes using simple text or image prompts.

See This Research at SIGGRAPH’s Real-Time Live 

Real-Time Live is one of the most anticipated events at SIGGRAPH. This year, the showcase features more than a dozen jury-reviewed projects, including those from teams at Roblox, the University of Utah and Metaphysic, a member of the NVIDIA Inception program for cutting-edge startups.

At the event, NVIDIA researchers will present this interactive materials research live, including a demo of the super resolution tool. Conference attendees can catch the session today at 6 p.m. PT in West Hall B at the Los Angeles Convention Center.

Learn about the latest advances in generative AI, graphics and more by joining NVIDIA at SIGGRAPH, running through Thursday, Aug. 10.

DENZA Collaborates With WPP to Build and Deploy Advanced Car Configurators on NVIDIA Omniverse Cloud

DENZA, the luxury EV brand joint venture between BYD and Mercedes-Benz, has collaborated with marketing and communications giant WPP and NVIDIA Omniverse Cloud to build and deploy its next generation of car configurators, NVIDIA founder and CEO Jensen Huang announced at SIGGRAPH.

WPP is using Omniverse Cloud — a platform for developing, deploying and managing industrial digitalization applications — to help unify the automaker’s highly complex design and marketing pipeline.

Omniverse Cloud enables WPP to build a single, physically accurate, real-time digital twin of the DENZA N7 model by integrating full-fidelity design data from the EV maker’s preferred computer-aided design tools via Universal Scene Description, or OpenUSD.

OpenUSD is a 3D framework that enables interoperability between software tools and data types for the building of virtual worlds.

The implementation of a new unified asset pipeline breaks down proprietary data silos, fostering enhanced data accessibility and facilitating collaborative, iterative reviews for the organization’s large design teams and stakeholders. It enables WPP to work on launch campaigns earlier in the design process, making iterations faster and less costly.

Unifying Asset Pipelines With Omniverse Cloud

Using Omniverse Cloud, WPP’s teams can connect their own pipeline of OpenUSD-enabled design and content creation tools such as Autodesk Maya and Adobe Substance 3D Painter to develop a new configurator for the DENZA N7. With a unified asset pipeline in Omniverse, WPP’s teams of artists can iterate and edit in real time a path-traced view of the full engineering dataset of the DENZA N7 — ensuring the virtual car accurately represents the physical car.

Traditional car configurators require hundreds of thousands of images to be prerendered to represent all possible options and variants. OpenUSD makes it possible for WPP to create a digital twin of the car that includes all possible variants in one single asset. No prerendered images are required.

In parallel, WPP’s environmental artists create fully interactive, live 3D virtual sets. These can start with a scan of a real-world environment, such as those WPP captures with their robot dog, or tap into generative AI tools from providers such as Shutterstock to instantly generate 360-degree HDRi backgrounds to maximize opportunity for personalization.

Shutterstock is using NVIDIA Picasso — a foundry for building generative AI visual models — to  develop a variety of generative AI services to accelerate 3D workflows. At SIGGRAPH, Shutterstock announced the first offering of these new services – 360 HDRi – to create photorealistic HDR environment maps to relight a scene. With this feature, artists can rapidly create custom environments that fit their needs.

One-Click Publish to GDN

Once the 3D experience is complete, with just one click, WPP can publish it to Graphics Delivery Network (GDN), part of NVIDIA Omniverse Cloud. GDN is a network of data centers built to serve real-time, high-fidelity 3D content to nearly any web device, enabling interactive experiences in the dealer showroom as well as on consumers’ mobile devices.

This eliminates the tedious process of manually packaging, deploying, hosting and managing the experience themselves. If updates are needed, just like with the initial deployment, WPP can publish them with a single click.

CTA: Learn more about Omniverse Cloud and GDN.

NVIDIA H100 Tensor Core GPU Used on New Microsoft Azure Virtual Machine Series Now Generally Available

Microsoft Azure users can now turn to the latest NVIDIA accelerated computing technology to train and deploy their generative AI applications.

Available today, the Microsoft Azure ND H100 v5 VMs using NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking — enables scaling generative AI, high performance computing (HPC) and other applications with a click from a browser.

Available to customers across the U.S., the new instance arrives as developers and researchers are using large language models (LLMs) and accelerated computing to uncover new consumer and business use cases.

The NVIDIA H100 GPU delivers supercomputing-class performance through architectural innovations, including fourth-generation Tensor Cores, a new Transformer Engine for accelerating LLMs and the latest NVLink technology that lets GPUs talk to each other at 900GB/sec.

The inclusion of NVIDIA Quantum-2 CX7 InfiniBand with 3,200 Gbps cross-node bandwidth ensures seamless performance across the GPUs at massive scale, matching the capabilities of top-performing supercomputers globally.

Scaling With v5 VMs

ND H100 v5 VMs are ideal for training and running inference for increasingly complex LLMs and computer vision models. These neural networks drive the most demanding and compute-intensive generative AI applications, including question answering, code generation, audio, video and image generation, speech recognition and more.

The ND H100 v5 VMs achieve up to 2x speedup in LLMs like the BLOOM 175B model for inference versus previous generation instances, demonstrating their potential to further optimize AI applications.

NVIDIA and Azure

NVIDIA H100 Tensor Core GPUs on Azure provide enterprises the performance, versatility and scale to supercharge their AI training and inference workloads. The combination streamlines the development and deployment of production AI with the NVIDIA AI Enterprise software suite integrated with Azure Machine Learning for MLOps, and delivers record-setting AI performance in industry-standard MLPerf benchmarks.

In addition, by connecting the NVIDIA Omniverse platform to Azure, NVIDIA and Microsoft are providing hundreds of millions of Microsoft enterprise users with access to powerful industrial digitalization and AI supercomputing resources.

Learn more about new Azure v5 instances powered by NVIDIA H100 GPUs.

SIGGRAPH Special Address: NVIDIA CEO Brings Generative AI to LA Show

As generative AI continues to sweep an increasingly digital, hyperconnected world, NVIDIA founder and CEO Jensen Huang made a thunderous return to SIGGRAPH, the world’s premier computer graphics conference.

“The generative AI era is upon us, the iPhone moment if you will,” Huang told an audience of thousands Tuesday during an in-person special address in Los Angeles.

News highlights include the next-generation GH200 Grace Hopper Superchip platform, NVIDIA AI Workbench — a new unified toolkit that introduces simplified model tuning and deployment on NVIDIA AI platforms — and a major upgrade to NVIDIA Omniverse with generative AI and OpenUSD.

The announcements are about bringing all of the past decade’s innovations — AI, virtual worlds, acceleration, simulation, collaboration and more — together.

“Graphics and artificial intelligence are inseparable, graphics needs AI, and AI needs graphics,” Huang said, explaining that AI will learn skills in virtual worlds, and that AI will help create virtual worlds.

A packed house at the SIGGRAPH professional graphics conference attended NVIDIA founder and CEO Jensen Huang’s keynote address.

Fundamental to AI, Real-Time Graphics

Five years ago at SIGGRAPH, NVIDIA reinvented graphics by bringing AI and real-time ray tracing to GPUs. But “while we were reinventing computer graphics with artificial intelligence, we were reinventing the GPU altogether for artificial intelligence,” Huang said.

The result: increasingly powerful systems such as the NVIDIA HGX H100, which harnesses eight GPUs  — and a total of 1 trillion transistors — that offer dramatic acceleration over CPU-based systems.

“This is the reason why the world’s data centers are rapidly transitioning to accelerated computing,” Huang told the audience. “The more you buy, the more you save.”

To continue AI’s momentum, NVIDIA created the Grace Hopper Superchip, the NVIDIA GH200, which combines a 72-core Grace CPU with a Hopper GPU, and which went into full production in May.

Huang announced that NVIDIA GH200, which is already in production, will be complemented with an additional version with cutting-edge HBM3e memory.

He followed up on that by announcing the next-generation GH200 Grace Hopper superchip platform with the ability to connect multiple GPUs for exceptional performance and easily scalable server design.

Built to handle the world’s most complex generative workloads, spanning large language models, recommender systems and vector databases, the new platform will be available in a wide range of configurations.

The dual configuration — which delivers up to 3.5x more memory capacity and 3x more bandwidth than the current generation offering — comprises a single server with 144 Arm Neoverse cores, eight petaflops of AI performance, and 282GB of the latest HBM3e memory technology.

Leading system manufacturers are expected to deliver systems based on the platform in the second quarter of 2024.

NVIDIA AI Workbench Speeds Adoption of Custom Generative AI

To speed custom adoption of generative AI for the world’s enterprises, Huang announced NVIDIA AI Workbench. It provides developers with a unified, easy-to-use toolkit to quickly create, test and fine-tune generative AI models on a PC or workstation — then scale them to virtually any data center, public cloud or NVIDIA DGX Cloud.

AI Workbench removes the complexity of getting started with an enterprise AI project. Accessed through a simplified interface running on a local system, it allows developers to fine-tune models from popular repositories such as Hugging Face, GitHub and NGC using custom data. The models can then be shared easily across multiple platforms.

While hundreds of thousands of pretrained models are now available, customizing them with the many open-source tools available can be challenging and time consuming.

“In order to democratize this ability, we have to make it possible to run pretty much everywhere,” Huang said.

With AI Workbench, developers can customize and run generative AI in just a few clicks. It allows them to pull together all necessary enterprise-grade models, frameworks, software development kits and libraries into a unified developer workspace.

“Everybody can do this,” Huang said.

Leading AI infrastructure providers — including Dell Technologies, Hewlett Packard Enterprise, HP Inc., Lambda, Lenovo and Supermicro — are embracing AI Workbench for its ability to bring enterprise generative AI capability to wherever developers want to work — including a local device.

Huang also announced a partnership between NVIDIA and startup Hugging Face, which has 2 million users, that will put generative AI supercomputing at the fingertips of millions of developers building large language models and other advanced AI applications.

Developers will be able to access NVIDIA DGX Cloud AI supercomputing within the Hugging Face platform to train and tune advanced AI models.

“This is going to be a brand new service to connect the world’s largest AI community to the world’s best training and infrastructure,” Huang said.

In a video, Huang showed how AI Workbench and ChatUSD bring it all together: allowing a user to start a project on a GeForce RTX 4090 laptop and scale, seamlessly to a workstation, or the data center  as it grows more complex.

Using Jupyter Notebook, a user can prompt the model to generate a picture of Toy Jensen in space. When the model provides a result that doesn’t work, because it’s never seen Toy Jensen, the user can fine-tune the model with eight images of Toy Jensen and then prompt it again to get a correct result.

Then with AI Workbench, the new model can be deployed to an enterprise application.

New NVIDIA Enterprise 4.0 Software Advances AI Deployment

In a further step to accelerate the adoption of generative AI, NVIDIA announced the latest version of its enterprise software suite, NVIDIA AI Enterprise 4.0.

NVIDIA AI Enterprise gives businesses access to the tools needed to adopt generative AI, while also offering the security and API stability required for large-scale enterprise deployments.

Major Omniverse Release Converges Generative AI, OpenUSD for Industrial Digitalization

Offering new foundation applications and services for developers and industrial enterprises to optimize and enhance their 3D pipelines with the OpenUSD framework and generative AI, Huang announced a major release of NVIDIA Omniverse, an OpenUSD-native development platform for building, simulating, and collaborating across tools and virtual worlds.

He also announced NVIDIA’s contributions to OpenUSD, the framework and universal interchange for describing, simulating and collaborating across 3D tools.

Updates to the Omniverse platform include advancements to Omniverse Kit — the engine for developing native OpenUSD applications and extensions — as well as to the NVIDIA Omniverse Audio2Face foundation app and spatial-computing capabilities.

Cesium, Convai, Move AI, SideFX Houdini and Wonder Dynamics are now connected to Omniverse via OpenUSD.

And expanding their collaboration across Adobe Substance 3D, generative AI and OpenUSD initiatives, Adobe and NVIDIA announced plans to make Adobe Firefly — Adobe’s family of creative generative AI models — available as APIs in Omniverse.

Omniverse users can now build content, experiences and applications that are compatible with other OpenUSD-based spatial computing platforms such as ARKit and RealityKit.

Huang announced a broad range of frameworks, resources and services for developers and companies to accelerate the adoption of Universal Scene Description, known as OpenUSD, including contributions such as geospatial data models, metrics assembly and simulation-ready, or SimReady, specifications for OpenUSD.

Huang also announced four new Omniverse Cloud APIs built by NVIDIA for developers to more seamlessly implement and deploy OpenUSD pipelines and applications.

ChatUSD — Assisting developers and artists working with OpenUSD data and scenes, ChatUSD is a large language model (LLM) agent for generating Python-USD code scripts from text and answering USD knowledge questions.
RunUSD — a cloud API that translates OpenUSD files into fully path-traced rendered images by checking compatibility of the uploaded files against versions of OpenUSD releases, and generating renders with Omniverse Cloud.
DeepSearch — an LLM agent enabling fast semantic search through massive databases of untagged assets.
USD-GDN Publisher — a one-click service that enables enterprises and software makers to publish high-fidelity, OpenUSD-based experiences to the Omniverse Cloud Graphics Delivery Network (GDN) from an Omniverse-based application such as USD Composer, as well as stream in real time to web browsers and mobile devices.

These contributions are an evolution of last week’s announcement of NVIDIA’s co-founding of the Alliance for OpenUSD along with Pixar, Adobe, Apple and Autodesk.

Powerful New Desktop Systems, Servers

Providing more computing power for all of this, Huang said NVIDIA and global workstation manufacturers are announcing powerful new RTX workstations for development and content creation in the age of generative AI and digitization.

The systems, including those from BOXX, Dell Technologies, HP and Lenovo, are based on NVIDIA RTX 6000 Ada Generation GPUs and incorporate NVIDIA AI Enterprise and NVIDIA Omniverse Enterprise software.

Separately, NVIDIA released three new desktop workstation Ada Generation GPUs — the NVIDIA RTX 5000, RTX 4500 and RTX 4000 — to deliver the latest AI, graphics and real-time rendering technology to professionals worldwide.

Huang also detailed how, together with global data center system manufacturers, NVIDIA is continuing to supercharge generative AI and industrial digitization with new NVIDIA OVX featuring the new NVIDIA L40S GPU, a powerful, universal data center processor design.

The powerful new systems will accelerate the most compute-intensive, complex applications, including AI training and inference, 3D design and visualization, video processing and industrial digitalization with the NVIDIA Omniverse platform.

NVIDIA Research Bringing New Capabilities

More innovations are coming, thanks to NVIDIA Research.

At the show’s Real Time Live Event, NVIDIA researchers will demonstrate a generative AI workflow that helps artists rapidly create and iterate on materials for 3D scenes, using text or image prompts to generate custom textured materials faster and with finer creative control.

And NVIDIA Research also demo’d how AI can take video conferencing to the next level with new 3D features. NVIDIA Research recently published a paper demonstrating how AI could power a 3D video-conferencing system with minimal capture equipment.

The production version of Maxine, now available in NVIDIA Enterprise, allows professionals, teams, creators and others to tap into the power of AI to create high-quaity audio and video effects, even using standard microphone and webcams.

Watch Huang’s full special address at NVIDIA’s SIGGRAPH event site. where there are also details of labs, presentations and more happening throughout the show. 

Startup Pens Generative AI Success Story With NVIDIA NeMo

Machine learning helped Waseem Alshikh plow through textbooks in college. Now he’s putting generative AI to work, creating content for hundreds of companies.

Born and raised in Syria, Alshikh spoke no English, but he was fluent in software, a talent that served him well when he arrived at college in Lebanon.

“The first day they gave me a stack of textbooks, each one a thousand pages thick, and all of it in English,” he recalled.

So, he wrote a program — a crude but effective statistical classifier that summarized the books — then he studied the summaries.

From Concept to Company

In 2014, he shared his story with May Habib, an entrepreneur he met while working in Dubai. They agreed to create a startup that could help marketing departments — which are always pressured to do more with less — use machine learning to quickly create copy for their web pages, blogs, ads and more.

“Initially, the tech was not there, until transformer models were announced — that was something we could build on,” said Alshikh, the startup’s CTO.

Writer co-founders Habib, CEO, and Alshikh, CTO.

“We found a few engineers and spent almost six months building our first model, a neural network that barely worked and had about 128 million parameters,” an often-used measure of an AI model’s capability.

Along the way, the young company won some business, changed its name to Writer and connected with NVIDIA.

A Startup Accelerated

“Once we got introduced to NVIDIA NeMo, we were able to build industrial-strength models with three, then 20 and now 40 billion parameters, and we’re still scaling,” he said.

NeMo is an application framework that helps companies curate their training datasets, build and customize large language models (LLMs), and run them in production at scale. Organizations everywhere from Korea to Sweden are using it to customize LLMs for their local languages and industries.

“Before NeMo, it took us four and a half months to build a new billion-parameter model. Now we can do it in 16 days — this is mind blowing,” Alshikh said.

Models Make Opportunities

In the first six months of this year, the startup’s team of fewer than 20 AI engineers used NeMo to develop 10 models, each with 30 billion parameters or more.

That translates into big opportunities. Hundreds of businesses now use Writer’s models that NeMo customized for finance, healthcare, retail and other vertical markets.

Writer’s Recap tool creates written summaries from audio recordings of an interview or event.

The startup’s customer list includes household names like Deloitte, L’Oreal, Intuit, Uber and many Fortune 500 companies.

Writer’s success with NeMo is just the start of the story. Dozens of other companies have already downloaded NeMo.

The software will be available soon for anyone to use. It’s part of NVIDIA AI Enterprise, full-stack software optimized to accelerate generative AI workloads and backed by enterprise-grade support, security and application programming interface stability.

Writer offers a full-stack platform for enterprise users.

A Trillion API Calls a Month

Some customers run Writer’s models on their own systems or cloud services. Others ask Writer to host the models, or they use Writer’s API.

“Our cloud infrastructure, managed basically by two people, hosts a trillion API calls a month — we’re generating 90,000 words a second,” Alshikh said. “We’re delivering high-quality models that compete with products from companies with larger teams and bigger budgets.”

NVIDIA NeMo supports an end-to-end flow for generative AI from data curation to inference.

Writer uses the Triton Inference Server that’s packaged with NeMo to run models in production for its customers. Alshikh reports that Triton, used by many companies running LLMs, enables lower latency and greater throughput than alternative programs.

“This means you can run a service for $20,000, instead of $100,000, so we can invest more in building meaningful features,” he said.

A Wide Horizon

Writer is also a member of NVIDIA Inception, a program that nurtures cutting-edge startups. “Thanks to Inception, we got early access to NeMo and some amazing people who guided us through the process of finding and using the tools we need,” he said.

Now that Writer’s text products are getting traction, Alshikh, who splits his time between homes in Florida and California, is searching the horizon for what’s next. In today’s broad frontier of generative AI, he sees opportunities in images, audio, video, 3D — maybe all of the above.

“We see multimodality as the future,” he said.

Check out this page to get started with NeMo. And learn about the early access program for multimodal NeMo here.

And if you enjoyed this story, let folks on social networks know using the following, a summary suggested by Writer:

“Learn how startup Writer uses NVIDIA NeMo software to generate content for hundreds of companies and rack up impressive revenues with a small staff and budget.”

NVIDIA Makes Extended-Reality Streaming More Scalable, Customizable for Enterprises and Developers

Organizations across industries are using extended reality (XR) to redesign workflows and boost productivity, whether for immersive training or collaborative design reviews.

With the growing use of all-in-one (AIO) headsets, more teams have adopted and integrated XR. While easing XR use, AIO headsets have modest compute and rendering power that can limit the graphics quality of streaming experiences.

NVIDIA is enabling more enterprises and developers to adopt high-quality XR with its CloudXR Suite. Built to greatly simplify streaming, CloudXR enables anyone with an AIO headset or mobile XR device to experience high-fidelity, immersive environments from any location.

CloudXR Suite combines the power of NVIDIA RTX GPUs and NVIDIA RTX Virtual Workstation (vWS) software to stream high-fidelity XR applications to Android and iOS devices. By dynamically adjusting to network conditions, CloudXR maximizes image quality and frame rates to power next-level, wireless augmented-reality and virtual-reality experiences.

With CloudXR, enterprises can gain the flexibility to effectively orchestrate and scale XR workloads, and developers can use the advanced platform to create custom XR products for their users. The suite offers high-quality streaming across both public and private networks.

Ericsson and VMware are among the first companies to use CloudXR.

Taking XR Workflows to the Next Level

CloudXR Suite offers performance that’s comparable to tethered VR experiences.

It comprises three components, including several updates:

CloudXR Essentials, the suite’s underlying streaming layer, brings new improvements such as 5G L4S optimizations, QoS algorithms and enhanced logging tools. Essentials also includes the SteamVR plug-in, along with sample clients and a new server-side application programming interface.
CloudXR Server Extensions improves server-side interfaces with a source-code addition to the Monado OpenXR runtime. The new CloudXR Server API contained in CloudXR Essentials and the OpenXR API represent the gateway to scaling XR distribution for orchestration partners.
CloudXR Client Extensions include as a first offering a CloudXR plug-in built for the Unity Editor. This lets developers build custom CloudXR client applications using already-familiar Unity development tools. Plus, Unity app developers can more easily build applications with branded custom interfaces and lobbies before connecting to their CloudXR streaming server using the plug-in.

Teams can tap into the power of NVIDIA RTX GPUs to achieve ultimate graphics performance on mobile devices. Enterprises can scale to data center and edge networks, and stream to concurrent users with NVIDIA RTX vWS software.

In addition, users can stream stunning XR content from any OpenVR or OpenXR application at the edge using high-bandwidth, low-latency 5G signals.

Partners Experience Enterprise-Grade XR Streaming

Organizations across industries use XR streaming to advance their workflows.

To provide optimal streaming performance, NVIDIA is working with leading companies like Ericsson to implement low-latency, low-loss scalable throughput (L4S) in NVIDIA CloudXR. L4S helps reduce lag in interactive, cloud-based video streaming, so CloudXR users will be able to experience photorealistic XR environments on high-bandwidth, low-latency networks.

“At Ericsson, we believe innovations like L4S are fundamental building blocks to enable latency-critical applications,” said Sibel Tombaz, head of product line for 5G Radio Access Network at Ericsson. “As a key part of Ericsson’s Time-Critical Communication capabilities, L4S will significantly improve user experience for use-cases like cloud gaming, and its great news that NVIDIA is making L4S a production element of CloudXR. We’re excited to be working with NVIDIA to further enhance the XR experience for enterprises, developers and consumers.

More professionals can elevate XR streaming from the cloud with VMware Workspace ONE XR Hub, which includes an integration of CloudXR.

Workspace ONE XR Hub enhances user experiences with VR headsets through advanced authentication and customization options. Combined with the streaming capabilities of CloudXR, Workspace ONE XR Hub allows teams across industries to quickly, securely access complex immersive environments using AIO headsets.

“With this new integration, access to high-fidelity immersive experiences is even easier because streaming lets users tap into the power of RTX GPUs from anywhere,” said Matt Coppinger, director of product management for end-user computing at VMware. “Workspace ONE XR Hub and CloudXR will allow our customers to stream rich XR content, and more teams can boost productivity and integrate realistic, virtual experiences into their workflows.”


CloudXR Suite will be available to download soon, so users can stream a wide range of XR applications over the network without worrying about demanding graphics requirements.

For example, independent software vendors (ISVs) can create a single, high-quality version of their application that’s built to take advantage of powerful GPUs. And with CloudXR streaming, ISVs can target users with mobile XR devices.

Mobile-device manufacturers can also offer their ISV partners and end users access to high-performance GPU acceleration for unparalleled graphics experiences.

In addition, cloud service providers, orchestrators and system integrators can extend their GPU services with interactive graphics to support next-generation XR applications.

Learn more about NVIDIA CloudXR Suite.

Extended Cut: NVIDIA Expands Maxine for Video Editing, Showcases 3D Virtual Conferencing Research

Professionals, teams, creators and others can tap into the power of AI to create high-quality audio and video effects — even using standard microphones and webcams — with the help of NVIDIA Maxine.

The suite of GPU-accelerated software development kits and cloud-native microservices lets users deploy AI features that enhance audio, video and augmented-reality effects for real-time communications services and platforms. Maxine will also expand features for video editing, enabling teams to reach new heights in video communication.

Plus, an NVIDIA Research demo at this week’s SIGGRAPH conference displays how AI can take video conferencing to the next level with 3D features.

NVIDIA Maxine Features Expand to Video Editing

Wireless connectivity has enabled people to join virtual meetings from more locations than ever. Typically, audio and video quality are heavily impacted when a caller is on the move or in a location with poor connectivity.

Advanced, real-time Maxine features — such as Background Noise Removal, Super Resolution and Eye Contact — allow remote users to enhance interpersonal communication experiences.

In addition, Maxine can now be used for video editing. NVIDIA partners are transforming this professional workflow with the same Maxine features that elevate video conferencing. The goal when editing a video, whether a sales pitch or a webinar, is to engage the broadest audience possible. Using Maxine, professionals can tap into AI features that enhance audio and video signals.

With Maxine, a spokesperson can look away from the screen to reference notes or a script while their gaze remains as if looking directly into the camera. Users can also film videos in low resolution and enhance the quality later. Plus, Maxine lets people record videos in several different languages and export the video in English.

Maxine features to be released in early access this year include:

Interpreter: Translates from simplified Chinese, Russian, French, German and Spanish to English while animating the user’s image to show them speaking English.
Voice Font: Enables users to apply characteristics of a speaker’s voice and map it to the audio output.
Audio Super Resolution: Improves audio quality by increasing the temporal resolution of the audio signal and extending bandwidth. It currently supports upsampling from 8,000Hz to 16,000Hz as well as from 16,000Hz to 48,000Hz. This feature is also updated with more than 50% reduction in latency and up to 2x better throughput.
Maxine Client: Brings the AI capabilities of Maxine’s microservices to video-conferencing sessions on PCs. The application is optimized for low-latency streaming and will use the cloud for all of its GPU compute requirements. Thin Client will be available on Windows this fall, with additional OS support to follow.

Maxine can be deployed in the cloud, on premises or at the edge, meaning quality communication can be accessible from nearly anywhere.

Taking Video Conferencing to New Heights

Many partners and customers are experiencing high-quality video conferencing and editing with Maxine. Two features of Maxine — Eye Contact and Live Portrait — are now available in production releases on the NVIDIA AI Enterprise software platform. Eye Contact simulates direct eye contact with the camera by estimating and aligning the user’s gaze with the camera. And Live Portrait animates a person’s portrait photo through their live video feed.

Software company Descript aims to make video a staple of every communicator’s toolkit, alongside docs and slides. With NVIDIA Maxine, professionals and beginners who use Descript can access AI features that improve their video-content workflows.

“With the NVIDIA Maxine Eye Contact feature, users no longer have to worry about memorizing scripts or doing tedious video retakes,” said Jay LeBoeuf, head of business and corporate development at Descript. “They can maintain a perfect on-screen presence while nailing their script every time.”

Reincubate’s Camo app aims to broaden access to great video by taking advantage of the hardware and devices people already own. It does this by giving users greater control over their image and by implementing a powerful, efficient processing pipeline for video effects and transformation. Using technologies enabled by NVIDIA Maxine, Camo can offer users an easier way to achieve incredible video creation.

“Integrating NVIDIA Maxine into Camo couldn’t have been easier, and it’s enabled us to get high performance from users’ RTX GPUs right out of the box,” said Aidan Fitzpatrick, founder and CEO of Reincubate. “With Maxine, the team’s been able to move faster and with more confidence.”

Quicklink’s Cre8 is a powerful video production platform for creating professional, on-brand productions, virtual and hybrid live events. The user-friendly interface combines an intuitive design with all the tools needed to build, edit and customize a professional-looking production. Cre8 incorporates NVIDIA Maxine technology to maximize productivity and the quality of video productions, offering complete control to the operator.

“Quicklink Cre8 now offers the most advanced video production platform on the planet,” said Richard Rees, CEO of Quicklink. “With NVIDIA Maxine, we were able to add advanced features, including Auto Framing, Video Noise Removal, Noise and Echo Cancellation, and Eye Contact Simulation.”

Los Angeles-based company gemelo.ai provides a platform for creating AI twins that can scale a user’s voice, content and interactions. Using Maxine’s Live Portrait feature, the gemelo.ai team can unlock new opportunities for scaled, personalized content and one-on-one interactions.

“The realism of Live Portrait has been a game-changer, unlocking new realms of potential for our AI twins,” said Paul Jaski, CEO of gemelo.ai. “Our customers can now design and deploy incredibly realistic digital twins with the superpowers of unlimited scalability in content production and interaction across apps, websites and mixed-reality experiences.”

NVIDIA Research Shows How 3D Video Enhances Immersive Communication

In addition to powering the advanced features of Maxine, NVIDIA AI enhances video communication with 3D. NVIDIA Research recently published a paper demonstrating how AI could power a 3D video-conferencing system with minimal capture equipment.

3D telepresence systems are typically expensive, require a large space or production studio, and use high-bandwidth, volumetric video streaming — all of which limits the technology’s accessibility. NVIDIA Research shared a new method, which runs on a novel VisionTransformer-based encoder, that takes 2D video input from a standard webcam and turns it into a 3D video representation. Instead of requiring 3D data to be passed back and forth between the participants in a conference, AI enables bandwidth requirements for the call to stay the same as for a 2D conference.

The technology takes a user’s 2D video and automatically creates a 3D representation called a neural radiance field, or NeRF, using volumetric rendering. As a result, participants can stream 2D videos, like they would for traditional video conferencing, while decoding high-quality 3D representations that can be rendered in real time. And with Maxine’s Live Portrait, users can bring their portraits to life in 3D.

AI-mediated 3D video conferencing could significantly reduce the cost for 3D capture, provide a high-fidelity 3D representation, accommodate photorealistic or stylized avatars, and enable mutual eye contact in video conferencing. Related research projects show how AI can help elevate communications and virtual interactions, as well as inform future NVIDIA technologies for video conferencing.

See the system in action below. SIGGRAPH attendees can visit the Emerging Technologies booth, where groups will be able to simultaneously view the live demo on a 3D display designed by New York-based company Looking Glass.


Learn more about NVIDIA Maxine, which is now available on NVIDIA AI Enterprise.

And see more of the research behind the 3D video conference project.

Featured image courtesy of NVIDIA Research.

Content Creation ‘In the NVIDIA Studio’ Gets Boost From New Professional GPUs, AI Tools, Omniverse and OpenUSD Collaboration Features

AI and accelerated computing were in the spotlight at SIGGRAPH — the world’s largest gathering of computer graphics experts — as NVIDIA founder and CEO Jensen Huang announced during his keynote address updates to NVIDIA Omniverse, a platform for building and connecting 3D tools and applications, as well as acceleration for Universal Scene Description (known as OpenUSD), the open and extensible ecosystem for 3D worlds.

This follows the recent announcement of NVIDIA joining Pixar, Adobe, Apple and Autodesk to form the Alliance for OpenUSD. It marks a major leap toward unlocking the next era of 3D graphics, design and simulation by ensuring compatibility in 3D tools and content for digitalization across industries.

NVIDIA launched three new desktop workstation Ada Generation GPUs — the NVIDIA RTX 5000, RTX 4500 and RTX 4000 — which deliver the latest AI, graphics and real-time rendering technology to professionals worldwide.

Shutterstock is bringing generative AI to 3D scene backgrounds with a foundation model trained using NVIDIA Picasso, a cloud-based foundry for building visual generative AI models. Picasso-trained models can now generate photorealistic, 8K, 360-degree high-dynamic-range imaging (HDRi) environment maps for quicker scene development. Autodesk will also integrate generative AI content-creation services — developed using foundation models in Picasso — with its popular Autodesk Maya software.

Each month, NVIDIA Studio Driver releases provide artists, creators and 3D developers with the best performance and reliability when working with creative applications. Available today, the August NVIDIA Studio Driver gives creators peak reliability for using their favorite creative apps. It includes support for updates to Omniverse, XSplit Broadcaster and Reallusion iClone.

Plus, this week’s featured In the NVIDIA Studio artist Andrew Averkin shows how AI influenced his process in building a delightful cup of joe for his Natural Coffee piece.

Omniverse Expands

Omniverse received a major upgrade, bringing new connectors and advancements to the platform.

These updates are showcased in Omniverse foundation applications, which are fully customizable reference applications that creators, enterprises and developers can copy, extend or enhance.

Upgraded Omniverse applications include Omniverse USD Composer, which lets 3D users assemble large-scale, OpenUSD-based scenes. Omniverse Audio2Face — which provides generative AI application programming interfaces that create realistic facial animations and gestures from only an audio file — now includes multilingual support and a new female base model.

The update brings boosted efficiency and an improved user experience. New rendering optimizations take full advantage of the NVIDIA Ada Lovelace architecture enhancements in NVIDIA RTX GPUs with DLSS 3 technology fully integrated into the Omniverse RTX Renderer. In addition, a new AI denoiser enables real-time 4K path tracing of massive industrial scenes.

New application and experience templates provide developers getting started with OpenUSD and Omniverse a major headstart with minimal coding.

A new Omniverse Kit Extension Registry, a central repository for accessing, sharing and managing Omniverse extensions, lets developers easily turn functionality on and off in their application, making it easier than ever to build custom apps from over 500 core Omniverse extensions provided by NVIDIA.

New extended-reality developer tools let users build spatial-computing options natively into their Omniverse-based applications, giving users the flexibility to experience their 3D projects and virtual worlds however they like.

Expanding their collaboration across Adobe Substance 3D, generative AI and OpenUSD initiatives, Adobe and NVIDIA announced plans to make Adobe Firefly, Adobe’s family of creative generative AI models, available as APIs in Omniverse, enabling developers and creators to enhance their design processes.

Developers and industrial enterprises have new foundation apps and services to optimize and enhance 3D pipelines with the OpenUSD framework and generative AI.

Studio professionals can connect the world of generative AI to their workflows to accelerate entire projects — from environment creation and character animation to scene-setting and more. With Kit AI Agent, OpenUSD Connectors and extensions to prompt top generative AI tools and APIs, Omniverse can aggregate the final result in a unified viewport — collectively reducing the time from conception to creation.

RTX: The Next Generation

The new NVIDIA RTX 5000, RTX 4500 and RTX 4000 Ada Generation professional desktop GPUs feature the latest NVIDIA Ada Lovelace architecture technologies, including DLSS 3, for smoother rendering and real-time interactivity in 3D applications such as Unreal Engine.

These workstation-class GPUs feature third-generation RT Cores with up to 2x the throughput of the previous generation. This enables users to work with larger, higher-fidelity images in real time, helping artists and designers maintain their creative flow.

Fourth-generation Tensor Cores deliver up to 2x the AI performance of the previous generation for AI training and development as well as inferencing and generative AI workloads. ‌Large GPU memory enables AI-augmented multi-application workflows with the latest generative AI-enabled tools and applications.

The Ada architecture provides these new GPUs with up to twice the video encode and decode capability of the previous generation, encoding up to 8K60 video in real time, with support for AV1 encode and decode. Combined with next-generation AI performance, these capabilities make the new professional GPUs ideal for multi-stream video editing workflows with high-resolution content using  AI-augmented video editing applications such as Adobe Premiere and DaVinci Resolve.

Designed for high-end creative, multi-application professional workflows that require large models and datasets, these new GPUs provide large GDDR6 memory: 20GB for the RTX 4000, 24GB for the RTX 4500 and 32GB for the RTX 5000 — all supporting error-correcting code for error-free computing.

A Modern-Day Picasso

3D artists regularly face the monumental task of bringing scenes to life by artistically mixing hero assets with props, materials, backgrounds and lighting. Generative AI technologies can help streamline this workflow by generating secondary assets, like environment maps that light the scene.

At SIGGRAPH, Shutterstock announced that it’s tapping into NVIDIA Picasso to train a generative AI model that can create 360 HDRi photorealistic environment maps. The model is built using Shutterstock’s responsibly licensed libraries.

Shutterstock using NVIDIA Picasso to create 360 HDRi photorealistic environment maps.

Previously, artists needed to use expensive 360-degree cameras to create backgrounds and environment maps from scratch, or choose from fixed options that may not precisely match their 3D scene. Now, from simple prompts or using their desired background as a reference, the Shutterstock generative AI feature will quickly generate custom 360-degree, 8K-resolution, HDRi environment maps, which artists can use to set a background and light a scene. This allows more time to work on hero 3D assets, which are the primary assets of a 3D scene that viewers will focus on.

Autodesk also announced that it will integrate generative AI content-creation services — developed using foundation models in Picasso — with its popular 3D software Autodesk Maya.

Autodesk Maya generative AI content-creation services developed using foundation models in Picasso.

August Studio Driver Delivers

The August Studio Driver supports these updates and more, including the latest release of XSplit Broadcaster, the popular streaming software that lets users simultaneously stream to multiple platforms.

XSplit Broadcaster 4.5 introduces NVIDIA Encoder (NVENC) AV1 support. GeForce and NVIDIA RTX 40 Series GPU users can now stream in high-quality 4K 60 frames per second directly to YouTube Live, dramatically improving video quality.

XSplit Broadcaster 4.5 adds AV1 livestreaming support for YouTube.

Streaming in AV1 with RTX GPUs provides 40% better efficiency than H.264, reducing bandwidth requirements for livestreaming or reducing file size for high-quality local captures.

H.264 vs. AV1: 4K60 source encoded at 8 Mbps.

An update to the Reallusion iClone Omniverse Connector includes new features such as real-time synchronization of projects, as well as enhanced import functionality for OpenUSD. This makes work between iClone and Omniverse quicker, smoother and more efficient.

Brew-tiful Artwork

Words can’t espresso the stunning 3D scene Natural Coffee.

Do they accept reservations?

NVIDIA artist Andrew Averkin has over 15 years of experience in the creative field. He finds joy in a continuous journey — blending art and technology — to bring his vivid imagination to life.

His work, Natural Coffee, has a compelling origin story. Once upon a time, in a bustling office, there was a cup of “natural coffee” known for its legendary powers. It gave artists nerves of steel at work, improved performance across the board and, as a small bonus, offered magical music therapy.

Averkin used an image generator to quickly cycle through visual ideas created from simple text-based prompts. Using AI to brainstorm imagery at the beginning of creative workflows is becoming more popular by artists looking to save time on iteration.

Averkin iterates for inspiration.

With a visual foundation, Averkin speeds up the process by acquiring 3D assets from online stores to quickly build a 3D blockout of the scene, a rough-draft level built using simple 3D shapes without details or polished details.

Next, Averkin polished individual assets in Autodesk 3ds Max, sculpting models with fine detail, testing and applying different textures and materials. His GeForce RTX 4090 GPU unlocked RTX-accelerated AI denoising — with the default Autodesk Arnold renderer — delivering interactive 3D modeling, which helped tremendously while composing the scene.

Averkin working in Autodesk 3ds Max.

“I chose a GeForce RTX graphics card for quality, speed and safety, plain and simple,” said Averkin.

Averkin then exported Natural Coffee to the NVIDIA Omniverse USD Composer app via the Autodesk 3ds Max Connector. “Inside USD Composer I added more details, played a lot with a built-in collection of materials, plus did a lot of lighting work to make composition look more realistic,” he explained.

Real-time rendering in Omniverse USD Composer.

One of the biggest benefits in USD Composer is the ability to review scenes rendering in real time with photorealistic light, shadows, textures and more. This dramatically improves the process of editing massive 3D scenes, making it quicker and easier. Averkin was even able to add a camera fly animation, further elevating the scene.

The final step was to add a few touch-ups in Adobe Photoshop. Over 30 GPU-accelerated features gave Averkin plenty of options for playing with colors and contrast, and making final image adjustments smoothly and quickly.

Averkin encourages advanced 3D artists to experiment with the OpenUSD framework. “I use it a lot in my work at NVIDIA and in personal projects,” he said. “OpenUSD is very powerful. It helps with work in multiple creative apps in a non-destructive way, and other great features make the entire process easier and more flexible.”

NVIDIA artist Andrew Averkin.

Check out Averkin’s portfolio on ArtStation.

Follow NVIDIA Studio on Instagram, Twitter and Facebook. Access tutorials on the Studio YouTube channel and get updates directly in your inbox by subscribing to the Studio newsletter.

Shutterstock Brings Generative AI to 3D Scene Backgrounds With NVIDIA Picasso

Picture this: Creators can quickly create and customize 3D scene backgrounds with the help of generative AI, thanks to cutting-edge tools from Shutterstock.

The visual-content provider is building services using NVIDIA Picasso — a cloud-based foundry for developing generative AI models for visual design.

The work incorporates Picasso’s latest feature — announced today during NVIDIA founder and CEO Jensen Huang’s SIGGRAPH keynote — which will help artists enhance and light 3D scenes based on simple text or image prompts, all with AI models built using fully licensed, rights-reserved data.

From these prompts, the new gen AI feature quickly generates custom 360-degree, 8K-resolution, high-dynamic-range imaging (HDRi) environment maps, which artists can use to set a background and light a scene.

This expands on NVIDIA’s collaboration with Shutterstock to empower the next generation of digital content-creation tools and accelerate 3D model generation.

To meet a surging demand for immersive visuals in films, games, virtual worlds, advertising and more, the 3D artist community is rapidly expanding, with over 20% growth in the past year.

Many of these artists are tapping generative AI to bolster their complex workflows — and will be able to use the technology to quickly create and customize environment maps. This allows more time to work on hero 3D assets, which are the primary assets of a 3D scene that viewers will focus on. It makes a panoramic difference when creating compelling 3D visuals.

“We’re committed to hyper-enabling 3D artists and collaborators — helping them build the immersive environments they envision faster than ever before and streamlining their content-creation workflows using NVIDIA Picasso,” said Dade Orgeron, vice president of 3D innovation at Shutterstock.

Generating Photorealistic Environment Maps

Previously, artists needed to buy expensive 360-degree cameras to create backgrounds and environment maps from scratch, or choose from fixed options that may not precisely match their 3D scene.

Now, users can simply provide a prompt — whether that’s text or a reference image — and the 360 HDRi services built on Picasso will quickly generate panoramic images. Plus, thanks to generative AI, the custom environment map can automatically match the background image that’s inputted as a prompt.

Users can then customize the maps and quickly iterate on ideas until they achieve the vision they want.

Collaboration to Boost 3D World-Building

Autodesk, a provider of 3D software and tools for creators in media and entertainment, is focused on giving artists the creative freedom to inspire and delight audiences worldwide.

Enabling artists to trade mundane tasks for unbridled creativity, Autodesk will integrate generative AI content-creation services — developed using foundation models in Picasso — with its popular 3D software Maya.

Supercharging Autodesk customer workflows with AI allows artists to focus on creating — and to ultimately produce content faster.

Generative AI Model Foundry

Picasso is part of NVIDIA AI Foundations, which advances enterprise-level generative AI for text, visual content and even biology.

The foundry will also adopt new NVIDIA research to generate physics-based rendering materials from text and image prompts, demonstrated at SIGGRAPH’s Real-Time Live competition. This will enable content providers to create 3D services, software and tools that enhance and expedite the simulation of diverse physical materials, such as tiles, metals and wood — complete with texture-mapping techniques, including normal, roughness and ambient occlusion.

Picasso runs on the NVIDIA Omniverse Cloud platform-as-a-service and is accessible via a serverless application programming interface that content and service providers like Shutterstock can easily connect to their websites and applications.

Learn about the latest advances in generative AI, graphics and more by joining NVIDIA at SIGGRAPH, running through Thursday, Aug. 10.

NVIDIA CEO Jensen Huang Returns to SIGGRAPH

One pandemic and one generative AI revolution later, NVIDIA founder and CEO Jensen Huang returns to the SIGGRAPH stage next week to deliver a live keynote at the world’s largest professional graphics conference.

The address, slated for Tuesday, Aug. 8, at 8 a.m. PT in Los Angeles, will feature an exclusive look at some of NVIDIA’s newest breakthroughs, including award-winning research, OpenUSD developments and the latest AI-powered solutions for content creation.

NVIDIA founder and CEO Jensen Huang.

Huang’s address comes after NVIDIA joined forces last week with Pixar, Adobe, Apple and Autodesk to found the Alliance for OpenUSD, a major leap toward unlocking the next era of interoperability in 3D graphics, design and simulation.

The group will standardize and extend OpenUSD, the open-source Universal Scene Description framework that’s the foundation of interoperable 3D applications and projects ranging from visual effects to industrial digital twins.

Huang will also offer a perspective on what’s been a raucous year for AI, with wildly popular new generative AI applications — including ChatGPT and Midjourney — providing a taste of what’s to come as developers worldwide get to work.

Throughout the conference, NVIDIA will participate in sessions on immersive visualization, 3D interoperability and AI-mediated video conferencing and presenting 20 research papers. Attendees will also get the opportunity to join hands-on labs.

Join SIGGRAPH to witness the evolution of AI and visual computing. Watch the keynote on this page.


Image source: Ron Diering, via Flickr, some rights reserved.