Staying in Sync: NVIDIA Combines Digital Twins With Real-Time AI for Industrial Automation

Real-time AI is helping with the heavy lifting in manufacturing, factory logistics and robotics.

In such industries — often involving bulky products, expensive equipment, cobot environments and logistically complex facilities — a simulation-first approach is ushering in the next phase of automation.

NVIDIA founder and CEO Jensen Huang today demonstrated in his GTC keynote how developers can use digital twins to develop, test and refine their large-scale, real-time AIs entirely in simulation before rolling them out in industrial infrastructure, saving significant time and cost.

NVIDIA Omniverse, Metropolis, Isaac and cuOpt interact in AI gyms where developers can train AI agents to help robots and humans navigate unpredictable or complex events.

In the demo, a digital twin of a 100,000-square-foot warehouse — built using the NVIDIA Omniverse platform for developing and connecting OpenUSD applications — operates as a simulation environment for dozens of digital workers and multiple autonomous mobile robots (AMRs), vision AI agents and sensors.

Each AMR, running the NVIDIA Isaac Perceptor multi-sensor stack, processes visual information from six sensors, all simulated in the digital twin.

At the same time, the NVIDIA Metropolis platform for vision AI creates a single centralized map of worker activity across the entire warehouse, fusing together data from 100 simulated ceiling-mounted camera streams with multi-camera tracking. This centralized occupancy map helps inform optimal AMR routes calculated by the NVIDIA cuOpt engine for solving complex routing problems.

cuOpt, a record-breaking optimization AI microservice, solves complex routing problems with multiple constraints using GPU-accelerated evolutionary algorithms.

All of this happens in real time, while Isaac Mission Control coordinates the entire fleet using map data and route graphs from cuOpt to send and execute AMR commands.

An AI Gym for Industrial Digitalization

AI agents can assist in large-scale industrial environments by, for example, managing fleets of robots in a factory or identifying streamlined configurations for human-robot collaboration in supply chain distribution centers. To build these complex agents, developers need digital twins that function as AI gyms — physically accurate environments for AI evaluation, simulation and training.

Such software-in-the-loop AI testing enables AI agents and AMRs to adapt to real-world unpredictability.

In the demo, an incident occurs along an AMR’s planned route, blocking the path and preventing it from picking up a pallet. NVIDIA Metropolis updates an occupancy grid, mapping all humans, robots and objects in a single view. cuOpt then plans an optimal route, and the AMR responds accordingly to minimize downtime.

With Metropolis vision foundation models powering the NVIDIA Visual Insight Agent (VIA) framework, AI agents can be built to help operations teams answer questions like, “What situation occurred in aisle three of the factory?” And the generative AI-powered agent offers immediate insights such as, “Boxes fell from the shelves at 3:30 p.m., blocking the aisle.”

Developers can use the VIA framework to build AI agents capable of processing large amounts of live or archived videos and images with vision-language models — whether deployed at the edge or in the cloud. This new generation of visual AI agents will help nearly every industry summarize, search and extract actionable insights from video using natural language.

All of these AI functions can be enhanced through continuous, simulation-based training and are deployed as modular NVIDIA NIM inference microservices.

Learn more about the latest advancements in generative AI and industrial digitalization at NVIDIA GTC, a global AI conference running through Thursday, March 21, at the San Jose Convention Center and online.

At Your Microservice: NVIDIA Smooths Businesses’ Journey to Generative AI

NVIDIA’s AI platform is available to any forward-thinking business — and it’s easier to use than ever.

Launched today, NVIDIA AI Enterprise 5.0 includes NVIDIA microservices, downloadable software containers for deploying generative AI applications and accelerated computing. It’s available from leading cloud service providers, system builders and software vendors — and it’s in use at customers such as Uber.

“Our adoption of NVIDIA AI Enterprise inference software is important for meeting the high performance our users expect,” said Albert Greenberg, vice president of platform engineering at Uber. “Uber prides itself on being at the forefront of adopting and using the latest, most advanced AI innovations to deliver a customer service platform that sets the industry standard for effectiveness and excellence.”

Microservices Speed App Development

Developers are turning to microservices as an efficient way to build modern enterprise applications at a global scale. Working from a browser, they use cloud APIs, or application programming interfaces, to compose apps that can run on systems and serve users worldwide.

NVIDIA AI Enterprise 5.0 now includes a wide range of microservices — NVIDIA NIM for deploying AI models in production and the  NVIDIA CUDA-X collection of microservices which includes NVIDIA cuOpt.

NIM microservices optimize inference for dozens of popular AI models from NVIDIA and its partner ecosystem.

Powered by NVIDIA inference software — including Triton Inference Server, TensorRT, and TensorRT-LLM — NIM slashes deployment times from weeks to minutes. It provides security and manageability based on industry standards as well as compatibility with enterprise-grade management tools.

NVIDIA cuOpt is a GPU-accelerated AI microservice that’s set world records for route optimization and can empower dynamic decision-making that reduces cost, time and carbon footprint. It’s one of the CUDA-X microservices that help industries put AI into production.

More capabilities are in the works. For example, NVIDIA RAG LLM operator — now in early access and described in more detail here — will move co-pilots and other generative AI applications that use retrieval-augmented generation from pilot to production without rewriting any code.

NVIDIA microservices are being adopted by leading application and cybersecurity platform providers including CrowdStrike, SAP and ServiceNow.

More Tools and Features

Three other updates in version 5.0 are worth noting.

The platform now packs NVIDIA AI Workbench, a developer toolkit for quickly downloading, customizing, and running generative AI projects. The software is now generally available and supported with an NVIDIA AI Enterprise license.

Version 5.0 also now supports Red Hat OpenStack Platform, the environment most Fortune 500 companies use for creating private and public cloud services. Maintained by Red Hat, it provides developers a familiar option for building virtual computing environments. IBM Consulting will help customers deploy these new capabilities.

In addition, version 5.0 expands support to cover a wide range of the latest NVIDIA GPUs, networking hardware and virtualization software.

Available to Run Anywhere

The enhanced NVIDIA AI platform is easier to access than ever.

NIM and CUDA-X microservices and all the 5.0 features will be available soon on the AWS, Google Cloud, Microsoft Azure and Oracle Cloud marketplaces.

For those who prefer to run code in their own data centers, VMware Private AI Foundation with NVIDIA will support the software, so it can be deployed in the virtualized data centers of Broadcom’s customers.

Companies have the option of running NVIDIA AI Enterprise on Red Hat OpenShift, allowing them to deploy on bare-metal or virtualized environments. It’s also supported on Canonical’s Charmed Kubernetes as well as Ubuntu.

In addition, the AI platform will be part of the software available on HPE ProLiant servers from Hewlett Packard Enterprise (HPE). HPE’s enterprise computing solution for generative AI handles inference and model fine-tuning using NVIDIA AI Enterprise.

In addition, Anyscale, Dataiku and DataRobot — three leading providers of the software for managing machine learning operations — will support NIM on their platforms. They join an NVIDIA ecosystem of hundreds of MLOps partners, including Microsoft Azure Machine Learning, Dataloop AI, Domino Data Lab and Weights & Biases.

However they access it, NVIDIA AI Enterprise 5.0 users can benefit from software that’s secure, production-ready and optimized for performance. It can be flexibly deployed for applications in the data center, the cloud, on workstations or at the network’s edge.

NVIDIA AI Enterprise is available through leading system providers, including Cisco, Dell Technologies, HP, HPE, Lenovo and Supermicro.

Hear Success Stories at GTC

Users will share their experiences with the software at NVIDIA GTC, a global AI conference, running March 18-21 at the San Jose Convention Center.

For example, ServiceNow chief digital information officer Chris Bedi will speak on a panel about harnessing generative AI’s potential. In a separate talk, ServiceNow vice president of AI Products Jeremy Barnes will share on using NVIDIA AI Enterprise to achieve maximum developer productivity.

Executives from BlackRock, Medtronic, SAP and Uber will discuss their work in finance, healthcare, enterprise software, and business operations using the NVIDIA AI platform.

In addition, executives from ControlExpert, a global application provider for  car insurance companies based in Germany, will share how they developed an AI-powered claims management solution using NVIDIA AI Enterprise software.

They’re among a growing set of companies that benefit from NVIDIA’s work evaluating hundreds of internal and external generative AI projects — all integrated into a single package that’s been tested for stability and security.

And get the full picture from NVIDIA CEO and founder Jensen Huang in his GTC keynote.

See notice regarding software product information. 

Safe and Found: NVIDIA Generative AI Microservices Help Enterprises Detect and Address Software Security Issues in Seconds

Software is writing software, thanks to generative AI.

Now, it can even help check software for cybersecurity and other risks.

NVIDIA founder and CEO Jensen Huang today unveiled in his GTC keynote how the company’s generative AI technologies can help enterprises rapidly detect and address common vulnerabilities and exposures (CVEs) and other software security issues.

The new NVIDIA NIM and NeMo Retriever microservices, along with the NVIDIA Morpheus accelerated AI framework, working together can identify such problems in just seconds, rather than the hours — or even days — it would take security analysts using traditional tools.

While traditional methods require substantial manual effort to pinpoint solutions for any discovered vulnerabilities, these technologies enable quick, automatic and actionable CVE risk analysis using large language models (LLMs) and retrieval-augmented generation, aka RAG.

This lets analysts function as CEO-like decision-makers in so-called “enterprises of the future,” where artificial intelligence accelerates much of the operational work and provides data-driven insights to inform human choices.

Generative AI for cybersecurity will become increasingly important and prevalent, as last year saw record-high reported software security flaws in the CVE public database.

Watch the application demo:

How Generative AI Works in Cybersecurity

Gartner predicts that generative AI will enable a 30% reduction in false-positive rates for application security testing and threat detection by 2027.

Offered as part of the NVIDIA AI Enterprise software platform, the NVIDIA generative AI microservices and Morpheus quickly accomplish CVE risk analysis with an extremely high level of accuracy, matching the results of most human experts.

Security analysts can use these technologies to determine whether a software package includes exploitable and vulnerable components, using LLMs and event-driven RAG triggered by the creation of a new software package or the detection of a CVE.

In the NVIDIA application demoed above, an LLM generates a list of tasks to check a software package for vulnerabilities.

Then, the NVIDIA AI-powered LLM agent searches data sources — both internal and external — for any safety actions that should be taken to bring the software into compliance.

These steps are repeated until every item on the checklist has been triaged. Then, the application summarizes the interaction and creates justifications for action, which are passed on to a human analyst to decide appropriate next steps.

In this way, event-driven RAG lets humans oversee security measures while generative AI dramatically accelerates the brunt of the research and investigative tasks that would typically take up to days for completion.

Enterprises Harness NVIDIA Generative AI for Security

NVIDIA is using the application to ensure the security of its own internal software development workflows.

On average, the application in seconds performs over 400 internet searches and makes more than 500 queries on various enterprise data sources to analyze a single software container — a task that would typically take a human up to days. NVIDIA scans 1,000+ containers per day.

Cybersecurity leader CrowdStrike is collaborating with NVIDIA to implement generative AI and RAG.

“Our industry has reached a crucial pivot point as AI becomes an equalizer for security teams and adversaries,” said Sven Krasser, senior vice president and chief scientist at CrowdStrike. “Today, threat actors are leveraging the latest AI advancements to compromise organizations with increased velocity. To stay one step ahead, security and operations teams need advanced threat detection and response capabilities that force-multiply their efforts by coupling together the power of data with targeted AI to accelerate investigations, identify potential vulnerabilities and prevent breaches in their environments.”

NIM, NeMo Retriever and Morpheus are available through NVIDIA AI Enterprise, a cloud-native software platform that provides accelerated and efficient runtime for generative AI foundation models. It streamlines generative AI adoption with security, stability, manageability and support.

Availability

Developers can experiment with NVIDIA microservices for free at ai.nvidia.com. Enterprises can deploy production-grade NIM microservices with NVIDIA AI Enterprise 5.0 running on NVIDIA-Certified Systems and leading cloud marketplaces.

See the software security application in action by joining Cybersecurity Developer Day at NVIDIA GTC, a global AI conference running through Thursday, March 21, online and at the San Jose Convention Center.

NVIDIA BioNeMo Expands Computer-Aided Drug Discovery With New Foundation Models

Pharma and biology researchers developing the next generation of therapeutics can now take advantage of NVIDIA BioNeMo’s expanded generative AI toolkit, along with new ways to access its models.

The latest BioNeMo foundation models can analyze DNA sequences, predict how proteins will change shape in response to a drug molecule, and determine a cell’s function based on its RNA.

Models for accelerating protein structure prediction, generative chemistry and molecular docking prediction are now available as microservices through NVIDIA NIM, a collection of models for inference announced today at NVIDIA GTC — and available through the NVIDIA AI Enterprise platform.

BioNeMo models will soon be accessible on AWS HealthOmics, a purpose-built service that helps healthcare and life sciences organizations store, query and analyze biological data including DNA and RNA.

With these capabilities, drug discovery teams can easily integrate generative AI into their workflows to better understand and design drug molecules virtually — and reduce the need for time- and resource-heavy physical experiments.

BioNeMo Expands to Foundation Models for Genomics, Protein Design

Among the new foundation models available in BioNeMo is its first genomics model, DNABERT. Trained on DNA sequences, the model can be used to predict the function of specific regions of the genome, analyze the effects of gene mutations and variants, and more.

A second model coming soon to BioNeMo, scBERT, is trained on data from single-cell RNA sequencing, enabling users to apply it to downstream tasks such as predicting the effects of gene knockouts — where a specific gene is removed or deactivated — or identifying cell types such as neurons, blood cells or muscle cells.

EquiDock, a third, joins a collection of BioNeMo models that can predict the 3D structure of how two proteins interact, which is critical to understanding if a drug molecule will be effective.

At Your Service: New Microservices Enable AI Insights

The NIM catalog of containerized AI microservices features more than two dozen healthcare models. Among them are DiffDock, which predicts the 3D structure of potential drug candidates and their protein candidates, and ESMFold, which can predict protein structure based on a single amino acid sequence.

Another NIM, MolMIM, generates drug candidates optimized for properties defined by the user — and is even able to design molecules that are optimized to bind to a specific protein target.

Developers can access production-grade NIM microservices through NVIDIA AI Enterprise using NVIDIA-Certified Systems on premises as well as leading cloud marketplaces, including Amazon Web Services (AWS), Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure.

100+ Companies Integrate BioNeMo-Powered AI Into Drug Discovery Workflows

NVIDIA BioNeMo is being used by more than 100 companies worldwide, including:

Astellas Pharma: The Tokyo-based company is using BioNeMo to accelerate molecular simulations and large language models for drug discovery applications. The company will use the Tokyo-1 AI supercomputer to further advance its work.
Cadence: A leading developer of computation software, San Jose, Calif.-based Cadence is integrating BioNeMo microservices with its Orion platform to accelerate molecular simulation.
Iambic: Based in San Diego, the drug discovery company has adopted BioNeMo and will contribute its NeuralPLexer model as a BioNeMo cloud API, or application programming interface, for noncommercial use, helping researchers predict how a protein’s 3D structure changes in response to a drug molecule.
Insilico Medicine: A premier member of the NVIDIA Inception program for startups, New York City-based Insilico has integrated BioNeMo in its AI-accelerated drug discovery workflow, developing a pipeline of over 30 therapeutic assets — including six in clinical stages.
Recursion: The Salt Lake City-based drug discovery company is a hosting partner offering its Phenom-Beta AI model through BioNeMo. The transformer model extracts insights from cellular microscopy images to help researchers better understand cell function.
Terray Therapeutics: The biotech company, based in Southern California,  is using BioNeMo to help develop a multi-target structural binding model — and is training generative AI models for small molecule design on NVIDIA DGX Cloud.

Discover the latest in AI and healthcare at GTC, a global AI conference running in San Jose, Calif., and online through Thursday, March 21. Tune in to a special address on generative AI in healthcare delivered by Kimberly Powell, vice president of healthcare at NVIDIA, on Tuesday at 8 a.m. PT.

Watch the GTC keynote address by NVIDIA founder and CEO Jensen Huang below:

Reach for the Stars: Eight Out-of-This-World Games Join the Cloud

The stars align this GFN Thursday as more top titles from Ubisoft and Square Enix join the cloud.

Star Wars Outlaws will be coming to the GeForce NOW library at launch later this year, while STAR OCEAN THE SECOND STORY R and PARANORMASIGHT: The Seven Mysteries of Honjo are part of eight new titles joining this week.

Additionally, four other games are getting NVIDIA RTX enhancements, all arriving at next week’s Game Developers Conference.

NARAKA: BLADEPOINT and Portal with RTX are adding full ray tracing and NVIDIA DLSS 3.5 Ray Reconstruction capabilities. This month’s Diablo IV update will add ray tracing. And Sengoku Dynasty — available to stream today — was recently updated with DLSS 3 Frame Generation.

Coming Soon

A galaxy far, far away is coming to the cloud.

GeForce NOW members will be able to stream Star Wars Outlaws, the first open-world Star Wars game from Ubisoft, when it comes to the cloud at launch later this year.

Set between the events of The Empire Strikes Back and Return of the Jedi, explore distinct planets across the galaxy, both iconic and new. Risk it all as Kay Vess, a scoundrel seeking freedom and a fresh new start. Members will fight, steal and outwit their way through the galaxy’s crime syndicates to become the galaxy’s most wanted.

The game will launch with DLSS 3 and ray-traced effects, as well as NVIDIA RTX Direct Illumination (RTXDI) and ray-traced global illumination lighting, taking visuals to the next level. Turn RTX ON, available to Ultimate and Priority members as well as Day Pass users. And both Ultimate members and Day Pass users get the added benefit of NVIDIA DLSS 3 and NVIDIA Reflex for a streaming experience nearly indistinguishable from playing locally.

Adventure Awaits

Play two of Square Enix’s latest games, thanks to the cloud.

With GeForce NOW, there’s always something new to play. This week, Japan-based publisher Square Enix brings two of its latest role-playing adventures to the cloud.

Witness an awakened destiny in STAR OCEAN THE SECOND STORY R, the highly acclaimed remake of the STAR OCEAN series’ second installment. Brought to life with a unique 2.5D aesthetic, which fuses 2D pixel characters and 3D environments, the remake includes all the iconic aspects of the original release while adding fresh elements. Experience new battle mechanics, full Japanese and English voice-overs, original and rearranged music, fast-travel and more. Discover the modernized, classic Japanese role-playing game perfect for newcomers and long-time fans alike.

Members can also try STAR OCEAN THE SECOND STORY R – DEMO this week before purchasing the full game.

Plus, solve an century-old mystery in PARANORMASIGHT: The Seven Mysteries of Honjo, a horror-adventure visual novel surrounding a Japanese tale, in which a mysterious “Rite of Resurrection” leads to conflict between those who have the power to curse others. Players conduct investigations throughout immersive, ambient, 360-degree environments to unravel the mysteries of Honjo, including by conversing with many interesting — and suspicious — characters.

Ultimate members can stream these games at up to 4K resolution for amazing visual quality across nearly any device and access NVIDIA GeForce RTX 4080 servers for extended session lengths. Upgrade today.

Shine Bright Like a New Game

Play crazy poker hands, discover game-changing jokers and trigger outrageous combos in Balatro, streaming this week.

Members can look for the following new games this week:

Hellbreach: Vegas (New release on Steam, March 11)
Deus Ex: Mankind Divided (New release on Epic Games Store, Free March 14)
Outcast – A New Beginning (New release on Steam, March 15)
Balatro (Steam)
PARANORMASIGHT: The Seven Mysteries of Honjo (Steam)
Space Engineers (Xbox, available on PC Game Pass)
STAR OCEAN THE SECOND STORY R (Steam)
STAR OCEAN THE SECOND STORY R – DEMO (Steam)
Warhammer 40,000: Boltgun (Xbox, available on PC Game Pass)

What are you planning to play this weekend? Let us know on X or in the comments below.

How would your life look without video games?

— NVIDIA GeForce NOW (@NVIDIAGFN) March 13, 2024

NVIDIA GTC 2024: A Glimpse Into the Future of AI With Jensen Huang

NVIDIA’s GTC 2024 AI conference will set the stage for another leap forward in AI.

At the heart of this highly anticipated event: the opening keynote by Jensen Huang, NVIDIA’s visionary founder and CEO, who speaks on Monday, March 18, at 1 p.m. Pacific, at the SAP Center in San Jose, Calif.

Planning Your GTC Experience

There are two ways to watch.

Register to attend GTC in person to secure a spot for an immersive experience at the SAP Center. The center is a short walk from the San Jose Convention Center, where the rest of the conference takes place. Doors open at 11 a.m., and badge pickup starts at 10:30 a.m.

The keynote will also be livestreamed at www.nvidia.com/gtc/keynote/.

Whether attending in person or virtually, commit to joining us all week. GTC is more than just a conference. It’s a gateway to the next wave of AI innovations.

Transforming AI: Hear more from Huang as he discusses the origins and impact of transformer neural network architecture with its creators and industry pioneers. He’ll host a panel with all eight authors of the legendary 2017 paper that introduced the concept of transformers: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin.Wed., March 20, 11-11:50 a.m. Pacific.
Join Visionaries Transforming Our World: Hear from leaders such as xAI cofounder Igor Babuschkin; Microsoft Vice President of GenAI Sebastian Bubeck, Stanford University’s Fei-Fei Li,  Meta Vice President of AI Research Joelle Pineau; OpenAI Chief Operating Officer Brad LightCap; Adept AI founder and CEO David Luan; Waabi founder and CEO Raquel Urtasun; Mistral CEO Arthur Mensch; and many others at the forefront of AI across various industries.
Be Part of What Comes Next: Engage from March 17-21 in workshops and peer networking and connect with the experts. This year’s session catalog is packed with topics covering everything from robotics to generative AI, showcasing real-world applications and the latest in AI innovation.
Stay Connected: Tune in online to engage with the event and fellow attendees using #GTC24 on social media.

With visionary speakers and a comprehensive program covering the essentials of AI and computing, GTC promises to be an enlightening experience for all.

Don’t miss your chance to be at the forefront of AI’s evolution. Register now.

Currents of Change: ITIF President Daniel Castro on Energy-Efficient AI and Climate Change

AI-driven change is in the air, as are concerns about the technology’s environmental impact. In this episode of NVIDIA’s AI Podcast, Daniel Castro, vice president of the Information Technology and Innovation Foundation and director of its Center for Data Innovation, speaks with host Noah Kravitz about the motivation behind his AI energy use report, which addresses misconceptions about the technology’s energy consumption. Castro also touches on the need for policies and frameworks that encourage the development of energy-efficient technology. Tune in to discover the crucial role of GPU acceleration in enhancing sustainability and how AI can help address climate change challenges.

The AI Podcast · ITIF President Daniel Castro on Energy-Efficient AI and Climate Change

Register for NVIDIA GTC, a global AI developer conference running March 18-21 in San Jose, Calif., to explore sessions on energy-efficient computing and using AI to combat climate change.

You Might Also Like…

Overjet on Bringing AI to Dentistry – Ep. 179

Dentists get a bad rap. Dentists also get more people out of more aggravating pain than just about anyone, which is why the more technology dentists have, the better. Overjet, a member of the NVIDIA Inception program for startups, is moving fast to bring AI to dentists’ offices.

DigitalPath’s Ethan Higgins on Using AI to Fight Wildires – Ep. 211

DigitalPath is igniting change in the golden state — using computer vision, generative adversarial networks and a network of thousands of cameras to detect signs of fire in real-time.

Anima Anandkumar on Using Generative AI to Tackle Global Challenges – Ep. 204

Anima Anandkumar, Bren Professor at Caltech and senior director of AI research at NVIDIA, speaks to generative AI’s potential to make splashes in the scientific community, from accelerating drug and vaccine research to predicting extreme weather events like hurricanes or heat waves.

Doing the Best They Can: EverestLabs Ensures Fewer Recyclables Go to Landfills – Ep. 184

All of us recycle. Or, at least, all of us should. Now, AI is joining the effort. JD Ambati, founder and CEO of EverestLabs, developer of RecycleOS, discusses developing first AI-enabled operating system for recycling.

Show Notes

1:41: Context on and findings from the AI energy use report
10:36: How GPU acceleration has transformed the energy efficiency of AI, particularly in weather and climate forecasting
12:31: Examples of how GPU acceleration has improved the energy efficiency of AI operations
15:51: Castro’s insights on sustainability and AI
20:01: Policies and frameworks to encourage energy-efficient AI
26:43: Castro’s outlook on the interplay among advancing AI technology, energy sustainability and climate change

Subscribe to the AI Podcast

Get the AI Podcast through iTunes, Google Podcasts, Google Play, Amazon Music, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.

Make the AI Podcast better: Have a few minutes to spare? Fill out this listener survey.

 

AI Decoded: Demystifying Large Language Models, the Brains Behind Chatbots

Editor’s note: This post is part of our AI Decoded series, which aims to demystify AI by making the technology more accessible, while showcasing new hardware, software, tools and accelerations for RTX PC and workstation users.

If AI is having its iPhone moment, then chatbots are one of its first popular apps.

They’re made possible thanks to large language models, deep learning algorithms pretrained on massive datasets — as expansive as the internet itself — that can recognize, summarize, translate, predict and generate text and other forms of content. They can run locally on PCs and workstations powered by NVIDIA GeForce and RTX GPUs.

LLMs excel at summarizing large volumes of text, classifying and mining data for insights, and generating new text in a user-specified style, tone or format. They can facilitate communication in any language, even beyond ones spoken by humans, such as computer code or protein and genetic sequences.

While the first LLMs dealt solely with text, later iterations were trained on other types of data. These multimodal LLMs can recognize and generate images, audio, videos and other content forms.

Chatbots like ChatGPT were among the first to bring LLMs to a consumer audience, with a familiar interface built to converse with and respond to natural-language prompts. LLMs have since been used to help developers write code and scientists to drive drug discovery and vaccine development.

But the AI models that power those functions are computationally intensive. Combining advanced optimization techniques and algorithms like quantization with RTX GPUs, which are purpose-built for AI, helps make LLMs compact enough and PCs powerful enough to run locally — no internet connection required. And a new breed of lightweight LLMs like Mistral — one of the LLMs powering Chat with RTX — sets the stage for state-of-the-art performance with lower power and storage demands.

Why Do LLMs Matter?

LLMs can be adapted for a wide range of use cases, industries and workflows. This versatility, combined with their high-speed performance, offers performance and efficiency gains across virtually all language-based tasks.

DeepL, running on NVIDIA GPUs in the cloud, uses advanced AI to provide accurate text translations.

LLMs are widely used in language translation apps such as DeepL, which uses AI and machine learning to provide accurate outputs.

Medical researchers are training LLMs on textbooks and other medical data to enhance patient care. Retailers are leveraging LLM-powered chatbots to deliver stellar customer support experiences. Financial analysts are tapping LLMs to transcribe and summarize earning calls and other important meetings. And that’s just the tip of the iceberg.

Chatbots — like Chat with RTX — and writing assistants built atop LLMs are making their mark on every facet of knowledge work, from content marketing and copywriting to legal operations. Coding assistants were among the first LLM-powered applications to point toward the AI-assisted future of software development. Now, projects like ChatDev are combining LLMs with AI agents — smart bots that act autonomously to help answer questions or perform digital tasks — to spin up an on-demand, virtual software company. Just tell the system what kind of app is needed and watch it get to work.

Learn more about LLM agents on the NVIDIA developer blog.

Easy as Striking Up a Conversation 

Many people’s first encounter with generative AI came by way of a chatbot such as ChatGPT, which simplifies the use of LLMs through natural language, making user action as simple as telling the model what to do.

LLM-powered chatbots can help generate a draft of marketing copy, offer ideas for a vacation, craft an email to customer service and even spin up original poetry.

Advances in image generation and multimodal LLMs have extended the chatbot’s realm to include analyzing and generating imagery — all while maintaining the wonderfully simple user experience. Just describe an image to the bot or upload a photo and ask the system to analyze it. It’s chatting, but now with visual aids.

For more on how these bots are designed, check out the on-demand webinar on Building Intelligent AI Chatbots Using RAG.

Future advancements will help LLMs expand their capacity for logic, reasoning, math and more, giving them the ability to break complex requests into smaller subtasks.

Progress is also being made on AI agents, applications capable of taking a complex prompt, breaking it into smaller ones, and engaging autonomously with LLMs and other AI systems to complete them. ChatDev is an example of an AI agent framework, but agents aren’t limited to technical tasks.

For example, users could ask a personal AI travel agent to book a family vacation abroad. The agent would break that task into subtasks — itinerary planning, booking travel and lodging, creating packing lists, finding a dog walker — and independently execute them in order.

Unlock Personal Data With RAG

As powerful as LLMs and chatbots are for general use, they can become even more helpful when combined with an individual user’s data. By doing so, they can help analyze email inboxes to uncover trends, comb through dense user manuals to find the answer to a technical question about some hardware, or summarize years of bank and credit card statements.

Retrieval-augmented generation, or RAG, is one of the easiest and most effective ways to hone LLMs for a particular dataset.

An example of RAG on a PC.

RAG enhances the accuracy and reliability of generative AI models with facts fetched from external sources. By connecting an LLM with practically any external resource, RAG lets users chat with data repositories while also giving the LLM the ability to cite its sources. The user experience is as simple as pointing the chatbot toward a file or directory.

For example, a standard LLM will have general knowledge about content strategy best practices, marketing tactics and basic insights into a particular industry or customer base. But connecting it via RAG to marketing assets supporting a product launch would allow it to analyze the content and help plan a tailored strategy.

RAG works with any LLM, as the application supports it. NVIDIA’s Chat with RTX tech demo is an example of RAG connecting an LLM to a personal dataset. It runs locally on systems with a GeForce RTX or NVIDIA RTX professional GPU.

To learn more about RAG and how it compares to fine-tuning an LLM, read the tech blog, RAG 101: Retrieval-Augmented Generation Questions Answered.

Experience the Speed and Privacy of Chat with RTX

Chat with RTX is a local, personalized chatbot demo that’s easy to use and free to download. It’s built with RAG functionality and TensorRT-LLM and RTX acceleration. It supports multiple open-source LLMs, including Meta’s Llama 2 and Mistral’s Mistral. Support for Google’s Gemma is coming in a future update.

Chat with RTX connects users to their personal data through RAG.

Users can easily connect local files on a PC to a supported LLM simply by dropping files into a folder and pointing the demo to that location. Doing so enables it to answer queries with quick, contextually relevant answers.

Since Chat with RTX runs locally on Windows with GeForce RTX PCs and NVIDIA RTX workstations, results are fast — and the user’s data stays on the device. Rather than relying on cloud-based services, Chat with RTX lets users process sensitive data on a local PC without the need to share it with a third party or have an internet connection.

To learn more about how AI is shaping the future, tune in to NVIDIA GTC, a global AI developer conference running March 18-21 in San Jose, Calif., and online.

Head of the Class: Explore AI’s Potential in Higher Education and Research at GTC

For students, researchers and educators eager to delve into AI, GTC — NVIDIA’s conference on AI and accelerated computing — is in a class of its own.

Taking place from March 18-21 at the San Jose Convention Center, GTC features over 900 talks presented by world-renowned experts in fields such as generative AI, high performance computing, healthcare, energy and environment and robotics.

See some of the top sessions for attendees in higher education below. And don’t miss NVIDIA founder and CEO Jensen Huang’s GTC keynote on how AI is transforming industries, on Monday, March 18, at 1 p.m. PT.

For Researchers 

Transforming AI is a panel featuring Huang and the eight authors of “Attention Is All You Need,” a groundbreaking paper that introduced the transformer neural network architecture.
Fireside Chat With Fei-Fei Li and Bill Dally: The High-Speed Revolution in AI and Managing the Impact on Humanity, featuring Dally, chief scientist and senior vice president of research at NVIDIA, and Li, Sequoia Professor of computer science at Stanford University.
Fireside Chat With Christian Szegedy and Bojan Tunguz: Automated Reasoning for More Advanced Software Synthesis and Verification, featuring Szegedy, research scientist and founder of xAI, and Tunguz, data scientist at NVIDIA.

See more sessions for researchers.

For Educators

Priming Researchers and Students for AI and Accelerated Computing Breakthroughs With Self-Sustaining Training Programs, featuring Israel Chaparro-Cruz, lecturer and co-investigator from Universidad Nacional Jorge Basadre Grohmann; Mohammad Mostafanejad, lead software scientist at the Molecular Sciences Software Institute at Virginia Tech; and Joe Bungo, Deep Learning Institute program manager at NVIDIA.
Learn How Educators Are Integrating Generative AI, Simulation and Design Into Their Curricula, featuring Deepak Chetty, area head for virtual production and assistant professor of practice at the University of Texas, Austin; Barbara Mones, teaching professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington; and Laura Scholl, senior content developer for the NVIDIA Deep Learning Institute.
Omniverse Educator Summit, an opportunity for educators to explore the NVIDIA Omniverse platform, connect with peers and discover practical resources for classrooms from NVIDIA, including the Deep Learning Institute’s Teaching Kits and University Ambassador Program. Register to attend.

Find more sessions for educators.

For Students

AI Secrets I Wish I Knew, featuring speakers from Stanford University, NVIDIA and education journalism initiative EdSurge.
NVIDIA Graduate Fellowship Fast Forward Talks, featuring Dally and fellowship recipients from Caltech, Cornell University, Stanford University and UC Berkeley.
Bridging the AI Divide: Expanding Access and Training to Nontraditional Talents and Underserved Communities, featuring speakers from Black Women In Artificial Intelligence, Create Labs, Cortex Innovation District and NVIDIA.

Discover more sessions for students and apply to join the NVIDIA Student Network.

To gain hands-on experience, check out training labs and full-day technical workshops at GTC.

Eco-System Upgrade: AI Plants a Digital Forest at NVIDIA GTC

The ecosystem around NVIDIA’s technologies has always been verdant — but this is absurd.

After a stunning premiere at the World Economic Forum in Davos, immersive artworks based on Refik Anadol Studio’s Large Nature Model will come to the U.S. for the first time at NVIDIA GTC.

Offering a deep dive into the synergy between AI and the natural world, Anadol’s multisensory work, “Large Nature Model: A Living Archive,” will be situated prominently on the main concourse of the San Jose Convention Center, where the global AI event is taking place, from March 18-21.

Fueled by NVIDIA’s advanced AI technology, including powerful DGX A100 stations and high-performance GPUs, the exhibit offers a captivating journey through our planet’s ecosystems with stunning visuals, sounds and scents.

These scenes are rendered in breathtaking clarity across screens with a total output of 12.5 million pixels, immersing attendees in an unprecedented digital portrayal of Earth’s ecosystems.

Refik Anadol, recognized by The Economist as “the artist of the moment,” has emerged as a key figure in AI art. His work, notable for its use of data and machine learning, places him at the forefront of a generation pushing the boundaries between technology, interdisciplinary research and aesthetics. Anadol’s influence reflects a wider movement in the art world towards embracing digital innovation, setting new precedents in how art is created and experienced.

Exhibition Details

Location: Main concourse at the San Jose McEnery Convention Center, ensuring easy access for all GTC attendees.
Total experience hours: Available from 5-7 p.m., providing a curated window to engage with the installation fully.
Screen dimensions: The installation features two towering screens, each four meters high. The larger, four-by-12-meter screen displays the “Large Nature Model: Living Archive,” showcasing Anadol’s centerpiece. A second, four-by-six-meter screen offers a glimpse into the process of building the Large Nature Model.

A Gateway to Digital Nature

Large Nature Model is a generative AI model focused exclusively on nature.

This installation exemplifies AI’s unique potential to capture nature’s inherent intelligence, aiming to redefine our engagement with and appreciation of Earth’s ecosystems.

Anadol has been working with nature-based datasets throughout his career, and began working with rainforest data years ago.

The Large Nature Model, on which the work being shown at GTC is based, continues to evolve. It represents the work of a team of 29 data scientists, graphic designers and AI specialists from around the world, all working under the umbrella of the Refik Anadol Studio.

The Large Nature Model showcased at GTC is fine-tuned using the Getty Images foundation model built using the NVIDIA Edify architecture. The model is fine-tuned on an extensive dataset of approximately 750,000 images, comprising 274,947 images of flora, 358,713 images of fauna and 130,282 images of fungi — showcasing the rich biodiversity of the Amazonian rainforest.

Insights Into the Making

Alongside the visual feast, a panel discussion featuring Anadol and colleagues from the Refik Anadol Studio will provide insights into their research and design processes.

Moderated by Brian Dowdy, a senior technical marketing engineer at NVIDIA, the discussion will explore the collaborative efforts, technical challenges and creative processes that make such pioneering art possible.

The creation of the Large Nature Model represents six months of rigorous development and collaboration with NVIDIA researchers, underscoring the dedication and interdisciplinary effort required to bring this innovative vision to life.

Register for GTC today to join this immersive journey into the heart of nature, art and AI innovation.