I Tested Best AI Note Takers: Fellow, Otter, and TL;DV

After using Fellow, Otter, and TL;DV for transcription, recording, and summarization of online meetings, I evaluated features, pros, and cons of these popular AI note takers. Here are my takeaways: AI note taker benchmark results Best AI meeting assistants Do I need an AI note taker? Consider using an AI note taker if: However, if

How to Build a Claims Processor Agent from Scratch?

We’ll use Stack AI workflow builder for claims automation and create an AI agent to enable users to upload accounting documents—like invoices, receipts, and claim forms—and automatically convert them into structured JSON using OCR and GPT-based processing.  The extracted data can then be sent to a Google Sheet or used in custom apps and databases.

Wildfire Prevention: AI Startups Support Prescribed Burns, Early Alerts

Artificial intelligence is helping identify and treat diseases faster with better results for humankind. Natural disasters like wildfires are next.

Fires in the Los Angeles area have claimed more than 16,000 homes and other structures so far this year. Damages in January were estimated as high as $164 billion, making it potentially the worst natural disaster financially in U.S. history, according to Bloomberg.

The U.S. Department of Agriculture and the U.S. Forest Service have reportedly been redirecting resources in recent months toward beneficial fires to reduce overgrowth.

AI enables fire departments to keep more eyes on controlled burns, making them safer and more accepted in communities, say industry experts.

“This is just like cancer treatment,” said Sonia Kastner, CEO and founder of Pano AI, based in San Francisco. “You can do early screening, catch it when it’s in phase one, and hit it with really aggressive treatment so it doesn’t progress — what we’ve seen this fire season is proof that our customers across the country use our solution in this way.”

San Ramon, California-based Green Grid, which specializes in AI for utility companies, in September alerted its customer at a Big Bear resort that a fire started in the San Bernardino National Forest was near, said Chinmoy Saha, the company’s CEO. By acting early, the resort customer was able to prepare for the needed suppression measures for the fire before it reached and became uncontrollable, he said. Due to the favorable weather conditions, the fire did not reach the customer territory.

In the recent Los Angeles area fires, Saha said he had been in discussion with a customer seeking to bring AI to cameras located at the site of the now-devasted Eaton fire that has claimed 17 lives and more than 9,000 buildings.

“If we had our system there, this fire could have been mitigated,” said Saha. “Early detection is the key, so the fire is contained and it doesn’t become a catastrophic wildfire.”

Aiding First Responders With Accelerated Computing

Pano’s service provides human-in-the-loop AI-driven fire detection and alerts that have enabled fire departments to act faster than from 911 calls, accelerating containment efforts, said Kastner.

The company’s Pano Station uses two ultra-high-definition cameras mounted on top of mountains like a cell tower, rotating 360 degrees every minute to capture views 10 miles in all directions. Those images are transmitted to the cloud every minute, where AI models running on GPUs do inference for smoke detection.

Pano AI’s Pano Station in Rancho Palos Verdes

Pano has a daytime smoke detection model and a nighttime near infrared model looking for smoke, as well as a nighttime geostationary satellite model. It has a human in the loop for verifying the detections, and it can be confirmed using digital zoom and time-lapse imagery.

It trains on NVIDIA GPUs locally and runs inference on NVIDIA GPUs in the cloud.

Harnessing AI for Controlled Burns

California Department of Forestry and Fire Protection (CAL FIRE) is carrying out prescribed fires, or controlled burns, to reduce dry vegetation that creates fuel for wildfires.

“Controlled burns are necessary, and we didn’t do a good job in California for the past 30 or 40 years,” said Saha. Green Grid has deployed its trailer mounted AI camera sensors for monitoring fires and control burns before they go out of control.

Pano can be used by fire departments to monitor controlled burn zones with its AI-driven cameras to make sure that plumes of smoke don’t appear outside of the permitted zone, maintaining safety.

The company has its cameras stationed at Rancho Palos Verdes, south of the recent Los Angeles area fires.

“The area around the palisades fire was a very overgrown forest, and with a lot of dead fuels, so our hope is that there is going to be more focus on prescribed fires,” said Kastner.

Embracing AI at Fire Departments for Faster Mitigation

CAL FIRE is partnered with Alert California and UC San Diego for a network of cameras owned by investor-owned utilities, CAL FIRE, U.S. Forest Service and other U.S. Department of the Interior agencies.

Through that network, they’ve implemented an AI program that looks for new fire starts. It pans every two minutes and continuously updates, and Alert California has the most up-to-date information from this network.

If AI can enable fire departments to get to the scene of a fire when it’s just a few acres, it’s a lot easier to control than if it’s 50 or more acres, said David Acuna, battalion chief at CAL FIRE, Clovis, California. This is particularly important in remote areas where it might take hours before a human sees and reports a fire, he added.

“They use AI to determine if this looks like a new start,” said Acuna. “Now the key here is the program will then send an email to the relevant emergency command center, saying ‘Hey, I think we spotted a new start, what do you think?’ And it has to be verified by a human.”

Join the Family: GeForce NOW Welcomes 2K’s Acclaimed ‘Mafia’ Franchise to the Cloud

Calling all wiseguys — 2K’s acclaimed Mafia franchise is available to stream from the cloud.

Step into the gritty underworld of organized crime and experience the cinematic storytelling and action-packed gameplay that made the Mafia franchise a classic, captivating both newcomers and longtime fans of the saga.

It’s all part of nine games joining the cloud family this week, including Towerborne from Stoic and Xbox Game Studios.

Plus, the family is waiting — the Mafia saga’s highly anticipated prequel, Mafia: The Old Country, will join the cloud at launch.

The Cloud Is Made

Step into the world of organized crime with Mafia, Mafia: Definitive Edition, Mafia II and Mafia III now streaming on GeForce NOW for those new and returning to the family. Experience the gritty underworld drama and cinematic storytelling that made Mafia a legend — no need to wait for a sitdown with the don.

Mafia on GeForce NOW
Keep your friends close and your enemies closer.

Start with the first game in the series, Mafia. An inadvertent brush with the mob thrusts cab driver Tommy Angelo into the world of organized crime. He’s initially uneasy about falling in with the Salieri family, but the rewards become too big to ignore. Plus, check out Mafia: Definitive Edition, a remake from the ground up of the classic first title, with an expanded story, gameplay and original score.

Mafia II on GeForce NOW
Sleeping with the fishies.

Years later, the criminal legacy continues with Mafia II. Vito Scaletta has started to make a name for himself on the streets of Empire Bay as someone who can be trusted to get a job done. Together with his buddy Joe, he’s working to prove himself to the mafia, quickly climbing the family ladder with crimes of larger reward, status and consequence — the life as a wise guy isn’t quite as untouchable as it seems.

Mafia III on GeForce NOW
Every family has its secrets.

The saga expands to 1968 New Bordeaux in Mafia III: Definitive Edition. After years of combat in Vietnam, Lincoln Clay knows this truth: family isn’t who you’re born with, it’s who you die for. When his surrogate family is wiped out by the Italian mafia, Lincoln builds a new family and blazes a path of revenge through the mafioso responsible.

Step into the tailored suits and fedoras of the criminal underworld with the Mafia series on GeForce NOW. Play anytime, anywhere across devices, just like a true mobster on the move.

The Cloud Is Full of Aces

Towerborne on GeForce NOW
Fight as one.

From Stoic and Xbox Game Studios comes Towerborne, a new kind of looter brawler, combining side-scrolling combat with action role-playing game loot progression and customization — available in the cloud at high performance for GeForce NOW members.

Take on the role of an Ace — a hero reborn from the spirit realm — to protect the last bastion of humanity, the Belfry, from monstrous threats. Embark on a series of daring expeditions beyond the city’s walls and into the wilds. Engage in fast-paced, combo-driven combat and experiment with different weapon classes like the heavy-hitting Rockbreaker, agile Shadowstriker and more.

Experience the vibrant, action-packed world of Towerborne on GeForce NOW, with high dynamic range support for richer colors and deeper contrast. Defend the Belfry on a Performance or Ultimate membership to get immersed in stunning, high-contrast visuals that bring explosive battles to life. Members can adventure solo or team with up to three friends for co-op action and stream the game instantly from the cloud, no downloads required.

Gacha New Games

Genshin Impact 5.6 on GeForce NOW
Keep cool in the cloud when the kitchen heats up.

Genshin Impact version 5.6, called “Paralogism,” brings a new story chapter in which players return to the city of Mondstadt and help solve a mysterious crisis involving the character Albedo. The update also introduces an event in the city of Fontaine, where players can build and run their own amusement park. Plus, two new characters join the game: Escoffier, a chef who uses cryo elemental power, and Ifa, an Saurian veterinarian who fights alongside his Saurian companion, Cacucu. Catch it in the cloud without waiting for downloads.

Look for the following games available to stream in the cloud this week:

What are you planning to play this weekend? Let us know on X or in the comments below.

LM Studio Accelerates LLM Performance With NVIDIA GeForce RTX GPUs and CUDA 12.8

As AI use cases continue to expand — from document summarization to custom software agents — developers and enthusiasts are seeking faster, more flexible ways to run large language models (LLMs).

Running models locally on PCs with NVIDIA GeForce RTX GPUs enables high-performance inference, enhanced data privacy and full control over AI deployment and integration. Tools like LM Studio — free to try — make this possible, giving users an easy way to explore and build with LLMs on their own hardware.

LM Studio has become one of the most widely adopted tools for local LLM inference. Built on the high-performance llama.cpp runtime, the app allows models to run entirely offline and can also serve as OpenAI-compatible application programming interface (API) endpoints for integration into custom workflows.

The release of LM Studio 0.3.15 brings improved performance for RTX GPUs thanks to CUDA 12.8, significantly improving model load and response times. The update also introduces new developer-focused features, including enhanced tool use via thetool_choice” parameter and a redesigned system prompt editor.

The latest improvements to LM Studio improve its performance and usability — delivering the highest throughput yet on RTX AI PCs. This means faster responses, snappier interactions and better tools for building and integrating AI locally.

 Where Everyday Apps Meet AI Acceleration

LM Studio is built for flexibility — suited for both casual experimentation or full integration into custom workflows. Users can interact with models through a desktop chat interface or enable developer mode to serve OpenAI-compatible API endpoints. This makes it easy to connect local LLMs to workflows in apps like VS Code or bespoke desktop agents.

For example, LM Studio can be integrated with Obsidian, a popular markdown-based knowledge management app. Using community-developed plug-ins like Text Generator and Smart Connections, users can generate content, summarize research and query their own notes — all powered by local LLMs running through LM Studio. These plug-ins connect directly to LM Studio’s local server, enabling fast, private AI interactions without relying on the cloud.

Example of using LM Studio to generate notes accelerated by RTX.

The 0.3.15 update adds new developer capabilities, including more granular control over tool use via thetool_choice” parameter and an upgraded system prompt editor for handling longer or more complex prompts.

The tool_choice parameter lets developers control how models engage with external tools — whether by forcing a tool call, disabling it entirely or allowing the model to decide dynamically. This added flexibility is especially valuable for building structured interactions, retrieval-augmented generation (RAG) workflows or agent pipelines. Together, these updates enhance both experimentation and production use cases for developers building with LLMs.

LM Studio supports a broad range of open models — including Gemma, Llama 3, Mistral and Orca — and a variety of quantization formats, from 4-bit to full precision.

Common use cases span RAG, multi-turn chat with long context windows, document-based Q&A and local agent pipelines. And by using local inference servers powered by the NVIDIA RTX-accelerated llama.cpp software library, users on RTX AI PCs can integrate local LLMs with ease.

Whether optimizing for efficiency on a compact RTX-powered system or maximizing throughput on a high-performance desktop, LM Studio delivers full control, speed and privacy — all on RTX.

Experience Maximum Throughput on RTX GPUs

At the core of LM Studio’s acceleration is llama.cpp — an open-source runtime designed for efficient inference on consumer hardware. NVIDIA partnered with the LM Studio and llama.cpp communities to integrate several enhancements to maximize RTX GPU performance.

Key optimizations include:

Data measured using different versions of LM Studio and CUDA backends on GeForce RTX 5080 on DeepSeek-R1-Distill-Llama-8B model. All configurations measured using Q4_K_M GGUF (Int4) quantization at BS=1, ISL=4000, OSL=200, with Flash Attention ON. Graph showcases ~27% speedup with the latest version of LM Studio due to NVIDIA contributions to the llama.cpp inference backend.

With a compatible driver, LM Studio automatically upgrades to the CUDA 12.8 runtime, enabling significantly faster model load times and higher overall performance.

These enhancements deliver smoother inference and faster response times across the full range of RTX AI PCs — from thin, light laptops to high-performance desktops and workstations.

Get Started With LM Studio

LM Studio is free to download and runs on Windows, macOS and Linux. With the latest 0.3.15 release and ongoing optimizations, users can expect continued improvements in performance, customization and usability — making local AI faster, more flexible and more accessible.

Users can load a model through the desktop chat interface or enable developer mode to expose an OpenAI-compatible API.

To quickly get started, download the latest version of LM Studio and open up the application.

  1. Click the magnifying glass icon on the left panel to open up the Discover menu.
  2. Select the Runtime settings on the left panel and search for the CUDA 12 llama.cpp (Windows) runtime in the availability list. Select the button to Download and Install.
  3. After the installation completes, configure LM Studio to use this runtime by default by selecting CUDA 12 llama.cpp (Windows) in the Default Selections dropdown.
  4. For the final steps in optimizing CUDA execution, load a model in LM Studio and enter the Settings menu by clicking the gear icon to the left of the loaded model.
  5. From the resulting dropdown menu, toggle “Flash Attention” to be on and offload all model layers onto the GPU by dragging the “GPU Offload” slider to the right.

Once these features are enabled and configured, running NVIDIA GPU inference on a local setup is good to go.

LM Studio supports model presets, a range of quantization formats and developer controls like tool_choice for fine-tuned inference. For those looking to contribute, the llama.cpp GitHub repository is actively maintained and continues to evolve with community- and NVIDIA-driven performance enhancements.

Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NVIDIA NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

Follow NVIDIA Workstation on LinkedIn and X.

Cadence Taps NVIDIA Blackwell to Accelerate AI-Driven Engineering Design and Scientific Simulation

A new supercomputer offered by Cadence, a leading provider of technology for electronic design automation, is poised to support a suite of engineering design and life sciences applications accelerated by NVIDIA Blackwell systems and NVIDIA CUDA-X software libraries.

Available to deploy in the cloud and on premises, the Millennium M2000 Supercomputer features NVIDIA HGX B200 systems and NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. Combined with optimized software, the supercomputer delivers up to 80x higher performance for electronic design automation, system design and life sciences workloads compared to its predecessor, a CPU-based system.

With this boost in computational capability, engineers can run massive simulations to drive breakthroughs in the design and development of autonomous machines, drug molecules, semiconductors, data centers and more.

Anirudh Devgan, president and CEO of Cadence, discussed the collaboration with NVIDIA founder and CEO Jensen Huang onstage at CadenceLIVE, taking place today in Santa Clara, California.

“This is years in the making,” Devgan said during the conversation with Huang. “It’s a combination of advancement on the hardware and system side by NVIDIA — and then, of course, we have to rewrite our software to take advantage of that.”

The pair discussed how NVIDIA and Cadence are working together on AI factories, digital twins and agentic AI.

“The work that we’re doing together recognizes that there’s a whole new type of factory that’s necessary. We call them AI factories,” Huang said. “AI is going to infuse into every single aspect of everything we do. Every company will be run better because of AI, or they’ll build better products because of AI.”

Huang also announced that NVIDIA plans to purchase 10 Millennium Supercomputer systems based on the NVIDIA GB200 NVL72 platform to accelerate the company’s chip design workflows.

“This is a big deal for us,” he said. “We started building our data center to get ready for it.”

Enabling Intelligent Design Across Industries 

The Millennium Supercomputer harnesses accelerated software from NVIDIA and Cadence for applications including circuit simulation, computational fluid dynamics, data center design and molecular design.

image of Cadence Millennium M2000 Supercomputer
Cadence Millennium M2000 Supercomputer

With the supercomputer’s optimized hardware and AI software, engineers and researchers can build more complex, detailed simulations that are capable of delivering more accurate insights to enable faster silicon, systems and drug development.

Through this collaboration, Cadence and NVIDIA are solving key design challenges with diverse applications across industries — for example, simulating thermal dynamics for chip design, fluid dynamics for aerospace applications and molecular dynamics for pharmaceutical research.

NVIDIA engineering teams used Cadence Palladium emulation platforms and Protium prototyping platforms to support design verification and chip bring-up workflows for the development of NVIDIA Blackwell.

Cadence used NVIDIA Grace Blackwell-accelerated systems to calculate the fluid dynamics at work when an aircraft takes off and lands. Using NVIDIA GB200 Grace Blackwell Superchips and the Cadence Fidelity CFD Platform, Cadence was able to run in under 24 hours highly complex simulations that would take several days to complete on a CPU cluster with hundreds of thousands of cores.

Cadence also used NVIDIA Omniverse application programming interfaces to visualize these intricate fluid dynamics.

Computational fluid dynamics simulation on the wing and engine of an airplane
NVIDIA Blackwell accelerates computer-aided engineering software by orders of magnitude, enabling complex simulations of fluid dynamics for the aerospace industry.

The company has integrated NVIDIA BioNeMo NIM microservices into Orion, Cadence’s molecular design platform — and NVIDIA Llama Nemotron reasoning models into the Cadence JedAI Platform.

Cadence has also adopted the NVIDIA Omniverse Blueprint for AI factory digital twins. Connected to the Cadence Reality Digital Twin Platform, the blueprint enables engineering teams to test and optimize power, cooling and networking in an AI factory with physically based simulations — long before construction starts in the real world. With these capabilities, engineers can make faster configuration decisions and future-proof the next generation of AI factories.

Learn more about the collaboration between NVIDIA and Cadence and watch this NVIDIA GTC session on advancing physics-based simulation technology for AI factory design.

Images courtesy of Cadence.

NVIDIA’s Rama Akkiraju on How AI Platform Architects Help Bridge Business Vision and Technical Execution

Enterprises across industries are exploring AI to rethink problem-solving and redefine business processes. But making these ventures successful requires the right infrastructure, such as AI factories, which allow businesses to convert data into tokens and outcomes.

Rama Akkiraju, vice president of IT for AI and machine learning at NVIDIA, joined the AI Podcast to discuss how enterprises can build the right foundations for AI success.

Drawing on over two decades of experience in the field, Akkiraju provided her perspective on AI’s evolution, from perception AI to generative AI to agentic AI, which allows systems to reason, plan and act autonomously, as well as physical AI, which enables autonomous machines to act in the real world.

What’s striking, Akkiraju pointed out, is the acceleration in the technology’s evolution: the shift from perception to generative AI took about 30 years, but the leap to agentic AI happened in just two. She also emphasized that AI is transforming software development by becoming an integral layer in application architecture — not just a tool.

“Treat AI like a new layer in the development stack, which is fundamentally reshaping the way we write software,” she said.

Akkiraju also spoke about the critical role of AI platform architects in designing and building AI infrastructure based on specific business needs. Enterprise implementations require complex stacks including data ingestion pipelines, vector databases, security controls and evaluation frameworks — and platform architects serve as the bridge between strategic business vision and technical execution.

Looking ahead, Akkiraju identified three trends shaping the future of AI infrastructure: the integration of specialized AI architecture into native enterprise systems, the emergence of domain-specific models and hardware optimized for particular use cases, and increasingly autonomous agentic systems requiring sophisticated memory and context management.

Time Stamps

1:27 – How Akkiraju’s team builds enterprise AI platforms, chatbots and copilots.

4:49 – The accelerated evolution from perception AI to generative AI to agentic AI.

11:22 – The comprehensive stack required for implementing AI in enterprise settings.

29:53 – Three major trends shaping the future of AI infrastructure.

You Might Also Like… 

NVIDIA’s Jacob Liberman on Bringing Agentic AI to Enterprises 

Jacob Liberman, director of product management at NVIDIA, explains how agentic AI bridges the gap between powerful AI models and practical enterprise applications, enabling intelligent multi-agent systems that reason, act and execute complex tasks with autonomy.

Isomorphic Labs Rethinks Drug Discovery With AI 

Isomorphic Labs’ leadership team discusses their AI-first approach to drug discovery, viewing biology as an information processing system and building generalizable AI models capable of learning from the entire universe of protein and chemical interactions.

AI Agents Take Digital Experiences to the Next Level in Gaming and Beyond

AI agents with advanced perception and cognition capabilities are making digital experiences more dynamic and personalized across industries. Inworld AI’s Chris Covert discusses how intelligent digital humans are reshaping interactive experiences, from gaming to healthcare.

How can India decarbonize its coal-dependent electric power system?

As the world struggles to reduce climate-warming carbon emissions, India has pledged to do its part, and its success is critical: In 2023, India was the third-largest carbon emitter worldwide. The Indian government has committed to having net-zero carbon emissions by 2070.

To fulfill that promise, India will need to decarbonize its electric power system, and that will be a challenge: Fully 60 percent of India’s electricity comes from coal-burning power plants that are extremely inefficient. To make matters worse, the demand for electricity in India is projected to more than double in the coming decade due to population growth and increased use of air conditioning, electric cars, and so on.

Despite having set an ambitious target, the Indian government has not proposed a plan for getting there. Indeed, as in other countries, in India the government continues to permit new coal-fired power plants to be built, and aging plants to be renovated and their retirement postponed.

To help India define an effective — and realistic — plan for decarbonizing its power system, key questions must be addressed. For example, India is already rapidly developing carbon-free solar and wind power generators. What opportunities remain for further deployment of renewable generation? Are there ways to retrofit or repurpose India’s existing coal plants that can substantially and affordably reduce their greenhouse gas emissions? And do the responses to those questions differ by region?

With funding from IHI Corp. through the MIT Energy Initiative (MITEI), Yifu Ding, a postdoc at MITEI, and her colleagues set out to answer those questions by first using machine learning to determine the efficiency of each of India’s current 806 coal plants, and then investigating the impacts that different decarbonization approaches would have on the mix of power plants and the price of electricity in 2035 under increasingly stringent caps on emissions.

First step: Develop the needed dataset

An important challenge in developing a decarbonization plan for India has been the lack of a complete dataset describing the current power plants in India. While other studies have generated plans, they haven’t taken into account the wide variation in the coal-fired power plants in different regions of the country. “So, we first needed to create a dataset covering and characterizing all of the operating coal plants in India. Such a dataset was not available in the existing literature,” says Ding.

Making a cost-effective plan for expanding the capacity of a power system requires knowing the efficiencies of all the power plants operating in the system. For this study, the researchers used as their metric the “station heat rate,” a standard measurement of the overall fuel efficiency of a given power plant. The station heat rate of each plant is needed in order to calculate the fuel consumption and power output of that plant as plans for capacity expansion are being developed.

Some of the Indian coal plants’ efficiencies were recorded before 2022, so Ding and her team used machine-learning models to predict the efficiencies of all the Indian coal plants operating now. In 2024, they created and posted online the first comprehensive, open-sourced dataset for all 806 power plants in 30 regions of India. The work won the 2024 MIT Open Data Prize. This dataset includes each plant’s power capacity, efficiency, age, load factor (a measure indicating how much of the time it operates), water stress, and more.

In addition, they categorized each plant according to its boiler design. A “supercritical” plant operates at a relatively high temperature and pressure, which makes it thermodynamically efficient, so it produces a lot of electricity for each unit of heat in the fuel. A “subcritical” plant runs at a lower temperature and pressure, so it’s less thermodynamically efficient. Most of the Indian coal plants are still subcritical plants running at low efficiency.

Next step: Investigate decarbonization options

Equipped with their detailed dataset covering all the coal power plants in India, the researchers were ready to investigate options for responding to tightening limits on carbon emissions. For that analysis, they turned to GenX, a modeling platform that was developed at MITEI to help guide decision-makers as they make investments and other plans for the future of their power systems.

Ding built a GenX model based on India’s power system in 2020, including details about each power plant and transmission network across 30 regions of the country. She also entered the coal price, potential resources for wind and solar power installations, and other attributes of each region. Based on the parameters given, the GenX model would calculate the lowest-cost combination of equipment and operating conditions that can fulfill a defined future level of demand while also meeting specified policy constraints, including limits on carbon emissions. The model and all data sources were also released as open-source tools for all viewers to use.

Ding and her colleagues — Dharik Mallapragada, a former principal research scientist at MITEI who is now an assistant professor of chemical and biomolecular energy at NYU Tandon School of Engineering and a MITEI visiting scientist; and Robert J. Stoner, the founding director of the MIT Tata Center for Technology and Design and former deputy director of MITEI for science and technology — then used the model to explore options for meeting demands in 2035 under progressively tighter carbon emissions caps, taking into account region-to-region variations in the efficiencies of the coal plants, the price of coal, and other factors. They describe their methods and their findings in a paper published in the journal Energy for Sustainable Development.

In separate runs, they explored plans involving various combinations of current coal plants, possible new renewable plants, and more, to see their outcome in 2035. Specifically, they assumed the following four “grid-evolution scenarios:”

Baseline: The baseline scenario assumes limited onshore wind and solar photovoltaics development and excludes retrofitting options, representing a business-as-usual pathway.

High renewable capacity: This scenario calls for the development of onshore wind and solar power without any supply chain constraints.

Biomass co-firing: This scenario assumes the baseline limits on renewables, but here all coal plants — both subcritical and supercritical — can be retrofitted for “co-firing” with biomass, an approach in which clean-burning biomass replaces some of the coal fuel. Certain coal power plants in India already co-fire coal and biomass, so the technology is known.

Carbon capture and sequestration plus biomass co-firing: This scenario is based on the same assumptions as the biomass co-firing scenario with one addition: All of the high-efficiency supercritical plants are also retrofitted for carbon capture and sequestration (CCS), a technology that captures and removes carbon from a power plant’s exhaust stream and prepares it for permanent disposal. Thus far, CCS has not been used in India. This study specifies that 90 percent of all carbon in the power plant exhaust is captured.

Ding and her team investigated power system planning under each of those grid-evolution scenarios and four assumptions about carbon caps: no cap, which is the current situation; 1,000 million tons (Mt) of carbon dioxide (CO2) emissions, which reflects India’s announced targets for 2035; and two more-ambitious targets, namely 800 Mt and 500 Mt. For context, CO2 emissions from India’s power sector totaled about 1,100 Mt in 2021. (Note that transmission network expansion is allowed in all scenarios.)

Key findings

Assuming the adoption of carbon caps under the four scenarios generated a vast array of detailed numerical results. But taken together, the results show interesting trends in the cost-optimal mix of generating capacity and the cost of electricity under the different scenarios.

Even without any limits on carbon emissions, most new capacity additions will be wind and solar generators — the lowest-cost option for expanding India’s electricity-generation capacity. Indeed, this is observed to be the case now in India. However, the increasing demand for electricity will still require some new coal plants to be built. Model results show a 10 to 20 percent increase in coal plant capacity by 2035 relative to 2020.

Under the baseline scenario, renewables are expanded up to the maximum allowed under the assumptions, implying that more deployment would be economical. More coal capacity is built, and as the cap on emissions tightens, there is also investment in natural gas power plants, as well as batteries to help compensate for the now-large amount of intermittent solar and wind generation. When a 500 Mt cap on carbon is imposed, the cost of electricity generation is twice as high as it was with no cap.

The high renewable capacity scenario reduces the development of new coal capacity and produces the lowest electricity cost of the four scenarios. Under the most stringent cap — 500 Mt — onshore wind farms play an important role in bringing the cost down. “Otherwise, it’ll be very expensive to reach such stringent carbon constraints,” notes Ding. “Certain coal plants that remain run only a few hours per year, so are inefficient as well as financially unviable. But they still need to be there to support wind and solar.” She explains that other backup sources of electricity, such as batteries, are even more costly. 

The biomass co-firing scenario assumes the same capacity limit on renewables as in the baseline scenario, and the results are much the same, in part because the biomass replaces such a low fraction — just 20 percent — of the coal in the fuel feedstock. “This scenario would be most similar to the current situation in India,” says Ding. “It won’t bring down the cost of electricity, so we’re basically saying that adding this technology doesn’t contribute effectively to decarbonization.”

But CCS plus biomass co-firing is a different story. It also assumes the limits on renewables development, yet it is the second-best option in terms of reducing costs. Under the 500 Mt cap on CO2 emissions, retrofitting for both CCS and biomass co-firing produces a 22 percent reduction in the cost of electricity compared to the baseline scenario. In addition, as the carbon cap tightens, this option reduces the extent of deployment of natural gas plants and significantly improves overall coal plant utilization. That increased utilization “means that coal plants have switched from just meeting the peak demand to supplying part of the baseline load, which will lower the cost of coal generation,” explains Ding.

Some concerns

While those trends are enlightening, the analyses also uncovered some concerns for India to consider, in particular, with the two approaches that yielded the lowest electricity costs.

The high renewables scenario is, Ding notes, “very ideal.” It assumes that there will be little limiting the development of wind and solar capacity, so there won’t be any issues with supply chains, which is unrealistic. More importantly, the analyses showed that implementing the high renewables approach would create uneven investment in renewables across the 30 regions. Resources for onshore and offshore wind farms are mainly concentrated in a few regions in western and southern India. “So all the wind farms would be put in those regions, near where the rich cities are,” says Ding. “The poorer cities on the eastern side, where the coal power plants are, will have little renewable investment.”

So the approach that’s best in terms of cost is not best in terms of social welfare, because it tends to benefit the rich regions more than the poor ones. “It’s like [the government will] need to consider the trade-off between energy justice and cost,” says Ding. Enacting state-level renewable generation targets could encourage a more even distribution of renewable capacity installation. Also, as transmission expansion is planned, coordination among power system operators and renewable energy investors in different regions could help in achieving the best outcome.

CCS plus biomass co-firing — the second-best option for reducing prices — solves the equity problem posed by high renewables, and it assumes a more realistic level of renewable power adoption. However, CCS hasn’t been used in India, so there is no precedent in terms of costs. The researchers therefore based their cost estimates on the cost of CCS in China and then increased the required investment by 10 percent, the “first-of-a-kind” index developed by the U.S. Energy Information Administration. Based on those costs and other assumptions, the researchers conclude that coal plants with CCS could come into use by 2035 when the carbon cap for power generation is less than 1,000 Mt.

But will CCS actually be implemented in India? While there’s been discussion about using CCS in heavy industry, the Indian government has not announced any plans for implementing the technology in coal-fired power plants. Indeed, India is currently “very conservative about CCS,” says Ding. “Some researchers say CCS won’t happen because it’s so expensive, and as long as there’s no direct use for the captured carbon, the only thing you can do is put it in the ground.” She adds, „It’s really controversial to talk about whether CCS will be implemented in India in the next 10 years.”

Ding and her colleagues hope that other researchers and policymakers — especially those working in developing countries — may benefit from gaining access to their datasets and learning about their methods. Based on their findings for India, she stresses the importance of understanding the detailed geographical situation in a country in order to design plans and policies that are both realistic and equitable.

Your Service Teams Just Got a New Coworker — and It’s a 15B-Parameter Super Genius Built by ServiceNow and NVIDIA

ServiceNow is accelerating enterprise AI with a new reasoning model built in partnership with NVIDIA — enabling AI agents that respond in real time, handle complex workflows and scale functions like IT, HR and customer service teams worldwide.

Unveiled today at ServiceNow’s Knowledge 2025 — where NVIDIA CEO and founder Jensen Huang joined ServiceNow chairman and CEO Bill McDermott during his keynote address — Apriel Nemotron 15B is compact, cost-efficient and tuned for action. It’s designed to drive the next step forward in enterprise large language models (LLMs).

Apriel Nemotron 15B was developed with NVIDIA NeMo, the open NVIDIA Llama Nemotron Post-Training Dataset and ServiceNow domain-specific data, and was trained on NVIDIA DGX Cloud running on Amazon Web Services (AWS).

The news follows the April release of the NVIDIA Llama Nemotron Ultra model, which harnesses the NVIDIA open dataset that ServiceNow used to build its Apriel Nemotron 15B model. Ultra is among the strongest open-source models at reasoning, including scientific reasoning, coding, advanced math and other agentic AI tasks.

Smaller Model, Bigger Impact

Apriel Nemotron 15B is engineered for reasoning — drawing inferences, weighing goals and navigating rules in real time. It’s smaller than some of the latest general-purpose LLMs that can run to more than a trillion parameters, which means it delivers faster responses and lower inference costs, while still packing enterprise-grade intelligence.

The model’s post-training took place on NVIDIA DGX Cloud hosted on AWS, tapping high-performance infrastructure to accelerate development. The result? An AI model that’s optimized not just for accuracy, but for speed, efficiency and scalability — key ingredients for powering AI agents that can support thousands of concurrent enterprise workflows.

A Closed Loop for Continuous Learning

Beyond the model itself, ServiceNow and NVIDIA are introducing a new data flywheel architecture — integrating ServiceNow’s Workflow Data Fabric with NVIDIA NeMo microservices, including NeMo Customizer and NeMo Evaluator.

This setup enables a closed-loop process that refines and improves AI performance by using workflow data to personalize responses and improve accuracy over time. Guardrails ensure customers are in control of how their data is used in a secure and compliant manner.

From Complexity to Clarity

In a keynote demo, ServiceNow is showing how these agentic models have been deployed in real enterprise scenarios, including with AstraZeneca, where AI agents will help employees resolve issues and make decisions with greater speed and precision — giving 90,000 hours back to employees.

“The Apriel Nemotron 15B model — developed by two of the most advanced enterprise AI companies — features purpose-built reasoning to power the next generation of intelligent AI agents,” said Jon Sigler, executive vice president of Platform and AI at ServiceNow. “This achieves what generic models can’t, combining real-time enterprise data, workflow context and advanced reasoning to help AI agents drive real productivity.”

“Together with ServiceNow, we’ve built an efficient, enterprise-ready model to fuel a new class of intelligent AI agents that can reason to boost team productivity,” added Kari Briski, vice president of generative AI software at NVIDIA. “By using the NVIDIA Llama Nemotron Post-Training Dataset and ServiceNow domain-specific data, Apriel Nemotron 15B delivers advanced reasoning capabilities in a smaller size, making it faster, more accurate and cost-effective to run.”

Scaling the AI Agent Era

The collaboration marks a shift in enterprise AI strategy. Enterprises are moving from static models to intelligent systems that evolve. It also marks another milestone in the partnership between ServiceNow and NVIDIA, pushing agentic AI forward across industries.

For businesses, this means faster resolution times, greater productivity and more responsive digital experiences. For technology leaders, it’s a model that fits today’s performance and cost requirements — and can scale as needs grow.

Availability

ServiceNow AI Agents, powered by Apriel Nemotron 15B, are expected to roll out following Knowledge 2025. The model will support ServiceNow’s Now LLM services and will become a key engine behind the company’s agentic AI offerings.

Learn more about the launch and how NVIDIA and ServiceNow are shaping the future of enterprise AI at Knowledge 2025. 

Hybrid AI model crafts smooth, high-quality videos in seconds

What would a behind-the-scenes look at a video generated by an artificial intelligence model be like? You might think the process is similar to stop-motion animation, where many images are created and stitched together, but that’s not quite the case for “diffusion models” like OpenAl’s SORA and Google’s VEO 2.

Instead of producing a video frame-by-frame (or “autoregressively”), these systems process the entire sequence at once. The resulting clip is often photorealistic, but the process is slow and doesn’t allow for on-the-fly changes. 

Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Adobe Research have now developed a hybrid approach, called “CausVid,” to create videos in seconds. Much like a quick-witted student learning from a well-versed teacher, a full-sequence diffusion model trains an autoregressive system to swiftly predict the next frame while ensuring high quality and consistency. CausVid’s student model can then generate clips from a simple text prompt, turning a photo into a moving scene, extending a video, or altering its creations with new inputs mid-generation.

This dynamic tool enables fast, interactive content creation, cutting a 50-step process into just a few actions. It can craft many imaginative and artistic scenes, such as a paper airplane morphing into a swan, woolly mammoths venturing through snow, or a child jumping in a puddle. Users can also make an initial prompt, like “generate a man crossing the street,” and then make follow-up inputs to add new elements to the scene, like “he writes in his notebook when he gets to the opposite sidewalk.”

The CSAIL researchers say that the model could be used for different video editing tasks, like helping viewers understand a livestream in a different language by generating a video that syncs with an audio translation. It could also help render new content in a video game or quickly produce training simulations to teach robots new tasks.

Tianwei Yin SM ’25, PhD ’25, a recently graduated student in electrical engineering and computer science and CSAIL affiliate, attributes the model’s strength to its mixed approach.

“CausVid combines a pre-trained diffusion-based model with autoregressive architecture that’s typically found in text generation models,” says Yin, co-lead author of a new paper about the tool. “This AI-powered teacher model can envision future steps to train a frame-by-frame system to avoid making rendering errors.”

Yin’s co-lead author, Qiang Zhang, is a research scientist at xAI and a former CSAIL visiting researcher. They worked on the project with Adobe Research scientists Richard Zhang, Eli Shechtman, and Xun Huang, and two CSAIL principal investigators: MIT professors Bill Freeman and Frédo Durand.

Caus(Vid) and effect

Many autoregressive models can create a video that’s initially smooth, but the quality tends to drop off later in the sequence. A clip of a person running might seem lifelike at first, but their legs begin to flail in unnatural directions, indicating frame-to-frame inconsistencies (also called “error accumulation”).

Error-prone video generation was common in prior causal approaches, which learned to predict frames one-by-one on their own. CausVid instead uses a high-powered diffusion model to teach a simpler system its general video expertise, enabling it to create smooth visuals, but much faster.

CausVid displayed its video-making aptitude when researchers tested its ability to make high-resolution, 10-second-long videos. It outperformed baselines like “OpenSORA” and “MovieGen,” working up to 100 times faster than its competition while producing the most stable, high-quality clips.

Then, Yin and his colleagues tested CausVid’s ability to put out stable 30-second videos, where it also topped comparable models on quality and consistency. These results indicate that CausVid may eventually produce stable, hours-long videos, or even an indefinite duration.

A subsequent study revealed that users preferred the videos generated by CausVid’s student model over its diffusion-based teacher.

“The speed of the autoregressive model really makes a difference,” says Yin. “Its videos look just as good as the teacher’s ones, but with less time to produce, the trade-off is that its visuals are less diverse.”

CausVid also excelled when tested on over 900 prompts using a text-to-video dataset, receiving the top overall score of 84.27. It boasted the best metrics in categories like imaging quality and realistic human actions, eclipsing state-of-the-art video generation models like “Vchitect” and “Gen-3.

While an efficient step forward in AI video generation, CausVid may soon be able to design visuals even faster — perhaps instantly — with a smaller causal architecture. Yin says that if the model is trained on domain-specific datasets, it will likely create higher-quality clips for robotics and gaming.

Experts says that this hybrid system is a promising upgrade from diffusion models, which are currently bogged down by processing speeds. “[Diffusion models] are way slower than LLMs [large language models] or generative image models,” says Carnegie Mellon University Assistant Professor Jun-Yan Zhu, who was not involved in the paper. “This new work changes that, making video generation much more efficient. That means better streaming speed, more interactive applications, and lower carbon footprints.”

The team’s work was supported, in part, by the Amazon Science Hub, the Gwangju Institute of Science and Technology, Adobe, Google, the U.S. Air Force Research Laboratory, and the U.S. Air Force Artificial Intelligence Accelerator. CausVid will be presented at the Conference on Computer Vision and Pattern Recognition in June.