The sweet taste of a new idea

Behavioral economist Sendhil Mullainathan has never forgotten the pleasure he felt the first time he tasted a delicious crisp, yet gooey Levain cookie. He compares the experience to when he encounters new ideas.

“That hedonic pleasure is pretty much the same pleasure I get hearing a new idea, discovering a new way of looking at a situation, or thinking about something, getting stuck and then having a breakthrough. You get this kind of core basic reward,” says Mullainathan, the Peter de Florez Professor with dual appointments in the MIT departments of Economics and Electrical Engineering and Computer Science, and a principal investigator at the MIT Laboratory for Information and Decision Systems (LIDS).

Mullainathan’s love of new ideas, and by extension of going beyond the usual interpretation of a situation or problem by looking at it from many different angles, seems to have started very early. As a child in school, he says, the multiple-choice answers on tests all seemed to offer possibilities for being correct.

“They would say, ‘Here are three things. Which of these choices is the fourth?’ Well, I was like, ‘I don’t know.’ There are good explanations for all of them,” Mullainathan says. “While there’s a simple explanation that most people would pick, natively, I just saw things quite differently.”

Mullainathan says the way his mind works, and has always worked, is “out of phase” — that is, not in sync with how most people would readily pick the one correct answer on a test. He compares the way he thinks to “one of those videos where an army’s marching and one guy’s not in step, and everyone is thinking, what’s wrong with this guy?”

Luckily, Mullainathan says, “being out of phase is kind of helpful in research.”

And apparently so. Mullainathan has received a MacArthur “Genius Grant,” has been designated a “Young Global Leader” by the World Economic Forum, was named a “Top 100 thinker” by Foreign Policy magazine, was included in the “Smart List: 50 people who will change the world” by Wired magazine, and won the Infosys Prize, the largest monetary award in India recognizing excellence in science and research.

Another key aspect of who Mullainathan is as a researcher — his focus on financial scarcity — also dates back to his childhood. When he was about 10, just a few years after his family moved to the Los Angeles area from India, his father lost his job as an aerospace engineer because of a change in security clearance laws regarding immigrants. When his mother told him that without work, the family would have no money, he says he was incredulous.

“At first I thought, that can’t be right. It didn’t quite process,” he says. “So that was the first time I thought, there’s no floor. Anything can happen. It was the first time I really appreciated economic precarity.”

His family got by running a video store and then other small businesses, and Mullainathan made it to Cornell University, where he studied computer science, economics, and mathematics. Although he was doing a lot of math, he found himself drawn not to standard economics, but to the behavioral economics of an early pioneer in the field, Richard Thaler, who later won the Nobel Memorial Prize in Economic Sciences for his work. Behavioral economics brings the psychological, and often irrational, aspects of human behavior into the study of economic decision-making.

“It’s the non-math part of this field that’s fascinating,” says Mullainathan. “What makes it intriguing is that the math in economics isn’t working. The math is elegant, the theorems. But it’s not working because people are weird and complicated and interesting.”

Behavioral economics was so new as Mullainathan was graduating that he says Thaler advised him to study standard economics in graduate school and make a name for himself before concentrating on behavioral economics, “because it was so marginalized. It was considered super risky because it didn’t even fit a field,” Mullainathan says.

Unable to resist thinking about humanity’s quirks and complications, however, Mullainathan focused on behavioral economics, got his PhD at Harvard University, and says he then spent about 10 years studying people.

“I wanted to get the intuition that a good academic psychologist has about people. I was committed to understanding people,” he says.

As Mullainathan was formulating theories about why people make certain economic choices, he wanted to test these theories empirically.

In 2013, he published a paper in Science titled “Poverty Impedes Cognitive Function.” The research measured sugarcane farmers’ performance on intelligence tests in the days before their yearly harvest, when they were out of money, sometimes nearly to the point of starvation. In the controlled study, the same farmers took tests after their harvest was in and they had been paid for a successful crop — and they scored significantly higher.

Mullainathan says he is gratified that the research had far-reaching impact, and that those who make policy often take its premise into account.

“Policies as a whole are kind of hard to change,” he says, “but I do think it has created sensitivity at every level of the design process, that people realize that, for example, if I make a program for people living in economic precarity hard to sign up for, that’s really going to be a massive tax.”

To Mullainathan, the most important effect of the research was on individuals, an impact he saw in reader comments that appeared after the research was covered in The Guardian.

“Ninety percent of the people who wrote those comments said things like, ‘I was economically insecure at one point. This perfectly reflects what it felt like to be poor.’”

Such insights into the way outside influences affect personal lives could be among important advances made possible by algorithms, Mullainathan says.

“I think in the past era of science, science was done in big labs, and it was actioned into big things. I think the next age of science will be just as much about allowing individuals to rethink who they are and what their lives are like.”

Last year, Mullainathan came back to MIT (after having previously taught at MIT from 1998 to 2004) to focus on artificial intelligence and machine learning.

“I wanted to be in a place where I could have one foot in computer science and one foot in a top-notch behavioral economic department,” he says. “And really, if you just objectively said ‘what are the places that are A-plus in both,’ MIT is at the top of that list.”

While AI can automate tasks and systems, such automation of abilities humans already possess is “hard to get excited about,” he says. Computer science can be used to expand human abilities, a notion only limited by our creativity in asking questions.

“We should be asking, what capacity do you want expanded? How could we build an algorithm to help you expand that capacity? Computer science as a discipline has always been so fantastic at taking hard problems and building solutions,” he says. “If you have a capacity that you’d like to expand, that seems like a very hard computing challenge. Let’s figure out how to take that on.”

The sciences that “are very far from having hit the frontier that physics has hit,” like psychology and economics, could be on the verge of huge developments, Mullainathan says. “I fundamentally believe that the next generation of breakthroughs is going to come from the intersection of understanding of people and understanding of algorithms.”

He explains a possible use of AI in which a decision-maker, for example a judge or doctor, could have access to what their average decision would be related to a particular set of circumstances. Such an average would be potentially freer of day-to-day influences — such as a bad mood, indigestion, slow traffic on the way to work, or a fight with a spouse.

Mullainathan sums the idea up as “average-you is better than you. Imagine an algorithm that made it easy to see what you would normally do. And that’s not what you’re doing in the moment. You may have a good reason to be doing something different, but asking that question is immensely helpful.”

Going forward, Mullainathan will absolutely be trying to work toward such new ideas — because to him, they offer such a delicious reward.

With AI, researchers predict the location of virtually any protein within a human cell

A protein located in the wrong part of a cell can contribute to several diseases, such as Alzheimer’s, cystic fibrosis, and cancer. But there are about 70,000 different proteins and protein variants in a single human cell, and since scientists can typically only test for a handful in one experiment, it is extremely costly and time-consuming to identify proteins’ locations manually.

A new generation of computational techniques seeks to streamline the process using machine-learning models that often leverage datasets containing thousands of proteins and their locations, measured across multiple cell lines. One of the largest such datasets is the Human Protein Atlas, which catalogs the subcellular behavior of over 13,000 proteins in more than 40 cell lines. But as enormous as it is, the Human Protein Atlas has only explored about 0.25 percent of all possible pairings of all proteins and cell lines within the database.

Now, researchers from MIT, Harvard University, and the Broad Institute of MIT and Harvard have developed a new computational approach that can efficiently explore the remaining uncharted space. Their method can predict the location of any protein in any human cell line, even when both protein and cell have never been tested before.

Their technique goes one step further than many AI-based methods by localizing a protein at the single-cell level, rather than as an averaged estimate across all the cells of a specific type. This single-cell localization could pinpoint a protein’s location in a specific cancer cell after treatment, for instance.

The researchers combined a protein language model with a special type of computer vision model to capture rich details about a protein and cell. In the end, the user receives an image of a cell with a highlighted portion indicating the model’s prediction of where the protein is located. Since a protein’s localization is indicative of its functional status, this technique could help researchers and clinicians more efficiently diagnose diseases or identify drug targets, while also enabling biologists to better understand how complex biological processes are related to protein localization.

“You could do these protein-localization experiments on a computer without having to touch any lab bench, hopefully saving yourself months of effort. While you would still need to verify the prediction, this technique could act like an initial screening of what to test for experimentally,” says Yitong Tseo, a graduate student in MIT’s Computational and Systems Biology program and co-lead author of a paper on this research.

Tseo is joined on the paper by co-lead author Xinyi Zhang, a graduate student in the Department of Electrical Engineering and Computer Science (EECS) and the Eric and Wendy Schmidt Center at the Broad Institute; Yunhao Bai of the Broad Institute; and senior authors Fei Chen, an assistant professor at Harvard and a member of the Broad Institute, and Caroline Uhler, the Andrew and Erna Viterbi Professor of Engineering in EECS and the MIT Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS). The research appears today in Nature Methods.

Collaborating models

Many existing protein prediction models can only make predictions based on the protein and cell data on which they were trained or are unable to pinpoint a protein’s location within a single cell.

To overcome these limitations, the researchers created a two-part method for prediction of unseen proteins’ subcellular location, called PUPS.

The first part utilizes a protein sequence model to capture the localization-determining properties of a protein and its 3D structure based on the chain of  amino acids that forms it.

The second part incorporates an image inpainting model, which is designed to fill in missing parts of an image. This computer vision model looks at three stained images of a cell to gather information about the state of that cell, such as its type, individual features, and whether it is under stress.

PUPS joins the representations created by each model to predict where the protein is located within a single cell, using an image decoder to output a highlighted image that shows the predicted location.

“Different cells within a cell line exhibit different characteristics, and our model is able to understand that nuance,” Tseo says.

A user inputs the sequence of amino acids that form the protein and three cell stain images — one for the nucleus, one for the microtubules, and one for the endoplasmic reticulum. Then PUPS does the rest.

A deeper understanding

The researchers employed a few tricks during the training process to teach PUPS how to combine information from each model in such a way that it can make an educated guess on the protein’s location, even if it hasn’t seen that protein before.

For instance, they assign the model a secondary task during training: to explicitly name the compartment of localization, like the cell nucleus. This is done alongside the primary inpainting task to help the model learn more effectively.

A good analogy might be a teacher who asks their students to draw all the parts of a flower in addition to writing their names. This extra step was found to help the model improve its general understanding of the possible cell compartments.

In addition, the fact that PUPS is trained on proteins and cell lines at the same time helps it develop a deeper understanding of where in a cell image proteins tend to localize.

PUPS can even understand, on its own, how different parts of a protein’s sequence contribute separately to its overall localization.

“Most other methods usually require you to have a stain of the protein first, so you’ve already seen it in your training data. Our approach is unique in that it can generalize across proteins and cell lines at the same time,” Zhang says.

Because PUPS can generalize to unseen proteins, it can capture changes in localization driven by unique protein mutations that aren’t included in the Human Protein Atlas.

The researchers verified that PUPS could predict the subcellular location of new proteins in unseen cell lines by conducting lab experiments and comparing the results. In addition, when compared to a baseline AI method, PUPS exhibited on average less prediction error across the proteins they tested.

In the future, the researchers want to enhance PUPS so the model can understand protein-protein interactions and make localization predictions for multiple proteins within a cell. In the longer term, they want to enable PUPS to make predictions in terms of living human tissue, rather than cultured cells.

This research is funded by the Eric and Wendy Schmidt Center at the Broad Institute, the National Institutes of Health, the National Science Foundation, the Burroughs Welcome Fund, the Searle Scholars Foundation, the Harvard Stem Cell Institute, the Merkin Institute, the Office of Naval Research, and the Department of Energy.

Study shows vision-language models can’t handle queries with negation words

Imagine a radiologist examining a chest X-ray from a new patient. She notices the patient has swelling in the tissue but does not have an enlarged heart. Looking to speed up diagnosis, she might use a vision-language machine-learning model to search for reports from similar patients.

But if the model mistakenly identifies reports with both conditions, the most likely diagnosis could be quite different: If a patient has tissue swelling and an enlarged heart, the condition is very likely to be cardiac related, but with no enlarged heart there could be several underlying causes.

In a new study, MIT researchers have found that vision-language models are extremely likely to make such a mistake in real-world situations because they don’t understand negation — words like “no” and “doesn’t” that specify what is false or absent. 

“Those negation words can have a very significant impact, and if we are just using these models blindly, we may run into catastrophic consequences,” says Kumail Alhamoud, an MIT graduate student and lead author of this study.

The researchers tested the ability of vision-language models to identify negation in image captions. The models often performed as well as a random guess. Building on those findings, the team created a dataset of images with corresponding captions that include negation words describing missing objects.

They show that retraining a vision-language model with this dataset leads to performance improvements when a model is asked to retrieve images that do not contain certain objects. It also boosts accuracy on multiple choice question answering with negated captions.

But the researchers caution that more work is needed to address the root causes of this problem. They hope their research alerts potential users to a previously unnoticed shortcoming that could have serious implications in high-stakes settings where these models are currently being used, from determining which patients receive certain treatments to identifying product defects in manufacturing plants.

“This is a technical paper, but there are bigger issues to consider. If something as fundamental as negation is broken, we shouldn’t be using large vision/language models in many of the ways we are using them now — without intensive evaluation,” says senior author Marzyeh Ghassemi, an associate professor in the Department of Electrical Engineering and Computer Science (EECS) and a member of the Institute of Medical Engineering Sciences and the Laboratory for Information and Decision Systems.

Ghassemi and Alhamoud are joined on the paper by Shaden Alshammari, an MIT graduate student; Yonglong Tian of OpenAI; Guohao Li, a former postdoc at Oxford University; Philip H.S. Torr, a professor at Oxford; and Yoon Kim, an assistant professor of EECS and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. The research will be presented at Conference on Computer Vision and Pattern Recognition.

Neglecting negation

Vision-language models (VLM) are trained using huge collections of images and corresponding captions, which they learn to encode as sets of numbers, called vector representations. The models use these vectors to distinguish between different images.

A VLM utilizes two separate encoders, one for text and one for images, and the encoders learn to output similar vectors for an image and its corresponding text caption.

“The captions express what is in the images — they are a positive label. And that is actually the whole problem. No one looks at an image of a dog jumping over a fence and captions it by saying ‘a dog jumping over a fence, with no helicopters,’” Ghassemi says.

Because the image-caption datasets don’t contain examples of negation, VLMs never learn to identify it.

To dig deeper into this problem, the researchers designed two benchmark tasks that test the ability of VLMs to understand negation.

For the first, they used a large language model (LLM) to re-caption images in an existing dataset by asking the LLM to think about related objects not in an image and write them into the caption. Then they tested models by prompting them with negation words to retrieve images that contain certain objects, but not others.

For the second task, they designed multiple choice questions that ask a VLM to select the most appropriate caption from a list of closely related options. These captions differ only by adding a reference to an object that doesn’t appear in the image or negating an object that does appear in the image.

The models often failed at both tasks, with image retrieval performance dropping by nearly 25 percent with negated captions. When it came to answering multiple choice questions, the best models only achieved about 39 percent accuracy, with several models performing at or even below random chance.

One reason for this failure is a shortcut the researchers call affirmation bias — VLMs ignore negation words and focus on objects in the images instead.

“This does not just happen for words like ‘no’ and ‘not.’ Regardless of how you express negation or exclusion, the models will simply ignore it,” Alhamoud says.

This was consistent across every VLM they tested.

“A solvable problem”

Since VLMs aren’t typically trained on image captions with negation, the researchers developed datasets with negation words as a first step toward solving the problem.

Using a dataset with 10 million image-text caption pairs, they prompted an LLM to propose related captions that specify what is excluded from the images, yielding new captions with negation words.

They had to be especially careful that these synthetic captions still read naturally, or it could cause a VLM to fail in the real world when faced with more complex captions written by humans.

They found that finetuning VLMs with their dataset led to performance gains across the board. It improved models’ image retrieval abilities by about 10 percent, while also boosting performance in the multiple-choice question answering task by about 30 percent.

“But our solution is not perfect. We are just recaptioning datasets, a form of data augmentation. We haven’t even touched how these models work, but we hope this is a signal that this is a solvable problem and others can take our solution and improve it,” Alhamoud says.

At the same time, he hopes their work encourages more users to think about the problem they want to use a VLM to solve and design some examples to test it before deployment.

In the future, the researchers could expand upon this work by teaching VLMs to process text and images separately, which may improve their ability to understand negation. In addition, they could develop additional datasets that include image-caption pairs for specific applications, such as health care.

How can India decarbonize its coal-dependent electric power system?

As the world struggles to reduce climate-warming carbon emissions, India has pledged to do its part, and its success is critical: In 2023, India was the third-largest carbon emitter worldwide. The Indian government has committed to having net-zero carbon emissions by 2070.

To fulfill that promise, India will need to decarbonize its electric power system, and that will be a challenge: Fully 60 percent of India’s electricity comes from coal-burning power plants that are extremely inefficient. To make matters worse, the demand for electricity in India is projected to more than double in the coming decade due to population growth and increased use of air conditioning, electric cars, and so on.

Despite having set an ambitious target, the Indian government has not proposed a plan for getting there. Indeed, as in other countries, in India the government continues to permit new coal-fired power plants to be built, and aging plants to be renovated and their retirement postponed.

To help India define an effective — and realistic — plan for decarbonizing its power system, key questions must be addressed. For example, India is already rapidly developing carbon-free solar and wind power generators. What opportunities remain for further deployment of renewable generation? Are there ways to retrofit or repurpose India’s existing coal plants that can substantially and affordably reduce their greenhouse gas emissions? And do the responses to those questions differ by region?

With funding from IHI Corp. through the MIT Energy Initiative (MITEI), Yifu Ding, a postdoc at MITEI, and her colleagues set out to answer those questions by first using machine learning to determine the efficiency of each of India’s current 806 coal plants, and then investigating the impacts that different decarbonization approaches would have on the mix of power plants and the price of electricity in 2035 under increasingly stringent caps on emissions.

First step: Develop the needed dataset

An important challenge in developing a decarbonization plan for India has been the lack of a complete dataset describing the current power plants in India. While other studies have generated plans, they haven’t taken into account the wide variation in the coal-fired power plants in different regions of the country. “So, we first needed to create a dataset covering and characterizing all of the operating coal plants in India. Such a dataset was not available in the existing literature,” says Ding.

Making a cost-effective plan for expanding the capacity of a power system requires knowing the efficiencies of all the power plants operating in the system. For this study, the researchers used as their metric the “station heat rate,” a standard measurement of the overall fuel efficiency of a given power plant. The station heat rate of each plant is needed in order to calculate the fuel consumption and power output of that plant as plans for capacity expansion are being developed.

Some of the Indian coal plants’ efficiencies were recorded before 2022, so Ding and her team used machine-learning models to predict the efficiencies of all the Indian coal plants operating now. In 2024, they created and posted online the first comprehensive, open-sourced dataset for all 806 power plants in 30 regions of India. The work won the 2024 MIT Open Data Prize. This dataset includes each plant’s power capacity, efficiency, age, load factor (a measure indicating how much of the time it operates), water stress, and more.

In addition, they categorized each plant according to its boiler design. A “supercritical” plant operates at a relatively high temperature and pressure, which makes it thermodynamically efficient, so it produces a lot of electricity for each unit of heat in the fuel. A “subcritical” plant runs at a lower temperature and pressure, so it’s less thermodynamically efficient. Most of the Indian coal plants are still subcritical plants running at low efficiency.

Next step: Investigate decarbonization options

Equipped with their detailed dataset covering all the coal power plants in India, the researchers were ready to investigate options for responding to tightening limits on carbon emissions. For that analysis, they turned to GenX, a modeling platform that was developed at MITEI to help guide decision-makers as they make investments and other plans for the future of their power systems.

Ding built a GenX model based on India’s power system in 2020, including details about each power plant and transmission network across 30 regions of the country. She also entered the coal price, potential resources for wind and solar power installations, and other attributes of each region. Based on the parameters given, the GenX model would calculate the lowest-cost combination of equipment and operating conditions that can fulfill a defined future level of demand while also meeting specified policy constraints, including limits on carbon emissions. The model and all data sources were also released as open-source tools for all viewers to use.

Ding and her colleagues — Dharik Mallapragada, a former principal research scientist at MITEI who is now an assistant professor of chemical and biomolecular energy at NYU Tandon School of Engineering and a MITEI visiting scientist; and Robert J. Stoner, the founding director of the MIT Tata Center for Technology and Design and former deputy director of MITEI for science and technology — then used the model to explore options for meeting demands in 2035 under progressively tighter carbon emissions caps, taking into account region-to-region variations in the efficiencies of the coal plants, the price of coal, and other factors. They describe their methods and their findings in a paper published in the journal Energy for Sustainable Development.

In separate runs, they explored plans involving various combinations of current coal plants, possible new renewable plants, and more, to see their outcome in 2035. Specifically, they assumed the following four “grid-evolution scenarios:”

Baseline: The baseline scenario assumes limited onshore wind and solar photovoltaics development and excludes retrofitting options, representing a business-as-usual pathway.

High renewable capacity: This scenario calls for the development of onshore wind and solar power without any supply chain constraints.

Biomass co-firing: This scenario assumes the baseline limits on renewables, but here all coal plants — both subcritical and supercritical — can be retrofitted for “co-firing” with biomass, an approach in which clean-burning biomass replaces some of the coal fuel. Certain coal power plants in India already co-fire coal and biomass, so the technology is known.

Carbon capture and sequestration plus biomass co-firing: This scenario is based on the same assumptions as the biomass co-firing scenario with one addition: All of the high-efficiency supercritical plants are also retrofitted for carbon capture and sequestration (CCS), a technology that captures and removes carbon from a power plant’s exhaust stream and prepares it for permanent disposal. Thus far, CCS has not been used in India. This study specifies that 90 percent of all carbon in the power plant exhaust is captured.

Ding and her team investigated power system planning under each of those grid-evolution scenarios and four assumptions about carbon caps: no cap, which is the current situation; 1,000 million tons (Mt) of carbon dioxide (CO2) emissions, which reflects India’s announced targets for 2035; and two more-ambitious targets, namely 800 Mt and 500 Mt. For context, CO2 emissions from India’s power sector totaled about 1,100 Mt in 2021. (Note that transmission network expansion is allowed in all scenarios.)

Key findings

Assuming the adoption of carbon caps under the four scenarios generated a vast array of detailed numerical results. But taken together, the results show interesting trends in the cost-optimal mix of generating capacity and the cost of electricity under the different scenarios.

Even without any limits on carbon emissions, most new capacity additions will be wind and solar generators — the lowest-cost option for expanding India’s electricity-generation capacity. Indeed, this is observed to be the case now in India. However, the increasing demand for electricity will still require some new coal plants to be built. Model results show a 10 to 20 percent increase in coal plant capacity by 2035 relative to 2020.

Under the baseline scenario, renewables are expanded up to the maximum allowed under the assumptions, implying that more deployment would be economical. More coal capacity is built, and as the cap on emissions tightens, there is also investment in natural gas power plants, as well as batteries to help compensate for the now-large amount of intermittent solar and wind generation. When a 500 Mt cap on carbon is imposed, the cost of electricity generation is twice as high as it was with no cap.

The high renewable capacity scenario reduces the development of new coal capacity and produces the lowest electricity cost of the four scenarios. Under the most stringent cap — 500 Mt — onshore wind farms play an important role in bringing the cost down. “Otherwise, it’ll be very expensive to reach such stringent carbon constraints,” notes Ding. “Certain coal plants that remain run only a few hours per year, so are inefficient as well as financially unviable. But they still need to be there to support wind and solar.” She explains that other backup sources of electricity, such as batteries, are even more costly. 

The biomass co-firing scenario assumes the same capacity limit on renewables as in the baseline scenario, and the results are much the same, in part because the biomass replaces such a low fraction — just 20 percent — of the coal in the fuel feedstock. “This scenario would be most similar to the current situation in India,” says Ding. “It won’t bring down the cost of electricity, so we’re basically saying that adding this technology doesn’t contribute effectively to decarbonization.”

But CCS plus biomass co-firing is a different story. It also assumes the limits on renewables development, yet it is the second-best option in terms of reducing costs. Under the 500 Mt cap on CO2 emissions, retrofitting for both CCS and biomass co-firing produces a 22 percent reduction in the cost of electricity compared to the baseline scenario. In addition, as the carbon cap tightens, this option reduces the extent of deployment of natural gas plants and significantly improves overall coal plant utilization. That increased utilization “means that coal plants have switched from just meeting the peak demand to supplying part of the baseline load, which will lower the cost of coal generation,” explains Ding.

Some concerns

While those trends are enlightening, the analyses also uncovered some concerns for India to consider, in particular, with the two approaches that yielded the lowest electricity costs.

The high renewables scenario is, Ding notes, “very ideal.” It assumes that there will be little limiting the development of wind and solar capacity, so there won’t be any issues with supply chains, which is unrealistic. More importantly, the analyses showed that implementing the high renewables approach would create uneven investment in renewables across the 30 regions. Resources for onshore and offshore wind farms are mainly concentrated in a few regions in western and southern India. “So all the wind farms would be put in those regions, near where the rich cities are,” says Ding. “The poorer cities on the eastern side, where the coal power plants are, will have little renewable investment.”

So the approach that’s best in terms of cost is not best in terms of social welfare, because it tends to benefit the rich regions more than the poor ones. “It’s like [the government will] need to consider the trade-off between energy justice and cost,” says Ding. Enacting state-level renewable generation targets could encourage a more even distribution of renewable capacity installation. Also, as transmission expansion is planned, coordination among power system operators and renewable energy investors in different regions could help in achieving the best outcome.

CCS plus biomass co-firing — the second-best option for reducing prices — solves the equity problem posed by high renewables, and it assumes a more realistic level of renewable power adoption. However, CCS hasn’t been used in India, so there is no precedent in terms of costs. The researchers therefore based their cost estimates on the cost of CCS in China and then increased the required investment by 10 percent, the “first-of-a-kind” index developed by the U.S. Energy Information Administration. Based on those costs and other assumptions, the researchers conclude that coal plants with CCS could come into use by 2035 when the carbon cap for power generation is less than 1,000 Mt.

But will CCS actually be implemented in India? While there’s been discussion about using CCS in heavy industry, the Indian government has not announced any plans for implementing the technology in coal-fired power plants. Indeed, India is currently “very conservative about CCS,” says Ding. “Some researchers say CCS won’t happen because it’s so expensive, and as long as there’s no direct use for the captured carbon, the only thing you can do is put it in the ground.” She adds, „It’s really controversial to talk about whether CCS will be implemented in India in the next 10 years.”

Ding and her colleagues hope that other researchers and policymakers — especially those working in developing countries — may benefit from gaining access to their datasets and learning about their methods. Based on their findings for India, she stresses the importance of understanding the detailed geographical situation in a country in order to design plans and policies that are both realistic and equitable.

Hybrid AI model crafts smooth, high-quality videos in seconds

What would a behind-the-scenes look at a video generated by an artificial intelligence model be like? You might think the process is similar to stop-motion animation, where many images are created and stitched together, but that’s not quite the case for “diffusion models” like OpenAl’s SORA and Google’s VEO 2.

Instead of producing a video frame-by-frame (or “autoregressively”), these systems process the entire sequence at once. The resulting clip is often photorealistic, but the process is slow and doesn’t allow for on-the-fly changes. 

Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Adobe Research have now developed a hybrid approach, called “CausVid,” to create videos in seconds. Much like a quick-witted student learning from a well-versed teacher, a full-sequence diffusion model trains an autoregressive system to swiftly predict the next frame while ensuring high quality and consistency. CausVid’s student model can then generate clips from a simple text prompt, turning a photo into a moving scene, extending a video, or altering its creations with new inputs mid-generation.

This dynamic tool enables fast, interactive content creation, cutting a 50-step process into just a few actions. It can craft many imaginative and artistic scenes, such as a paper airplane morphing into a swan, woolly mammoths venturing through snow, or a child jumping in a puddle. Users can also make an initial prompt, like “generate a man crossing the street,” and then make follow-up inputs to add new elements to the scene, like “he writes in his notebook when he gets to the opposite sidewalk.”

The CSAIL researchers say that the model could be used for different video editing tasks, like helping viewers understand a livestream in a different language by generating a video that syncs with an audio translation. It could also help render new content in a video game or quickly produce training simulations to teach robots new tasks.

Tianwei Yin SM ’25, PhD ’25, a recently graduated student in electrical engineering and computer science and CSAIL affiliate, attributes the model’s strength to its mixed approach.

“CausVid combines a pre-trained diffusion-based model with autoregressive architecture that’s typically found in text generation models,” says Yin, co-lead author of a new paper about the tool. “This AI-powered teacher model can envision future steps to train a frame-by-frame system to avoid making rendering errors.”

Yin’s co-lead author, Qiang Zhang, is a research scientist at xAI and a former CSAIL visiting researcher. They worked on the project with Adobe Research scientists Richard Zhang, Eli Shechtman, and Xun Huang, and two CSAIL principal investigators: MIT professors Bill Freeman and Frédo Durand.

Caus(Vid) and effect

Many autoregressive models can create a video that’s initially smooth, but the quality tends to drop off later in the sequence. A clip of a person running might seem lifelike at first, but their legs begin to flail in unnatural directions, indicating frame-to-frame inconsistencies (also called “error accumulation”).

Error-prone video generation was common in prior causal approaches, which learned to predict frames one-by-one on their own. CausVid instead uses a high-powered diffusion model to teach a simpler system its general video expertise, enabling it to create smooth visuals, but much faster.

CausVid displayed its video-making aptitude when researchers tested its ability to make high-resolution, 10-second-long videos. It outperformed baselines like “OpenSORA” and “MovieGen,” working up to 100 times faster than its competition while producing the most stable, high-quality clips.

Then, Yin and his colleagues tested CausVid’s ability to put out stable 30-second videos, where it also topped comparable models on quality and consistency. These results indicate that CausVid may eventually produce stable, hours-long videos, or even an indefinite duration.

A subsequent study revealed that users preferred the videos generated by CausVid’s student model over its diffusion-based teacher.

“The speed of the autoregressive model really makes a difference,” says Yin. “Its videos look just as good as the teacher’s ones, but with less time to produce, the trade-off is that its visuals are less diverse.”

CausVid also excelled when tested on over 900 prompts using a text-to-video dataset, receiving the top overall score of 84.27. It boasted the best metrics in categories like imaging quality and realistic human actions, eclipsing state-of-the-art video generation models like “Vchitect” and “Gen-3.

While an efficient step forward in AI video generation, CausVid may soon be able to design visuals even faster — perhaps instantly — with a smaller causal architecture. Yin says that if the model is trained on domain-specific datasets, it will likely create higher-quality clips for robotics and gaming.

Experts says that this hybrid system is a promising upgrade from diffusion models, which are currently bogged down by processing speeds. “[Diffusion models] are way slower than LLMs [large language models] or generative image models,” says Carnegie Mellon University Assistant Professor Jun-Yan Zhu, who was not involved in the paper. “This new work changes that, making video generation much more efficient. That means better streaming speed, more interactive applications, and lower carbon footprints.”

The team’s work was supported, in part, by the Amazon Science Hub, the Gwangju Institute of Science and Technology, Adobe, Google, the U.S. Air Force Research Laboratory, and the U.S. Air Force Artificial Intelligence Accelerator. CausVid will be presented at the Conference on Computer Vision and Pattern Recognition in June.

Q&A: A roadmap for revolutionizing health care through data-driven innovation

What if data could help predict a patient’s prognosis, streamline hospital operations, or optimize human resources in medicine? A book fresh off the shelves, “The Analytics Edge in Healthcare,” shows that this is already happening, and demonstrates how to scale it. 

Authored by Dimitris Bertsimas, MIT’s vice provost for open learning, along with two of Bertsimas’ former students — Agni Orfanoudaki PhD ’21, associate professor of operations management at University of Oxford’s Saïd Business School, and Holly Wiberg PhD ’22, assistant professor of public policy and operations research at Carnegie Mellon University — the book provides a practical introduction to the field of health-care analytics. With an emphasis on real-world applications, the first part of the book establishes technical foundations — spanning machine learning and optimization — while the second part of the book presents integrated case studies that cover various clinical specialties and problem types using descriptive, predictive, and prescriptive analytics. 

Part of a broader series, “The Analytics Edge in Healthcare” demonstrates how to leverage data and models to make better decisions within the healthcare sector, while its predecessor, “The Analytics Edge,” dives into the science of using data to build models, improve decisions, and add value to institutions and individuals. 

Bertsimas, who is also the associate dean of business analytics and the Boeing Leaders for Global Operations Professor of Management at the MIT Sloan School of Management, is the innovator behind class 15.071 (The Analytics Edge), a course on MIT Open Learning’s MITx that has attracted hundreds of thousands of online learners and served as the inspiration behind the book series. Bertsimas took a break from research and his work at MIT Open Learning to discuss how the field of analytics is transforming the health care system and share some surprising ways analytics are already being used in hospitals. 

Q: How is the field of analytics changing the way hospitals provide care and manage their operations?

A: As an academic, I’ve always aspired to educate, write publications, and utilize what we do in practice. Therefore, I founded Holistic Hospital Optimization (H20) with the goal of optimizing hospital operations with machine learning to improve patient care. We have developed a variety of tools at MIT and implemented them at hospitals around the world. For example, we manage patients’ length of stay and their deterioration indexes (a computerized tool that predicts a patient’s risk of clinical deterioration); we manage nurse optimization and how hospitals can allocate human resources appropriately; and we optimize blocks for surgeries. This is the beginning of a change where analytics and AI methods are now being utilized quite widely. My hope would be that this work and this book will accelerate the effect of using these tools. 

Additionally, I have taught a nine-lecture course twice with Agni and Holly at the Hartford Hospital System, where I realized that these analytics methods — which are typically not taught in medical schools — can be demonstrated for health care practitioners, including physicians, nurses, and administrators. To have an impact, you need to have appropriate methods, implement them, and apply them, but you also need to educate people on how to use them. This links well with my role at Open Learning, where our objective is to educate learners globally. In fact, Open Learning is launching this fall Universal AI, a dynamic online learning experience that provides comprehensive knowledge on artificial intelligence, preparing a global audience of learners for employment in our rapidly evolving job market. 

Q: What are some surprising ways analytics are being used in health care that most people wouldn’t expect?

A: Using analytics, we have reduced patients’ length of stay at Hartford Hospital from 5.67 days to five days. We have an algorithm that predicts patients’ probability of being released; therefore, doctors prioritize the patients with the highest probability, preparing them for discharge. This means that the hospital can treat far more patients, and the patients stay in the hospital less time.

Furthermore, when hospitals saw an increase in nurse turnover during the Covid-19 pandemic, we developed an analytics system that takes into account equity and fairness and decreases overtime costs, giving preferred slots to nurses and decreasing overall turnover substantially. These are just two examples; there are many others where an analytical perspective to health care and medicine has made a material difference. 

Q: Looking ahead, how do you see artificial intelligence shaping the future of health care?

A: In a very significant way — we use machine learning to make better predictions, but generative AI can explain them. I already see a movement in that direction. It’s really the evolution of AI that made this possible, and it is exciting. It’s also important for the world, because of its capabilities to improve care and save lives. 

For example, through our program at the Hartford Hospital System, we discovered that a patient was getting worse and predicted through analytics that they would get even worse. After our prediction, the doctors examined the patient more closely and discovered the patient had an early case of sepsis, a life-threatening condition in which the body responds improperly to an infection. If we hadn’t detected sepsis earlier, the patient might have died. This made an actual difference in saving a person’s life. 

Q: If you had to describe “The Analytics Edge in Healthcare” in one or two words, what would they be, and why? 

A: The book is a phased transition in health care because it is capable of affecting the health-care sector in a way that has not been done before. The book really outlines my work in health care and its applications in the last decade.

New tool evaluates progress in reinforcement learning

If there’s one thing that characterizes driving in any major city, it’s the constant stop-and-go as traffic lights change and as cars and trucks merge and separate and turn and park. This constant stopping and starting is extremely inefficient, driving up the amount of pollution, including greenhouse gases, that gets emitted per mile of driving. 

One approach to counter this is known as eco-driving, which can be installed as a control system in autonomous vehicles to improve their efficiency.

How much of a difference could that make? Would the impact of such systems in reducing emissions be worth the investment in the technology? Addressing such questions is one of a broad category of optimization problems that have been difficult for researchers to address, and it has been difficult to test the solutions they come up with. These are problems that involve many different agents, such as the many different kinds of vehicles in a city, and different factors that influence their emissions, including speed, weather, road conditions, and traffic light timing.

“We got interested a few years ago in the question: Is there something that automated vehicles could do here in terms of mitigating emissions?” says Cathy Wu, the Thomas D. and Virginia W. Cabot Career Development Associate Professor in the Department of Civil and Environmental Engineering and the Institute for Data, Systems, and Society (IDSS) at MIT, and a principal investigator in the Laboratory for Information and Decision Systems. “Is it a drop in the bucket, or is it something to think about?,” she wondered.

To address such a question involving so many components, the first requirement is to gather all available data about the system, from many sources. One is the layout of the network’s topology, Wu says, in this case a map of all the intersections in each city. Then there are U.S. Geological Survey data showing the elevations, to determine the grade of the roads. There are also data on temperature and humidity, data on the mix of vehicle types and ages, and on the mix of fuel types.

Eco-driving involves making small adjustments to minimize unnecessary fuel consumption. For example, as cars approach a traffic light that has turned red, “there’s no point in me driving as fast as possible to the red light,” she says. By just coasting, “I am not burning gas or electricity in the meantime.” If one car, such as an automated vehicle, slows down at the approach to an intersection, then the conventional, non-automated cars behind it will also be forced to slow down, so the impact of such efficient driving can extend far beyond just the car that is doing it.

That’s the basic idea behind eco-driving, Wu says. But to figure out the impact of such measures, “these are challenging optimization problems” involving many different factors and parameters, “so there is a wave of interest right now in how to solve hard control problems using AI.” 

The new benchmark system that Wu and her collaborators developed based on urban eco-driving, which they call “IntersectionZoo,” is intended to help address part of that need. The benchmark was described in detail in a paper presented at the 2025 International Conference on Learning Representation in Singapore.

Looking at approaches that have been used to address such complex problems, Wu says an important category of methods is multi-agent deep reinforcement learning (DRL), but a lack of adequate standard benchmarks to evaluate the results of such methods has hampered progress in the field.

The new benchmark is intended to address an important issue that Wu and her team identified two years ago, which is that with most existing deep reinforcement learning algorithms, when trained for one specific situation (e.g., one particular intersection), the result does not remain relevant when even small modifications are made, such as adding a bike lane or changing the timing of a traffic light, even when they are allowed to train for the modified scenario.

In fact, Wu points out, this problem of non-generalizability “is not unique to traffic,” she says. “It goes back down all the way to canonical tasks that the community uses to evaluate progress in algorithm design.” But because most such canonical tasks do not involve making modifications, “it’s hard to know if your algorithm is making progress on this kind of robustness issue, if we don’t evaluate for that.”

While there are many benchmarks that are currently used to evaluate algorithmic progress in DRL, she says, “this eco-driving problem features a rich set of characteristics that are important in solving real-world problems, especially from the generalizability point of view, and that no other benchmark satisfies.” This is why the 1 million data-driven traffic scenarios in IntersectionZoo uniquely position it to advance the progress in DRL generalizability.  As a result, “this benchmark adds to the richness of ways to evaluate deep RL algorithms and progress.”

And as for the initial question about city traffic, one focus of ongoing work will be applying this newly developed benchmarking tool to address the particular case of how much impact on emissions would come from implementing eco-driving in automated vehicles in a city, depending on what percentage of such vehicles are actually deployed.

But Wu adds that “rather than making something that can deploy eco-driving at a city scale, the main goal of this study is to support the development of general-purpose deep reinforcement learning algorithms, that can be applied to this application, but also to all these other applications — autonomous driving, video games, security problems, robotics problems, warehousing, classical control problems.”

Wu adds that “the project’s goal is to provide this as a tool for researchers, that’s openly available.” IntersectionZoo, and the documentation on how to use it, are freely available at GitHub.

Wu is joined on the paper by lead authors Vindula Jayawardana, a graduate student in MIT’s Department of Electrical Engineering and Computer Science (EECS); Baptiste Freydt, a graduate student from ETH Zurich; and co-authors Ao Qu, a graduate student in transportation; Cameron Hickert, an IDSS graduate student; and Zhongxia Yan PhD ’24. 

Novel AI model inspired by neural dynamics from the brain

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a novel artificial intelligence model inspired by neural oscillations in the brain, with the goal of significantly advancing how machine learning algorithms handle long sequences of data.

AI often struggles with analyzing complex information that unfolds over long periods of time, such as climate trends, biological signals, or financial data. One new type of AI model, called „state-space models,“ has been designed specifically to understand these sequential patterns more effectively. However, existing state-space models often face challenges — they can become unstable or require a significant amount of computational resources when processing long data sequences.

To address these issues, CSAIL researchers T. Konstantin Rusch and Daniela Rus have developed what they call “linear oscillatory state-space models” (LinOSS), which leverage principles of forced harmonic oscillators — a concept deeply rooted in physics and observed in biological neural networks. This approach provides stable, expressive, and computationally efficient predictions without overly restrictive conditions on the model parameters.

„Our goal was to capture the stability and efficiency seen in biological neural systems and translate these principles into a machine learning framework,“ explains Rusch. „With LinOSS, we can now reliably learn long-range interactions, even in sequences spanning hundreds of thousands of data points or more.“

The LinOSS model is unique in ensuring stable prediction by requiring far less restrictive design choices than previous methods. Moreover, the researchers rigorously proved the model’s universal approximation capability, meaning it can approximate any continuous, causal function relating input and output sequences.

Empirical testing demonstrated that LinOSS consistently outperformed existing state-of-the-art models across various demanding sequence classification and forecasting tasks. Notably, LinOSS outperformed the widely-used Mamba model by nearly two times in tasks involving sequences of extreme length.

Recognized for its significance, the research was selected for an oral presentation at ICLR 2025 — an honor awarded to only the top 1 percent of submissions. The MIT researchers anticipate that the LinOSS model could significantly impact any fields that would benefit from accurate and efficient long-horizon forecasting and classification, including health-care analytics, climate science, autonomous driving, and financial forecasting.

„This work exemplifies how mathematical rigor can lead to performance breakthroughs and broad applications,“ Rus says. „With LinOSS, we’re providing the scientific community with a powerful tool for understanding and predicting complex systems, bridging the gap between biological inspiration and computational innovation.“

The team imagines that the emergence of a new paradigm like LinOSS will be of interest to machine learning practitioners to build upon. Looking ahead, the researchers plan to apply their model to an even wider range of different data modalities. Moreover, they suggest that LinOSS could provide valuable insights into neuroscience, potentially deepening our understanding of the brain itself.

Their work was supported by the Swiss National Science Foundation, the Schmidt AI2050 program, and the U.S. Department of the Air Force Artificial Intelligence Accelerator.

Making AI models more trustworthy for high-stakes settings

The ambiguity in medical imaging can present major challenges for clinicians who are trying to identify disease. For instance, in a chest X-ray, pleural effusion, an abnormal buildup of fluid in the lungs, can look very much like pulmonary infiltrates, which are accumulations of pus or blood.

An artificial intelligence model could assist the clinician in X-ray analysis by helping to identify subtle details and boosting the efficiency of the diagnosis process. But because so many possible conditions could be present in one image, the clinician would likely want to consider a set of possibilities, rather than only having one AI prediction to evaluate.

One promising way to produce a set of possibilities, called conformal classification, is convenient because it can be readily implemented on top of an existing machine-learning model. However, it can produce sets that are impractically large. 

MIT researchers have now developed a simple and effective improvement that can reduce the size of prediction sets by up to 30 percent while also making predictions more reliable.

Having a smaller prediction set may help a clinician zero in on the right diagnosis more efficiently, which could improve and streamline treatment for patients. This method could be useful across a range of classification tasks — say, for identifying the species of an animal in an image from a wildlife park — as it provides a smaller but more accurate set of options.

“With fewer classes to consider, the sets of predictions are naturally more informative in that you are choosing between fewer options. In a sense, you are not really sacrificing anything in terms of accuracy for something that is more informative,” says Divya Shanmugam PhD ’24, a postdoc at Cornell Tech who conducted this research while she was an MIT graduate student.

Shanmugam is joined on the paper by Helen Lu ’24; Swami Sankaranarayanan, a former MIT postdoc who is now a research scientist at Lilia Biosciences; and senior author John Guttag, the Dugald C. Jackson Professor of Computer Science and Electrical Engineering at MIT and a member of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the Conference on Computer Vision and Pattern Recognition in June.

Prediction guarantees

AI assistants deployed for high-stakes tasks, like classifying diseases in medical images, are typically designed to produce a probability score along with each prediction so a user can gauge the model’s confidence. For instance, a model might predict that there is a 20 percent chance an image corresponds to a particular diagnosis, like pleurisy.

But it is difficult to trust a model’s predicted confidence because much prior research has shown that these probabilities can be inaccurate. With conformal classification, the model’s prediction is replaced by a set of the most probable diagnoses along with a guarantee that the correct diagnosis is somewhere in the set.

But the inherent uncertainty in AI predictions often causes the model to output sets that are far too large to be useful.

For instance, if a model is classifying an animal in an image as one of 10,000 potential species, it might output a set of 200 predictions so it can offer a strong guarantee.

“That is quite a few classes for someone to sift through to figure out what the right class is,” Shanmugam says.

The technique can also be unreliable because tiny changes to inputs, like slightly rotating an image, can yield entirely different sets of predictions.

To make conformal classification more useful, the researchers applied a technique developed to improve the accuracy of computer vision models called test-time augmentation (TTA).

TTA creates multiple augmentations of a single image in a dataset, perhaps by cropping the image, flipping it, zooming in, etc. Then it applies a computer vision model to each version of the same image and aggregates its predictions.

“In this way, you get multiple predictions from a single example. Aggregating predictions in this way improves predictions in terms of accuracy and robustness,” Shanmugam explains.

Maximizing accuracy

To apply TTA, the researchers hold out some labeled image data used for the conformal classification process. They learn to aggregate the augmentations on these held-out data, automatically augmenting the images in a way that maximizes the accuracy of the underlying model’s predictions.

Then they run conformal classification on the model’s new, TTA-transformed predictions. The conformal classifier outputs a smaller set of probable predictions for the same confidence guarantee.

“Combining test-time augmentation with conformal prediction is simple to implement, effective in practice, and requires no model retraining,” Shanmugam says.

Compared to prior work in conformal prediction across several standard image classification benchmarks, their TTA-augmented method reduced prediction set sizes across experiments, from 10 to 30 percent.

Importantly, the technique achieves this reduction in prediction set size while maintaining the probability guarantee.

The researchers also found that, even though they are sacrificing some labeled data that would normally be used for the conformal classification procedure, TTA boosts accuracy enough to outweigh the cost of losing those data.

“It raises interesting questions about how we used labeled data after model training. The allocation of labeled data between different post-training steps is an important direction for future work,” Shanmugam says.

In the future, the researchers want to validate the effectiveness of such an approach in the context of models that classify text instead of images. To further improve the work, the researchers are also considering ways to reduce the amount of computation required for TTA.

This research is funded, in part, by the Wistrom Corporation.

Novel method detects microbial contamination in cell cultures

Researchers from the Critical Analytics for Manufacturing Personalized-Medicine (CAMP) interdisciplinary research group of the Singapore-MIT Alliance for Research and Technology (SMART), MIT’s research enterprise in Singapore, in collaboration with MIT, A*STAR Skin Research Labs, and the National University of Singapore, have developed a novel method that can quickly and automatically detect and monitor microbial contamination in cell therapy products (CTPs) early on during the manufacturing process. By measuring ultraviolet light absorbance of cell culture fluids and using machine learning to recognize light absorption patterns associated with microbial contamination, this preliminary testing method aims to reduce the overall time taken for sterility testing and, subsequently, the time patients need to wait for CTP doses. This is especially crucial where timely administration of treatments can be life-saving for terminally ill patients.
 
Cell therapy represents a promising new frontier in medicine, especially in treating diseases such as cancers, inflammatory diseases, and chronic degenerative disorders by manipulating or replacing cells to restore function or fight disease. However, a major challenge in CTP manufacturing is quickly and effectively ensuring that cells are free from contamination before being administered to patients.
 
Existing sterility testing methods, based on microbiological methods,  are labor-intensive and require up to 14 days to detect contamination, which could adversely affect critically ill patients who need immediate treatment. While advanced techniques such as rapid microbiological methods (RMMs) can reduce the testing period to seven days, they still require complex processes such as cell extraction and growth enrichment mediums, and they are highly dependent on skilled workers for procedures such as sample extraction, measurement, and analysis. This creates an urgent need for new methods that offer quicker outcomes without compromising the quality of CTPs, meet the patient-use timeline, and use a simple workflow that does not require additional preparation.
 
In a paper titled “Machine learning aided UV absorbance spectroscopy for microbial contamination in cell therapy products,” published in the journal Scientific Reports, SMART CAMP researchers described how they combined UV absorbance spectroscopy to develop a machine learning-aided method for label-free, noninvasive, and real-time detection of cell contamination during the early stages of manufacturing.
 
This method offers significant advantages over both traditional sterility tests and RMMs, as it eliminates the need for staining of cells to identify labelled organisms, avoids the invasive process of cell extraction, and delivers results in under half-an-hour. It provides an intuitive, rapid “yes/no” contamination assessment, facilitating automation of cell culture sampling with a simple workflow. Furthermore, the developed method does not require specialized equipment, resulting in lower costs.
 
“This rapid, label-free method is designed to be a preliminary step in the CTP manufacturing process as a form of continuous safety testing, which allows users to detect contamination early and implement timely corrective actions, including the use of RMMs only when possible contamination is detected. This approach saves costs, optimizes resource allocation, and ultimately accelerates the overall manufacturing timeline,” says Shruthi Pandi Chelvam, senior research engineer at SMART CAMP and first author of the paper.
 
“Traditionally, cell therapy manufacturing is labor-intensive and subject to operator variability. By introducing automation and machine learning, we hope to streamline cell therapy manufacturing and reduce the risk of contamination. Specifically, our method supports automated cell culture sampling at designated intervals to check for contamination, which reduces manual tasks such as sample extraction, measurement, and analysis. This enables cell cultures to be monitored continuously and contamination to be detected at early stages,” says Rajeev Ram, the Clarence J. LeBel Professor in Electrical Engineering and Computer Science at MIT, a principal investigator at SMART CAMP, and the corresponding author of the paper.
 
Moving forward, future research will focus on broadening the application of the method to encompass a wider range of microbial contaminants, specifically those representative of current good manufacturing practices environments and previously identified CTP contaminants. Additionally, the model’s robustness can be tested across more cell types apart from MSCs. Beyond cell therapy manufacturing, this method can also be applied to the food and beverage industry as part of microbial quality control testing to ensure food products meet safety standards.