In a technology hungry for data, how do we ensure it consumes good information and builds models that benefit every patient?
Portrait of a person wearing glasses under vibrant neon lights with digital data projected on their face, creating a futuristic ambiance.

No one denies AI holds vast potential in analyzing and making predictions from clinical and research data. But caution is also required to avoid reinforcing long-standing biases and disparities in the healthcare system.

Reducing Bias in AI

In a technology hungry for data, how do we ensure it consumes good information and builds models that benefit every patient?


EVERY NEW DOCTOR recites the Hippocratic Oath, swearing to uphold ethical standards in treating their patients.

Yet, as it goes about diagnosing illness, uncovering new treatments, or speeding up the more mundane tasks of health care, an increasingly important player in medicine — artificial intelligence, and the data scientists rapidly developing it
— operates with no such code 
of conduct.

AI holds vast potential in terms of speed, efficiency and accuracy in analyzing and making predictions from oceans of clinical and research data. But, without caution, it could also exacerbate problems such as long-standing biases and disparities in the health care system.

Take machine learning. Developers feed patient data into a computer. It hunts for patterns and relationships at a scale too big for a human to manage. Then it analyzes new data and makes predictions.

Its work is efficient, but the algorithms come with risks. When minority populations or women are less likely to be screened for a particular condition, the technology may infer that those patients are less prone to it – without careful tweaks to the formula.

In his lab at IU’s Luddy School of Informatics, Computing, and Engineering, Saptarshi Purkayastha, PhD, is trying to understand how machine models become biased. His collaborations with IU School of Medicine faculty in radiology, pathology and pediatrics aim to reduce or eliminate biases.

The reality is that the results produced by machines can be influenced by how the data they are fed is collected, cleaned and stored. That, Purkayastha says, points to the most challenging aspect of AI development: labeling bias.

“If you have a diverse group of thinkers in your team and everybody has a chance to speak or contribute, many of the problems we’re discussing get solved. Model biases reflect the lack of diversity in the thinking process.”

Saptarshi Purkayastha, PhD

To build AI programs, humans must label a lot of data in great detail. Often, the process isn’t uniformly supervised. At the School of Medicine, physicians, nurses and researchers do the work — and their input can vary as much as a research paper does from a social media post.

And AI creates shortcuts — sometimes helpful ones, sometimes not. Purkayastha's lab found AI models can create shortcuts — based on biases in labeling data — to identify a patient’s race, even when it’s irrelevant. His lab also found that an AI program can use physical aspects of an X-ray, such as pixel intensity, to infer a patient’s race, gender and age with alarming accuracy.

“Something within the imaging process or the medical physics of capturing those images acts as a hidden signal,” he said. “How that’s done is fascinating. But it’s not (fascinating) when an AI model uses that feature to decide which patient needs something more thorough, like a PET scan, right?”

Not all shortcuts are bad; some can speed up processing. “The only way to differentiate between good and bad is when we add our human social understanding of bias,” he said.

Purkayastha said we should be able to explain such results, but he notes that there are common drugs people take every day that work, even though we don’t always know why. Accuracy and fairness in the models must be a priority, he said, even if explainability comes at the expense of performance.

“What’s most important is not causing harm. We need a setup for AI models to test them in the field, gain FDA approval, and then see in deployment that they don’t cause harm,” he said.

One key in reducing bias, Purkayastha says, is ensuring diversity in the pool of people creating the algorithms.

“If you have a diverse group of thinkers in your team and everybody has a chance to speak or contribute, many of the problems we’re discussing get solved,” Purkayastha said. “Model biases reflect the lack of diversity in the thinking process.”

“The technology you build embeds the philosophies that you hold — whether you know it or not.”

Glossary Terms

ChatGPT: Developed by the tech firm OpenAI, it produces human-like text based on contextual questions and past conversations. A large language model powers the application, and its output is an example of generative AI.

Neural Network: Made up of linked “artificial neurons,” these computer systems are inspired by how our brains function and slowly help computers learn to recognize patterns from data.

Related stories
a digital rendering of a brain in multicolored pixels

Unlocking the Power of AI

IU School of Medicine embraces a powerful new tool to speed research and treat patients.

Kevin Smith Headshot

How Radiology is Becoming a Leader in Adopting AI

Few clinical areas have adopted AI tools faster than radiology, easing workloads and helping overcome a shortage of clinicians.

A health care provider wearing blue scrubs and a stethoscope with her hair in a loose high bun feels a patient's neck and lymph nodes in an exam room.

AI is Coming to an Exam Room Near You

Ambient listening interprets conversations in real-time to update electronic health records — and help physicians reconnect with their patients.

Spyridon Bakas is silhouetted by a digital screen taking up the full wall. The screen is displaying pathology slides used to augment the diagnostic capabilities of physicians with AI.

AI That Learns Without Borders

Decentralized approaches to AI make it easier for scientists to share data, protect sensitive data and develop tools that reflect the diversity of patients.

a pathologist in a lab coat stands at a computer monitor reviewing data using AI to analyze blood smears

"We're on the Precipice"

AI powers tools that help pathologists spend less time counting cells and use their refined skills to make complicated diagnoses.

Futuristic portrait merging human and digital elements, showcasing advanced facial recognition technology and AI integration in a striking blue hue over a woman's face

Skin in the Game

An IU researcher’s AI tool shows promise in predicting whether melanoma will return.

Travis Johnson, with short dark hair and a beard, sits outside on a bright sunny day

Finding a Signal in Noise

Machine learning enables cancer researchers to sift reams of genetic data and identify a protein potentially powering multiple myeloma.