“Doctors may also lose their skills, as over-reliance on AI results diminishes critical thinking,” Shegewi said. “Large-scale implementations will likely raise issues related to patient data privacy and regulatory compliance. The risk of bias, inherent in any AI model, is also enormous and could harm underrepresented populations.”
Additionally, the increasing use of AI by health insurance companies often does not translate into the best interests of the patient. Doctors facing an avalanche of AI-generated patient care denials from insurance companies are fighting back and using the same technology to automate their appeals.
“One of the reasons AI has surpassed humans is that it is very good at thinking about why it might be wrong,” Rodman said. “So it’s good at what doesn’t fit the hypothesis, which is a skill that humans aren’t very good at. We are not good at disagreeing with ourselves. “We have cognitive biases.”
Of course, AI has its own biases, Rodman noted. The higher proportion of racial and sexual prejudice has been well documented in LLMs, but they are probably less prone to prejudice than people, he said.
Still, bias in classical AI has been a long-standing problem, and genAI has the potential to exacerbate the problem, according to Gartner’s Walk. “I think one of the biggest risks is that technology is outpacing the industry’s ability to train and prepare clinicians to detect, respond to, and report these biases,” he said.
GenAI models are inherently prone to bias due to their training on data sets that may disproportionately represent certain populations or scenarios. For example, models trained primarily on data from dominant demographic groups could perform poorly for underrepresented groups, said Mutaz Shegewi, senior research director for IDC’s Digital Strategies for Global Healthcare Providers group.
“Prompt design can further amplify bias, as poorly designed prompts can reinforce disparities,” he said. “Furthermore, genAI’s focus on common patterns risks missing rare but important cases.”
For example, the research literature ingested by LLMs is often biased toward white men, creating critical data gaps regarding other populations, Mutaz said. “Because of this, AI models may not recognize atypical disease presentations in different groups. “The symptoms of certain diseases, for example, may have marked differences between groups, and failure to recognize those differences could lead to delayed or misguided treatment,” he said.
Under current regulatory structures, LLMs and their genAI interfaces cannot accept responsibility like a human doctor can. Therefore, for “official purposes,” it is likely still necessary to have a human being aware of the responsibility, judgment, nuances, and the many other levels of assessment and support that patients need.
Chen said he wouldn’t be surprised if doctors were already using LLMs for low-risk purposes, such as explaining medical conditions or generating treatment options for less severe symptoms.
“For better or worse, ready or not, Pandora’s box has already been opened, and we need to figure out how to effectively use these tools and advise patients and doctors on appropriate, safe and reliable ways to do so,” Chen said.