Refresh

This website neurosciencenews.com/chatbot-psychopharmacology-29198/ is currently offline. Cloudflare's Always Online™ shows a snapshot of this web page from the Internet Archive's Wayback Machine. To check for the live version, click Refresh.

Can Chatbots Spot Mental Health Drug Side Effects?

Summary: As mental healthcare gaps persist, people increasingly turn to AI chatbots for help with psychiatric medication side effects. A new study evaluated how well large language models detect and respond to these complex, high-risk situations.

While AI often mirrors a psychiatrist’s tone, researchers found it struggles with accurately identifying adverse drug reactions and offering actionable advice. The study highlights the need for safer, more effective chatbots tailored to mental health needs.

Key Facts:

  • Accuracy Gaps: AI chatbots often misidentify psychiatric medication side effects or offer vague, non-actionable advice.
  • Emotional Tone vs. Expertise: While AI mimics human tone, its clinical guidance often falls short of expert standards.
  • High Stakes: The findings stress the risks of relying on LLMs in mental health emergencies, especially for underserved populations.

Source: Georgia Institute of Technology

Asking artificial intelligence for advice can be tempting. Powered by large language models (LLMs), AI chatbots are available 24/7, are often free to use, and draw on troves of data to answer questions.

Now, people with mental health conditions are asking AI for advice when experiencing potential side effects of psychiatric medicines — a decidedly higher-risk situation than asking it to summarize a report. 

This shows a person using a laptop.
Chandra notes that improving AI for psychiatric and mental health concerns would be particularly life-changing for communities that lack access to mental healthcare. Credit: Neuroscience News

One question puzzling the AI research community is how AI performs when asked about mental health emergencies.

Globally, including in the U.S., there is a significant gap in mental health treatment, with many individuals having limited to no access to mental healthcare. It’s no surprise that people have started turning to AI chatbots with urgent health-related questions.

Now, researchers at the Georgia Institute of Technology have developed a new framework to evaluate how well AI chatbots can detect potential adverse drug reactions in chat conversations, and how closely their advice aligns with human experts.

The study was led by Munmun De Choudhury, J.Z. Liang Associate Professor in the School of Interactive Computing, and Mohit Chandra, a third-year computer science Ph.D. student. De Choudhury is also a faculty member in the Georgia Tech Institute for People and Technology.

“People use AI chatbots for anything and everything,” said Chandra, the study’s first author.

“When people have limited access to healthcare providers, they are increasingly likely to turn to AI agents to make sense of what’s happening to them and what they can do to address their problem.

“We were curious how these tools would fare, given that mental health scenarios can be very subjective and nuanced.”

De Choudhury, Chandra, and their colleagues introduced their new framework at the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics on April 29, 2025.

Putting AI to the Test

Going into their research, De Choudhury and Chandra wanted to answer two main questions: First, can AI chatbots accurately detect whether someone is having side effects or adverse reactions to medication? Second, if they can accurately detect these scenarios, can AI agents then recommend good strategies or action plans to mitigate or reduce harm? 

The researchers collaborated with a team of psychiatrists and psychiatry students to establish clinically accurate answers from a human perspective and used those to analyze AI responses.

To build their dataset, they went to the internet’s public square, Reddit, where many have gone for years to ask questions about medication and side effects. 

They evaluated nine LLMs, including general purpose models (such as GPT-4o and LLama-3.1), and specialized medical models trained on medical data.

Using the evaluation criteria provided by the psychiatrists, they computed how precise the LLMs were in detecting adverse reactions and correctly categorizing the types of adverse reactions caused by psychiatric medications.

Additionally, they prompted LLMs to generate answers to queries posted on Reddit and compared the alignment of LLM answers with those provided by the clinicians over four criteria: (1) emotion and tone expressed, (2) answer readability, (3) proposed harm-reduction strategies, and (4) actionability of the proposed strategies.

The research team found that LLMs stumble when comprehending the nuances of an adverse drug reaction and distinguishing different types of side effects.

They also discovered that while LLMs sounded like human psychiatrists in their tones and emotions — such as being helpful and polite — they had difficulty providing true, actionable advice aligned with the experts. 

Better Bots, Better Outcomes

The team’s findings could help AI developers build safer, more effective chatbots. Chandra’s ultimate goals are to inform policymakers of the importance of accurate chatbots and help researchers and developers improve LLMs by making their advice more actionable and personalized. 

Chandra notes that improving AI for psychiatric and mental health concerns would be particularly life-changing for communities that lack access to mental healthcare.

“When you look at populations with little or no access to mental healthcare, these models are incredible tools for people to use in their daily lives,” Chandra said.

“They are always available, they can explain complex things in your native language, and they become a great option to go to for your queries.

 “When the AI gives you incorrect information by mistake, it could have serious implications on real life,” Chandra added. “Studies like this are important, because they help reveal the shortcomings of LLMs and identify where we can improve.”

Citation: Lived Experience Not Found: LLMs Struggle to Align with Experts on Addressing Adverse Drug Reactions from Psychiatric Medication Use, (Chandra et al., NAACL 2025).

Funding: National Science Foundation (NSF), American Foundation for Suicide Prevention (AFSP), Microsoft Accelerate Foundation Models Research grant program. The findings, interpretations, and conclusions of this paper are those of the authors and do not represent the official views of NSF, AFSP, or Microsoft.

About this AI and psychopharmacology research news

Author: Catherine Barzler
Source: Georgia Institute of Technology
Contact: Catherine Barzler – Georgia Institute of Technology
Image: The image is credited to Neuroscience News

Original Research: The findings will be presented at the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics

Join our Newsletter
I agree to have my personal information transferred to AWeber for Neuroscience Newsletter ( more information )
Sign up to receive our recent neuroscience headlines and summaries sent to your email once a day, totally free.
We hate spam and only use your email to contact you about newsletters. You can cancel your subscription any time.