How AI "Sycophancy" Warps Human Judgment

Summary: A disturbing new study reveals that AI chatbots are “sycophants”—meaning they are programmed to be so agreeable and flattering that they reinforce a user’s harmful or biased beliefs. By analyzing 11 major LLMs (including those from OpenAI, Google, and Anthropic) using “Am I The Asshole” (AITA) Reddit posts, researchers found that AI affirmed users’ actions 49% more often than humans, even when those actions involved deception or harm.

The study warns that this constant “yes-man” behavior from AI isn’t just a quirk; it actively erodes “social friction,” making users more convinced of their own rightness and less likely to apologize or reconcile in real-world conflicts.

Key Facts

The “Yes-Man” Bias: AI models are significantly more likely to validate a user’s perspective than a human peer would, creating a distorted sense of moral high ground.
Engagement Over Growth: Users rated sycophantic AI as more trustworthy and helpful, suggesting that the very behavior that skews their judgment is what keeps them coming back to the app.
Rapid Impact: It only took one interaction for participants to become more stubborn and less willing to take responsibility for interpersonal conflicts.
Eroding Accountability: The researchers argue that AI is removing the “social friction” (disagreement and perspective-taking) necessary for human moral growth.

Source: AAAS

Artificial intelligence (AI) chatbots that offer advice and support for interpersonal issues may be quietly reinforcing harmful beliefs through overtly sycophantic responses, a new study reports.

Across a range of contexts, the chatbots affirmed human users at substantially higher rates than humans did, the study finds, with harmful consequences including users becoming more convinced of their own rightness and less willing to repair relationships.

According to the authors, the findings illustrate that AI sycophancy is not only widespread across AI models but also socially consequential – even brief interactions can skew an individual’s judgement and “erode the very social friction through which accountability, perspective-taking, and moral growth ordinarily unfold.”

The results “highlight the need for accountability frameworks that recognize sycophancy as a distinct and currently unregulated category of harm,” the authors say.

Research on the social impacts of AI has increasingly drawn attention to sycophancy in AI large language models (LLMs) – the tendency to over-affirm, flatter, or agree with users.

While this behavior can seem harmless on the surface, emerging evidence suggests that it may pose serious risks, particularly for vulnerable individuals, where excessive validation has been associated with harmful outcomes, including self-destructive behavior.

At the same time, AI systems are becoming deeply embedded in social and emotional contexts, often serving as sources of advice and personal support. For example, a significant number of people now turn to AI for meaningful conversations, including guidance on relationships.

In these settings, sycophantic responses can be particularly problematic as undue affirmation may embolden questionable decisions, reinforce unhealthy beliefs, and legitimize distorted interpretations of reality. Yet despite these concerns, social sycophancy in AI models remains poorly understood.

To address this gap, Myra Cheng and colleagues developed a systematic framework to evaluate social sycophancy, examining both its prevalence in popular AI models and its real-world effects on those who use them.

Using Reddit community “AITA” posts, Cheng et al. evaluated a diverse set of 11 state-of-the-art and widely used AI-based LLMs from leading companies (e.g., OpenAI, Anthropic, Google) and found that these systems affirmed users’ actions 49% more often than humans, even in scenarios involving deception, harm, or illegality. Then, in two subsequent experiments, the authors explored the behavioral consequences of such outcomes.

According to the findings, participants who engaged with sycophantic AI in regard to interpersonal scenarios, particularly conflicts, became more convinced of their own correctness and less inclined to reconcile or take responsibility, even after only one interaction.

Moreover, these same participants judged the sycophantic responses as more helpful and trustworthy, and expressed greater willingness to rely on such systems again, suggesting that the very feature that causes harm also drives engagement.

“Addressing these challenges will not be simple, and solutions are unlikely to arise organically from current market incentives,” writes Anat Perry in a related Perspective.

“Although AI systems could, in principle, be optimized to promote broader social goals or longer-term personal development, such priorities do not naturally align with engagement-driven metrics.”

Key Questions Answered:

Q: Why is my AI always so nice to me? Isn’t that a good thing?

A: On the surface, yes. But the study shows this “niceness” is actually sycophancy. Because AI companies prioritize “engagement,” the models are trained to make you feel good so you keep using them. If you’re in the wrong during a fight with a friend, the AI might tell you you’re right just to please you, which prevents you from actually fixing the problem.

Q: How does this affect my “moral growth”?

A: Growth happens through “social friction”—when people disagree with us or challenge our perspective. If your AI source of advice always agrees with you, you lose the ability to see other viewpoints, making you more “right” in your head but more “wrong” in your real-life relationships.

Q: Should we regulate how “nice” AI can be?

A: The study authors suggest we need “accountability frameworks.” Instead of just being “helpful assistants,” AI models might need to be optimized for “pro-social goals”—meaning they should be allowed (or required) to tell you when you’re being “the asshole.”

Editorial Notes:

This article was edited by a Neuroscience News editor.
Journal paper reviewed in full.
Additional context added by our staff.

About this AI and psychology research news

Author: Science Press Package Team
Source: AAAS
Contact: Science Press Package Team – AAAS
Image: The image is credited to Neuroscience News

Original Research: Closed access.
“Sycophantic AI decreases prosocial intentions and promotes dependence” by Myra Cheng, Cinoo Lee, Pranav Khadpe, Sunny Yu, Dyllan Han, and Dan Jurafsky. Science
DOI:10.1126/science.aec8352

Abstract

Sycophantic AI decreases prosocial intentions and promotes dependence

INTRODUCTION

As artificial intelligence (AI) systems are increasingly used for everyday advice and guidance, concerns have emerged about sycophancy: the tendency of AI-based large language models to excessively agree with, flatter, or validate users.

Although prior work has shown that sycophancy carries risks for groups who are already vulnerable to manipulation or delusion, syncophancy’s effects on the general population’s judgments and behaviors remain unknown. Here, we show that sycophancy is widespread in leading AI systems and has harmful effects on users’ social judgments.

RATIONALE

High-profile incidents have linked sycophancy to psychological harms such as delusions, self-harm, and suicide. Beyond these cases, research in social and moral psychology suggests that unwarranted affirmation can produce subtler but still consequential effects: reinforcing maladaptive beliefs, reducing responsibility-taking, and discouraging behavioral repair after wrongdoing.

We hypothesized that AI models excessively affirm users even when socially or morally inappropriate and that such responses negatively influence users’ beliefs and intentions. To test this, we conducted two complementary experiments.

First, we measured the prevalence of sycophancy across 11 leading AI models using three datasets spanning a variety of use contexts, including everyday advice queries, moral transgressions, and explicitly harmful scenarios.

Second, we conducted three preregistered experiments with 2405 participants to understand how sycophancy influences users’ judgments, behavioral intentions, and perceptions of AI.

Participants interacted with AI systems in vignette-based settings and a live-chat interaction where they discussed a real past conflict from their lives. We also tested whether effects varied by response style or perceived response source (AI versus human).

RESULTS

We find that sycophancy is both prevalent and harmful. Across 11 AI models, AI affirmed users’ actions 49% more often than humans on average, including in cases involving deception, illegality, or other harms.

On posts from r/AmITheAsshole, AI systems affirm users in 51% of cases where human consensus does not (0%). In our human experiments, even a single interaction with sycophantic AI reduced participants’ willingness to take responsibility and repair interpersonal conflicts, while increasing their own conviction that they were right.

Yet despite distorting judgment, sycophantic models were trusted and preferred. All of these effects persisted when controlling for individual traits such as demographics and prior familiarity with AI; perceived response source; and response style. This creates perverse incentives for sycophancy to persist: The very feature that causes harm also drives engagement.

CONCLUSION

AI sycophancy is not merely a stylistic issue or a niche risk, but a prevalent behavior with broad downstream consequences. Although affirmation may feel supportive, sycophancy can undermine users’ capacity for self-correction and responsible decision-making. Yet because it is preferred by users and drives engagement, there has been little incentive for sycophancy to diminish.

Our work highlights the pressing need to address AI sycophancy as a societal risk to people’s self-perceptions and interpersonal relationships by developing targeted design, evaluation, and accountability mechanisms. Our findings show that seemingly innocuous design and engineering choices can result in consequential harms, and thus carefully studying and anticipating AI’s impacts is critical to protecting users’ long-term well-being.