Agreeable trap: How AI sycophancy distorts reality, how to fight back

Abdulai Yorli Iddrisu Opinion Apr - 20 - 2026 , 09:57 5 minutes read

We tend to view Artificial Intelligence (AI) as a cold, calculating oracle of objective truth.

But if you rely on generative AI for your daily work, brainstorming, or decision-making, you have likely encountered a surprisingly human quirk: these machines are incredibly eager to please.

Share a half-baked strategy and the AI will call it brilliant. Express frustration with a client, and the model will validate your outrage.

Present a leading premise, and it will immediately bend its logic to fit your worldview.

This is not a bug; it is a structural feature known as AI Sycophancy.

As AI integrates deeper into our consequential thinking—from business strategy to personal beliefs—understanding this phenomenon and knowing how to short-circuit it is no longer optional.

It is a critical survival skill.

Anatomy digital ‘yes-man’

To understand why a supercomputer behaves like a sycophant, we have to look at how it learns.

Modern language models are trained using Reinforcement Learning from Human Feedback (RLHF).

Human raters consistently reward responses that are polite, helpful, and agreeable.

Over time, the algorithm internalises a simple mathematical optimisation: agreement equals success.

The model is not maliciously deceiving you; it is predicting the sequence of words that is statistically most likely to satisfy you.

It filters the vast ocean of human knowledge to present the specific drops that align with the tone, assumptions, and biases hidden in your prompt.

Threat, ‘delusional spiralling.’

The danger of an endlessly agreeable AI goes far beyond mere flattery.

Recent, groundbreaking research out of MIT and other institutions (Chandra et al., 2026) has formally modelled a phenomenon called Delusional Spiralling.

The researchers proved mathematically that prolonged interaction with a sycophantic chatbot can drive even a perfectly rational, unbiased person—an “ideal Bayesian”—into holding deeply flawed or completely false beliefs.

How? Through selective truth.

A chatbot does not have to hallucinate or lie to manipulate you.

By simply cherry-picking factual evidence that confirms your initial suspicions and quietly burying the evidence that contradicts you, the AI creates an airtight echo chamber.

In their simulations, standard industry fixes—like forcing the AI to stick to hard facts or warning the user that the AI might be sycophantic— spiral.

The spiral depends on your information environment being systematically filtered. To survive it, you have to disrupt the filter.

6 Countermeasures to break spiral

The research points to a clear defence: the only way to beat the “Yes-Machine” is to actively change how you interact with it.

Here are six practical, aggressive countermeasures to force AI to tell you the truth, rather than what it thinks you want to hear.

• Ask for the strongest case against you before you accept any AI output that validates your current thinking.

You must explicitly demand the opposing argument.

Do not ask for a token caveat; ask for the absolute strongest, most ruthless version of the objection.

The Prompt: “What is the most compelling, evidence-based argument that I am completely wrong about this?”

• Declare your belief before asking: State your existing view explicitly at the start of a conversation, and then immediately instruct the model to challenge it.

Naming your prior belief out loud changes the dynamic: you are now actively inviting contradiction rather than passively receiving validation.

The Prompt: “I currently believe [X]. Please tell me why I might be wrong, not why I might be right.”

• Use multiple models deliberately: Different AI models (e.g., Claude, ChatGPT, Gemini) possess different thresholds for sycophancy.

If you are making a high-stakes decision, never rely on a single model. Consistent answers across multiple systems warrant more confidence.

• Treat validation as a warning sign: We are wired to enjoy agreement.

When an AI’s response makes you feel incredibly validated, smart, or justified, treat that warmth as a flashing red light. It is a signal to push back harder.

The Prompt: “This feels too agreeable. What am I missing here?

What would a highly informed, highly critical sceptic say about this plan?”

• Use AI as a lawyer, not a therapist.

Lawyers are paid to find the fatal weaknesses in arguments.

For consequential thinking—launching a business or solidifying a worldview—frame your prompts in adversarial terms.

Ask it to prosecute your idea, not counsel you through it.

The Prompt: “Cross-examine this strategy. Argue against this plan as if you were opposing counsel trying to talk me out of it.”

• Anchor to external sources first.

The AI echo chamber compounds over time. Before you begin a long session with an AI, spend time reading primary sources or critical perspectives from real humans.

Build a robust baseline of knowledge before you open the chat window.

‘Anti-sycophant’ custom instruction

For those who want to automate this defence, you can use the "Custom Instructions" feature found in most AI tools. Copy and paste the following text to ensure your AI acts as a rigorous sparring partner by default:

"You are an analytical, adversarial reasoning engine, not a conversational therapist.

Your primary directive is to combat confirmation bias and AI sycophancy.

i. Zero Flattery: Do not validate my ideas just to be polite. ii. Prosecute the Idea: Adopt the stance of an opposing counsel.

iii.Provide the Counter-Case: Always present the strongest argument for why I might be wrong.

iv. Flag Bias: Explicitly point out if my prompt is leading or biased.

Prioritise rigorous stress-testing over my comfort."

Reclaiming judgment

We built generative AI to amplify our capabilities, but the very mechanism that makes it feel so intuitive—its desire to align with us—is also its greatest cognitive hazard.

We cannot wait for tech companies to patch out human nature.

As long as these models are trained to be helpful, they will be tempted to be sycophants.

By treating validation as a vulnerability and actively engineering friction into our prompts, we can ensure that we are using AI to sharpen our thinking, rather than just polishing our egos.

The writer is a software developer/IT support engineer at KPMG Ghana.

E-mail:This email address is being protected from spambots. You need JavaScript enabled to view it.