This month, Anthropic changed the Claude Code client in a way that radically changed Claude Opus's effectiveness in Claude Code
. Their team, or more specifically Boris the creator of Claude Code, has described a few ways to tweak it's behavior and closed the issue. It appears that folks are only getting acceptable results if they downgrade Claude Code to the version before March 3rd. Ultimately it has become so unsafe and incapable that I've had to expressly tell my coworkers to stop using Claude Opus.
"Opus-level intelligence" is what convinced me to finally engage with Claude Code and conversational code updates upon Opus 4.5's release.
Many continue to express their frustration in the closed issue on Anthropic's public repository, reddit, Twitter / X. In fairness, here's what Boris specifically shared. It is so lacking in empathy that I am convinced that 's teams do not dog-food the same Claude Code and the same model configuration defaults they give to the public.
There's a lot here, I will try to break it down a bit. These are the two core things happening:
redact-thinking-2026-02-12This beta header hides thinking from the UI, since most people don't look at it. It does not impact thinking itself, nor does it impact thinking budgets or the way extended reasoning works under the hood. It is a UI-only change.
Under the hood, by setting this header we avoid needing thinking summaries, which reduces latency. You can opt out of it with
showThinkingSummaries: truein your settings.json (see docs).If you are analyzing locally stored transcripts, you wouldn't see raw thinking stored when this header is set, which is likely influencing the analysis. When Claude sees lack of thinking in transcripts for this analysis, it may not realize that the thinking is still there, and is simply not user-facing.
Thinking depth had already dropped ~67% by late February
We landed two changes in Feb that would have impacted this. We evaluated both carefully:
1/ Opus 4.6 launch → adaptive thinking default (Feb 9)
Opus 4.6 supports adaptive thinking, which is different from thinking budgets that we used to support. In this mode, the model decides how long to think for, which tends to work better than fixed thinking budgets across the board.
CLAUDE_CODE_DISABLE_ADAPTIVE_THINKINGto opt out.2/ Medium effort (85) default on Opus 4.6 (Mar 3)
We found that effort=85 was a sweet spot on the intelligence-latency/cost curve for most users, improving token efficiency while reducing latency. On of our product principles is to avoid changing settings on users' behalf, and ideally we would have set effort=85 from the start. We felt this was an important setting to change, so our approach was to:
- Roll it out with a dialog so users are aware of the change and have a chance to opt out
- Show the effort the first few times you opened Claude Code, so it wasn't surprising.
Some people want the model to think for longer, even if it takes more time and tokens. To improve intelligence more, set effort=high via
/effortor in your settings.json. This setting is sticky across sessions, and can be shared among users. You can also use the ULTRATHINK keyword to use high effort for a single turn, or set/effort maxto use even higher effort for the rest of the conversation.
It's not all about effort. It spends longer "thinking" (but not actually somehow?), it does more tool calls, and gets the problem wrong, tokens "burn" faster when they shouldn't. Occasionally I find developing with Claude Code to be so frustrating that I have to walk outside and touch grass to regain my cool. Now Claude Code is unusable even with configuration changes. I should not have to spend an hour baby sitting and carefully explaining what it has already tried and what it got wrong only for it to continue getting it wrong.
I'm using agents to get results faster with the understanding that while I know what it is doing, and that I could do the same, it would take me much longer to enact that change. That promise of a trustworthy (but still naive or ignorant or clueless) partner has been broken. As of March, Claude Opus broke that promise.
It can no longer review itself, self-correct, or improve in real time.
They sacrificed the ability to think about its own thinking... in order to save on compute.
Real data (6,852 production sessions — AMD):
Thinking depth: -73% (2,200 → 600 chars)
Reads before editing: -70% (6.6 → 2.0)
Blind edits (without reading): +440% (6.2% → 33.7%)
API calls per task: Up to 80x higher
Even at EFFORT MAX (April 2026), it produces worse results than the HIGH setting from January 2026.
So far the public agrees on their own benchmarks. Opus is worse now than it was in January and February. Its effectiveness in actual use is correlated with their metrics dramatically changing.
My next renewal with Claude won't be on the max plan. As if on cue, OpenAI announced a $100/mo offering. In five minutes, issues that literally got me to rage quit Claude were resolved. Their terminal interface is a step down from Claude Code and Cursor's Agent CLI. But the GPT 5.4 model works — which I cannot say for Claude Opus.
I may still use "Opus-level intelligence" to describe the transition of capability and trust to hand brownfield work off to an agent. However, "Opus" is not safe to use outside toy projects at this time and this seriously impacts my future trust in Anthropic as an inference provider. How can I trust any process I automate with an LLM they serve if the underlying model degrades so loudly and in turn their team's response is to ignore the feedback?
If only GPT 5.x wasn't so bad at UI.