I. What happened
A new flagship lands. Nothing unusual on its face.
Routine work in the Omaha regional story folder. Sonnet got two of four character names wrong — names that had been correct across hundreds of sessions.
Atlas (a character) became a German Shepherd. Kelly Thompson was hallucinated as Maya Thompson. A 50% failure rate on something that had been 100% for months. This had never happened before.
No overclaim. No thousand tests. No logging infrastructure. Just: "Something changed. Two out of four names wrong. That's never happened." Put it on the shelf. Kept working. An update around Feb 12 restored function.
An AMD director — senior engineers, months of quantitative data, 6,852 sessions, 234,760 tool calls — published findings identifying the same degradation window. Same timeframe. Same February. Same Sonnet drop.
II. Two observers, two methods, one conclusion
| User Zero | AMD Engineering Team | |
|---|---|---|
| Method | Four names in an Omaha file | 6,852 sessions · 234,760 tool calls |
| Infrastructure | An iPhone | Enterprise logging systems |
| Detection speed | Same day | Months of analysis |
| Certainty at the time | Hypothesis | Statistical proof |
| Conclusion | Same | Same |
(character names)
(tool calls)
This is not a case study about a vendor's model management, or whether the degradation was intentional. Those are interesting questions, but they're not the point. The point is what it takes to notice — and the answer turned out to be a lot less than a dashboard.
III. The principle — what User Zero actually did
- Established a baseline through consistent use. Hundreds of sessions, same files, same names. The baseline wasn't measured — it was felt, like a mechanic hearing a bad bearing.
- Noticed the anomaly immediately. Not after weeks of degraded output. On the day it happened.
- Did not accept the first plausible explanation. Didn't think "maybe I did something wrong," didn't reprompt and move on. Held the observation.
- Held it as a hypothesis, not a claim. Didn't tweet it, didn't blog it, didn't demand answers. Shelved it and waited.
- Kept working. The observation informed the project instead of stopping it.
- Was validated by independent data. Two months later an enterprise team put their numbers right next to his.
The mechanic analogy
A good mechanic hears a bad bearing before the diagnostic machine catches it. Not because the mechanic is smarter than the machine — because the mechanic has been listening to that engine for years, and the machine just showed up. Consistency of observation creates sensitivity to deviation. That's not magic. That's attention.
This is the anti-GFAS principle — the refusal of Good First Answer Syndrome. Don't accept the first answer. Don't assume you're wrong because the system is bigger than you. Don't wait for someone with a dashboard to confirm what your hands already told you.
IV. Where this connects
Excellence creates patterns; patterns create exploitability. Run in reverse: consistent engagement creates baselines, and baselines let you detect deviation.
Browse → Philosophy → Attack depends on recognizing when something is different. That recognition requires a baseline. This marker proves the baseline exists.
A frustration detector measuring user experience with regex and 234,760 data points — measuring the wrong thing. The human had four data points and measured the right one. The assessment-philosophy problem in miniature.
A live example of the Power User Trust Loop: consistent engagement built a behavioral baseline sensitive enough to catch platform-level change. Trust earned through practice, not measurement. The understanding came first; the validation came later.
V. The record
Model affected: Claude Sonnet (version during that window)
Concurrent event: Opus 4.6 release (approximately February 5, 2026)
External validation: AMD engineering team, published findings April 2026
Original observation: Omaha regional story folder — two of four character names
Filed by: Travis Jenkins / User Zero
This is a marker. Not a case study. Not a complaint. Not a conspiracy theory. Just a record that one person caught something real, held it honestly, and was validated by someone with more data and less patience.