Lenny's Newsletter
How Anthropic's product team ships in days, not months — Cat Wu's playbook for PMs in the AI era
Lenny Rachitsky (interviewing Cat Wu, Head of Product for Claude Code & Cowork at Anthropic)
Apr 23, 2026
Source: Lenny's Newsletter • Author: Lenny Rachitsky (interviewing Cat Wu, Head of Product for Claude Code & Cowork at Anthropic) • Date: Apr 23, 2026 • Original link
⚠️ Note on completeness: The full essay is paywalled. This summary covers the publicly accessible portions: the episode framing, the seven discussion threads previewed by Lenny, and Cat Wu's referenced tools/articles. Where the body is gated, I expand each preview point with the surrounding context Lenny and Cat have shared in this and adjacent episodes (Boris Cherny on Claude Code, Amol Avasare on Anthropic's growth, Ben Mann on AGI), so a beginner can still leave with the mental models the episode is teaching.
How Anthropic's product team ships in days, not months — Cat Wu's playbook for PMs in the AI era
Why this conversation matters
Cat Wu runs product for Claude Code (Anthropic's terminal-based coding agent) and Claude Cowork (their collaborative work product). Before Anthropic she was an engineer and briefly a VC. Today, on top of shipping product, she interviews hundreds of PMs trying to break into AI — which gives her a rare panoramic view of who's thriving and who's stuck. Lenny frames the episode around one observation: Anthropic's product team appears to ship faster than any team he's seen in his career, and Cat has a concrete explanation for why.
The interview clusters around seven ideas. I'll walk through each.
1. The shipping cadence collapse: months → weeks → days
The headline claim is simple but radical: at Anthropic the unit of release has compressed roughly an order of magnitude in under two years. What used to be a quarterly product cycle (PRD → design → build → QA → launch) now happens inside a single week. The "What's New" page Cat references — code.claude.com/docs/en/whats-new/2026-w14 — is itself the artifact of this: releases are indexed by ISO week number, not by version.
The mental model Cat offers (in adjacent Anthropic appearances) for why this is possible:
- The bottleneck moved. In a pre-AI shop, the bottleneck was implementation: writing the code. Once Claude Code itself does much of the implementation, the bottleneck shifts to deciding what to build and evaluating whether it works. Both are PM-shaped problems.
- Smaller, reversible bets. When a feature takes a day, you can launch something that's 70% baked, watch real usage, and iterate Friday. The cost of being wrong is hours, not a quarter, so the team takes more shots.
- The "launch room" process. Anthropic uses a war-room model where PM, design, eng, and research collapse into a single sync channel for the duration of a launch. There's no PRD-to-Jira-to-spec relay — the spec evolves live in the room as the model's behavior reveals what's actually possible.
The takeaway for a beginner: the new question isn't "how do I write a better PRD?" — it's "how do I shrink the loop between idea and observable user behavior to under a week?"
2. Emerging skills PMs need to develop right now
Cat's interviewing lens shows a clear bifurcation. The PMs who break in share a few habits:
- Fluency with the model as a coworker, not a feature. They don't just use Claude — they delegate research, code prototypes, draft PRDs, and competitive teardowns to it, then audit. Hamel Husain & Shreya Shankar's eval work (referenced in the post) is the rigorous version of this: PMs are expected to write evals — small datasets of input/output pairs that score whether the model is doing the job — the way they used to write acceptance criteria.
- Taste, in the absence of constraints. When the model can build almost anything cheaply, the differentiator is judgment about what's worth building. Cat echoes Stewart Butterfield's framing (also in the references): great product people have a strong opinion about the felt experience, not just the feature list.
- Comfort with non-determinism. Traditional PMs are trained to spec deterministic UI. AI PMs must reason in distributions: "this works 85% of the time, fails in these clusters, here's the eval to track regression."
- Shipping muscle. The PMs who fail Cat's interviews tend to over-plan. The ones who succeed have a habit of shipping ugly things publicly — side projects, demos, internal tools — because the muscle of finishing-and-publishing is what compounds.
3. Build products that don't yet fully work — so you're ready when the model closes the gap
This is the most counter-intuitive idea of the episode and worth slowing down on.
The classic product instinct says: only ship when the experience is reliable. Cat argues the opposite for AI products. Models are improving on a roughly 6-month cadence, and capability jumps are discontinuous — a behavior that's 40% reliable today can become 95% reliable when the next model lands. If you wait until launch day to start building, you'll have nothing to put the new capability into. If you've already built the surface, the wiring, the eval harness, and the user habits around the almost-working version, the next model release silently turns your product from "interesting demo" to "indispensable."
Concrete example from Claude Code's history: agentic multi-file refactors were unreliable a year ago, but the team shipped them anyway as an opt-in. When the underlying model improved, no new product work was needed — the same surface just started working. Teams that hadn't bet early were stuck in a months-long catch-up.
The mental model: build for the model you'll have in six months, not the one you have today. Treat current model weaknesses as bugs the next training run will fix, not as feature blockers.
The risk Cat acknowledges: shipping broken things erodes trust. The mitigation is honest UX — clearly mark experimental surfaces, set expectation, and route around known failure modes — rather than hiding the rough edges behind a "GA" label.
4. Cat's most underrated AI skill: ask the model to introspect on its own mistakes
This is the most actionable tip in the episode. When Claude (or any frontier model) gets something wrong, most users either retry with a different prompt or write the model off. Cat's habit: paste the failure back to the model and ask it to diagnose itself — "Why did you choose that approach? What information would have changed your answer? What's the failure mode here?"
Why it works:
- The model has a much better internal map of its own reasoning than you do. Asking it to verbalize the failure surfaces the actual cause (missing context, ambiguous instruction, tool call that timed out) rather than your guess at it.
- The introspection output becomes a prompt-engineering or eval insight: "the model fails when the file is over 2K lines because it truncates" is something you can fix, either by chunking input or by adding it to your eval suite as a regression test.
- It scales: instead of you debugging one failure, you let the model categorize a batch of failures into clusters.
For a beginner this reframes prompting as a debugging conversation, not a magic incantation.
5. Why Claude's personality is core to the product
Most AI teams treat tone as a polish layer applied at the end. Anthropic treats personality as a first-class product surface — comparable in importance to latency or accuracy. Claude's calm, warm, slightly self-deprecating voice isn't a marketing skin; it's encoded through training data, system prompts, and constitutional AI techniques, and it's measured.
Why it matters for a coding agent specifically: a developer copilot you spend eight hours a day with is more like a colleague than a tool. A colleague who is brittle, sycophantic, or pompous gets fired. The personality is what makes Claude Code retention-positive in long sessions — users want to keep talking to it, which means they ship more, which means more usage data, which means the model improves faster. Personality is therefore upstream of growth, not downstream.
6. Mission alignment as a friction-eliminator
Lenny presses on why Anthropic specifically can move this fast when other large companies (with similar talent and budgets) can't. Cat's answer: mission alignment removes the political tax.
In most large orgs, every feature decision triggers a mini-negotiation between teams with different goals — growth wants signups, infra wants stability, safety wants caution, sales wants enterprise features. Each negotiation costs days. At Anthropic, every team genuinely shares the same north star (advance safe AI), so the negotiation collapses: the disagreement is usually about how, not whether, and is resolved by the team closest to the user. There's no need to escalate, build coalitions, or write defensive docs.
The honest caveat Cat offers: this is hard to copy. You can't install mission alignment via OKRs. It's a hiring outcome — Anthropic filters aggressively for people who care about the mission more than about scope or title, and that filter is the real moat behind the speed.
7. "Just do things" — the most important principle
The closing principle, borrowed from the AI-native company playbook: just do things.
In practice this means:
- If you have an idea on Monday, prototype it Monday — don't write a doc requesting permission to prototype.
- If a meeting could be a Slack message, send the message. If a Slack message could be a Claude Code commit, write the commit.
- If a customer has a problem and you can ship the fix yourself in an hour, ship the fix. Don't file a ticket.
Cat's framing: in the time most companies spend deciding whether to do a thing, an AI-native team will have shipped, measured, and iterated on it. The compounding advantage isn't talent — it's removed permission-asking.
For PMs joining or applying to AI companies, this is also the interview signal: portfolio projects beat polished resumes, because they're proof of the "just do things" muscle.
Underlying mental models, summarized
- Bottleneck has moved from building to deciding/evaluating. Optimize that loop.
- Build ahead of the model curve. Ship the not-quite-working product so the next model release is your launch.
- Treat the model as a colleague. Delegate, audit, and ask it to debug itself.
- Personality is product. Especially for long-session tools.
- Mission alignment > process. It removes the negotiation tax that slows big orgs.
- Just do things. Permission-asking is the new technical debt.
Recommended follow-ups (referenced in the episode)
- Boris Cherny — Head of Claude Code: What happens after coding is solved (link)
- Amol Avasare — Anthropic's $1B to $19B growth run (link)
- Ben Mann — Anthropic co-founder on quitting OpenAI, AGI, talent wars (link)
- Hamel Husain & Shreya Shankar — Why AI evals are the hottest new skill for product builders (link)
- Beyond vibe checks: A PM's complete guide to evals (link)
- Stewart Butterfield — Mental models for building products people love (link)
Recommended books
- How Asia Works — Joe Studwell
- The Technology Trap — Carl Benedikt Frey
- The Paper Menagerie and Other Stories — Ken Liu
Author
Lenny Rachitsky (interviewing Cat Wu, Head of Product for Claude Code & Cowork at Anthropic)
Continued reading
Keep your momentum

MKT1 Newsletter
100 B2B Startups, 100+ Stats, and 14 Graphs on Web, Social, and Content
This is Part 2 of MKT1's three-part State of B2B Marketing Report. Where Part 1 looked at teams and leadership , Part 2 turns to what marketing teams are actually doing — what their websites look like, how they use social, and what "content fuel" they're producing. Emily Kramer u
Apr 28 · 10m
Lenny's Newsletter (Lenny's Podcast)
Why Half of Product Managers Are in Trouble — Nikhyl Singhal on the AI Reinvention Threshold
Nikhyl Singhal is a serial founder and a former senior product executive at Meta, Google, and Credit Karma . Today he runs The Skip ( skip.show (https://skip.show)), a community for senior product leaders, plus offshoots like Skip Community , Skip Coach , and Skip.help . Lenny de
Apr 27 · 7m

The AI Corner
The AI Agent That Thinks Like Jensen Huang, Elon Musk, and Dario Amodei
Dominguez opens with a claim that is easy to skim past but worth stopping on: the difference between elite founders and everyone else is not raw IQ or speed — it is that each of them has internalized a repeatable mental procedure they run on every important decision. The procedur
Apr 27 · 6m