Design partner program

Agent-written code is getting harder to review.

The PR shows the result. It doesn’t show how the code was produced. We’re working with a small number of engineering teams to test whether session-based context solves that problem in practice — before building further.

See the broader picture
Shape of the engagementSmall groupReal reposNo lock-in

The problem you’re already in

Agents write more of the code each quarter. Review and audit were built for humans.

Most of the reasoning behind an agent-authored change — the intent, the tool calls, the attempts, the rejected paths — lived in a session that closed the moment the PR opened. The diff shipped. The context didn’t. And the gap widens as agent usage grows, not the other way round.

The proposed answer

Capture the session behind each change. Link it back to the commit.

This is the kind of artifact we believe solves the problem. It already exists as a product direction — the partnership is how we find out whether it holds up against your real workflows.

LaserOwl/sessions
live
LIVEsx-9f2a

Add rate limiting to billing webhook

northwind/payments-api·sx/rate-limit-billing·14:22:01 · 3m 14s
Intent·@abigail

Protect /webhooks/stripe from retry storms during incident conditions. Target: token bucket at 50 req/s per source IP, with backoff signaling.

Timeline
CL
@claude-codeagent14:22:03
evaluate_plan
confidence 0.71. noted missing: tests/webhooks/rate_limit_spec.ts
AB
add to the conversation…
Live session preview

About the partnership

A focused engagement to validate whether sessions are the right answer.

Not a pilot of a finished product. A small-group engagement to test a specific hypothesis — that session-based context meaningfully changes how agent-written code gets reviewed, debugged, and audited — before we build further. You get early say in how the thing is shaped. We get real ground truth.

Small group

A handful of teams. Direct line to the people building it.

Real workflows

Against your repos, your agents, your review process.

No lock-in

No commitment beyond the engagement. You keep what you learn.

How it works

Four steps. Not a six-month program.

A deliberate, focused sequence. Weeks, not quarters. Designed so both sides learn something real before either of us commits further.

  1. Step 01 · of 04

    Align on what you’re seeing

    A short working session with your team. We map out where agent-written code is showing up, where review is struggling, and what “good enough context” would actually look like for you.

  2. Step 02 · of 04

    Look at real examples

    We walk through real agent-authored changes in your environment — PRs where reviewers got stuck, incidents where provenance mattered. No abstractions, no slideware.

  3. Step 03 · of 04

    Prototype session context

    We capture sessions against a slice of your workflow and see where the extra context actually changes a review, a debug path, or an audit conversation.

  4. Step 04 · of 04

    Decide what’s worth building

    We use what we learn together to validate product direction. If it’s working, we keep going. If it’s not, we learn why — and you keep the notes.

A concrete example

What “review with the session intact” looks like.

A reviewer questions a rate-limit value on an agent-authored PR. Because the session is linked to the commit, the reviewer, the author, and the agent itself can all speak to the same record — not a transcript that vanished into Slack.

Discussion
3 replies
M

Maya

reviewer · 14:14
Token bucket for /v1/context · commit cfa9e2

Are we sure 50 rps is the right ceiling? Our analytics showed 38 rps p95 last week but there are bursts.

C

Claude Code

agent · 14:15

The intent said “cap around 50 rps”, so I kept headroom above p95. I tried 40 first and backed out — not enough slack for the burst window. Happy to tighten to 45 if you want a follow-up session.

A

Abigail

author · 14:16

Good — that’s the reasoning I needed to see. Leaving it at 50 and watching for a week. Merging.

Who this is for

Senior engineering teams already in the problem.

Not everyone is ready for this conversation yet. The partnership works best with teams already living the pain.

Likely relevant if
  • Your team already ships agent-authored code as part of normal work.
  • Reviewers increasingly lack the context behind changes and are asking “why?” more often.
  • Debugging or auditing AI-assisted changes has gotten measurably harder in the last 6–12 months.
  • You want to understand whether session-based context would actually help before a product exists to buy.
Probably not yet
  • Agent usage is still purely exploratory — no production code.
  • You need a GA product with an SLA today.

Next step

One conversation. See if it’s a fit.

A 30-minute discussion about what your team is seeing, what we’re building, and whether a design partnership makes sense. No pitch deck.