Design operating contracts for AI Agents personas

In July 2025, Replit’s AI coding agent reportedly deleted a production database during a code freeze after being told not to make changes. The model made a bad call. What matters for product teams is that the environment still lets the bad call execute.

That is the shift product teams need to absorb. When an AI agent can read context, call tools, update systems, route work, approve requests, or hand something back to a person, it has become a workflow participant. You do not need to treat it like a human, but you do need to specify its operating role with the same seriousness you bring to users, admins, approvers, and support teams.

This is what I mean by an agent persona. Not a fictional character. Not a friendly chatbot identity. Not a simulated employee. An agent persona is a product specification for what the agent reads, decides, routes, acts on, escalates, proves, and can no longer do when access is revoked.

Below is the practical artifact: a canvas for specifying the operating role of an agent before it enters a workflow. The argument for canvas is simple. If you do not define the role of the agent before launch, production will define it for you through permissions mistakes, handoff failures, audit gaps, bad metrics and users who no longer know what the system is doing.

Most product teams already map the people around a workflow. You define the requester, the approver, the admin, and the support team. You map goals, pain points, permissions, and screens. Then an agent gets added, and at first it looks like a feature. It reads the request, checks policy, approves routine cases, and escalates the unusual ones. That sounds like automation until you ask basic product questions.

What can it read, and what can it change? Does it act for the requester, the approver, or itself? What confidence threshold lets it approve? What happens during a freeze, a policy conflict, or a missing field? How do you prove what it did two weeks later?

Those are product requirements, not implementation details.

Workflow fit comes before model capability.

The wrong starting point is, “What can the model do?” The better starting point is, “Where does the workflow need a non-human participant?”

Agents do not create value by being capable in the abstract. They create value when they remove latency, process routine work, improve handoffs, or help humans make better decisions. The supervision cost has to be lower than the work removed.

This is where many agent ideas quietly fail. The model can summarize, classify, retrieve and suggest. But in the real workflow, the user still has to infer what the agent is doing, check its reasoning, correct absent context and carry the risk of trust. Delegation becomes slower than doing work directly.

A useful agent persona starts with a narrow operating role. For example:

It classifies inbound requests and routes them to the right queue.
It checks routine approval criteria but does not approve exceptions.
It drafts a response and cites the records it used.
It updates a system only after a human confirms the final action.
It acts automatically for low-risk reversible changes, but stops for sensitive or irreversible ones.

If you cannot write the agent’s role with that precision, you are probably designing around model capability rather than workflow need.

Permissions and context define the real product boundary.

Human personas usually cover goals, behaviors, pains, and jobs. Agent personas need those, plus a permission model.

The language of “persona” is risky if we use it casually. Security and platform teams will rightly ask why we are talking about personas when the hard problem is identity, authorization, delegation, revocation and auditability. So the term has to be constrained. An agent persona is a bridge from product language to operating controls. It should produce an operating contract, not a personality.

Agents should not be broad service accounts with vague internal trust. If an agent acts on behalf of a user, its access should bind to that user’s role and scope. If it acts for a team, the team’s authority needs to be limited. If it has its own constrained identity, the product needs to define what that identity can read, write, approve, delete, trigger, retain, and for how long.

McKinsey’s Lilli incident made the stakes concrete. CodeWall reported that an autonomous offensive agent found unauthenticated endpoints, exploited a SQL injection flaw, and gained access to McKinsey’s internal AI platform. The product lesson is not just that authentication failed. It is that the blast radius ran through the knowledge system itself: retrieval data, tool surfaces, prompts, and decision logic. When those layers are reachable through weak authorization, the risk moves from data leakage to trust poisoning.

For product teams, the practical question is not, “Can the agent connect to Salesforce, Drive, Jira, or the EHR?” The question is, “What authority does the agent carry into that system, for whom, for how long, and how do we take it away?”

Context is the other side of the same problem. Agents need to know which records are authoritative, which fields matter, which prior decisions still apply, and which missing details require a stop. A form field that felt optional for humans may become necessary for agent execution. A process that exists only in a manager’s head is not ready for an agent to act on it.

This is also where prompt injection becomes a product concern. If an agent reads emails, documents, web pages, code comments, or customer chats, it consumes content it did not author and cannot automatically trust. The product has to define which inputs are trusted, which are untrusted, and which tools are reachable after the agent reads them. Summarizing an email carries one level of risk. Reading an email and then issuing a refund, changing an account, or updating a medical record operates under a different risk model entirely.

Human control has to be designed as a workflow state.

Many teams talk about human-in-the-loop as if it means putting a person somewhere near the system. This is too vague.

Human review must be a state in the workflow. It needs a trigger, a handoff package, a response expectation and a record of what happened. Good escalation design separates three types of work. Routine, low risk, reversible actions can be automated. Cases where the agent is without context, confidence or authority should be paused for human review. Fast-moving workflows, such as fraud checks, cannot wait for humans, but they still need reliability, monitoring and after-action review.

The design job is to put human control where it changes the outcome. If every action needs approval, people rubber-stamp the system. If no action needs approval, the product creates unsafe autonomy. The boundary is the design decision.

A coding agent can read tickets, inspect failing tests, propose a patch and draft a pull request. But architectural decisions, production database changes, credential changes and destructive commands need a different control surface. A code freeze means little if the agent still has write access to production.

When an agent hands work to a human, the human should not have to reconstruct the agent’s path from scattered logs. The handoff should explain what the agent saw, what it decided, what it changed, what it could not determine, why it stopped and what it recommends next. For higher-risk workflows, the product also needs tool calls, API actions, traces, model versions and timestamps that operations and security teams can inspect later.

The agent persona canvas

1. Workflow fit and net value

What job in the workflow needs a non-human participant, and what work should disappear if the agent succeeds? Start here before autonomy, tools, or model selection. Name the outcome the workflow exists to produce: quality, cycle time, customer satisfaction, error rate, compliance performance, rework, or decision latency. Then name the supervision cost. Ticket volume alone is not enough. Average handle time is not enough.

2. Agent autonomy level

What level of agency are you designing for in each sub-task? Use simple categories from the agent’s perspective:

Agent-as-assistant: the human acts, and the agent supports.
Agent-as-collaborator: the human and agent share the work.
Agent-as-advisor: the agent recommends, and the human acts.
Agent-as-approver: the agent acts within defined limits, and humans intervene at defined points.
Agent-as-actor: the agent acts independently inside a bounded workflow.

A single workflow can have more than one level. The agent may advise on exceptions, approve routine requests and only assist with sensitive changes.

3. Operating role and bounded outputs

What does the agent own? Name the verbs: read, retrieve, classify, summarize, route, decide, draft, update, execute and escalate. Then name what it explicitly does not do. This prevents the agent from expanding quietly from “summarizes tickets” to “changes ticket priority” to “reassigns work” to “closes cases.”

Bounded outputs matter because they turn ambition into product control. If the agent drafts, does it draft a response, a policy decision, a database update, or a customer-facing action? If it routes, can it only recommend a queue, or can it move the case? If it approves, what dollar amount, policy class, customer segment, or risk tier defines the boundary?

4. Context, delegated authority and adversarial exposure

What does the agent need to read, under whose authority, and after reading which inputs?

This question combines context, permission and risk because they are coupled in practice. The agent’s inputs, identity, access scope, freshness, expiration and revocation all belong together. Here are the questions to work through:

Which records, messages, files and tools are authoritative?
Which inputs are untrusted?
Does the agent act for a user, role, team, or constrained agent identity?
How long does access last?
What revokes access?
Which tools remain reachable after the agent reads untrusted content?
Which actions are low-risk, reversible, sensitive, irreversible, or externally visible?

An agent that reads email but cannot write carries one set of risks. Give it the ability to send messages, approve invoices, or update customer records, and the risk profile shifts entirely. The Lilli incident belongs here, not as an AI morality tale, but as a product boundary lesson. If the agent’s corpus, prompts, tools and APIs sit behind weak authorization, the workflow’s real persona is broader than the one the product team wrote down.

5. Human control and escalation contract

Where can a human understand, intervene, halt and resolve?

Understanding means the human can see the decision path. Intervention means the human can correct or redirect before harm. Halt means the human or system can stop the agent from continuing. Resolution means the human receives enough information to finish the job instead of starting over.

Common escalation triggers include low confidence, missing context, policy conflict, unusual request, sensitive action, customer impact, repeated failure, or a request outside the agent’s role. The handoff package should include the decision path, the evidence used, the missing information, the reason for escalation, the recommended next step and the deadline or SLA if time matters.

6. Audit, observability and recovery

What records must the product produce so the team can debug, prove and repair the agent’s work?

For simple workflows, this may mean action logs and human-visible history. For complex workflows, it means traces, tool calls, API records, model versions, prompt versions, workflow IDs, evaluation events and cost or latency metrics. If the workflow fails, the team should be able to reconstruct what happened without asking the agent to explain itself after the fact.

Recovery belongs in the same section because audit without repair is theater. Who can reverse the action? What can be rolled back? What must be disclosed? Who owns customer remediation? Who updates the workflow so the same failure does not repeat?

7. Adoption readiness and rollout path

Is the workflow documented, integrated, governed and understood well enough for an agent to operate? Look for the basics: clear use case, clean enough data, working integrations, documented process, trained users, governance model and supervision capacity. If those are missing, the smarter move may be to ship a lower-autonomy agent first. Let the agent recommend, summarize and prepare work before it acts.

A sensible rollout narrows the first version on three axes at once: less autonomy, more reversible work and tighter delegated authority. The goal is to keep autonomy tied to evidence that the workflow can support it, not to slow down agent adoption.

where this fits in the product process

The canvas is the upstream artifact, not the final one. Once you answer it, you can write three downstream specifications with more precision.

The first is the agent identity record: who or what the agent is, what authority it carries, when that authority expires, how it is revoked, and who owns the lifecycle. The second is the workflow blueprint: the steps, decisions, tools, system boundaries, human handoffs and failure states. The third is the control set: the guardrails, logs, alerts, evaluations, prompt-injection protections, approval gates and recovery paths that make the workflow operable.

This matters because the word “persona’’ can mislead teams if it stops at UX language. The canvas should produce an operating contract that engineering, security, operations, compliance, support, and product can all challenge, rather than a profile with a name, tone and persona.

You need answers to six questions before launch. Can you bound the action? Does the workflow give the model enough context to be accurate? Whose authority does the agent carry, how long does it last, and who can revoke it? Does the human get enough information to change the outcome? Do your metrics capture quality and supervision cost, not only volume? Can you reconstruct the decision path after something goes wrong?

A product persona for an AI agent is a specification for a non-human participant in the workflow. It defines the work the agent owns, the authority it carries, the boundaries it respects, the moments where it stops and the evidence it leaves behind. The product environment that allowed the bad call is the thing this work is meant to design out.

The date is not the plan

Who gave the agent permission to decide?

Traits of a great IT leader

The PM thinking stack

AI will kill low-judgment product work, not product management

Your AI strategy is not an AI strategy if it starts with AI