Production AI Institute — vendor-neutral certification for AI practitioners
Verify a credentialFor organisationsContact
GuideCursor SDK · Published Launch Day30 April 2026

Cursor SDK in Production:
Three PSF-Compliant Deployment Patterns

CI/CD pipeline agents, event-driven ambient agents, and embedded product agents — with the full PSF Domain framework applied to each. Architecture, required controls, and the anti-patterns that will hurt you in production.

Published 30 April 2026 — Cursor officially launched the Cursor SDK today. Patterns reflect the SDK at v1 release. CC BY 4.0.

Why the PSF applies directly to Cursor SDK deployments

The PSF was designed for exactly this class of deployment: AI systems that take consequential actions in production environments, with inputs from external sources and outputs that affect real data, real communications, and real infrastructure. The Cursor SDK, in any of its three deployment contexts, is precisely this.

D1 Input Governance
SDK inputs come from PRs, webhooks, or users — all external, all untrusted
D2 Output Validation
Agent outputs become code commits, customer communications, or product data
D3 Data Protection
Agent context routinely contains source code, credentials, and customer data
D4 Observability
SSE streaming gives you the hooks — but monitoring must be built
D5 Deployment Safety
Blast radius spans filesystems, APIs, and communication channels
D6 Human Oversight
Every consequential action needs a review gate before execution
D7 Security
Filesystem + terminal + MCP = widest attack surface of any framework
D8 Vendor Resilience
Cloud execution creates a hard dependency on Cursor infrastructure
Pattern P1

CI/CD Pipeline Agent

Automate code review, test generation, and security scanning on every PR
USE CASE

A Cursor SDK agent runs as a GitHub Actions step. On every pull request, it performs automated code review against your organisation's standards, generates missing test cases, scans for common security patterns, and posts a structured review comment — all before human review begins.

Architecture

1
PR opened
GitHub webhook fires
2
Input governance layer
Validate diff size, file types, branch scope before agent call
3
Cursor SDK agent run
Per-prompt run with scoped filesystem access to the PR diff only
4
Output validation
Parse agent output against ReviewSchema before posting
5
Human review gate
Agent comment is advisory — PR merge still requires human approval

PSF Controls Required

D1 Input Governance
Required

Validate the PR diff before passing to the agent: check file types are in scope, reject diffs that modify files outside the PR's declared scope, cap diff size at a defined token limit to prevent context flooding.

D2 Output Validation
Required

Parse agent review output against a ReviewSchema before posting. Required fields: summary string, issues array (each with file/line/severity/description), verdict enum (approved/changes-requested/advisory). Reject malformed output rather than posting it.

D3 Data Protection
Required

The agent context will contain source code. Ensure the agent's cloud execution environment does not have access to .env files, secrets, or files outside the PR diff. Pass only the diff to the agent context — not the full repository.

D4 Observability
Required

Log every agent run: PR ID, run duration, token consumption, review verdict, issue count. Alert on runs exceeding 3 minutes or 50k tokens. Retain logs for 90 days for audit and calibration.

D5 Deployment Safety
Strong by default

Per-prompt runs provide natural blast-radius containment. Each PR triggers one bounded agent run. The agent cannot carry state between PRs. Use the SDK lifecycle to ensure runs are archived after completion.

D6 Human Oversight
Strong by default

The agent produces advisory output only. A human engineer must still review and approve the PR before merge. Never configure branch protection rules that allow agent approval to substitute for human review.

D7 Security
Required

Run the agent under a service account with read-only repository access. The CI/CD step should not grant the agent write permissions to the main branch. Treat the agent's review output as untrusted before validation.

D8 Vendor Resilience
Required

Implement a fallback path for Cursor SDK unavailability: the CI/CD step should be non-blocking on Cursor service outages. Configure the step to produce a skip result (not a failure) when the SDK is unreachable, and alert the team to run manual review.

⚠ ANTI-PATTERN

Giving the agent write access to commit changes directly. The agent must remain advisory — it can identify issues but humans apply the changes.

IMPLEMENTATION NOTE

Use GitHub Actions with the Cursor SDK TypeScript client. Set CURSOR_API_KEY from GitHub Secrets. Scope filesystem access to the checked-out PR diff directory only.

Pattern P2

Event-Driven Ambient Agent

Respond to business events — emails, tickets, Slack messages — with structured AI actions
USE CASE

A Cursor agent is triggered by incoming support tickets. When a ticket arrives matching a defined category (e.g., billing dispute, integration error), the agent analyses the ticket, retrieves relevant account data via a read-only CRM MCP, drafts a structured response with supporting documentation links, and places it in a review queue — never sending directly.

Architecture

1
Event arrives
Webhook from support platform (Zendesk, Linear, etc.)
2
Input classification
Classify ticket category + check against allowed agent categories
3
Scope validation
Verify the event is within agent operational scope before proceeding
4
Cursor SDK agent run
Agent retrieves account context via read-only MCP, drafts response
5
Output validation + queue
Validate response structure, place in human review queue — not sent
6
Human approval gate
Support agent reviews draft, edits if needed, sends manually

PSF Controls Required

D1 Input Governance
Critical — implement before launch

This is the most critical domain for event-driven agents. Every incoming event is from an external, potentially untrusted source. Implement: event source authentication (verify webhook signatures), input classification (is this event within the agent's defined operational scope?), content sanitisation (strip HTML, normalise encoding, check for injection patterns in user-submitted text before passing to the agent).

D2 Output Validation
Critical — implement before launch

Define an explicit output schema for every action the agent can take. For a response draft: ResponseDraft schema with required fields (subject, body, category, confidence, account_id). Reject drafts that don't conform. Validate that referenced documentation links exist before including them. Never pass raw agent output to a communication channel.

D3 Data Protection
Critical — implement before launch

The agent will process customer data from the support ticket and potentially retrieve account data via MCP. Ensure: the agent context contains only the data required for the specific task, CRM MCP access is read-only with explicit field-level scope (do not grant access to payment data or full account history when only account status is needed), agent runs are not retained longer than required, PII in agent logs is masked.

D4 Observability
Required

Log every event processed: ticket ID, classification result, agent run ID, run duration, token usage, output validation result, queue placement. Build a dashboard showing daily volume, classification distribution, draft acceptance rate (after human review), and reject rate. This data calibrates the agent and identifies drift over time.

D5 Deployment Safety
Required

Implement action budgets: if the agent attempts more than N MCP tool calls in a single run, abort and flag for review. Use per-prompt runs for each event. Define explicit category scope — if a ticket is not in the agent's defined categories, route to human handling, do not pass to the agent.

D6 Human Oversight
Critical — implement before launch

No communication leaves the organisation without human review. The draft review queue is not optional — it is the core control. If queue processing falls behind (SLA breach), escalate to human handling rather than allowing unsupervised agent sends. Track time-in-queue and alert on unusual patterns.

D7 Security
Required

Treat every incoming event as potentially adversarial. A sophisticated attacker can craft a support ticket designed to manipulate the agent's response or exploit the MCP connection. Implement: input content classification before agent access, MCP read-only scope enforcement, output review before any external action. Rotate MCP connection credentials on a regular schedule.

D8 Vendor Resilience
Required

All events must be handled — Cursor service availability cannot be a single point of failure for customer support. Maintain a fallback routing path that sends tickets directly to human queues when the SDK is unavailable. Monitor SDK availability and alert on sustained outages.

⚠ ANTI-PATTERN

Connecting the agent directly to a send-capable email or messaging MCP without a review queue. The agent must never be the final actor on communications.

IMPLEMENTATION NOTE

Use a webhook receiver (e.g., Next.js API route or serverless function) to validate and enqueue events. The Cursor SDK call happens in a queue worker, not in the webhook handler. This separates event receipt from agent execution and gives you retry and backpressure controls.

Pattern P3

Embedded Product Agent

Ship AI agent capabilities inside your product — with defined scope and user oversight
USE CASE

A SaaS product embeds a Cursor agent as a 'Generate draft' feature. Users click a button in the UI, describe what they want, and the agent produces a structured output within the product — a report, a configuration file, a data pipeline definition. The output appears in an editable field before any action is taken.

Architecture

1
User action in product UI
User writes intent + triggers agent in a bounded context
2
Input governance
Server-side: authenticate user, validate intent against allowed operations, check rate limits
3
Context assembly
Build agent context from the user's data only — no cross-tenant data
4
Cursor SDK agent run
Scoped to the user's workspace data, per-prompt run
5
Output validation
Parse and validate agent output against the product's data schemas
6
User review + confirm
Output rendered as editable draft — user explicitly applies

PSF Controls Required

D1 Input Governance
Critical — implement before launch

User-submitted intent text is untrusted input. Validate: length limits, content classification (is this within the feature's intended scope?), injection pattern detection. Implement server-side validation — never rely on client-side validation alone. Rate-limit per user per time window. Log all inputs with user ID for audit.

D2 Output Validation
Critical — implement before launch

Every agent output must be validated against the product's data schemas before rendering to the user or allowing application. For a config file generator: parse the output, validate schema conformance, check for required fields, validate value ranges. An agent-generated config that bypasses schema validation is a security risk.

D3 Data Protection
Critical — implement before launch

Multi-tenant isolation is non-negotiable. The agent context must contain only the requesting user's data. Implement: tenant-scoped data access in your context assembly step, explicit verification that the assembled context contains no cross-tenant data, audit logging of what data was included in each agent context. GDPR and similar regulations apply to data processed by the agent.

D4 Observability
Required

Track per-user and per-feature agent usage: run count, token consumption, latency, output validation pass rate, user acceptance rate (did the user apply the draft or discard it?). Usage spikes may indicate abuse or unexpected feature adoption. Cost visibility per tenant is required for billing and capacity planning.

D5 Deployment Safety
Required

Implement per-user rate limits and monthly usage caps. Use per-prompt runs — embedded product agents should not maintain session state between user requests. Define explicit scope limits: the agent can access the user's data in the current workspace, nothing else. Implement hard limits on agent run duration (15s for sync, 5min for async).

D6 Human Oversight
Required

The user is the human oversight layer. Never apply agent output without an explicit user confirmation step. Render output as an editable draft, not as an immediately applied change. For bulk operations (e.g., applying a config to multiple resources), require explicit confirmation per resource or a batched-confirmation UI with clear impact summary.

D7 Security
Critical — implement before launch

Embedded agents are a high-value attack target. Inputs come from users who may attempt to manipulate the agent to access other users' data or bypass product controls. Implement: server-side input validation, strict tenant isolation in context assembly, output schema validation before rendering, content security policy headers to prevent XSS if agent output is rendered as HTML.

D8 Vendor Resilience
Required

The agent feature must degrade gracefully when Cursor is unavailable. Implement a feature flag that disables the agent UI and shows a 'currently unavailable' message rather than failing silently. Monitor SDK latency — if p95 latency exceeds your product's acceptable threshold, circuit-break to the fallback. Communicate planned maintenance to users proactively.

⚠ ANTI-PATTERN

Applying agent output to production data without a user confirmation step. The draft-and-confirm pattern is the minimum acceptable oversight model for embedded agents.

IMPLEMENTATION NOTE

Build context assembly as a separate server-side function that explicitly constructs the agent's context from scoped data queries. This function is auditable, testable, and can be reviewed independently of the agent logic.

Cross-Cutting Requirements

These requirements apply to all three patterns regardless of deployment context.

Secret management

Never store API keys, database credentials, or OAuth tokens in files accessible to the agent. Use a secrets manager (AWS Secrets Manager, GitHub Secrets, Doppler, Vault) and inject credentials as environment variables at runtime. A Cursor agent operating on a codebase that contains .env files will process those secrets in its context.

MCP scope review

Before connecting any MCP server to a Cursor SDK agent, document: what data the MCP server can read, what actions it can take, and whether those actions are reversible. Grant the minimum OAuth scope required. Separate read MCPs from write MCPs architecturally — an agent should not have write access to a service it only needs to read from.

Run ID logging

Every Cursor SDK agent run produces a run ID. Log this ID with your request metadata at the point of invocation, and store it in your observability system. When an incident occurs, the run ID lets you retrieve the full action trace from the SDK's durable agent store.

Model independence

The PSF framework is model-agnostic. The controls above apply regardless of whether your Cursor agent is running GPT-4o, Claude Sonnet, Gemini, or Cursor's own Composer model. The framework evaluates your deployment architecture and operational controls — not your model choice. Model changes are a D8 (Vendor Resilience) concern.

Testing before production

Before promoting a Cursor SDK agent to a production deployment, test: does input governance correctly reject out-of-scope inputs? Does output validation correctly reject malformed outputs? Does the human oversight gate work as designed? Is the fallback path operational? Run adversarial test cases: what happens if you pass a prompt injection attempt through the input layer?

Related guides

Cursor SDK PSF Assessment
Domain-by-domain rating of the SDK at launch
Ambient Agents
Safety requirements for agents embedded in existing tools
D1 Input Governance
Implementation guide for production input controls
D6 Human Oversight
HITL patterns for production agents
D7 Security Guide
Prompt injection defence and MCP security
Full PSF Framework
All 8 domains with implementation guidance
From reading to credential

You understand the gaps.
Get the credential that proves it.

The AIDA examination tests applied PSF knowledge across all eight domains — exactly the gaps and strengths covered in this assessment. 15 minutes. No charge. Ever.

Start AIDA — free →CPAP practitioner credential
The Production AI Brief

Get framework updates in your inbox

PSF assessments, deployment guides, and production AI analysis. Weekly. No hype.