Cairn · May 28, 2026 ↑Agent Safety and Memory

MCP Is an Interface Boundary

Why agent tools need typed contracts, policy, and confirmation instead of clever prompt glue · ~16 min read · Suggested by Q engineeringoperations

ai architecture security tools

Model Context Protocol is easy to describe as USB-C for agents. That metaphor is useful, but it hides the engineering decision that matters most: MCP is where a team decides what an agent is allowed to know, what it is allowed to do, and what proof must exist before a side effect happens.

aiarchitecturesecuritytools

The protocol is less interesting than the boundary

MCP matters because it moves agent integration out of vague prompt glue and into explicit interfaces. A useful server does not just expose a capability; it names the operation, validates inputs, describes side effects, and gives the client enough context to ask for confirmation before mutation. That makes MCP a boundary between language and authority.

Tools, resources, and prompts are different promises

The protocol separates surfaces that teams often blur. Resources are context, prompts are reusable interaction patterns, and tools are executable capabilities. Treating those as separate promises keeps an agent from turning every piece of text into permission to act.

The Osprey prototype shows the right shape

Strike’s MCP package is deliberately small: one read-only geocoder, one ECO creation tool, schema validation, session IDs, sanitized logging, and explicit warnings that production needs TLS and real auth. That is enough to teach the pattern: bind agent-friendly language to deterministic application commands.

A tool is also a product decision

Every exposed tool encodes product policy. createEco says which fields matter before an emergency callout exists, when geocoding is allowed, how attachments are treated today, and where job-number reservation belongs. If that policy lives only in the model’s prompt, it will drift. If it lives in a tool contract, it can be reviewed, tested, and versioned.

Security starts before authorization

OAuth and scopes matter, but the earlier security decision is tool shape. Read-only and mutating tools should be separate. Open-world tools should be marked and treated differently. Inputs need schemas. Errors need redaction. Network exposure needs TLS. Human confirmation belongs before irreversible operations, not after.

What to build next

The next useful MCP work is not a giant catalog of tools. It is a narrow set of operationally important actions with typed inputs, durable audit trails, confirmation checkpoints, and tests around the exact mutations we care about. Start with one workflow that humans already perform by hand.

What the team should take away

MCP is not magic. It is a standard place to put ordinary engineering judgment: typed inputs, separated capabilities, transport choices, auth boundaries, confirmation moments, logging rules, tests, and audit. When an agent crosses from advice into action, the answer should be in the boundary, not merely in the prompt.

Discussion Prompts

The useful team question is which workflows deserve tool contracts because they already cross trust boundaries, not which ones would be flashy in a demo.

References

Osprey Strike packages/mcp source - The internal prototype that grounds the article: ECO creation, geocoding, transport, validation, and logging.
Model Context Protocol architecture - The official host, client, and server framing and the distinction between tool, resource, and prompt capabilities.
MCP tools specification - Tool listing, invocation, structured content, and annotations such as read-only and open-world hints.

The easiest way to misunderstand MCP is to treat it as a connector marketplace. The phrase “USB-C for AI” is memorable, and it is not wrong, but it points attention at plug shape when the more important question is authority. Once an assistant can call tools, the team has crossed from text generation into operational software. A tool can read a customer record, reserve an identifier, file a ticket, create an ECO, send a message, or mutate production state. At that point the interesting object is not the model. It is the interface boundary around the action.

That boundary is where we decide what the agent may know, what it may do, how inputs are validated, how users confirm intent, how errors are redacted, how audit survives, and which operations remain impossible no matter how persuasive the conversation becomes. MCP gives us a common protocol for making those decisions visible. It does not make the decisions for us.

This cairn uses Osprey Strike’s experimental MCP package as the concrete example. It is intentionally small: a server that exposes a geocoder and an emergency-callout creation workflow to MCP-aware clients. It is not production-critical infrastructure yet. That is what makes it a good teaching specimen. The important parts are visible without a giant platform around them.

Key Takeaway

MCP is most valuable when we treat it as a typed operational boundary, not as a nicer way to paste API docs into a prompt.

The protocol is less interesting than the boundary

A language model can improvise prose, but production systems should not depend on improvised authority. Before MCP-style tool contracts, many AI integrations looked like this: put API instructions in a system prompt, hope the model extracts the right fields, call a backend endpoint through custom glue, and clean up mistakes later. That can work for demos. It is a poor place to put operational trust.

MCP changes the shape by giving the agent client a protocol-level way to discover capabilities and call them. A server can advertise a tool named createEco, a schema for its inputs, annotations about its behavior, and structured responses. The host and client can mediate that call instead of treating the model’s next paragraph as an implicit command.

The subtle win is not that every app gets one universal integration. The win is that tool behavior becomes inspectable. A reviewer can ask ordinary engineering questions: What are the required fields? Which tool mutates state? Does this call reach the open internet? What gets logged? What error data can leak? What confirmation should the human see? What should be tested?

That is a better set of questions than “is the model smart enough?” Smart models still need bounded authority. In fact, smarter models need it more, because they are better at finding paths through vague instructions.

The USB-C metaphor is still useful for interoperability: hosts, clients, and servers can agree on one connection model. The metaphor stops being sufficient when the connected device can spend money, touch customer data, or change operational state.

flowchart TD
  U[Human intent] --> H[Agent host]
  H --> M[Model]
  H --> C[MCP client]
  C --> S[MCP server]
  S --> T1[Read-only tool]
  S --> T2[Mutating tool]
  S --> R[Resources]
  S --> P[Prompts]
  T2 --> A[Application command]
  A --> E[Audit trail]

The diagram is deliberately boring. Boring is good here. The boundary should make the dangerous path obvious: human intent becomes model reasoning, the host mediates tool access, the server validates a request, and the application command records a side effect.

Tools, resources, and prompts are different promises

MCP’s capability types matter because they prevent a category error we make too easily in agent systems. Not every piece of context is a command. Not every reusable instruction is a permission. Not every API endpoint should be exposed as a tool.

Resources are material the model can read: documents, files, records, or generated context. Prompts are reusable interaction templates. Tools are callable functions. Those three surfaces may sit next to each other in the same server, but they carry different promises.

The distinction matters most when content is untrusted. A resource can contain a customer email, a vendor PDF, a web page, or a support transcript. Any of those can include hostile text. If the system blurs resource text and system authority, a malicious document can try to steer the agent. The Injection Problem covers that failure mode in detail; MCP does not remove it. MCP gives us a place to enforce that resource content is data, while tools remain separately defined capabilities.

Definition

Tool contract: the reviewed interface that says a model may ask for a specific operation with specific inputs, but the host, server, application, and human confirmation path still decide whether it happens.

That framing helps with ordinary product design too. Suppose an assistant has access to project documents and a createChangeOrder tool. The document can say a change order is needed. The tool contract can still require project ID, scope, amount, approver, supporting evidence, and confirmation. The document suggests; the tool decides what a valid request looks like.

This is why “just give the agent database access” is usually the wrong first move. Direct access maximizes flexibility and minimizes product judgment. Good MCP design does the opposite: expose a narrow operation that matches how the business wants work to happen.

The Osprey prototype shows the right shape

Osprey Strike’s MCP package is marked experimental, which is the correct posture. It is not part of the official deployment, and the README says to treat it with reduced rigor compared with the API and web packages. Still, it already demonstrates several good instincts.

The server exposes two tools. geocodeAddress converts a free-form location string into candidate GeoJSON geometries through Nominatim. It is annotated as read-only, idempotent, and open-world because it reads from a public service. createEco guides the assistant through creating a fully specified emergency callout, optionally geocoding locations, reserving a job number if needed, and calling the createECO GraphQL mutation.

Those two tools should not be one tool. Geocoding is exploratory. ECO creation is mutating. Separating them gives the host and the model a chance to handle them differently. The assistant can call the read-only lookup, present candidates, ask the human which geometry is right, summarize the final payload, and only then invoke the mutating tool.

Scenario: Turning a Slack-ish request into a bounded tool flow

@Corey Open an ECO for loss of signal near 20th and Main. Looks like vandalism.

@Q I need job type and a location confirmation. I can geocode that intersection, show candidates, then summarize the ECO payload before creation.

@Corey CMR. Use the Conway candidate.

@Q Confirming: CMR ECO, loss of signal/vandalism, Conway 20th and Main geometry, no attachments, job number reserved automatically. Create it?

The important thing in that scenario is not conversational polish. It is that the flow creates a checkpoint between interpretation and mutation. The human’s first sentence is not enough authority to create production state. It is enough authority to gather missing fields.

The implementation reinforces that shape. The server uses schema validation for environment and tool inputs. It validates MCP session IDs as UUIDs before looking them up. It caps JSON body size. It sanitizes GraphQL errors before logging so bearer tokens and request details are not casually emitted. The README calls out that bearer tokens and request data are plaintext over HTTP unless a reverse proxy terminates TLS, and it warns not to expose the MCP HTTP port directly to the internet.

None of that is glamorous. All of it is what separates an integration from a prompt trick.

A tool is also a product decision

Every tool name hides policy. createEco sounds like a technical adapter, but it is really a product decision about how emergency callouts should enter the system. It says an ECO needs enough structure to satisfy the GraphQL mutation. It says job-number reservation belongs server-side when the user does not provide one. It says geocoding can help when the human gives an address, but the geometry should be confirmed. It says attachments are string identifiers for the current proof of concept, not a full upload system.

Those details matter because agents are good at making vague workflows look complete. A model can write a beautiful summary of an ECO with no durable effect. It can also call a backend with half-guessed fields if the tool lets it. The tool contract is where we stop both failures.

The same pattern applies outside Strike. In From Intake Folder to Project Memory, the core rule was that agents should not freehand writes; mutations should go through MCP tools or equivalent validated interfaces. That is not a protocol fetish. It is a way to keep source truth, derived artifacts, identity, lineage, and audit from collapsing into a model’s latest confident answer.

Key Takeaway

If a workflow has business meaning, the tool should expose the business operation, not the lowest-level database or API primitive.

For a project knowledge system, that might mean ingestDocument, classifyDocument, linkRevision, or materializeProjectRegister, not writeFile. For an operations system, it might mean acknowledgeOutage, assignCrew, or closeECO, not updateRow. For an internal assistant, it might mean draftReply and requestSendApproval, not sendEmail as a casual default.

The shape of the tool tells the model what kind of work exists. More importantly, it tells humans what kind of work can be safely delegated.

Security starts before authorization

Authorization is necessary, but it is not the first security control. Long before OAuth scopes and bearer tokens, the team has to decide whether the operation should exist as a tool, how narrow it should be, what input schema it accepts, and what confirmation path it requires.

MCP tool annotations help here, but they are hints, not a security perimeter. A readOnlyHint tells the client and model how to think about a tool. It does not replace server-side enforcement. A mutating tool still needs application authorization. An open-world tool still needs data-handling judgment. A destructive operation still needs explicit human confirmation and probably stronger controls than a local convenience script.

The Strike prototype points at several practical controls worth keeping:

Control	Why it matters
Separate read-only from mutating tools	Lets the host, model, and human treat lookup differently from side effect.
Validate inputs with schemas	Prevents the model from inventing shape and moves errors to the boundary.
Sanitize logs	Keeps tokens, headers, and payload details out of ordinary error output.
Validate session identifiers	Avoids accepting arbitrary header values as session keys.
Require TLS in production	Keeps bearer tokens and operational payloads off plaintext transport.
Confirm before mutation	Gives the human one last chance to catch bad interpretation.

That list is not exotic. It is the normal discipline of service design applied to agent tools. The novelty is that the caller is probabilistic, conversational, and vulnerable to hostile text in its context window.

MCP annotations such as read-only, destructive, idempotent, and open-world are useful because they make tool behavior legible to clients and models. Treat them like labels on a breaker panel: important for operation, insufficient as the only safety mechanism.

This also connects to Three Gates, One Identity. Browser users, webhooks, upstream APIs, and agent tools should not all share one mushy trust boundary. MCP should get its own gate: client identity, server authorization, application command authorization, and audit that says the action came through an agent-mediated path.

What to build next

The tempting MCP roadmap is a large tool catalog. Resist that. Tool catalogs are useful after the first few contracts are excellent. Before that, they multiply vague authority.

The better next step is to pick one workflow that already matters and make it boringly solid. ECO creation is a good candidate because the current prototype already names the shape. A production-grade version would likely add real auth wiring, deployment behind TLS, tighter tenant authorization, richer tests, durable audit events, attachment upload orchestration, and a deterministic planning step that can enumerate missing fields before any mutation is possible.

That deterministic planning idea is worth pausing on. The README mentions a possible planCreateEco mutation. This is exactly the kind of design that belongs near agent tools. Instead of asking the model to remember every required field and validation nuance, the application can answer: here is what you have, here is what is missing, here are valid choices, here are geocode candidates, here is whether the operation is ready to confirm. The model remains useful as the conversational layer, but the application owns readiness.

sequenceDiagram
  participant Human
  participant Agent
  participant MCP as MCP Server
  participant App as Strike API
  Human->>Agent: Open an ECO near 20th and Main
  Agent->>MCP: geocodeAddress(query)
  MCP-->>Agent: candidates
  Agent-->>Human: confirm location and missing fields
  Human->>Agent: CMR, Conway candidate
  Agent-->>Human: summarize payload, ask create confirmation
  Human->>Agent: confirmed
  Agent->>MCP: createEco(payload)
  MCP->>App: reserveJobId + createECO
  App-->>MCP: ECO created
  MCP-->>Agent: structured result

The same planning pattern would help document ingestion, change-order drafting, crew assignment, GitHub issue triage, Notion updates, and Slack delivery. The agent can negotiate with the human. The application should decide when the request is complete enough to execute.

What the team should take away

MCP is not magic, and that is exactly why it is useful. It gives us a standard place to put old-fashioned engineering judgment: typed inputs, separated capabilities, transport choices, auth boundaries, confirmation moments, logging rules, tests, and audit.

For Constructured, the useful mental model is simple. When an agent needs to cross from advice into action, look for the boundary. If the action is low-risk and local, a shell command or repo script may be enough. If it touches business state, customer data, project truth, messages, tickets, deployments, or external systems, it probably deserves a tool contract. MCP is one way to make that contract portable across clients. The discipline is the real asset.

That discipline also gives us a shared review language. A proposed MCP server should be reviewed like any other production-facing interface. What authority does it grant? What state can it mutate? What input does it validate? What data can the model see? What happens when the model is wrong? What happens when the source content is hostile? What audit remains after the conversation is gone?

The answer should not be “the prompt says to be careful.” The answer should be in the boundary.

MCP is an interface boundary. Its value is not just interoperability; it is a place to turn agent authority into reviewed contracts.
Separate context from capability. Resources, prompts, and tools carry different promises, and only tools should represent executable authority.
Expose business operations, not raw plumbing. A good tool says createEco, ingestDocument, or assignCrew, with the policy and validation those operations require.
Security starts with shape. Narrow tools, schemas, annotations, confirmation, redacted logs, TLS, and server-side authorization matter before the model ever asks for a call.
Build the first tool deeply before building many. One operationally important, tested, audited workflow teaches more than a broad demo catalog.

Discussion Prompts

Which current workflow crosses from advice into business-state mutation and should get a typed tool contract before we delegate it further?
For the Strike MCP prototype, should the next production step be auth, deterministic planning, attachment handling, audit events, or tenant-bound authorization?
Where are we still letting agents freehand writes through generic filesystem, browser, or API access when a narrower business operation would be safer?

References

Osprey Strike packages/mcp/README.md - Internal prototype overview, local setup, example ECO workflows, and security notes about TLS, binding, and future deterministic planning.
Osprey Strike packages/mcp/src/tools/createEcoTool.ts and geocodeTool.ts - The concrete tool contracts: mutating ECO creation versus read-only open-world geocoding.
Osprey Strike packages/mcp/src/server.ts, graphql/client.ts, and config/env.ts - Transport, session handling, schema-validated configuration, GraphQL calls, and sanitized error logging.
Model Context Protocol: Architecture - Official host, client, and server model for MCP, including the capability categories used in this article.
Model Context Protocol: Tools Specification - Official tool discovery and invocation model, including structured tool results and behavior annotations.
Model Context Protocol: Authorization - Current authorization guidance for HTTP-based MCP, including the OAuth-centered direction for protected resource access.
OpenAI Apps SDK Reference - OpenAI's MCP-compatible app/tool surface, useful as evidence that MCP-style tool contracts are becoming a common integration layer rather than one vendor's private API.
OWASP GenAI: LLM01 Prompt Injection - Security background for why untrusted content must not be allowed to become tool authority.

Generated by Cairns · Agent-powered with Claude

← Back to Trailhead