Reference architecture: a 4-week MCP server build

Last verified: June 2026· reference architecture

Starting state

The buyer is in the following situation on day 0. We name it honestly because the engagement can't start from a fantasy starting point.

The team has identified a specific system to expose to AI agents (e.g. Datadog, Postgres, Snowflake, an internal API)
A named internal owner with on-call access to that system
No production MCP server for that system yet (or a community / vendor server that doesn't meet the team's security bar)
The team uses Claude Code, Cursor, or Cline; MCP support is in place

Week-by-week plan

The work, phase by phase. Every phase has a clear focus, a clear deliverable, and a clear handoff to the next phase.

Week 1
Scope + auth + read-only smoke test
Deliverables
- One-page spec: the system, the use case, the tool surface (3–5 tools), the auth story, the production concern
- Read-only credential for the system (least-privilege, scoped, rotatable)
- A first server implementation, with a read-only smoke test that proves the credential + scope
Week 2
Tools + transport + production concern
Deliverables
- Production tool surface: list_X, get_X, search_X — verb-first, typed, with one-paragraph descriptions
- Transport picked (stdio for local dev, HTTP+SSE for shared) and stuck with
- The production concern designed around (e.g. read-only DB role, no auto-transition, restricted Stripe key)
Week 3
Rollout + audit log + human gate
Deliverables
- Server behind a feature flag; rolled to one engineer, then a small team, then the org
- Audit log on every tool call (who, what, when, request, response) wired to the team's observability stack
- Human-approval gate for high-impact tools (destructive actions, cost thresholds)
Week 4
Handoff + runbook + next server
Deliverables
- One-page runbook the internal owner can use to debug, version, and roll the server forward
- MCP config in version control, with the server version pinned
- Pattern documentation so the second server (Postgres → Snowflake, say) takes 2 weeks, not 4

End state

What the buyer has when the engagement ends. Quantified where we can quantify it; named where we can name it.

One production MCP server, behind a feature flag, with audit logging and a human-approval gate
The team can answer "should the agent be able to call this tool?" with a documented yes/no and a scope
Every tool call is logged to the same observability stack as the rest of production
The named internal owner can debug, version, and roll the server without us
The second server in the same category takes 2 weeks, not 4

What the buyer owns

Everything. The code is in the team's repo. The dashboards are in the team's stack. The runbook is in the team's wiki. The credentials are in the team's secret store. We do not operate managed services and we do not retain access after handoff. The point of the engagement is to leave the team running the system themselves, well enough to hire in-house and transition out cleanly.

The server code, in the team's repo, in the team's preferred language
The credentials, in the team's secret store
The audit log, in the team's observability stack
The runbook, owned by the named internal owner
The MCP config, in version control, with versions pinned

Reference architecture: a 4-week MCP server build

Starting state

Week-by-week plan

Week 1

Week 2

Week 3

Week 4

End state

What the buyer owns

Related